Thursday, November 29, 2012

Control Entity Framework, do not let it control you


In the company I am working they are currently facing really heavy problems with an application that (miss)uses Entity Framework (EF).
I have not yet worked with EF in my own projects (so take this post not too serious) but did some hours of investigation how it probably should be used in a "real world" application (I mean a mid-sized or even big data centric business application in contrast to the "hello world" style tutorials you usually see in the internet where DbContexts are within controller actions).
First of all, what are the main features / benefits that ship with EF:

  • Build-in mapping functionality and relationship management (foreign keys)
  • Automatic generation of CRUD SQL statements
  • No need to define SQL parameters manually (increased security due to reduced risk of SQL injections)
  • Automatic data migrations (managed by Nuget Package Manager) can replace non-integrated data migrations
  • Data access layer validation (data annotations)
  • Concurrency handling (with timestamps)
  • Support of all major RDBMS
  • Quick (re-)generation of databases (possible use case: "quickly" creating testing databases within nightly builds)
  • Precompilation of queries (execution plans), but only from Version 5 on

In enterprise scenarios, you are usually dealing with existing databases. Here the question comes up how you would apply EF to an existing database.
There are two possibilites: if you prefer working with Code instead of designer tools (my guess is that most developers do) you would use reverse engineer tools (e.g. EF powertools) to define code classes and mappings for your existing database. Alternative to this code centric way is the designer centric way ("database first") where you would reverse engineer an .edmx model (classes and mappings are auto-generated from .edmx).

Now, what should be taken into consideration when working with EF (most of the hints are from a TechEd 2012 session by Adam Tuliper) ?
  • DBContext is not thread safe, instantiate a new one per request (best via DI)
  • Do not cache it or use a static instance.
  • Dispose DBContext when done (DI does that automatically for you)
  • Utilize repository pattern, make EF your repository implementation
  • No EF code anywhere else than in your repository implementation (e.g. not as view models) - no references to EF from other layers than data access
  • Return data grabbed data with .ToArray() / .ToList(). Reason: EF uses deferred execution and you usually want to have control over when a database query is being performed (note that deferred execution outside the DBContext scope will lead to "DBContext already disposed" errors). By calling .ToArray() or .ToList() you are forcing an immediate execution.
  • Always check EF generated Sql statements (e.g. MiniProfiler is a convenient possibility) - replace them by telling EF to use custom stored procedures in non trivial scenarios
  • Performance was improved in Version 5 (see above), but be aware that EF is still slower compared to „raw“ ADO.NET access (SqlDataReader etc). Consider using a more lightweight ORM (e.g. dapper) if winning some milliseconds per query is crucial for your application.
  • Keep controlling the loading process, avoid lazy loading when it is not necessary
  • EF does have out-of –the-box support for “nolock”, you have to use Transactions with READ UNCOMMITTED (or call stored procedures)
Let me know if you think that other things are also important when using EF beyond “hello, world”. Btw: most of the mentioned points are not only applying to EF but to every ORM.

No comments:

Post a Comment