How to get the best performance out of NHibernate (and when not to use it at all)

Edit this page | 9 minute read

Use the right tool for the right problem

A very common sentiment I'm getting from the .NET community is the aversion against object-relational mappers like NHibernate and Entity Framework. Granted, if I could, I would use an (embeddable) NoSQL solution like RavenDB myself. They remove the object-relational friction OR/Ms try to solve and allow you to decouple your code from scalability bottlenecks like shared database servers. And quite often they provide some cool features such as map-reduce indexes and faceted search. If a NoSQL product is not an option, some would argue that writing native SQL is always the better option. But in my opinion, unless you need to squeeze out the last bit of performance from a database, writing your own SQL statements is a waste of time. But even then, I would probably prefer some lightweight mapping library such as Dapper then writing the mapping code myselfclip_image001Those same people would also argue that NHibernate adds a lot of overhead and complexity, and there's some truth in there. It's a very powerful OR/M that is very good at mapping complex object-oriented designs to a relational database schema, but there's also a lot you can do wrong that will completely kill your performance. I once made the mistake of creating an abstraction on top of NHibernate (inspired by this article). It sounded like a nice idea for testability purposes, but treating NHibernate as a persistent LINQ-enabled collection forced me to limit myself to the common denominator (a.k.a. LINQ).

Regardless, if you can't use a NoSQL solution like RavenDB, you can't apply an architectural style that avoids the object-relation mismatch (e.g. Event Sourcing and/or CQRS), you need to support multiple database vendors, and you don't need the raw performance of native SQL, I would still recommend NHibernate over Entity Framework.

Now, when you do, please don't make the mistakes I made, and apply some or all of the following tips & tricks. For the record, I'm assuming you use Fluent NHibernate to avoid those ugly XML files. I never bothered with the built-in fluent API because I kind of got the impression it is still work-in-progress (hopefully somebody can convince me otherwise).

Don't abstract NHibernate

If you're practicing Test Driven Development, don't abstract away the code that uses NHibernate and write unit tests against an in-memory Sqlite or SQL Server LocalDB using NHibernate's built-in schema generation tool. It will surface any edge cases in NHibernate's LINQ support much earlier, and you will be able to profile the underlying raw queries right from inside your unit test. Which brings me to the next point…

Understand the run-time behavior

Ayende's NHProf is an awesome tool to find performance bottlenecks and common mistakes. It will not only show you’re the queries you've been executed, but also show you the entire stack trace of the code that was involved. Next to that, it can provide you with the actual results of that query as well as the full query execution plan from the underlying database. For each query it shows what part of the execution time was spent in the database and what part is added by NHibernate as well. It will even show you whether or not the query benefitted from NH's 2nd-level cache. And did I mention all the warnings it will give you when you're making common mistake such as N+1 selects, inefficient transaction management, or requesting unbounded result sets? In a way NHProf gives you a holistic view of your application's database interaction.

clip_image002

Prefer NH's own QueryOver API over LINQ

LINQ is a common denominator and doesn't support everything NH supports such as e.g. inner joins, left and right out joins, aliasing, projection transformers, etc. One more reason not to abstract NHibernate…

decimal mostExpensiveProduct = session.QueryOver<Product>().Select(Projections.Max<Product>(x => x.Price)).SingleOrDefault<decimal>();

Use optimistic concurrency and dynamic updates

NH's default behavior is to include all columns in every UPDATE or INSERT statement, regardless if the mapped property has changed or not. NH will also include all columns in the WHERE clause when doing an optimistic concurrency check. You can improve the speed of the latter by adding some kind of incremental number or timestamp and map that one as the version for the entity. That ensures that only the versioning column is included in the WHERE clause. But you can do even better by enabling dynamic updates on the mapping, e.g. using the DynamicUpdate method of the ClassMap<T>. This will tell NH to only include the actual columns that changed in the UPDATE and INSERT statements. I don't need to explain why that will give you a nice performance boost.

public class ProductClassMap : ClassMap<Product>
{
    public ProductClassMap ()
    {
        DynamicUpdate();

        Id(x => x.ProductId);
        Version(x => x.Version);
        Map(x => x.Name);
        Map(x => x.Price);
    }
}

Avoiding the insert-insert-update for child collections

A long-standing issue that has caused a lot of confusion in many of my projects is the way NH deals with parent-child relationships (also known as HasMany associations). For reasons related to association ownership, NH will first insert the children without any foreign key, and then issue another update of those children after the parent has been added. Because of this weird algorithm, the foreign key column on the child table has to be nullable. Because of all of this, inserting a new parent with 5 childs involves a total of 11 SQL statements. 5 to insert the children, one to insert the parent and another 5 to update the foreign keys of those children. I only recently discovered that this has been changed in NHibernate 3.2 and you can now fix this by using the following (fluent) construct.

public class OrderClassMap : ClassMap<Order>
{
    public OrderClassMap ()
    {
        HasMany(x => x.Products)
            .Not.Inverse()
            .Not.KeyNullable()
            .Not.KeyUpdate()
            .Cascade.AllDeleteOrphan();
    }
}

Notice the Not.KeyNullable() and Not.KeyUpdate(). You need both to make this work. The additional Not.Inverse() is not really needed, but can be used to emphasize that the Order in this association is responsible for maintaining the foreign key relationships. In NH jargon, this means that this side owns the association. You only need the Inverse() option in bi-directional associations so that NHibernate knows whether the Parent or the Child is responsible for properly setting up foreign keys. If this inverse thing still confuses you, I can highly recommend this article.

Lazy-loading heavy loaded properties

Sometimes you need to map a property on your entity that is pretty expensive to load and save, e.g. a byte array or some serialized Json or Xml. I know, you may want to avoid that in the first place, but if you can't, you need to know that you can mark properties as lazy loading like this:

Map(x => x.Thumbnail).LazyLoad();

So, assuming the ProductClassMap from earlier examples, when you fetch one or more orders, it will exclude the thumbnail data from the SELECT statement. But as soon as you access that property, it will issue a separate SELECT to get the actual column data. One caveat though. If you have multiple lazy-loaded properties, NH will fetch all them as soon as you access any of them.

Eager fetching of associations

Associations between entities are never initialized by default. This is well-known source of the infamous N+1 SELECT problem that happens when you load a bunch of entities using a query and then iterate over them. The first SELECT will get the parent entities, but accessing the association property of each parent entity will cause another SELECT to fetch the related childs. You can tell NH to fetch those children as part of a query using @ on a case by case. But if you know you'll always need them together and you can't merge those two tables in a nice and eficient cartesion product, map the association as a Not.LazyLoad().Fetch.Join().

Components without value semantics

A very much misunderstood aspect of component mapping is that the classes that are mapped as a component must behave like a component. They must expose value type semantics and have no identity other than the combined values of all of its properties. In other words, they must override Equals() and GetHashCode(). This is especially important when you map a property to a collection of components, like this:

HasMany<Address>(x => x.Shipments)
    .KeyColumn("OrderId")
    .Table("OrderShipments")
    .Component(x =>
    {
        x.Map(c => c.ZipCode);
        x.Map(c => c.Number);
        x.Map(c => c.State);
    })
    .Cascade.AllDeleteOrphan();

If you don't, NH can't determine the equality of the objects in your collection property resulting in some weird behavior. I’ve seen NH delete and insert the same set of child objects every time somebody added an additional child to the collection. You don't notice that until you run that profiler again.

Some more tips & tricks

  • If you're in a position that you can't change too much of an existing database scheme, and you're application has vastly different needs in the way that data is read or written, you can consider multiple ClassMap to the same table. As long as all but one class maps are declared as ReadOnly, Nhibernate wil happily allow you. This has proved to be a very efficient technique to have different lazy-loading settings for the same table structure.
  • If you're into Domain Driven Design like me, you might be tempted to create all kinds of domain-specific and rich custom NHibernate types and map them to your columns. So rather than having a string-valued property to represent an ISBN number, you might define your own Isbn type. Now, if you value performance, don't. Just run a good CPU profiler like JetBrains' dotTrace to understand the impact of that.
  • Don't underestimate the power of NHibernate's second level cache offered by the likes of SysCache2. It can give you an enormous performance boost, especially if you deal with a lot of immutable data and you can avoid the infrastructural complexity of a distributed cache. Just don't forget to wrap all your code with a call to EnlistTransaction and CompleteTransaction. Ayende has written enough about that.
  • Consider you need to remove an entire range of entities from your database. You could query for them using LINQ or QueryOver and then issue individual Delete() statements on the session, but you can do better by employing NH's DML operations API. It supports HQL statements that resemble native SQL without any coupling to a specific database vendor like this:

    session.CreateQuery("delete Order order where order.CreatedAt > :minData")
        .SetDateTime(minData).ExecuteUpdate();
  • You might know that you need to define cascading operations on parent-child collections. Just don't make the mistake to do this on HasManyToMany or References mappings. They are meant to create associations between entities which lifetime is indepednent of other entities. Doing it wrong caught us by surprise a couple of times, only to discover somebody added a Cascade.All or Cascade.AllDeleteOrphan().
  • NHibernate allows you to map simple collections of single elements such as numbers or strings to a child collection. But if you do, think hard about the uniqueness of those elements. By default, NH will treat an IList or array as a bag and allow duplicate items. If you don't want that, add an AsSet to the mapping like this:
  • HasMany(x => x.Options).Table("Options").KeyColumn("ParentId").Element("OptionValue").AsSet();

Well, that got a bit out of hand. What do you think? Did I tell you something you didn't know yet? And what about you? Any tips to add to this post? Love to hear your thoughts by commenting below. And follow me at @ddoomen to get regular updates on my everlasting quest for better solutions.

Leave a Comment