Thursday, June 16, 2016

Event Sourcing from the Trenches: Aggregates

While visiting QCon New York this year, I realized that a lot of the architectural problems that were discussed there could benefit from the Event Sourcing architecture style. Since I've been in charge of architecting such a system for several years now, I started to reflect on the work we've done, e.g. what worked for us, what would I do differently next time, and what is it that I still haven't made my mind up about. So please let me share some of my thoughts on this.

Event Sourcing (ES) is an awesome architecture style for high-performance systems that supports some powerful concepts like fine-grained conflict handling, optimized projections, potentially unlimited horizontal scaling, and great business buy-in. But it introduces a lot of complexity like eventual consistency, event versioning and projection migration challenges. As with every (design) pattern, methodology or tool, you need to consider the trade-offs and the problem you're trying to solve. Don't jump on the ES train just because it sounds like a cool thing to work on. We only migrated from a CQRS-based architecture to ES to build an application-level replication protocol, even though we knew about ES when the entire project started. Granted, because of my positive experience in the current project, I would definitely consider ES for any non-trivial system. But I'm fully aware I might be falling in the second-system trap.

Use Event Storming
Event Storming is a technique to identify your (business) events from conversations with the business rather than extract them from your domain. Since we migrated from a relational-database-based domain model loosely based on Domain Driven Design principles to event sourcing, we had to extract our events from the existing code. This resulted sometimes in what Yves Reynhout amusedly called "property sourcing". They were rather technical and never encompassed the actual business intent. Event Storming helps you to identify the dynamics of the process you're trying to model rather than the state-oriented domain modelling approach. A nice side-effect of it is that it will surface potential conflicts in definitions warranting the introduction of separate bounded contexts.

Don't rely on aggregates to be in sync at all times
If your order (logically) references a product, don't rely on that product to exist or to be in a certain state. By following this principle, your code will be designed to handle non-existing data from day one. If, in the future, you're performance requirements grow that far that you need to partition your event store, you can do so. Trying to make your code handle these situations later on is going to be extremely painful. Projection code that aggregates events from multiple aggregates or maintains lookups is particularly susceptible to this. Prepare for it.

Postpone snapshots until you really need it
It adds complexity that you may not need. For instance, in our project we have two kinds of aggregates. They live a couple of days and receive a lot of events, but get abandoned after that. Or they live very long (like a user aggregate) but receive only a couple of events over a period of months. In both cases, we had no need for snapshots since the number of events per aggregate is rather low.

Identify a partition key for your aggregate
Even though you won't need it immediately, it allows you to scale in the future by partitioning the event store by that key. For instance, orders in a purchase system may be tied to a country. It's not like an order suddenly moves from one country to the other. And if the unexpected still happens, you always have the choice to copy the aggregate into a new aggregate with a different partition key. If there's no natural partition key, try to come up with something synthetic anyway.

Feedback, please!
So what do you think? Do my thoughts make sense? Am I too pragmatic here? Are you using Event Sourcing yourself? If so, care to share some experiences? Really love to hear your thoughts by commenting below. Oh, and follow me at @ddoomen to get regular updates on my everlasting quest for better solutions.