Image for post
Image for post
Sometimes, you want things fast…really fast!

Db2 Event Store — How fast is it?

Taking on the competition, one query at at time

When you build a database system for performance (as one always should) you’re bound to get competitive questions. This is especially true if you make objective claims about the system’s performance.

Why Druid?

Largely because Event Store and Druid have a lot in common. For starters, they’re both designed to ingest large volumes of data and perform analytics over the data in real-time. Secondly, both systems are designed for high availability, and built to scale to very large clusters. Finally, Apache Druid has been around for a number of years, and has become popular for a number of large use cases in that time (Airbnb, Uber, Netflix). As a result of this significant overlap, it only made sense to compare Event Store performance to that of Druid.

But how to run the benchmark?

The Event Store performance team struggled a bit when looking for a suitable way to benchmark against Apache Druid. In-house we test Event Store’s performance (and guard against performance regressions) by using the TPC-DS workload. Unfortunately, this workload wouldn’t work as Druid doesn’t have support for multi-table joins, something that’s required to run TPC-DS. After a few failed attempts at choosing a benchmark, we stumbled across this benchmark, published by the folks at Apache Druid to compare their system against MySQL (to which, it’s worth noting, it compares very favourably).

Image for post
Image for post
The nine benchmark queries

Benchmark results — Insert

To run the query benchmark, we first had to load the data. We completed the load using a 100 GB data set which we generated using the publicly available tools for TPC-H and loaded it into both Druid (using the Hadoop Batch method) and Db2 Event Store (by insert-subselect from an external table).

Image for post
Image for post
Fast data load means less time waiting, more time querying

Benchmark results —Query

With the data loaded, we were able to run the query benchmark. To do this we ran the above mentioned nine queries and found that Db2 Event Store ran the workload noticeably faster than Druid. The entire workload took 76.4 seconds to complete in Druid, and only 40.4 seconds in Event Store — making Db2 Event Store nearly 1.9x faster.

Image for post
Image for post
Almost twice the insights in the same amount of time
Image for post
Image for post

Additional queries

After analyzing the benchmark queries, we noticed that they were all fairly simple. This led us to wonder how the two systems would perform with more complex queries. To test this, we created two more complex queries and ran them on both Db2 Event Store and Druid. Here are the queries that were created:

Image for post
Image for post
When life gets complex, you need the right tools

Db2 Event Store — High performance for Fast Data

If you’ve got a Fast Data problem and are considering Apache Druid, I’d encourage you to look at Db2 Event Store as well. In addition to dramatically better performance, it’s highly available by default, is more storage efficient than Druid, stores its data in an open data format (Apache Parquet) to avoid vendor lock-in, and is tightly integrated with Watson Studio to leverage advanced analytics.

Adam has been developing and designing complex software systems for the last 15+ years. He is also a son, brother, husband and father.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store