Sometimes, you want things fast…really fast!

Db2 Event Store — How fast is it?

Taking on the competition, one query at at time

Adam Storm
6 min readSep 18, 2019

--

When you build a database system for performance (as one always should) you’re bound to get competitive questions. This is especially true if you make objective claims about the system’s performance.

When we built Db2 Event Store, we targeted 1 million inserts per second for each of the nodes of the cluster (with scalability to 100s of nodes). Several years after setting this target we were easily able to achieve this for 40 byte events (a common size for IoT data where events typically consist of an 8 byte timestamp, an 8 byte sensor ID, and a small number of 8 byte readings).

In the past year, we set out to pair this high speed ingest with lightning fast, in-memory optimized, query performance. We did this by pairing Event Store’s modern storage engine with the industry’s most mature SQL Engine — Db2, and the results are impressive. Event Store is now able to easily handle point queries (serviced by an automatically generated index) in a handful of milliseconds, and complex analytics queries (through the benefit of synopsis metadata) in a fraction of what they’d take on traditional systems servicing IoT workloads today.

Where queries differ from ingest however, is in the difficulty of leveraging absolute performance numbers to any meaningful effect. While our customers typically know the data ingest volume they’d require a system to handle in a given period of time, every workload is different. As a result, the most meaningful way to benchmark a system from a query perspective, is to compare it against a competitor’s system, on a collection of well defined queries — in other words, run a benchmark.

While there are many competitors in the data space (and what seems like an increasing number every week), the one competitor which comes up most often is Apache Druid.

Why Druid?

Largely because Event Store and Druid have a lot in common. For starters, they’re both designed to ingest large volumes of data and perform analytics over the data in real-time. Secondly, both systems are designed for high availability, and built to scale to very large clusters. Finally, Apache Druid has been around for a number of years, and has become popular for a number of large use cases in that time (Airbnb, Uber, Netflix). As a result of this significant overlap, it only made sense to compare Event Store performance to that of Druid.

But how to run the benchmark?

The Event Store performance team struggled a bit when looking for a suitable way to benchmark against Apache Druid. In-house we test Event Store’s performance (and guard against performance regressions) by using the TPC-DS workload. Unfortunately, this workload wouldn’t work as Druid doesn’t have support for multi-table joins, something that’s required to run TPC-DS. After a few failed attempts at choosing a benchmark, we stumbled across this benchmark, published by the folks at Apache Druid to compare their system against MySQL (to which, it’s worth noting, it compares very favourably).

The benchmark contains nine single table SQL queries based on the LINEITEM table from the TPC-H benchmark. Each of the queries either perform a count, sum, group by, or order by, commonly found in fast data query workloads.

The nine benchmark queries

Since the original benchmark was run several years ago, on hardware we couldn’t procur today, we setup an Apache Druid cluster (using Hortonworks HDP version 3.1.0 which contains Druid version 0.12.1) on 3 physical machines, each with 28 cores, 386 GBs of RAM and 2 direct attached SSDs. On the same hardware, we setup Db2 Event Store version 2.0. We then configured Druid for the environment by moving the segment cache location to the locally attached SSDs and increasing the number of processing threads to 55. Db2 Event Store used its default configuration, without any modifications.

From a storage perspective, Apache Druid was running on HDFS and Db2 Event Store was leveraging NFS storage (one of its many storage options).

Benchmark results — Insert

To run the query benchmark, we first had to load the data. We completed the load using a 100 GB data set which we generated using the publicly available tools for TPC-H and loaded it into both Druid (using the Hadoop Batch method) and Db2 Event Store (by insert-subselect from an external table).

The ingest performance difference between Druid and Event Store was significant. Inserting the data into Druid took 7734 seconds (2 hours and 8 minutes) while the insert into Db2 Event Store took only 730 seconds (12 minutes and 10 seconds) — a 10.6x difference.

Fast data load means less time waiting, more time querying

Benchmark results —Query

With the data loaded, we were able to run the query benchmark. To do this we ran the above mentioned nine queries and found that Db2 Event Store ran the workload noticeably faster than Druid. The entire workload took 76.4 seconds to complete in Druid, and only 40.4 seconds in Event Store — making Db2 Event Store nearly 1.9x faster.

Almost twice the insights in the same amount of time

When we looked at the queries one-by-one, we found that in all but two queries, Db2 Event Store outperformed Druid, often significantly.

Additional queries

After analyzing the benchmark queries, we noticed that they were all fairly simple. This led us to wonder how the two systems would perform with more complex queries. To test this, we created two more complex queries and ran them on both Db2 Event Store and Druid. Here are the queries that were created:

select sum(L_EXTENDEDPRICE) from lineitem group by L_ORDERKEY order by sum(L_EXTENDEDPRICE) desc fetch first 100 rows only

select sum(L_EXTENDEDPRICE) from lineitem group by L_PARTKEY, L_ORDERKEY order by sum(L_EXTENDEDPRICE) desc fetch first 100 rows only

When these queries were run against Event Store and Druid, we found that the performance advantage that Event Store illustrated on the base benchmark was amplified. The first additional query took 31 seconds to run on Event Store and 861 seconds to run on Druid (a difference of more than 27x), while the second query took 159 seconds to run on Event Store, and 1819 seconds to run on Druid (a difference of more than 11x).

When life gets complex, you need the right tools

It’s also worth noting that while these additional queries are slightly more complex than the original benchmark queries, they’re by no means complex. For true complex queries we’d need to include joins, which as mentioned above, Druid doesn’t support.

Db2 Event Store — High performance for Fast Data

If you’ve got a Fast Data problem and are considering Apache Druid, I’d encourage you to look at Db2 Event Store as well. In addition to dramatically better performance, it’s highly available by default, is more storage efficient than Druid, stores its data in an open data format (Apache Parquet) to avoid vendor lock-in, and is tightly integrated with Watson Studio to leverage advanced analytics.

For more info, or to start your trial, visit our website. You can also take a product tour or reserve a Event Store demo system free of charge in the cloud here. If you have further questions, please feel free to reach out.

--

--

Adam Storm
Adam Storm

Written by Adam Storm

Adam has been developing and designing complex software systems for the last two decades. He is also a son, brother, husband and father.

Responses (1)