Apache Druid Turns 10: The Untold Origin Story

“We did something crazy: we rolled our own database.” – Eric Tschetter, creator of Druid

Ten years ago today, the Druid data store was introduced to the world by Eric Tschetter, its creator, working at a small start-up named Metamarkets. Eric had left LinkedIn six months earlier to join us as the first full-time employee, and I was the CTO and co-founder, working in a shoebox office[1] off South Park in San Francisco. In his blog post, Introducing Druid: Real-Time Analytics at a Billion Rows Per Second, he shared the rationale for Druid’s creation:

“Here at Metamarkets we have developed a web-based analytics console that supports drill-downs and roll-ups of high dimensional data sets – comprising billions of events – in real-time. This [post introduces] Druid, the data store that powers our console. Over the last twelve months, we tried and failed to achieve scale and speed with relational databases (Greenplum, InfoBright, MySQL) and NoSQL offerings (HBase). So instead we did something crazy: we rolled our own database. Druid is the distributed, in-memory OLAP data store that resulted.”

The initial responses from HackerNews were predictably skeptical:

“It’s always tempting to build it yourself. “
“They should have just used QlikView.”
“HANA has been doing this for at least 5 years now.”

Ignoring the naysayers, Eric continued to lead the engineering team in building out Druid as the core engine for the Metamarkets platform. A year and half later, in October 2012, we open sourced Druid in a talk we gave at O’Reilly Strata’s conference [2]. Open source Druid has now been adopted by hundreds of leading companies around the globe, notably Netflix [3], Lyft [4], eBay [5], Netflix [6], Salesforce[7], Pinterest [8], Yahoo! [9], and Snap [10].

While most of this is public, there is one piece of history about Druid that hasn’t previously been shared. Before Metamarkets’ acquisition by Snap in 2017, I retained a few keepsakes from the early days. One of them was an email titled, quite simply, “Druid — the spec” from February 7, 2011. It is 78 lines and 553 words long. It lays out a simple proposal for Druid’s architecture, with a postscript “I’m going to take this project on as a background thread, working on it whenever there aren’t other more pressing things to deal with.” In the subsequent eight weeks, Eric not only wrote Druid but he pushed it into production. He sunset our HBase cluster on April 4, 2011, replacing it with the first Druid service. That original Druid cluster has been continuously operational for over a decade, having processed over 100 trillion events, 100 billion queries, and 1000s of end users in its lifetime.

Druid architecture — “the spec”

---------- Forwarded message ---------
From: Eric Tschetter <cheddar@metamarketsgroup.com>
Date: Mon, Feb 7, 2011 at 6:09 PM
Subject: Druid -- the spec
To: <eng@metamarketsgroup.com>

Given what we've learned about our data needs and various operational
issues,  I think we need the following backend in order to support our
dashboarding and analytics front-end needs.

I'll start with the requirements:

1) Fast scans of the data
2) Operationally simple to manage


So, in order to meet these requirements, the initial implementation should be

1) In-memory: the only way to ensure fast scans is to remove disk from
the equation as much as possible.  This starts with everything in
memory
2) Able to push code to data: this means to me that it should be
implemented on the JVM as it provides functionality for distributing
code and the various scripting languages on the JVM allow for a wide
variety of languages.  This is because the only way we are going to
get fast scans of data is to be able to push code down to data in
parallel.
3) Symmetrical (all nodes should be able to do the same job that other nodes do)


What I propose is:

1) Take "beta" style data and produce an "index" which is the beta
style data indexed for all of the dims.  Partition the indexes by
timestamp (daily for now) and store them on s3
2) Build up a simple Java process that, when requested (HTTP) for data
in a specific time range, will go and pull the index and load it in
memory
-- there are a number of optimizations that can be done to reduce
latency here, but we'll start really simple
3) Build a fat client that will consistently hash over the current
instances and request data in parallel to as many nodes as it needs to

What will not be in v1:

Pushing of code.  At the beginning, there will be a set of operations
that are pre-implemented and clients will be limited.  Eventually, if
this works out as hoped we will add that ability.

Benefits:

1) Simple to manage and deploy a new cluster of these things
-- Just fire up some java processes on nodes that have access to s3
-- Capacity expansion is as simple as adding new machines
-- Vertical scalability possible
2) Fast (hopefully)
-- Initial tests seem to indicate that it will be fast enough, but
this is yet to be seen.
3) Changes the problem of more dimensions from a
combinatorial(exponential) problem into a multiplicative problem
-- Makes it significantly faster to backfill
4) Allows for more flexibility in the future
5) Possible to turn this into a real-time aggregating and serving system as well
6) All dimensions available
7) Ability to do or/and queries inside of a specific dimension

Downsides:

1) A new system
-- I don't think we have a choice, though
2) Indexing implies the existence of a schema.
-- Our data isn't schema-less anyway, so I think this is acceptable


I obviously think there is more up-side than down-side, but it's my
proposal so there's most likely some bias.  Please provide feedback.
Specifically, if there are any more downsides that people can think
of, it will be good to have them out and known ahead of time.

Barring objections that might be raised, I'm going to take this
project on as a background thread, working on it whenever there aren't
other more pressing things to deal with.

--Eric

While I could share my own interpretations about the modest beginnings of technology innovation, often stories and source materials speak for themselves. That’s the appeal of the collections of stories in books like Revolution in the Valley and Founders at Work. I hope this story of Druid’s origins will be valuable to others out there who have a crazy idea for a new software architecture, and steel them with the confidence that their contribution could indeed make a dent in the technology universe.

Footnotes

[1] The shoebox was 300 Brannan Street, San Francisco. It was packed to the gills with startups, like many office buildings around South Park at the time. It was an unofficial, physical incubator where the rents were (then) affordable and the amenities were few. Guillermo Rauch, who went out to create Next.js and Vercel, was downstairs from us; Daniel Gross, was then a 19-year prodigy and media darling working on Greplin (eventually Cue) down the hall. The entire building reeked of a sickly sweet odor from its first floor restaurant, aptly named Ozone Thai.

[2] Beyond Hadoop: Fast Ad-Hoc Queries on Big Data Michael Driscoll, Eric Tschetter. O’Reilly Strata Conference 2012. I remember the night before the Strata talk, our head of sales & marketing, Eric, and I were holed up in a Manhattan hotel room rehearsing while Eric pored over the code base to clear it for public release.

[3] How Superset and Druid Power Real-Time Analytics at AirBnB. (June 2017) by Maxime Beauchemin, YouTube.

[4] Streaming SQL and Druid at Lyft. (August 2018) by Arup Malakar, YouTube.

[5] Monitoring at eBay with Druid (May 2019). Mohan Garadi, eBay Blog.

[6] How Netflix uses Druid for Real-time Insights to Ensure a High-Quality Experience (March 2020). Ben Sykes, Netflix Tech Blog.

[7] Salesforce – Delivering High-Quality Insights Interactively Using Apache Druid at Salesforce (2020). Dun Lu, Salesforce Engineering Blog.

[8] Powering Pinterest ad analytics with Apache Druid (Jan 2020). Filip Jaros and Weihong Wang, Pinterest Engineering Blog.

[9] Yahoo Casts Real-Time OLAP Queries with Druid (Aug 2015). Datanami.

[10] Data analytics and processing at Snap (Sep 2018). Charles Allen, Slideshare.

Apache, Apache Druid, Druid, the Apache Druid logo, Apache Superset, and Superset are registered trademarks or trademarks of Apache Software Foundation.

Apache Druid Turns 10: The Untold Origin Story

Druid architecture — “the spec”

Footnotes

Like this:

Published by Michael Driscoll

Leave a ReplyCancel reply

Druid architecture — “the spec”

Footnotes

Share this:

Like this:

Published by Michael Driscoll

Leave a ReplyCancel reply

Discover more from m. e. driscoll