
In this post I’ll share some of the thinking behind our choices for the Big Data stack that powers our petabyte platform, consisting of three layers (i) a processing and storage substrate based around Hadoop and HBase, (ii) an analytics engine that mixes R, Python, and Pig and (iii) a visualization console and data API built principally in Javascript.