what to feed the mythical machine learning beast?

One of the holy grails of machine learning is the creation of a system that can “read the web” and learn from it, as Isaac Newton read Euclid’s Elements and taught himself geometry. Imagine a mythical beast that could speed-read one-hundred million pages per second, consuming every Wikipedia entry, every scientific article on arxiv.org, everyContinue reading “what to feed the mythical machine learning beast?”

beyond hadoop: fast queries from big data

There’s an unspoken truth lurking behind the scourge of Big Data and the heralding of Hadoop as its savior: While Hadoop shines as a processing platform, it is awkward as a query tool. Hive was developed by the folks at Facebook in 2008, as a means of providing an easy-to-use, SQL-like query language that wouldContinue reading “beyond hadoop: fast queries from big data”

how Oracle, the Goliath of data, could stumble

 This week’s Oracle World was bracketed by two events. First: the unveiling of Oracle Exalytics, a beefy in-memory appliance dedicated to large-scale analytics, during Larry Ellison’s opening keynote. Second: the undressing of Oracle’s cloud computing initiatives by Marc Benioff, SalesForce’s CEO, and the unceremonious cancellation of his keynote on Wednesday morning. Both events highlight thatContinue reading “how Oracle, the Goliath of data, could stumble”

the secret guild of silicon valley

A couple of weeks ago, I was drinking beer in San Francisco with friends when someone quipped: “You have too many hipsters, you won’t scale like that. Hire some fat guys who know C++.”  It’s funny, but it got me thinking.  Who are the “fat guys who know C++”, or as someone else put it, “theContinue reading “the secret guild of silicon valley”

node.js and the javascript age

Three months ago, we decided to tear down the framework we were using for our dashboard, Python’s Django, and rebuild it entirely in server-side Javascript, using node.js. (If there is ever a time in a start-ups life to remodel parts of your infrastructure, it’s early on, when your range of motion is highest.) This decisionContinue reading “node.js and the javascript age”