Big data is heading down the path to becoming a $50 million industry by 2017. And while the incumbent big boys of the tech sector – HP, IBM, Intel, Oracle – dominate the emerging big data market at the moment, I’m here at the Hadoop Summit this week, where legacy vendors, startups, service providers, systems integrators and basically everyone else as developers race to become the first to offer a truly enterprise-grade Hadoop-based big data solution. Whoever gets there first, wins.
As SiliconANGLE’s John Furrier explained earlier today, Hortonworks, the Hadoop solution developer angling to become the Red Hat of enterprise big data with a free product backed by paid services, has boosted its own go-to-market play and mainstream appeal considerably thanks to partnerships with the likes of Microsoft and VMware.
Other vendors, too, are using Hadoop as the basis for their own products, using the shortcomings of the platform (weak interface, difficult management, complexity of deployment, I/O speeds) as the springboard to build business and add value. NetApp is at the summit hyping its Hadoop-ready storage infrastructure offerings. Talend is pitching a connector to help integrate a visual interface with Hadoop’s HCatalog and Oozie services. Datameer provides an Excel spreadsheet-like view into the big data table.
But there’s a common theme here, and it’s less than portentous for the big data ecosystem as a whole. There are many vendors here who are promoting the ability to use SQL (or at least SQL-like) queries to get real-time insight from unstructured data stored on Hadoop. Kognitio, Karmaworks, and Drawn to Scale are all here promoting the same.
And I’ve heard at least one conference attendee mention that they prefer using Apache Hive rather than Apache Pig for analytics and warehousing purposes, if only because of that same SQL likeness.
In and of itself, it’s not a problem: Easier, more familiar access to the insights a professional-grade Hadoop deployment can bring is definitely worth its weight in gold.
But if customers are so nervous about Hadoop that they want and need the security blanket of SQL just to be comfortable using it, that speaks to the increasingly dire skills gap and talent shortage in the big data marketplace. There’s still plenty of value in the traditional relational database model. But if the compromise of combining petabyte-scale data sets to SQL queries is the best the market can come up with, Hadoop may just be in trouble.
The Hadoop ecosystem is standing on the cusp. Hot Silicon Valley companies like Facebook, Twitter and Netflix are vocal about their increasing usage of Hadoop here at the summit. Vendors like Hortonworks, Cloudera, DataStax and other pure-play Hadoop providers are exhibiting market leadership, sure.
But there needs to be a redoubled focus on education and thought leadership, too, if we’re to see that $50 million milestone marker go past the rear-view window.