YARN – the fledgling next-gen MapReduce subproject to the incredibly popular Apache Hadoop big data platform – may already have some competition as Dremel, an analytics platform borne in Google’s labs, picks up steam, reports Wired Enterprise.
Hadoop, as it stands today, has some limitations. It requires the usage of MapReduce, leaving other data analysis frameworks out in the cold – especially a problem if you’re looking for real-time analytics. That’s why the big data world is looking to YARN as the future of the field, as it takes the groundwork that MapReduce laid and makes it more versatile and flexible.
That’s also why Dremel is getting some attention. Like Hadoop, it’s used to analyze huge data sets. But unlike Hadoop, or at least Hadoop as it stands today, Dremel is designed to provide instant answers, bringing the best of SQL and MapReduce together while technically using neither (Dremel uses an “SQL-like” querying language).
“The size of the data and the speed with which you can comfortably explore the data is really impressive. People have done Big Data systems before, but before Dremel, no one had really done a system that was that big and that fast. Usually, you have to do one or the other. The more you do one, the more you have to give up on the other. But with Dremel, they did both,” UC Berkeley professor Armando Fox told Wired.
Dremel is already packaged in a public-friendly offering: Google BigQuery, which we’ve written about before, is a core part of the Google Cloud Platform, and the search giant is trying to make big data a pillar of its enterprise cloud computing strategy. Meanwhile, as Wired says, an Israeli startup is developing OpenDremel, based on Google’s research, though it’s not gotten very far yet.
In short, YARN definitely deserves plenty of attention. But in the next generation of big data analytics, Hadoop may not have the stranglehold it’s enjoying in these frontier days.