The Hadoop Distributed File System, or HDFS, which is really the core of any Hadoop cluster, has been taking a lot of flak lately. Many of the criticisms focus on some of HDFS’s well-known drawbacks, including its (until recently) lack of High Availability, and questions concerning HDFS security and management capabilities.
Put another way, critics are coming out of the woodwork, declaring that HDFS, and by extension Hadoop, is not enterprise-ready and that an alternative file system is needed. Well, Steve Loughran, for one, isn’t going to take it anymore.
Loughran, who is an active committer to the Apache Hadoop project and an engineer at Hortonworks, took to his blog to refute some of the criticisms and otherwise defend HDFS. He notes that some of HDFS’s critics are legacy storage vendors whose business models are under threat thanks to Hadoop’s scale-out, commodity hardware approach:
Everyone whose line of business is selling storage infrastructure has realised that not only are they not getting new sales deals for hadoop clusters … [but that] Hadoop HDFS is making it harder to justify the prices of “Big Iron” storage.
Some of the criticisms of HDFS, Loughton acknowledges, are valid. HDFS is certainly not without its flaws and reasonable people can disagree about its maturity as an enterprise-ready platform. But Loughton also points out, correctly, that as an open source project HDFS’s deficiencies are transparent to anyone who cares to look and that there is a thriving community of developers and engineers working to quickly to address them. You can’t say the same for any proprietary file system. Writes Loughton:
This does make it easy for anyone to point at the JIRA and say “look, the namenode can’t…”, or “look, the filesystem doesn’t…” That’ something we just have to recognise and accept.
The whole post is worth checking out. Loughton reviews a number of HDFS alternatives and gives even-handed evaluations. He’s also got a killer list of 10 questions you should ask your current storage vendor when they come calling claiming they support Hadoop.
Also, below is a video from Hadoop Summit earlier this summer. Hortonworks’ Arun Murthy discusses the state of Hadoop from an enterprise-readiness perspective, including how to leverage existing data management skills in Hadoop environments: