Big data’s getting bigger by the day, and unless we can get a handle on what to do with it all, it’s going to cause some real headaches for those in the data storage business.
Just how much data is there?
As it stands today, there’s around 10 trillion gigabytes of data being generated each and every year, a figure that HP reckons will grow to 50 trillion gigabytes of data by the year 2020. That’s an awfully big number, but to really understand just how much data is being created, it helps to be able to see it in physical terms that we can better relate to.
Fortunately, a new infographic designed by Avalaunch Media and presented by PC Wholesale does a pretty good job of it. The infographic estimates it would take a staggering 600 billion smartphones to be able to store that 10 trillion gigabytes of data we’re currently generating each year. To get an idea of just how many smartphones that is, well, let’s just say that if you laid them all out end-to-end, you’d have enough phones to stretch across every single road in the United States… twelve times!
That’s a fantastical amount of smartphones to be sure, and while in reality it’s not just handsets we rely on to store all of our data, it gives us an idea of the tremendous strain that our existing data infrastructure will be put under if things keep going the way they are.
Facebook, the biggest data hog in the business right now, is already feeling the strain. The social media giant is attempting to be proactive with its newly announced Project Prism, an ambitious undertaking that will eventually allow it to physically scatter its data centers far and wide whilst maintaining a single view of all of its data, something that it has so far been unable to do using its current MapReduce framework.
Facebook’s project is interesting, but from what we can tell it doesn’t seem to address the most fundamental problem we have. They’re still going to need a shed load of infrastructure, regardless of whether it’s housed in one place or scattered around all four corners of the globe, and it’s this issue that will become the most pressing over time.
Data in a bottle
Storage systems have come a long way, there’s no doubt about that. Twenty years ago we would never have believed that each of us would be carrying several dozen GBs of storage around, on our key rings (flash drives) or in our handbags (phones), but indeed we are.
The big question is just how much smaller can we go?
Infinitesimally small, if the researchers at Harvard University have anything to do with it. Led by the genomics pioneer George Church, the team has just discovered a mind-boggling new way to store big data using a DNA sequence.
The Wall Street Journal first reported on Church’s incredible discovery, explaining how he recently completed an experiment that saw him encode the contents of an entire book into strands of DNA molecules, which were then transformed into a viscous liquid that could be kept safely in a test tube for hundreds of years without deteriorating:
The Harvard researchers started with the digital version of the book, which is composed of the ones and zeros that computers read. Next, on paper, they translated the zeros into either the A or C of the DNA base pairs, and changed the ones into either the G or T.
Then, using now-standard laboratory techniques, they created short strands of actual DNA that held the coded sequence—almost 55,000 strands in all. Each strand contained a portion of the text and an address that indicated where it occurred in the flow of the book.
Admittedly, the technique is still some way off from being viable on a commercial basis, but there’s no denying DNA’s potential as a stable and long-term information archive.
DNA data is a pretty mind-blowing concept, but would you believe that people are looking at making data even smaller than that? A test tube is small enough, but it still contains several billion atoms – wouldn’t it be nice to shed a few of them and make do with, say, just 12 atoms?
IBM wants to. A few months ago the computer giant revealed that it’s looking into the use of nanomaterials to create next-generation memory chips that will not only be much smaller than anything we’ve seen to date, but also consume far less power. What they came up with was little short of miraculous; a data array made up of just 12 atoms – two rows of six iron atoms laid out on molecules of copper nitride – that can be programmed to remember a binary “0” or “1”. Using this system, the IBM team was able to create an entire ‘byte’ of data, using a 96-atom array, upon which they later encoded the I.B.M. motto “Think” by repeatedly programming the memory block to store representations of its five letters.
Again, this is a technology that’s still in its infancy, but IBM has a history of leading new innovations – could it be that this is just the first step in a big data revolution?