Big Data – What the Heck is... A Zettabyte or Brontobyte?

Big Data – What the Heck is... A Zettabyte or Brontobyte?

If you bought your first home computer in the early days – the 80s, or even the 70s if you were a very early adopter, you will probably remember when storage space was measured in kilobytes (1,000 bytes) or megabytes (1 million bytes).

The first commercially available hard disk drives made to fit popular home PCs were 5MB in size – just big enough to hold perhaps one or two songs in MP3 format today, or one large color image – and cost around $3,000.

These days, a modern smartphone is likely to hold 32 or 64GB of data – that’s around 12,800 times the capacity of those early hard disk drives, with the cost of the storage component coming in at under $50.

As data has gotten bigger, we have obviously needed bigger places to store it – and increasingly weirder words to allow us to describe these sorts of capacities in human language.

Hence, we are well into the age of the terabyte – where your average home computer is endowed with at least 1,000 gigabytes of storage capacity. And we are heading into the age of the petabyte and exobyte.

A petabyte (pb) is (usually*) 1,000 terabytes. While it may be a while before most of us need that sort of storage on-hand, industry is already frequently dealing with data on this scale. For example, Google is said to currently process around 20 pb of data per day. While this is transient data, not all of which needs (or in fact can be – see my article on the capacity crisis) to be stored, storage on this scale does happen. Facebook’s Hadoop clusters were said, in 2012, to have the largest storage capacity of any single cluster in existence, at around 100 pb.

To put that into some perspective is quite difficult, although that hasn’t stopped a lot of people from trying. One way to think of it, is that everything ever written by mankind, in any language, from the beginning of time, is thought to make up about 50 pb worth of text.

Of course text is quite easy to compress, and quite efficient to store. Most of the really big data we deal with today is likely to be in the form of pictures or videos, which require a lot more space – if you felt like watching 1 Pb of HD-quality video, for example, it would keep you occupied for a mere 13 years.

Taking into account the speed at which we moved from using kilobytes, to megabytes, to gigabytes, as our standard unit of measurement of personal storage capacity, we could expect to have a petabyte of storage in our homes within perhaps 10 to 20 years.

Which brings us onto one of the other reasons we continuously need to store more data – the increase in quality (and therefore size) of data – by the time we have Pb hard disk drives in our home PCs, we will probably be used to watching (at least) 4k Ultra HD video as standard – meaning if we fill our Pb drive with videos, we will blaze through it in just 5 years.

Beyond that, we will be looking at moving into the age of the Exabyte – 1,000 petabytes, or one million terabytes (or even a trillion gigabytes, if you want!)

No one stores information of that sort of quantity today, but annual global internet is expected to reach 950 exabytes by 2015. Which means we will very nearly need an even bigger unit of measurement to describe it – which is the zettabyte (1,000 exabytes).

With a word like that, we are equipped to talk about numbers as large as the entire size of the internet – calculated in 2013 to stand at around 4 zettabytes.

(Bear in mind, however, that it is very difficult to know how big the total internet is, as there is no single index – probably the biggest – Google’s – was said in 2010, to cover a mere 0.004% of the total internet! It probably covers a lot more now, but there are huge swathes of it cut off from its web-crawling bots which will forever be “dark” – for example, private corporate networks.)

An order of magnitude higher, we find the yottabyte – 1,000 zettabytes. No one makes yottabyte-scale storage media, so if you had a yottabyte of data (which no one does, or will have for some time) you would need to spread it across a lot of smaller disks – and at today’s rates that would cost you around $25 trillion. If you somehow had this much data stored on the cloud somewhere, and wanted to download it to your computer using current high-speed internet, it would take you roughly 11 trillion years.

Still not enough? Try the brontobyte (1,000 yottabytes) for size. Of course there is nothing in existence which is measurable on this scale. Take 1,000 of them and you have one geopbyte. There is only one way to describe this unit which comes close to conveying its scale - no one has yet bothered to think of what to call 1,000 of them.

Thinking about quantities such as zettabytes and brontobytes may seem very theoretical now, but remember, it wasn’t too long ago that Bill Gates (allegedly) said: “No computer will ever need more than 640k of memory”. This quote is now widely considered to be misattributed but the point remains, that our need for increasing amounts of storage space has grown far more quickly than we ever thought it would. So while we will probably live to see the days when it is common to carry a petabyte in our pocket, our grandchildren and great-grand-children might one day be carrying around a bronto on their bionic implants.

As always, I am interested to hear your views on big data. Please share your thoughts in the comments below.

-------------------

I really appreciate that you are reading my post. Here, at LinkedIn, I regularly write about management and technology issues and trends. If you would like to read my regular posts then please click 'Follow' and send me a LinkedIn invite. And, of course, feel free to also connect via Twitter, Facebook and The Advanced Performance Institute.

For more on the topic, check out my other recent LinkedIn Influencer posts:

About : Bernard Marr is a globally recognized expert in strategy, performance management, analytics, KPIs and big data. He helps companies and executive teams manage, measure, analyze and improve performance. His new book is: Big Data: Using Smart Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance

*Because of the way computers work, the increase between each jump in unit of measurement (kilobyte to megabyte to gigabyte, and so on) is actually X 1024, rather than x 1000. However for simplicity’s sake, it is usually calculated as an increase of X 1000, when absolute accuracy is not required – for example, in marketing.

Photo: Shutterstock.com

Hi Bernard, Read many articles about big data but this was the most interesting one, thanks for sharing with us Best Murat

Like
Reply
John Meyer

B2B Marketing Strategy | Digital Marketing | Social Selling | Business Development | Author | Blogger

9y

Great post. The way big data is growing it will be interesting how close we get to the need for brontobyte storage services.

Like
Reply
David H. White

Systems Software Engineer (QTS / CACI / NASA), IT Consultant for John Moore Services

9y

Nice article. Learn something new everyday. Never heard of bronto-- and geop- terms. Suggest update to article (paragraph 11, in the parentheses): 1 Exabyte = 1 billion Gigabytes. Arkesh, you are right. The author is rounding. Using 1000, instead of 1024 (2 to the 10th power), just makes the data size comparisons much easier to understand for most people (including myself).

Like
Reply

A good thought provoking article. I really like the way you present your thoughts. Storing large data in zettabytes or brontobytes scale might not be a challenge in future but what we do with that data needs to be thought. Extracting value out of it and creating insights valuable to target users. There is a need of a technology that can scale to that level. Next generation of Data transfer protocols, data crunching at atleast petabytes scale, Advanced analytics, advanced visual models etc.

Like
Reply

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics