If there’s something fundamentally wrong with the universe it certainly applies to our old friend the kilobyte. It may not have been around for long - the word was first recorded in about 1970 - but the kilobyte is not what it seems. Like other words beginning with 'kilo-' it means 1,000 of something - bytes in this case. The trouble is it doesn't. If you buy a kilobyte of memory you get 24 bytes extra, free!
You get 1,024 bytes, which is two raised to the power of ten. It's out by 2.4%, which isn't a lot. The trouble is, these days we can't do a lot with a kilobyte of memory and prefer to work with megabytes. The 'mega-' prefix tells us this should be a million bytes. But again it isn't. It's actually 1,024 x 1,024, or 220, or 1,048,576 bytes. This, as you can see, is even more inaccurate - it's nearly 5% out.
And so it goes on for larger units. Our one gigabyte memory chip has over 7% more than the billion bytes we may have expected. A terabyte of memory will have a trillion bytes, plus an extra 10%. The decimal prefixes we use are becoming increasingly inaccurate with the advance of technology.
The reason is that computers use binary logic. Each tiny circuit embedded deep within your microprocessor has one of two states: one or zero, on or off, true or false. We say that each circuit can store a bit, or binary digit of information, and we combine them in groups of eight to make bytes. As we combine these into an area of memory which can be addressed by the processor, we have developed circuitry in the numbers of bytes which are ever larger powers of two. At some point a few years ago it just happened to be convenient for one microchip manufacturer that 1,024 was close to 1,000, so the kilobyte prefix stuck.
But there's inconsistency as well as inaccuracy. Computer hard disk manufacturers may sell you a terabyte of storage, but this time you'll only get exactly 1 trillion bytes - disk storage technology isn’t so reliant on the binary expansion. The size of flash drives, floppy disks and DVDs is quoted using decimal prefixes. CDs are an exception - a 700MB compact disk actually holds 700 x 1,024 x 1,024, or around 730 million bytes.
Data transmission is another confusing area. If you’re accessing the Internet through a dial-up modem then maybe it’s been rated at 56kbps (kilobits per second), or your broadband link could be in megabits per second. Even industry professionals can't agree among themselves on what these network bandwidth prefixes mean. Generally, network designers prefer the megabit to mean 1,048,576 bits; whereas telecommunications engineers prefer it to be a straight 1,000,000 bits.
In fact, the 56kbps modem transfers your data at neither 56,000 nor 56 x 1,024 bits per second; it's nominally 57,600 bits per second. Historically, the first baud1 rate above telegraph speed was 300bps. This figure then rose in a binary fashion to 4,800, before it was tripled to 14,400. Successive doubling then brought us to the figure of 57,600bps. Not a lot of people know that!
It was becoming clear by the 1990s that we needed a standard. It fell to a very august committee to put us straight in all things binary.
Lord Kelvin to the Rescue
Way back in 1906, a group of scientists and engineers met at the Hotel Cecil in London and formed the International Electrotechnical Commission (IEC), with the aim of bringing standardisation to the fledgling electrical industry. It was seen as important to collaborate on matters of specification and safety, terminology and testing. Delegates from 11 countries not only met and dined together, but toured the UK, taking in both the Lake District and Stratford-upon-Avon. The commission's first president was the eminent scientist Lord Kelvin, after whom the SI2 unit of temperature is named.
This spirit of cooperation has prevailed. In the 100 years since its founding, the IEC has published countless documents and standards which have shaped the electrical and electronic industry as we know it. In 1999, the IEC decided that if decimal prefixes were inaccurate we just needed another set which weren't, and produced a groundbreaking publication which went by the natty title of IEC 60027-2.
Thanks to the IEC, we now have a spanking new set of prefixes. So long as we know what our actual figure is, we should be able to express it unambiguously in both words and symbols. That 56k modem unfortunately isn't covered; it may just have to remain an anomaly until modems are a thing of the past.
Take-up of these prefixes has so far been slow - this isn't part of the SI system, after all - but may one day catch on, hopefully before it's too late. If nothing else, they give us an interesting set of Scrabble bonus words.
Based on the decimal prefix kilo-, the kibibyte (KiB) is 1,024 or 210 bytes. We also have the kibibit (Kib).
Based on the decimal prefix mega-, the mebibyte (MiB) is 1,0242 or 220 bytes. We also have the mebibit (Mib).
Based on the decimal prefix giga-, the gibibyte (GiB) is 1,0243 or 230 bytes. We also have the gibibit (Gib).
Based on the decimal prefix tera-, the tebibyte (TiB) is 1,0244 or 240 bytes. We also have the tebibit (Tib).
Based on the decimal prefix peta-, the pebibyte (PiB) is 1,0245 or 250 bytes. We also have the pebibit (Pib).
Based on the decimal prefix exa-, the exbibyte (EiB) is 1,0246 or 260 bytes. We also have the exbibit (Eib).
Based on the decimal prefix zetta-, the zebibyte (ZiB) is 1,0247 or 270 bytes. We also have the zebibit (Zib).
Based on the decimal prefix yotta-, the yobibyte (YiB) is 1,0248 or 280 bytes. We also have the yobibit (Yib).