What's in a Megabyte?
Dana J. Parker
March 2000 |
Whenever we discuss a time to meet, I ask my in-laws: "Is that real time or Starrett time?" This is shorthand for "Yes, I know you said 5:30, but we all know perfectly well none of you will show up until at least 6." Asking for clarification is a nice way of letting them know I know they'll be late, by my terms, even if they're on time by theirs, and that there's no problem as long as we agree--on terms.
In the computer industry, we have similarly two-sided phraseology--a kind of shorthand--for measures of capacity. There's the Marketing Megabyte and the Techie Megabyte, for example. When techie types talk among themselves about megabytes and gigabytes, it's taken for granted that they mean techie megabytes and techie gigabytes; add a marketing person to the conversation, and you can no longer be sure there's agreement on terms among all present. The same thing happens in a marketing venue, such as a trade show booth--any megabytes are automatically recalculated by the techies. DVD is commonly referred to as holding 4.7 gigabytes in a single layer. Only it doesn't really; it holds 4.7 billion bytes, or 4.38 true (techie) gigabytes. Before DVD came along, the makers of CD-R media labeled them as anywhere from 700 megabytes to 650 megabytes--and when OSTA finally got around to convincing its members to standardize on 650 megabytes to denote the true capacity of standardized CD-R media, some customers felt shortchanged. There were numerous calls to media suppliers from customers wanting to know why the manufacturers decided to make smaller-capacity media, and asking if it was still possible to get the older, "bigger" kind.
This annoying discrepancy in the way kilobytes, megabytes, gigabytes, (and soon terabytes) are measured dates from the earliest days of data processing. We humans count in tens, so we think in terms of powers of ten--such as 10⁁2, or 100. But computers use binaries, so it's handy to talk of capacities in terms of powers of 2--such as 2⁁10 = 1,024 bytes. It's much faster to say "one K" (for kilo, or thousand) than to say "One thousand twenty four."
At the kilobyte level, this isn't a huge discrepancy--what're 24 little bytes among friends, after all? The few computer professionals talking about kilobytes in those good old days, when kilobytes were big, all knew what they meant. But pretty soon, the personal computer came along, and computer professionals found themselves talking to physicists and chemists and even ordinary people. Shortly thereafter, kilobytes were too small, and they were talking about megabytes. When you get into megabytes, it's still not that big a deal, even though a true megabyte is 48,576 bytes more than a million.
Unfortunately, the megabyte level is also where marketing got involved, so that we currently have three different values for a megabyte: one million bytes, which is still used by some hard drive manufacturers and makers of network hardware; 1,048,576 bytes, the techie megabyte; and 1,024,000 bytes, used by floppy diskette manufacturers. This last one is the most confusing. If you calculate the capacity of a "1.44MB" floppy using techie megabytes, it's 1,509,949.44 bytes. Using one million bytes, it's 1,440,000 bytes. But the true capacity of a "1.44MB" floppy is neither--it's really 1,474,599.99 bytes, because the floppy diskette marketing geniuses defined a megabyte as 1,024,000 bytes--or 1,000 times 1,024 bytes.
The problem gets progressively worse as values increase, so that a true gigabyte is 73,741,824 bytes more than a billion bytes, and so on. Pretty soon, you graduate into the terabytes and exabytes, and you're talking real numbers.
I know what you're thinking. This is why we have standards organizations, right? Exactly! And that's why, in March 1999, the National Institute of Standards and Technology (NIST) announced in its TechBeat newsletter that it was time to "Get ready for the mebi, gibi, and tebi." In conjunction with the International Electrotechnical Commission (IEC), which writes international standards for electronic technologies, NIST has adopted the kibi- (Ki), mebi- (Mi), gibi- (Gi), tebi- (Ti), pebi- (Pi), and exbi- (Ei) byte to take the place of the ambiguous but widely used kilo-, mega-, giga-, tera-, peta-, and exabyte.
A brief canvass of the Web yielded a range of reactions to the proposed new names for values, including the following:
- "Are you joking with me?"--Bill Robbins, Dell spokesman
- "Has the 8-bit usage of the term 'byte' been internationally standardized? Or is 'kibibyte' still ambiguous (for Multics and PDP-10 hackers at least)? 'Kibioctet'--sounds like a cat food."--David Henkel-Wallace, Zembu Labs
- "This may lead to kibi-tsars, mebi-or-mebi-not, gibi-ous moonings, tele-tebi, pebi-le-mocha, and exbi-gated Websites. But, it does create a new solution to the Y2K problem: Change K from kilo to kibi, which puts the problem off until 2048."--Edupage editors
Perhaps they should have titled the NIST report, "Get ready to sound like a babbling idiot." And mind you, this is supposed to simplify things. kibibyte, mebibyte, tebibyte, pebibyte, exbibyte
Say that three times, fast. Feel silly? Maybe that explains why the terms haven't caught on. And are not likely to, either, it seems--except as an opportunity to play with words.
Dana J. Parker (email@example.com) is a Denver, Colorado-based independent consultant and writer and regular columnist for STANDARD DEVIATIONS. She is also a contributing editor for EMedia, co-author of CD-ROM Professional's CD-Recordable Handbook (Pemberton Press, 1996), and chair of Online Inc.'s DVD PRO Conference & Exhibition.
Comments? Email us at firstname.lastname@example.org.