# The confusing prefixes for binary multiples

En Español

Expressing sizes in informatics can be confusing at times, this is due to the lack of standardization among hardware manufacturers, programs and operating systems regarding the use of prefixes (K, M, G) when expressing binary multiples. A standard does exist, this is what ISO/IEC 80000-13:2008 is about, it defines a standard for the use of prefixes for binary multiples, however, it has not been widely adopted as of the writing of this post.

This confusion originates from the early days of informatics. In computing, the most basic unit of information is the bit, which could be think of as a switch, it has only two states, on and off (1 and 0). Because you can't store but a certain state (on or off) in a bit, a bigger unit was created, the byte, which is just a group of bits. A byte consists of 8 bits, and this allows to represent a value, from 0 to 255.

The SI (International System of Units) defines that K (Kilo) denotes 1000, and M (Mega) represents 1000×1000, this is based in our base-10 numeric system. Originally 1 kilobyte meant 1000 bytes (10^{3}) and there was no confusion. However, in the world of computing, because everything is based on the bit, the numeric system in use is base-2. People noticed the convenience of denote 1 kilobyte as 1024 bytes (2^{10}), and one megabyte as 1024×1024 (2^{20}); instead of the SI Kilo and Mega convention, which denotes 1KB as 10^{3} bytes, and 1MB 10^{6} bytes.

This originated the confusion. In the early days, people working in computing were knowledgeable and computer manufactures specified the capacity of their products in full precision. Within the computing circles everyone started using 1KB as 1024 bytes. But as the capacity of the storage and of the overall memory grew, the capacity began to be expressed with prefixes. Hardware manufactures started to express their capacities as SI units, and programs reported the capacity in base-2 customized prefixes. A manufacturer would specify a capacity of 800MB based on the SI equivalency of 1MB = 1000×1000 bytes, and the size would be reported by programs as 763MB since they were using the custom equivalency of 1MB = 1024×1024.

Nowadays, a standard exists, new prefixes were created to represent the binary multiples, leaving the SI prefixes to represent the decimal multiples. The new prefixes (yes, they sound funny) are the Kibi (Ki) for 2^{10}, Mebi (Mi) for 2^{20}, Gibi (Gi) for 2^{30}, etc. While it is not widely adopted yet, you probably have seen KiB, MiB and GiB around.

Name | Symbol | Base 2 | Base 10 | Value in full precision |
---|---|---|---|---|

Kibi | Ki | 2^{10} | ≈1.02×10^{3} | 1,024 |

Mebi | Mi | 2^{20} | ≈1.05×10^{6} | 1,048,576 |

Gibi | Gi | 2^{30} | ≈1.07×10^{9} | 1,073,741,824 |

Tebi | Ti | 2^{40} | ≈1.10×10^{12} | 1,099,511,627,776 |

Peti | Pi | 2^{50} | ≈1.13×10^{15} | 1,125,899,906,842,624 |

Exbi | Ei | 2^{60} | ≈1.15×10^{18} | 1,152,921,504,606,846,976 |

Zebi | Zi | 2^{70} | ≈1.18×10^{21} | 1,180,591,620,717,411,303,424 |

Yobi | Yi | 2^{80} | ≈1.21×10^{24} | 1,208,925,819,614,629,174,706,176 |

At the time of this posts, this is how things are:

These use SI prefixes, where 1KB = 10^{3} and 1MB = 10^{6}

- The capacity of HDDs
- The capacity of flash memory
- The capacity of DVDs
- Mac OSX
- fdisk
- cfdisk
- apt-get

These use custom prefixes, where 1KB = 2^{10} and 1MB = 2^{20}

- The capacity of RAM memory
- The capacity of the cache memory
- The capacity of CDs
- Windows
- ls
- df
- Almost every website, paper and publication

These use the new ISO/IEC standard, where 1KiB = 2^{10} and 1MiB = 2^{20}

- Linux (although many utilities still use the custom prefixes)
- gparted
- ifconfig
- pidgin
- thepiratebay.org
- arstechnica.com
- The UK government
- jveweb.net

Yes, it is a mess, unfortunately new standards can take many years to be adopted by the industry and by the field. As far as this website and whatever proyect I made in the future goes, I will be using this standard (ISO/IEC 80000-13:2008) to express base-2 prefixes, with the goal of avoid confusion in my posts. I will double-check all the posts before the publication of this post, but chances are that by the time this post is indexed by search engines I will be already done.