CS 100 (Learn) — CS 100 (Web) — Module 01

Bits and Bytes

NOTE: If your internet access is restricted and you do not have access to YouTube, we have provided alternate video links.

TRANSCRIPT

In this video we are going to learn some of the terminology for using bits and bytes.

Digital information is really just a long sequence of zeros and ones. It is helpful to think about them as tiny switches that are either off or on. Each tiny switch is a bit of information, where bit stands for binary digit.

Early in computing they realized that it was a little tedious to deal with individual bits, and so for convenience they decided to group eight bits together and called that a byte. The name "byte" was actually a pun on the phrase "bits and bites". To distinguish byte from bite (as in taking a bite of an apple), they spelled it with a y. There is also a "nibble", which is a grouping of four bits -- but that terminology is rarely used today. (I swear, I am not making this up).

The reason we use bytes is for convenience. It is a lot like a carton of eggs. It can be quite a hassle to carry around a single egg. It is much more convenient to carry around a carton. Even if you only need to carry around a single egg or have one egg left in the fridge, you still keep it in a carton. Bytes are pretty much just cartons of bits. One important distinction between cartons and bytes is that bytes are always exactly eight bits. If you only need five bits and want to store them in a byte, you can just ignore the other three bits, but they will always be there.

If you recall, with one bit you can represent two different values. With two bits you can represent four possible values: zero-zero, zero-one, one-zero and one-one [00, 01, 10 and 11]. With three bits you can represent eight values, and so on. With "k" bits, you can represent 2^k possible values. A byte has eight bits, so it can represent 2⁸ or 256 different possible values.

For a quick perspective on how large digital information gets:

An essay will probably be a few "kilobytes" -- or a thousand of bytes.
A song will likely be a few "megabytes" -- or millions of bytes.
A high quality video could easily be a "gigabyte" -- or a billion bytes.

In the early days of computing, it was generally understood that a kilobyte was actually one thousand and twenty-four bytes [1,024] because it was an even power of two (two to the ten [2¹⁰]), and a megabyte would be two two to the twenty [2²⁰] or one million forty-eight thousand five hundred and seventy-six bytes [1,048,576].

In today's world it's really hard to say for sure, but "kilobyte" now usually means just 1,000 bytes and "megabyte" means one million bytes. It all got ruined because of business and marketing practices to make products appear more impressive.

Sometimes, businesses will use bits instead of bytes. For example, your internet provider may advertise a download speed of one gigabit per second, but that is really one hundred and twenty five [125] megabytes per second -- it doesn't sound nearly as impressive.

They are supposed to use a capital B when referring to Bytes and a lower case b when referring to bits, so err... b-ware of misleading marketing.

We know that digital information is really just a long sequence of zeros and ones, usually grouped together in bytes. We still don't have a good idea how those bits can represent numbers, words, pictures or anything else on a computer, but we'll get there soon.