Creative Works, Greed and Free Information, Pt. 3

This series of posts was started to address a common justification for the piracy of digital products, the idea that information should be free. In the first post, I examined a related argument, that corporations are evil, and argued that it is no excuse for piracy. In part two, I started to look at the information should be free argument by trying to define information. I asserted that there are two definitions of information relevant to the discussion and argued that the first, knowledge, cannot be applied to digital products. I concluded with the following:

…since music, movies, or software all fail to fit the definition of knowledge, this falls flat as an excuse to pirate digital goods. Where the ambiguity arises, and where you can get your ammunition to call me a moron, is when information means data.

Are you ready to call me a moron? OK. Here’s your ammunition. The word data has different meanings in different contexts. It could be the results of scientific experimentation or observation, the results of a public poll, a set of statistics such as those collected during a Census, and more. The meaning of data relevant to our discussion is given to us by Merriam-Webster as:

3 : information in numerical form that can be digitally transmitted or processed

So there you go. Digital products such as books, movies, music and software all fit that definition nicely. Now you can sit back, call me a moron, and laugh about how you’re entitled to all of that music you’ve been downloading because it is information and information should be free! Finished? Good. Now I’m going to show why you’re a bigger moron than I am.

Before we go further, it would be helpful to understand a bit about information in numerical form that can be digitally transmitted or processed. Technically, computer data is not stored or processed in a numerical format. But it is often visualized that way, so the definition works for our purposes. Computer data is often viewed in base 16, or hexadecimal, format by software developers and crackers because it’s not too hard for a human to interpret. But on the computer architectures we use today, it is stored, transmitted, and processed in a binary format. If you were to directly translate computer data to an equivalent numeric format, you would use base 2, which is a series of zeros and ones (11110110, for example). Not so easy for we humans to interpret.

If you were to print out any file from your computer in binary form, regardless of the format of the file itself, you would see an unbroken sequence of ones and zeros. It would all appear meaningless. And you know what? It would be. Those ones and zeros only have meaning to the computer. If your printout is of an image file, you have absolutely no way of reproducing that image yourself without feeding it manually (using special software) to another computer. If it is a music file, you have no way of reproducing or transcribing it. And you certainly wouldn’t be able to do anything with the binary printout of a program. Even if you have the patience to learn to interpret binary printouts, they would still be out of context and, therefore, meaningless to you.

Here’s the problem. A given sequence of binary numbers can have several different meanings, depending on its context. As an example, look at this sequence: 11110110. By itself, meaningless. But in an executable file (a program) on an Intel processor (or any processor with that supports the x86 architecture) it might be a command to execute one of several types of multiply operations on a couple of numbers. Or, instead of a CPU instruction, it might represent the number 247. Or it might represent the division sign, ÷. Or it might be part of a longer sequence, something like 1111011001101110, which could again have a different meaning on different CPU architectures. Furthermore, these numbers could represent operating system commands. Or they could represent other commands inside a program.

The point is, computer data out of context is meaningless. It needs to be interepreted by the CPU, the operating system and, ultimately, the software you run on your system. Take a Windows executable file and try to run it natively on Linux or Mac OSX. Not going to happen. By themselves, those operating systems do not know how to interpret the binary data in Windows executable format. Try to run a program written for an Intel CPU on a PowerPC CPU. Won’t happen. Each CPU architecture and operating system has its own way of interpreting binary data. This is a very important factor in deciding whether or not movies, music and software are information.

To illustrate, let’s take something from the real world. Preferably, something that qualifies as knowledge. A phone number would do just fine. It qualifies as the knowing-what type of knowledge. If someone were to ask you for your phone number, you might write it down on a piece of paper. Or, you might send it via email. Whether it’s on paper, or in binary format on an email server, the number itself is still the same. All that has changed is the medium on which it was delivered.

Now, would you consider a piece of paper to be information? If you would, you’re a lost cause and can just stop reading now. The paper is not information. But the phone number is, in the form of knowledge. As for the email, that’s a different story. Not only is the phone number a kind of information, but so is the binary code used to represent it. Now we’re dealing with two types of information — knowledge, the phone number, and data, the binary code. One type for human consumption and the other for computer consumption. When the recipient reads your email, he’ll be able to see your phone number in text format on his monitor. But there’s more to the story.

I’ve decided to write a new program for fun (not really, but indulge me). My program will go through emails, extract phone numbers, and save them to disk as JPEG image files. What does that mean? It means that software which knows how to interpret the JPEG image file format will be able to read the file and interpret the binary data that represents your phone number not as text, but as the color values of a digital image. The binary data is the same, but it no longer represents your phone number. Did your phone number change? Even if I directly edited the binary data that represents your phone number such that it became a different phone number, your real phone number would still be the same. If not, the universe as I know it no longer exists and we can all go home now.

If you haven’t figured it out, I’ll summarize the point I’m trying to make. The binary data that represents your phone number in an email is distinct from your phone number itself. The two are not inextricably tied together. Changing one does not change the other. They are separate entities. It goes back to that word that I emphasized in the last post, representation. I’ll explain.

Take a board game. Any board game. Remember, we determined in part two that board games do not classify as information. The rules do, but the representation of the rules, in the form of the game board and accessories, does not. Now let’s make a computer version of the board game for Windows. The rules are the same (assuming we didn’t intentionally modify them), but this version does not have a physical representation. Instead, it has a graphical representation on a computer monitor. Furthermore, the graphical representation is enabled by an executable computer file, which is a collection of binary data that can be interpreted on a Windows operating system. Now let’s port the game to Mac OSX. If we were to compare the binary data of the Mac version with that of the Windows version, we’ll find that while a lot of it is the same, there are several differences. However, the graphical representation will essentially be the same. And, importantly, the rules will not have changed. The game itself is distinct from the executable file that creates its representation.

So what does all of this mean? I’m about to tell you. It means that books, music, movies, and software are not computer data and, therefore, are not information. Now, I know you’re going to wave your hands in protest, especially with my claim that software isn’t computer data. After all, software, by definition, cannot exist except in digital form. And most software does not have a real world equivalent like a board game. But it doesn’t matter. Whether you’re talking about computer games, office suites, graphics editors, or any kind of software imaginable, the representation of that software, be it in text mode or in graphical GUI goodness, is distinct from the binary code that creates it.

I’m simplifying a bit. OK, a lot. Most of the software you use, games especially, have a lot going on behind the scenes. The computer is crunching numbers, saving files to disk, and doing a lot of work that you don’t see graphically. But, again, it doesn’t matter. That, too, is a representation of the software. Any program can be ported to multiple operating systems and CPU architectures, performing the same general operations. The binary changes, but the program itself remains largely intact. Open Office is still Open Office on multiple platforms, even though the binary data is different. And in the future, if we ever represent computer data using something other than binary, we can still port old software to the new format and the representation can, conceivably, be largely the same.

So in two posts, I’ve laid out the rationale behind my assertion that digital works are not information. You may agree with me, you may not. That’s fine. You are free to do either. Ah, there’s that word free again. He just keeps popping up. And in part 4, I’m going to tackle head on the argument that information should be free. Despite all of the effort I went to in explaining that I don’t think digital works are information. Because I really love freedom, but all of those information-should-be-free hippies keep trying to take it away from me.

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>