23 September 2007

Exploration of the Digital Image

A couple of days ago I was thinking about photos and their properties when existing in the digital domain. A frame of life is captured and stored as a series of pixels, each pixel a set of numbers defining color. A random number generator could provide data to create a series of pixels and would result in an image. That got me thinking... if I were to have a file for each set of possible pixels, wouldn't I have the set of all images? Yes!

This excited me greatly.

So, how do we define the set of all images? First, we must decide on a resolution, or, the number of pixels in width and height. Then, we must define how specific to get when defining the set of colors. That's it. Our formula for the set of all images becomes:

(number of possible colors a pixel can be) ^ (width * height of the image)

Simple formula, huh? Let's start defining a sample resolution and color depth. What is a common image size found on computers? How about the icon? Icons are 10 pixels wide, 10 pixels high, and can have 256 colors. So how many possible icons could we ever see?

(256) ^ (10*10) = 6.67 * 10 ^ 240

or

6000000000000000000000000000000000000000000000000
00000000000000
00000000000000000000000000000000000
0000000000000000000000000000
000000000000000000000
000000000000000000000000000000000000000000
0000000
000000000000000000000000000000000000000000000
icons

Holy mackeral! It would take an astronomical amount of time and disk space to generate the set of all icons. On the bright side, we won't be running out of unique program icons for awhile...

Let's make a huge jump to photographs. A common format for photos stored and viewed on the web is 800x600, 24-bit color. 24-bit color refers to the number of bits needed to represent possible colors. A pixel has three primary colors: red, blue, and green. Each color has a range of values from 0 to 255. Each color range can be represented by a hexadecimal digit. For example, a pure red pixel has values of R=255, B=0, G=0, which in hex is R=FF, B=00, G=00. Each of those hex digits is four bits in size (2^4 in binary), so one pixel uses 24 bits (four bits per digit, two digits per color, three colors per pixel) to store it's color value. This provides for 16,777,216 (16^6) possible colors; a decent range for representing real life images. Good grief, what does this look like in our formula?

(16^6) ^ (800*600) = (My calculator commits suicide)

Yep, should have seen that one coming. What if we mix and match the icon properties and the photo properties?

Icon size with photo color range:
(16^6) ^ (10*10) = 2.96 * 10 ^ 722

Photo size with icon color range:
(256) ^ (800*600) = (Calculator chokes on itself again)

Raising the color depth of an icon to photo quality keeps us in the land of reality, but even with a drastically reduced color depth, we still aren't anywhere near fathoming the set of all photos at 800x600. Let's continue reducing the size and color depth until we can at least computet a number my computer will understand. How about 200x150 (a quarter of the resolution) at 16-color greyscale?
(16) ^ (200*150) = 3.98 * 10 ^ 36123

Hooray! Finally, a number my computer can display without imploding.

Moving along, let's think about why the numbers are so massive. There are so many fricken images in the set of all images that are very nearly the same image. Can you really tell the difference between these two images?


All I did was modify the value of a single pixel!

How many images look nearly exactly like this one with variation in a single pixel? Well, there are sixteen million values for a pixel and 480,000 different pixels that could be different by a single value. What does this mean for the set of all images? Just looking at one example, there are (16^6)*(800*600), or 8 trillion images that basically look exactly the same. If from the set of all images, we threw out 8 trillion entries for every good entry we wanted to keep, we reduce the number of images we care about. However, dividing an incalculably large number by a "measly" 8.00 * 10 ^ 12 is relatively insignificant. After all, in a much smaller set of 200x150, 16-color images, the magnitude of the total set is still in the range of 10 ^ 36123.

Why stop at discarding images with a single pixel difference? If ten pixels were different, would I still consider this an interesting image? Of course! Determining how many relatively similar images we could throw out of our random image generator is one path of a possible research project. In terms of "duplicate" images, what else can we throw out? Is there another factor for deciding whether a set of images are similar enough that we can keep one and discard the rest? Are there other factors for determining garbage images that we don't care about? It was here that I arrived at a very interesting question. What determines whether a set of 480,000 pixels is "worthwhile"?

This is where the fun starts. I can imagine this being a great research project. Creating a set of algorithms that define an image as either "interesting" or garbage pixels. Once enough algorithms are in place, how long would it take a random image generator to produce an interesting image? How long to produce two? This stuff fascinates me and is what computer science is all about: translating real life into mathematics executed on a computer.

No comments:

Post a Comment