Thursday, January 6, 2011

Let's Be Formal

Warning of those who have a tendency to tl;dr - this is going to be long and technical. If you're brave enough to stick it out through this post, you'll understand the major formal qualities of filetypes in digital image making. I'm going to limit this discussion to popular filetypes exchanged on the web. From raster filetypes, we'll look at JPGs, GIFs, PNGs, and also BMPs to explain concepts behind the other types. I also will explain general qualities of vector images, as they are used online, but as of yet are not as widely supported as raster images. Lastly, all of this is heavily annotated with links to Wikipedia, because their sample images of concepts are very helpful.

Raster Images - the Bitmap
Raster images are essentially two-dimensional grids of color values, called pixels. The most basic qualities that all raster images have are a resolution, and a color depth. These basically just mean that each raster image has a fixed width and height, and a number of possible color values that can exist in the grid. When computers first started becoming popular, screen resolutions were low and computers could represent very few colors, if any. Nowadays, the standard number of colors that a computer can display is about 16.7 million. Each pixel displayed on your computer screen is a value made up of three bytes - one for red, one for green, and one for blue. Typically for colors on the web, values are expressed in hexadecimal, so for example a bright red is #ff0000.The code denotes that the red byte be set to its max value of 255, or  #ff, and that green and blue are set at #00 each.

The most basic way to represent a true color image as a file is to simply lay these three-byte images out in a grid. The most common filetype that does this is the BMP (bitmap), well known to MSPaint aficionados. However, as true as the colors are, the files tend to be very large at high resolutions, making them unsuitable for web distribution. Thankfully (sort of), the human eye is only capable of distinguishing about 10 million colors on a good day. This means that we can cheat, and reduce our palette to reduce file size.

Image Cheats - Compression, Indexing and Dithering
 This is where things start to get fun. The most common technique used in reducing file size is called compression, and I'll admit the exact methods of compression are beyond my scope of knowledge. However, I can describe the general principle and the effects of compression. JPEGs, possibly the most common filetype exchanged online, use lossy compression, which means that the original bitmap is broken into chunks by an image editing program and those chunks are converted from pixels into mathematical averages of the color content of each chunk. Here's an example:

This is Lena, a popular digital test image made some time in the 70s. I think someone scanned her from an old Playboy. Anyway, this file is a BMP; you might have noticed that it loaded from the bottom up (an odd quality of BMPs on web browsers). This image takes up 768KB, almost a megabyte.

If we compress the image extensively as a JPG, we get this:

This image only takes up 3KB! Obviously, this is very exaggerated compression; usually it's much slighter on most images. Notice that the chunks that the image has been reduced to are converted into large pixels or gradients. People working with JPGs call chunks when they're noticeable "artifacts," as they are reminders of the compression. Although someone interested in high-res photography might not want any artifacts in their image, some digital artists like the abstract qualities that compression can provide, and use it expressively. Sometimes digital artists will intentionally change random bits within JPG images to "bring out" glitches and therefore wild, colorful artifacts. A good example of such images would be Maxwell Paparellas' Intoxications series. PNGs also use compression to some extent, but the effects are not nearly as dramatic as the JPG. More on that later.

The infamous GIF is not a filetype capable of this sort of compression. This is largely due to its age - developed in 1987, when color on computers was still pretty hot stuff. GIFs use another method of reducing file size, called indexing. Indexing essentially changes how pixels work. Instead of referring to a color value, an indexed pixel refers to a color in what's called a color map or lookup table, like a palette. Here's Lena again:

Here we've reduced the image to 32 colors in the color map. I've drawn in the map on the right side, so that you can see all the colors that the image uses. This drastically shrinks file size (76KB, 1/10 of the original BMP) while maintaining good color. Indexing also gives the image a texture, as can be seen especially on Lena's shoulder. GIF makers often enjoy the added textures of various indexing methods, and look for interesting ways to employ them. To use a personal example, check out my GIF of Scott Ostler at dump IRL: 32 colors used for the total animation gives the light on his face a gritty, laser-like quality.

When I reduced the colors, I asked the program to simply average the whole palette down. However, we can again take advantage of the human eye and "cheat" if we so desire to make the image appear to have more colors. The way we do this is called dithering, arranging colors in patterns at the pixel level so that colors appear to blend. The textures from different dithering methods can bring interesting visual sensations to images. Here's Lena with the most common methods to illustrate:

Notice how much truer to the original the dithered versions are. Also notice the difference between the positioned quarter's checkerboard patterns and the Floyd-Steinberg quarter's texture. I like to think of the F-S method as "pore-like." Anyway, GIFs have an upper limit on the number of colors that they can display as well, 256 max. This puts their color depth at much lower than the average JPG, so the filetype is more suited for both images with very few colors and for images where a dithered texture is desired. Again, personal example - in Mirage I pitted F-S dithering of a black and white image against the antialiasing (slight blurring of edges) that most web browsers apply to images. This encourages hallucinatory moire patterns to occur in the shadows of the wavy buildings. In this way, you can anticipate how browsers treat images and design your base images to take advantage of that output. Browser rendering of images is a whole different subject which we'll save for the future.

Animation and Transparency
GIFs have two more formal qualities that make them interesting as filetypes, transparency and capability for animation. GIF animations can have as many frames as the artist might want, but more frames add to the file size quickly. We're going to use a sample other than Lena this time, the popular crab monster from dump:

Interesting things to note about this guy: the reduced color palette, very few frames of animation, and the transparent background surrounding the crab. The pixelation is an intentional move on the artists' part. GIFs have the capability for pixels to be either completely opaque or completely transparent. This allows artists to put an emphasis on the form of a shape or animation. This also makes things interesting for building animations, as GIFs treat each frame as a layer of the same image. When making animated GIFs, the artist must specify frame by frame the duration of each frame in milliseconds and the method of adding the frame to the image. The minimum duration for a frame to be visible is about 17ms, or 60hz, the standard refresh rate of monitors. Frames can be as slow as you want. A frame can either replace the previous frame, or combine, i.e. overlap onto the previous frame. The above image replaces each frame with the next frame. If we change the layers to combine, we get this:

The crab leaves "tracers" of the previous frames, then erases them when it returns to frame 1. This can lead to some interesting effects; check out Tom Moody's "GIF Heavy Hitters" figure, which brings an emphasis onto the path of motion in space. In addition to these possibilities for GIF animation is the GIF's loop qualities. GIFs can loop infinitely, or loop only once per display. Finite GIFs have been largely unexplored due to poor browser support and general lack of apparent possibility, but they can exist.

Returning to transparency as a topic, the advantages that the PNG filetype have over other graphics formats is their capability for gradual transparency. Unlike the on-off method of transparency used in GIFs, PNGs can add another byte onto each pixel value for an alpha value. PNGs also support up to 16.7 million colors outside of transparency.This makes PNGs very useful in web deisgn, where gradual transparency can be used for all sorts of interesting effects.

I mentioned that I was going to return to a comparison of PNGs and JPGs regarding compression. I honestly have a poor understanding of PNG compression, but I know from personal experience that when PNG images use few colors, their file size is usually much smaller than a comparable JPG, and artifacts do not occur. When PNGs are used for photographs, often they are much larger in file size than their JPEG counterparts. Basically, JPEGs more suited for photos, and PNGs are better for layering, abstract images and illustrations.

I think it's time for another Lena sample. Here she is as a PNG with varying alpha values. 532KB - not great size-wise, but pretty good for having an extra byte added on to each pixel. Notice how she blends into the gray background. If you were to remove the alpha channel on this image, the missing areas would reappear.

Wow, kinda misty and romantic. I should point out, you can dither alpha values and save an image like this as a GIF. You get what's called the "screen door effect." Depending on your computer and browser, this image might be antialiased and gain a moire pattern around the outside. You might want that for your images, but if you just want simple, gradual transparency, stick with PNGs.

Vectors and Reproducibility
We're almost finished! Vector images are a completely different ballgame. Remember how I mentioned that compressed JPGs are made up of lots of small "chunked" mathematical functions? Vectors take that a step further. Think of a curve, like a sine curve. In math, we can use points, curves and calculus to define the areas of regions of space. Vector image editors like Illustrator and Inkscape allow users to draw simple shapes, which are stored as mathematical functions. This makes these sorts of files incredibly small, and also scalable.
Rather than having to define a value at every point in a grid, the computer is given a mathematical function to trace and fill in.

Lena, you've changed somehow... As you can see, this drawing is composed of curves, simple shapes, fills and gradients. Simple vectors can be very reminiscent of airbrush drawings due to the ease of using gradients and sharp edges. Because of the clean appearance and the scalability aspect, vector images are very useful for graphic designers who make illustrations and logos for publications. Most print work needs to be at very high resolutions, and so pixels just don't cut it when you need a banner graphic 8 feet wide. Also, I should mention that above I posted a raster version of the original vector graphic. Some of you might be able to view the original in browser. Try scaling the image up and down in browser, if you can; you'll notice the edges stay sharp instead of becoming pixelated, like the scaled up crab monster above. For those of you unfamiliar with how to do that or unable to see the SVG file, here's a larger raster version. The two most commonly browser-supported vector formats are SVG, an open-source format, and SWF, the Flash standard. SWFs allow for interactivity and animations, but require proprietary software to write and view.

This brings us to our conclusion, the subject of reproducibility. As I said, vector images are notable for their scalability. Keep in mind, though, that since the raster versions of a vector file are limited to their native resolution, they are not as easily reused. When a raster image is uploaded to the internet, anyone can edit it, take credit for it, and reuse it at the resolution it exists at or smaller. An artist can have ownership over a vector image by distributing it online as rasters and keeping the vector file private. That way, if authorship is ever called into question, the artist can produce the highest resolution possible, the vector, as proof of ownership. This is an issue I haven't seen discussed in the art community before, so I'd love to hear any artists weigh in on this possibility of content management. Unlike raster files, where there is no original source, so to speak, vectors are like paintings in that they are original objects, but also are like digital images in that they can be infinitely reproduced. It's an interesting paradox.

Many digital artists know that by putting their work online, they expose it to a remix culture where the work will be infinitely reproduced, lose authorship and take on new meanings in different contexts. This is a formal quality of sorts for all online content, and I believe it's one of the main reasons why collectors and institutions are hesitant of purely digital art. Finally it should be kept in mind that all of the image filetypes I've discussed (except for the BMP) share what might be called a formal quality at this point: common support on all web browsers. Because these images are viewable by most people online, they are suited for online display. There are ways to translate these files into formats viewable outside of a computer, but their suitability for the online environment is part of what makes them unique and important as art media.

If anyone thinks that I've missed formal qualities of these filetypes, please share your thoughts in the comments. This essay is meant to be constructive and open to growth.


  1. A note on GIF combine and replace modes for frames: "combine" mode is also very useful for "optimizing" GIFs. Not all frames have to be the same size in a GIF, so if only a small area needs to be updated in a GIF, the following frames need only be the size of the changed area. Here's an example (thanks, noisia!):

  2. Good essay, very clear. One issue about enlargement for raster images: most browsers use "resampling," essentially taking a snapshot of the image, running it through some math formulas, and redrawing it larger. I think I was the first to enlarge "The Crab" up to "dump size" (400 pixels wide). It was originally from a game--not Final Fantasy, but something like that--see for the original GIF. The pixels look sharp in my enlargement because I used "nearest neighbor" resampling. If I had enlarged it in, say, LunaPic, or Photoshop's default enlargement setting, it would have used "bicubic" resampling and looked like this:

    I personally don't find the latter appealing--it is having visual information added to it through interpretive edge smoothing. Essentially Photoshop is acting as a third artist. Most imaging software assumes you are editing a photo, not a graphic image, and that you want a nice, pleasing smoothness. That is often exactly what you don't want.

  3. Noisia says "the crab is the Ohumein-Conga from Metal Slug 3, the big brother of the Chowmein-Conga." Here are the brothers together:


  5. the nearly 80 percent of the worlds population without internet accessJanuary 9, 2011 at 9:27 AM


  6. I love that argument! Anyway, the nearly 10% of the world's population with dialup (a large number) said "thank you, ditherers, for keeping file sizes small over the years!"

  7. two things on gifs; one of them isn't true--

    pronounced "jif"!
    can actually support more than 256 colors!