Image compression

By quadtree decomposition and vector quantisation

Most of the time, the 'gif' and 'jpeg' standards provide very good performance. Some people always feel like experimenting, however. An image compression scheme will be developed, based on the fundamental notions of quadtree decomposition and vector quantisation. (They are not as menacing as they sound, really!)

Merging similar colours is what vector quantisation involves. Most of the time, it is accompanied by additional methods to reduce the number of bits needed for data representation: For example, it is very likely that half the available colour range (or more) will not be in use in a particular image. It then makes sense to renumber the colours that do get used, so that they only occupy the lowest value slots.

Quadtree decomposition works as follows: Part of an image is compressed according to some technique, then the reconstruction compared to the original; the compression is accepted if the difference is small enough. If it isn't, this part of the image is split into four smaller subblocks, and thus handled more carefully by the compression subroutine. Only demanding parts of the image attract more rigorous treatment. The entire image does not have to be processed thoroughly just because a small part of it does.

Quadtree decomposition could, in principle, continue down to the pixel level, but, of course, the program can do better than that: Once a specific subblock size has been reached, fragmentation will stop. A couple of alternatives are possible, which have a profoundly different effect on reconstruction quality and compression ratios:

The loss is accepted
The partition is encoded losslessly (or a higher level compression scheme is used)

Both vector quantisation and quadtree decomposition are generic in the sense that they allow any image compression algorithm to be applied. This section employs the rather naive technique of only encoding partitions which are uniform (or almost so.)

All practical implementations must set certain limits, so that they can deal effectively with images belonging to different classes: This section allows images up to 256 pixels on a side, and up to 256 values for colour (or grayscales.) Some compilers will not allow an array to extend over 64k, so the size may be reduced somewhat, actually, though this is not a limitation of the program. Using fewer than 16 initial subblocks (of the same size) does not seem profitable for most images. Given the above constraints, the boundary between lossy and lossless encoding has been placed at 4x4 subblocks by the following rationale:

Uniform encoding of 2x2 blocks does not offer a compression advantage: The coordinates of the top left corner must be registered, as well as the size of the block and the uniform colour.
For lossless encoding of 2x2 blocks, there is a 50% overhead: The top left corner must be stored (though not the size, since this has been determined for the entire image.) For 4x4 subblocks, the corresponding penalty is only 12%. This is hopefully more than offset by the uniform encoding of most partitions (some of which are much larger.)

The algorithm prefers images having large uniform backgrounds or objects, and not too many colours. Compression is very good for images mainly containing horizontal or vertical edges, such as buildings and the like. When a very detailed image is submitted, the program could become more of a hindrance than help. If the bitmap file length is exceeded, the routine admits defeat and prints the associated message. However, another try could follow, after performing some, or additional, vector quantisation.

The routine is not as good as jpeg, surely, but should do better than run length encoding- it is a sort of two- dimensional run coding. There are several ways in which performance can be easily seen to improve:

Using a transform, rather than only encoding blocks which are uniform, or nearly so, should bring performance closer to jpeg.
The condition for accepting the compressed version of a subblock is too simplistic, as well: The total error is only considered, not the number of pixels causing it. When the percentage of pixels in error is very small, the compression may be subjectively acceptable, even though the individual excursions will be larger. A higher percentage of pixels in discrepancy, though, may not look nice, even though the individual deviations are smaller: Variations in luminance or saturation appear to be less offending than shifts in chrominance, but of course this is only a personal view.
Additional savings in compression figures and improved reconstruction quality can be obtained by allowing uneven partitions in quadtrees. Subblocks can be rectangles, rather than just equal squares. However, this will be the end of simplicity, which is the beauty of the present version. Apart from that, a signal should not be told how to be partitioned. A signal tells how to be partitioned.

Part of one of last month's reflection images has been used for experimenting- it is an 128x128 picture: The following table shows the maximum acceptable and actual error for several settings:

Lossless 1% max error

Error: Max acceptable~ 1%. Actual average~ 0.07%

2% max error

Error: Max~ 2%. Actual~ 0.2% 3% max error

Error: Max~ 3%. Actual~ 0.4%

Decimation maps for 3% maximum acceptable error are shown, too: On the bottom are depicted the most demanding blocks (encoded losslessly.)

Decimation map

Demanding subblocks

The program is given after is has been made more presentable, in line with the usual practice of this site. This is chiefly meant to investigate tradeoffs for decompression quality versus compression factor, so it has cut quite a few corners.

The bulk of the compression is done by the 'partition' procedure. The support routines are as follows:

'getImage' reads an input file into the 'im%()' array
'create' generates a random image if the above file is not found
'putImage' stores an image to an output file
'putCompression' saves compression data to a file
'reconstruct' decompresses the above file
'shift' draws the image at the offset specified.

Only square images, whose number of pixels on a side is an integral power of 2 can be handled. Albeit a nuisance, this is easily circumvented: The width (or depth) of partitions in a quadtree may differ by one unit. The width and depth of rectangles, rather than the size of squares, must be recorded.

The program should really have accepted a standard bitmap input file. Instead, it will presently use a simplified bitmap file. A test file for the image above is provided. It has the following format:

Image size minus one (one byte)
Maximum number of available colours minus one (one byte)
The leftmost column in the image (from top to bottom), followed by the remaining columns (one byte per pixel)

The program expects to find a file called 'image.bin' in the standard directory used by the language for its files. If it doesn't, it will create a random image and proceed anyway. The compression will be placed in a file called 'compress.bin' in the same directory. A more robust program would again have prompted for the input and output filenames, and checked that the output file isn't already there. Therefore, it is a good idea to exit the program immediately and rename, move, or lock any compression file which is to be kept. The program doesn't check that the input image size is an integer power of two. The structure of compressed data is shown here:

The size and number of colours as above
The number of bytes in the lossy section, plus one
The number of bytes in the lossless section, plus one (two bytes, low order first)
The lossy section, followed by the lossless section (4 bytes for each lossy partition and 18 bytes for each lossless segment)

About 35% of blocks are encoded losslessly in the above image. A compression factor of just under 2 is obtained, with respect to the bitmap size. For the next attempt, most blocks will be approximated by using a low level transform: The mean value of the block, plus the prevailing component, will only be tried in the first instance- lossless encoding will only be accepted as a last resort. The site is not through with image compression.

Above, it is stated that a small percentage of pixels in error may be tolerable, but not a large one, even though the total variance may be the same. This should be refined to some extent: It is necessary to reproduce the boundaries between objects, or an object and the background, faithfully. The noise on a rough surface, on the other hand, does not have to be represented accurately. (It is also true that the viewer will notice if a rough surface is replaced by a uniform block.)

Thus it is necessary to distinguish between important edges and the rest of them. A large percentage of pixels differing from their neighbours is likely to indicate noise.

Site Index
Reflection graphics

Lossless	Error: Max acceptable~ 1%. Actual average~ 0.07%
Error: Max~ 2%. Actual~ 0.2%	Error: Max~ 3%. Actual~ 0.4%