Image compression

By quadtree decomposition and vector quantisation

Most of the time, the 'gif' and 'jpeg' standards provide very good performance. Some people always feel like experimenting, however. An image compression scheme will be developed, based on the fundamental notions of quadtree decomposition and vector quantisation. (They are not as menacing as they sound, really!)

Merging similar colours is what vector quantisation involves. Most of the time, it is accompanied by additional methods to reduce the number of bits needed for data representation: For example, it is very likely that half the available colour range (or more) will not be in use in a particular image. It then makes sense to renumber the colours that do get used, so that they only occupy the lowest value slots.

Quadtree decomposition works as follows: Part of an image is compressed according to some technique, then the reconstruction compared to the original; the compression is accepted if the difference is small enough. If it isn't, this part of the image is split into four smaller subblocks, and thus handled more carefully by the compression subroutine. Only demanding parts of the image attract more rigorous treatment. The entire image does not have to be processed thoroughly just because a small part of it does.

Quadtree decomposition could, in principle, continue down to the pixel level, but, of course, the program can do better than that: Once a specific subblock size has been reached, fragmentation will stop. A couple of alternatives are possible, which have a profoundly different effect on reconstruction quality and compression ratios: Both vector quantisation and quadtree decomposition are generic in the sense that they allow any image compression algorithm to be applied. This section employs the rather naive technique of only encoding partitions which are uniform (or almost so.)

All practical implementations must set certain limits, so that they can deal effectively with images belonging to different classes: This section allows images up to 256 pixels on a side, and up to 256 values for colour (or grayscales.) Some compilers will not allow an array to extend over 64k, so the size may be reduced somewhat, actually, though this is not a limitation of the program. Using fewer than 16 initial subblocks (of the same size) does not seem profitable for most images. Given the above constraints, the boundary between lossy and lossless encoding has been placed at 4x4 subblocks by the following rationale: The algorithm prefers images having large uniform backgrounds or objects, and not too many colours. Compression is very good for images mainly containing horizontal or vertical edges, such as buildings and the like. When a very detailed image is submitted, the program could become more of a hindrance than help. If the bitmap file length is exceeded, the routine admits defeat and prints the associated message. However, another try could follow, after performing some, or additional, vector quantisation.

The routine is not as good as jpeg, surely, but should do better than run length encoding- it is a sort of two- dimensional run coding. There are several ways in which performance can be easily seen to improve: Part of one of last month's reflection images has been used for experimenting- it is an 128x128 picture: The following table shows the maximum acceptable and actual error for several settings:

lossless


Lossless
1% max error

Error: Max acceptable~ 1%. Actual average~ 0.07%
2% max error

Error: Max~ 2%. Actual~ 0.2%
3% max error

Error: Max~ 3%. Actual~ 0.4%

Decimation maps for 3% maximum acceptable error are shown, too: On the bottom are depicted the most demanding blocks (encoded losslessly.)

Decimation map

Demanding subblocks

The program is given after is has been made more presentable, in line with the usual practice of this site. This is chiefly meant to investigate tradeoffs for decompression quality versus compression factor, so it has cut quite a few corners.

The bulk of the compression is done by the 'partition' procedure. The support routines are as follows:

Only square images, whose number of pixels on a side is an integral power of 2 can be handled. Albeit a nuisance, this is easily circumvented: The width (or depth) of partitions in a quadtree may differ by one unit. The width and depth of rectangles, rather than the size of squares, must be recorded.

The program should really have accepted a standard bitmap input file. Instead, it will presently use a simplified bitmap file. A test file for the image above is provided. It has the following format: The program expects to find a file called 'image.bin' in the standard directory used by the language for its files. If it doesn't, it will create a random image and proceed anyway. The compression will be placed in a file called 'compress.bin' in the same directory. A more robust program would again have prompted for the input and output filenames, and checked that the output file isn't already there. Therefore, it is a good idea to exit the program immediately and rename, move, or lock any compression file which is to be kept. The program doesn't check that the input image size is an integer power of two. The structure of compressed data is shown here: About 35% of blocks are encoded losslessly in the above image. A compression factor of just under 2 is obtained, with respect to the bitmap size. For the next attempt, most blocks will be approximated by using a low level transform: The mean value of the block, plus the prevailing component, will only be tried in the first instance- lossless encoding will only be accepted as a last resort. The site is not through with image compression.

Above, it is stated that a small percentage of pixels in error may be tolerable, but not a large one, even though the total variance may be the same. This should be refined to some extent: It is necessary to reproduce the boundaries between objects, or an object and the background, faithfully. The noise on a rough surface, on the other hand, does not have to be represented accurately. (It is also true that the viewer will notice if a rough surface is replaced by a uniform block.)

Thus it is necessary to distinguish between important edges and the rest of them. A large percentage of pixels differing from their neighbours is likely to indicate noise.

1