Image compression

By pattern matching

Last month’s section on image compression is enhanced by adding pattern matching for demanding subblocks.

The 'gif' and 'jpeg' standards offer adequate performance most of the time. The best way to improve understanding of any matter, though, if there ever was one, could be by snooping around and fiddling with things.

The recent compression section has already discussed vector quantisation and quadtree decomposition. Briefly, to refresh the memory: Vector quantisation involves merging similar colours. Quadtree decomposition ensures that only demanding partitions receive extra attention. All subblocks are subject to no more processing than necessary, which implies that quadtree decomposition is adaptive.

Clearly enough, pattern matching involves comparison to a limited set of templates only- the best fit is selected. Partitions are presumed basically to include two colours only. For 4x4 subblocks, this is not a tough prerequisite, and if it fails, this portion of the image can be encoded losslessly, anyway. If there are any additional colours, they are approximated by one of the two major colours.

Again, the same image is used for tests, (though it was shown magnified by 2 in the initial compression section.) Examples include several settings of the quality control:

Here is a table which relates compressed file sizes and actual average error. The length of the simplified bitmap is 16,384 bytes:

Lossless 0.1% error

Error: Actual average~ 0.1%

0.2% error

Error:~ 0.2%. 0.3% error

Error:~ 0.3%

Size Error (%)

4934 Loss free

4496 0.1

4202 0.2

4068 0.3

Before claims for compression figures are made, a dozen of different images should probably be tested. The compression factor for the image above is about 4 times for an actual average error of 0.33%.

Here are the decimation maps for demanding compartments: Both pattern encoded subblocks (top) and losslessly encoded partitions are shown.

Pattern encoding

Lossless encoding

Modifications made to the initial version are as follows: The ‘pattern’ subroutine performs most of the pattern selection process. The number of bytes in the pattern matching block follows the corresponding figure for losslessly encoded partitions (two bytes.) Lastly, the pattern matching section is appended to the compression file. It has the format detailed below:

location of top left corner (2 bytes)
the two major colours in the pattern (2 bytes)
the pattern itself- starting from the left column, top to low, and then the remaining columns, each bit is on if the corresponding pixel is closer to the first major colour (2 bytes)

The shortcuts in the original version are still present, for the time being: The input file is in a simplified bitmap format, and only square images are handled- the number of pixels on a side is an integer power of two. The input image is expected to be in 'image.bin' in the default directory used by the language for its files. Compression data is put to the 'compress.bin' file in the same directory, overwriting any existing file of the same name. If the input file is not found, a random image is generated and the program runs anyway. The source code for this section is supplied.

A low level transform will be applied to demanding partitions for the next version of the compression code.

Lossless	Error: Actual average~ 0.1%
Error:~ 0.2%.	Error:~ 0.3%