Fractal compression?

This project was approached with a lot of scepticism. The site has not been converted to the self-similarity following. For example, see [1], [2].

tulip jpeg $tulip fractal collage error 2%$
tulip fractal: 2.0%

There will be a review of image compression first, as presented in this site so far:

Merging similar colours is a useful concept. The formal term for this is vector quantisation, (which is sure to scare off half the visitors from the rest of this page!)
Noisy blocks need not cause too much concern- even if they could be represented faithfully, it probably would not be noticed. After all, it is image compression which is being discussed, right?
Choosing the proper block size has been mentioned before, too. Too small a block will not provide worthwhile compression. Nor will too large a block. Initially, it may seem that large blocks can be encoded more efficiently, but the chance of this happening is reduced on par- and then much more.
Splitting the image into smaller blocks only when necessary, so that extra effort is only put in parts of the photograph which deserve it, is also a sound notion. This is called a quadtree partition, and for best results it should be adapted to the image compressed, rather than the other way round. Horizontal-verical (HV) partioning seems to be more effective in that respect.

Monalisa jpeg $Monalisa fractal$
Monalisa fractal: 1.8%

The algorithm which is widely known for fractal compression involves the following steps:

Following Jacquin's terminology [3], an image is split into non overlapping range blocks and domain blocks which, in the first instance, need not overlap, either. This site will use 4x4 range blocks and 8x8 domain blocks, which will yield a modest compression ratio of about 5. Many papers are after a more aggressive compression figure of 20, and seem to propose 8x8 range blocks. To this, they are welcome.
All domain blocks are reduced to the size of the range blocks, maybe using subsampling, but probably employing averaging.
It is attempted to reconstruct a range block from the most appropriate domain block by a contrast and brightness adjustment obtained from linear regression:

bright = (Σy-contrast.Σx) / n and

contrast = Σ[(x-x')(y-y')] / Σ[(x-x')²]

where:

n is the number of pixels in a block
x is the brightness of a pixel in the shrunk domain block
y is the corresponding figure in the range block
x' and y' are average block values

mandrill jpeg $mandrill fractal$
mandrill fractal: 6.5%

Most of the time, the 'fit' will, of course, be lamentable, but if a large number of domains is used, then a few may provide an acceptable match. If the contrast value cannot be obtained from the formulae above, it is zero. Convergence is guaranteed if the contrast values are between -1 and +1. Actually, this condition is stricter than necessary; in practice, a fair proportion of contrast values was closer to 2, and has been retained, but the results in the decoder were constrained between the minimun and maximum brightness values in the original image. In the implementation, it is seen that the difference of the average brightess in a range block from the respective value in the previous block can be recorded, with advantage.

monarch jpeg $monarch fractal$
monarch fractal: 6.0%

Fractal compression is a highly asymmetric process, meaning that encoding takes far longer than image restoration. Domain blocks which do not stand a chance of providing a good match really must be excluded from further processing soonest. There are several ways to speed up compression. Some techniques are suitable for all block based lossy methods, while the last two points are only valid for fractal coders in particular:

Noisy blocks need not attract a lot of care: They cannot be well encoded- if they could, it would not be obvious, anyway.
Set a reasonable error margin. The maximum acceptable error was set to 10%, rather than searching for the optimal domain, in one image (the black and white tiger), which only worsened the actual average error from 7.7 to 8.9, but provided a threefold improvement in speed. Even better, you could increase the error margin especially for noisy blocks.
The domain search could be limited to those blocks which are close to the range block in question only. However, for best quality, the distance between every range block and the corresponding optimal domain, is random. Lu suggests only considering far domains when the difference in quality is significant (he is using Huffman coding for the domain number.)
Ranges and domains can be classified to 24 groups according to the ordering of partial brightness levels in quadrants. Fisher advises to seek a suitable domain only in the same class as the range. (If both positive and negative contrast adjustments are allowed, two classes are appropriate.) Further classification can be attempted by taking local variance levels into account, as well.

Lenna gif $Lenna fractal$
Lenna fractal: 3.5%

When you are not happy with compression quality, it may be better to try another compression method: There are a couple of options to take, actually, but they will almost certainly do little to improve image quality and everything to render the compression time completely unacceptable. There are several ways to increase the number of hopeful domain blocks:

Add a fixed codebook to the domain pool.
Domain blocks may overlap- the domain separation is reduced from eight pixels to 4, 2 or 1- in both directions.
In addition, all domain classes can yield prospective matches- if the quadrants are rearranged so that the ordering of local brightness levels matches that of the range quadrants. This will be discussed in more detail.

Reflections and rotations comprise the eight principal isometries. All classes can be derived from a single class by using 3 reflections, at most: In effect, a rotation is a double reflection. A 'partial reflection' affects a single row (or column) only, and is needed before the remaining 16 classes are formed. All of these are depicted in the following sketch. Initially, it seems that only the eight principle isometries will be useful, but the ordering scheme employed is not perfect: It does not take the fact that some partial brightness levels may be equal into account.

isometries

Thus a range is approximated by a brightness and contrast value, plus a domain location and, possibly, an isometry descriptor. The contents of the domains themselves are not needed in the compression file. Some quantisation of values and bit manipulation is in order so that data is packed efficiently in the fractal file.

The site has tried using domain and range blocks of the same size, in the hope that self- similarity might be more apparent at the same scale. The results, predictably, were plagued with convergence problems. Restraining contrast values between -1 and +1 and shifting domain and range blocks as shown in the following sketch improved matters- but only so much. This direction has been abandonded.

Shifted domain and range blocks

The decompression process has not been given as much attention- it is much faster on any reasonable hardware. If subsampling has been used for encoding, there is a technique called pixel chasing (or is it pixel chaining?), which exploits the fact that pixels are classified in a small number of families, every family having a limited number of cycles. Usually, though, decompression is recursive. It is faster to choose a uniform image of the average brightness in the original for a start (but much more impressive to start with a completely different photograph.)

Strictly speaking, there is a difference between the fractal approximation (the so called collage) and the final decompressed image (the attractor), but it is small if the photograph was suitable for fractal encoding. The following table gives the residual error for every iteration of the Lenna image. Ten iterations are usually adequate, though given the collage error, even three iterations would be acceptable in many cases. Normally, convergence seems to take place more or less at a geometric rate:

Iteration Error (%)

1 23.0

2 9.8

3 4.7

4 3.8

5 3.7

10 3.7

collage 3.5

Only the ratio, not the actual sizes of range and domain blocks have been fixed during compression. The decompressed image may therefore be of any size. (This seems to be the source of some unfortunate claims of compression ratios in the past.) The images below show a part of Lenna magnified twice. For comparison, a jpeg blow up is displayed, too. The fractal zoom is said to have the edge at much higher zoom factors, though. Both images are badly in need of postprocessing, but here you get the chance to compare the raw results.

Lenna jpeg zoom

Jpg error: 2.5% $Lenna fractal zoom$
Collage error: 2.6%

For some images, the error in the fractal model is under 1.5%. Why then the controversy surrounding fractal compression? Well, the algorithm which is widely known will not provide a compression factor of 100, for most images, if the results are to be presentable. Compression does work for more reasonable compression figures, but it still may not be the best way to take.

The quest for self-similarity and the iterative decoding process is needed so that the implicit codebook (the domain pool) is not transmitted in the compressed file . If a fixed codebook is used for all images, though, it need not form part of the compressed files, either. It is simply attached to the encoder and the decoder. Decoding will then be a single step process, averaging of domain blocks no longer necessary and the quality will look similar. A fixed codebook coder will be described in two months.

Black and white photographs are shown in many papers- this could be because of printing costs. This is not a problem on the web, and excellent quality colour photographs can be rebuilt at low compression ratios. The rgb colour planes have been considered individually for the images on this page, but these are correlated, so this is not the best way to go. For example, the same domain block will often serve all the colour planes equally well.

For a fair comparison, images are transmitted as 24-bit bitmaps. This takes time and some browsers may not display them. Images of 256 gray tones, on the other hand, can be sent losslessly in Gif files, and will be faster. None of these images was subject to any postprocessing whatsoever.

Modern processors seem to run most programs in no time. This program will, at the higher quality settings, show them who the boss is! A 168x168, 256 gray level image appears to take about 3 seconds to compress on a 1.8 Ghz Celeron, at the lowest quality setting, that is, no overlapping of domains and isometry transformations disabled. If the size of the domain pool is increased sixteen times, by making the domain separation two pixels (or, even more, by allowing all isometries), the program will take about a minute. You can choose the highest quality setting before you go to lunch! If you have an old computer, don't bother downloading this one. The source program will be added to this section later.

The executable file is now in place. In addition, you may also use the images given or referenced at the end of the image compression error section.

1. In Fractal image encoding and analysis, Y. Fisher (ed), Springer 1998 (ISBN 3-540-63196-8), On fractal compression and vector quantisation, Skjalg Lepsoy et al: Cross-transformation of 11 images with peppers or couple gives better results than self-transformation, figures 2.1, 2.2 and also 2.7-2.10 and table 2.4. Back to text.
2. Fractal imaging, Ning Lu, Academic Press 1997, (ISBN 0-12-458010-6), Table 3.5.2: A fractal proponent can only demonstrate a 0.4% improvement between self-transformability of four images and cross-transformability with the image Plava Laguna. Back to text.
3. Some researchers speak of reference and destination blocks. When single-letter prefixes are used for indices, there is ample opportunity for confusion! Back to text.