Thesis-Fractal Compression

Master Thesis Software EngineeringThesis no: MSE-2008:05February 2008

School of EngineeringBlekinge Institute of TechnologyBox 520SE – 372 25 RonnebySweden

Fractal Compression of Medical Images

Wojciech Walczak

This thesis is submitted to the School of Engineering at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Master of Science in Software Engineering. The thesis is equivalent to 20 weeks of full time studies.

Contact Information:Author:Wojciech WalczakAddress: ul. Krzętowska 3A, 97-525 Wielgomłyny, PolandE-mail: [email protected]

University advisors:

Bengt AspvallSchool of EngineeringBlekinge Institute of Technology, Sweden

Jan KwiatkowskiInstitute of Applied InformaticsWrocław University of Technology, Poland

School of EngineeringBlekinge Institute of TechnologyBox 520SE – 372 25 RonnebySweden

Internet : www.bth.se/tekPhone : +46 457 38 50 00Fax : + 46 457 271 25

ii

Faculty of Computer Science and Managementfield of study: Computer Sciencespecialization: Software Engineering

Master Thesis

Fractal Compression of Medical Images

Wojciech Walczak

keywords:fractal compressionfractal magnificationmedical imaging

short abstract:The thesis investigates the suitability of the fractal compression to medi-cal images. The fractal compression method is selected through a surveyof the literature and adapted to the domain. The emphasis is put onshortening the encoding time and minimizing the loss of information.Fractal magnification, the most important advantage of the fractal com-pression, is also discussed and tested. The proposed method is comparedwith existing lossy compression methods and magnification algorithms.

Supervisor: Jan Kwiatkowski............................................ ...................... .......................

Name Grade Signature

Wrocław 2008

Contents

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Research Aim and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Chapter 1. Digital Medical Imaging . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.1. Digital Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.1.1. Analog and Digital Images . . . . . . . . . . . . . . . . . . . . . . . . . 41.1.2. Digital Image Characterisitcs . . . . . . . . . . . . . . . . . . . . . . . 5

1.2. Medical Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Chapter 2. Image Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.1. Fundamentals of Image Compression . . . . . . . . . . . . . . . . . . . . . . . 132.1.1. Lossless Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.1.2. Lossy Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2. Fractal Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.2.1. Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.2.2. Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.3. Fractal Magnification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Chapter 3. Fractal Compression Methods . . . . . . . . . . . . . . . . . . . . . 28

3.1. Partitioning Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.1.1. Uniform Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.1.2. Overlapped Range Blocks . . . . . . . . . . . . . . . . . . . . . . . . . 293.1.3. Hierarchical Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . 293.1.4. Split-and-Merge Approaches . . . . . . . . . . . . . . . . . . . . . . . . 33

3.2. Domain Pools and Virtual Codebooks . . . . . . . . . . . . . . . . . . . . . . . 353.2.1. Global Codebooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.2.2. Local Codebooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.3. Classes of Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.3.1. Spatial Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.3.2. Symmetry Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.3.3. Block Intensity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

ii

3.4. Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.4.1. Quantization During Encoding . . . . . . . . . . . . . . . . . . . . . . 433.4.2. Quantization During Decoding . . . . . . . . . . . . . . . . . . . . . . 44

3.5. Decoding Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443.5.1. Pixel Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.5.2. Successive Correction Decoding . . . . . . . . . . . . . . . . . . . . . . 453.5.3. Hierarchical Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.5.4. Decoding with orthogonalization . . . . . . . . . . . . . . . . . . . . . 45

3.6. Post-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.7. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Chapter 4. Accelerating Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.1. Codebook Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.2. Invariant Representation and Invariant Features . . . . . . . . . . . . . . . . . 494.3. Nearest Neighbor Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.4. Block Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.4.1. Classification by Geometric Features . . . . . . . . . . . . . . . . . . . 504.4.2. Classification by intensity and variance . . . . . . . . . . . . . . . . . . 514.4.3. Archetype classification . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.5. Block Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524.6. Excluding impossible matches . . . . . . . . . . . . . . . . . . . . . . . . . . . 534.7. Tree Structured Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534.8. Multiresolution Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544.9. Reduction of Time Needed for Distance Calculation . . . . . . . . . . . . . . . 544.10. Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Chapter 5. Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.1. The Encoding Algorithm Outline . . . . . . . . . . . . . . . . . . . . . . . . . 575.2. Splitting the Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575.3. Codebook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.3.1. On-the-fly Codebook . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625.3.2. Solid Codebook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 635.3.3. Hybrid Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

5.4. Symmetry Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645.5. Constructing the fractal code . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.5.1. Standard Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645.5.2. New Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665.5.3. Choosing the Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 745.5.4. Adaption to irregular regions coding . . . . . . . . . . . . . . . . . . . 77

5.6. Time Cost Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785.6.1. Variance-based Acceleration . . . . . . . . . . . . . . . . . . . . . . . . 785.6.2. Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.7. Summary of the Proposed Compression Method . . . . . . . . . . . . . . . . . 80

Chapter 6. Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

6.1. Block Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 846.2. Number of Bits for Scaling and Offset Coefficients . . . . . . . . . . . . . . . . 866.3. Coding the Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 876.4. Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

6.4.1. Codebook Size Reduction . . . . . . . . . . . . . . . . . . . . . . . . . 896.4.2. Breaking the Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

iii

6.4.3. Codebook Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 926.4.4. Variance-based Acceleration . . . . . . . . . . . . . . . . . . . . . . . . 926.4.5. Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 936.4.6. Spiral Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6.5. Comparison with JPEG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 946.6. Magnification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Research Question 1: Is it possible, and how, to minimize the drawbacks of thefractal compression method to satisfying level in order to apply this methodto medical imaging? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Research Question 2: Which fractal compression method suits best for medicalimages and gives best results? . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

Research Question 3: Do the fractal compression preserve image quality better orworse than other irreversible (information lossy) compression methods? . . . . 102

Research Question 4: Can the results of fractal magnification be better than theresults of traditional magnification methods? . . . . . . . . . . . . . . . . . . . 103

Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

Appendix A. Sample Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

Appendix B. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

Appendix C. Application Description and Instructions for Use . . . . . . . . 149

C.1. How to Run the Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149C.2. Common Interface Elements and Functionalities . . . . . . . . . . . . . . . . . 149

C.2.1. Menu Bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149C.2.2. Tool Bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151C.2.3. Status Bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151C.2.4. Pop-up Menus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

C.3. WoWa Fractal Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152C.3.1. Original Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152C.3.2. Comparison Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153C.3.3. Partitions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153C.3.4. Transformations Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154C.3.5. Log Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154C.3.6. Image Comparator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

C.4. WoWa Fractal Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155C.5. Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

C.5.1. Application Preferences . . . . . . . . . . . . . . . . . . . . . . . . . . 156C.5.2. Encoder Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156C.5.3. Decoder Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

Appendix D. Source Code and Executable Files . . . . . . . . . . . . . . . . . 158

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

iv

AbstractObrazy medyczne, tak jak inne dane cyfrowe, wymagają kompresji aby zredukowaćwymaganą do przechowywania ilość pamięci oraz czas potrzebny na transmisję.Kompresja bezstratna może zmniejszyć wielkość pliku tylko do bardzo ogranic-zonego stopnia. Zastosowanie kompresji fraktalnej do obrazów medycznych poz-woliłoby osiągnąć znacznie wyższe współczynniki kompresji. Natomiast powięk-szenie fraktalne – nieodłączna cecha kompresji fraktalnej – byłaby niezwyklepożyteczna w prezentacji zrekonstruowanego obrazu w postaci bardzo czytelnej.Aczkolwiek kompresja fraktalna, jak wszystkie metody stratne, jest związana zproblemem utraty informacji, który jest szczególnie uciążliwy w obrazowaniu me-dycznym. Bardzo czasochłonny proces kodowania, który może trwać nawet kilkagodzin, jest kolejną kłopotliwą wadą kompresji fraktalnej. Na podstawie przegląduliteratury oraz własnych przemyśleń, autor usiłuje przedstawić rozwiązanie dos-tosowane do potrzeb obrazowania medycznego, które przezwycięży niekorzystneprzypadłości metod kompresji fraktalnej. Praca zawiera nie tylko rozważania teore-tyczne ale również dostarcza implementacji zaproponowanego algorytmu, który jestwykorzystany w celu zbadania odpowiedniości kompresji fraktalnej do obrazowaniamedycznego. Otrzymane wyniki są więcej niż satysfakcjonujące – wierność obrazówskompresowanych proponowaną metodą kompresji fraktalnej spełnia wymaganianakładane obrazom medycznym a powiększenie fraktalne przewyższa inne technikipowiększania obrazów.

AbstractMedical images, like any other digital data, require compression in order to reducedisk space needed for storage and time needed for transmission. The lossless com-pression methods of still images can shorten the file only to a very limited degree.The application of fractal compression to medical images would allow obtainingmuch higher compression ratios. While the fractal magnification – an inseparablefeature of the fractal compression – would be very useful in presenting the recon-structed image in a highly readable form. However, like all irreversible methods,the fractal compression is connected with the problem of information loss, whichis especially troublesome in the medical imaging. A very time consuming encodingprocess, which can last even several hours, is another bothersome drawback of thefractal compression. Based on a survey of literature and own cogitations, the authorattempts to provide an adapted to the needs of medical imaging solution that willovercome the unfavorable ailments of the fractal compression methods. The thesisdoes not provide only theoretical deliberations but also gives implementation of theproposed algorithm, which is used to test the suitability of the fractal compressionto medical imaging. The results of the work are more than satisfying – the fidelityof the images compressed with the proposed fractal compression method meets therequirements imposed on the medical images and the fractal magnification outper-forms other magnification techniques.

v

Streszczenie

Systemy informacyjne w szpitalach czy przychodniach przechowują ogromną ilośćwyników badań. Długookresowe przechowywanie danych medycznych może dawaćwymierne korzyści, ponieważ lekarze mogą w dowolnej chwili zapoznać się z wynikamipoprzednich badań i historią przypadku. Szpitalne bazy danych szybko się rozrastają,co jest skonfrontowane z ograniczonymi pojemnościami urządzeń magazynujących.Konieczne są więc nakłady finansowe na zwiększenie tej pojemności. Szybko rozwija sięrównież telemedycyna. Coraz większą popularnością cieszą się ostatnio zdalne operacje.Podczas takich operacji jeden ze specjalistów, mimo że oddalony od miejsca wykony-wania operacji, może na bieżąco oglądać jej przebieg dzięki transmisji obrazów. Ni-estety jakość takiego obrazu jest ograniczona przez przepustowość łącza internetowego,którym transmisja się odbywa.Kompresja danych może być wielce pomocna zarówno w długoterminowym prze-

chowywaniu jak i w transmisji danych obrazowych. Jednak w przypadku obrazów me-dycznych, gdzie właściwa ocena stanu pacjenta zależy od jakości zdjęcia, szczególnieważna jest informacja niesiona przez zdjęcie.Celem pracy jest sprawdzenie, czy kompresja fraktalna może być zastosowana do

obrazów medycznych. Wiąże się to z koniecznością zminimalizowania wad kompresjifraktalnej, z których najważniejsze to nieodwracalna utrata części informacji niesionejprzez obraz oraz długi czas kompresji. Jednakże kompresja fraktalna ma również zalety,z których największą jest możliwość fraktalnego powiększania obrazów. Zatem praca tapróbuje odpowiedzieć na pytanie czy i jak można zmniejszyć wady kompresji fraktalnejdo takiego poziomu, by możliwe było wykorzystanie jej do kompresji obrazów medy-cznych. Kompresja fraktalna jest ponadto porównywana z innymi metodami kompresjistratnej, jak również z innymi algorytmami powiększania obrazów. Aby osiągnąć cel,następujące kroki zostały wykonane:1. przegląd metod kompresji fraktalnej na podstawie literatury2. przegląd technik przyśpieszania procesu kompresji3. implementacja wybranej metody kompresji fraktalnej, najlepiej pasującej dodziedziny obrazowania medycznego

4. implementacja wybranych metod przyśpieszania

Streszczenie vii

5. eksperymenty mające na celu oszacowanie ilości traconej informacji oraz czasukompresji wybraną metodą

6. eksperymenty mające na celu pomiar wpływu zaimplementowanych technikprzyśpieszania na czas kompresji

7. porównanie wyników uzyskanych zaimplementowaną metodą kompresji fraktalnejz innymi stratnymi metodami kompresji oraz algorytmami powiększaniaW efekcie okazało się, że kompresja fraktalna może być z powodzeniem stosowana

w kompresji obrazów medycznych. Kompresowane obrazy USG zachowywały akcep-towalną wierność dla obrazów medycznych przy dziewięciokrotnym zmniejszeniu dłu-gości pliku z danymi obrazowymi. Długość pliku skompresowanego bezstratnie byłabyokoło dwa razy większa.Również długość czasu potrzebnego do kompresji udało się z powodzeniem

ograniczyć. Obraz typowych obrazów można opracowaną metodą skompresować w cza-sie nie dłuższym niż kilkadziesiąt sekund. A obraz wielkości 256×256 w zaledwie około5 sekund.W porównaniu z algorytmem JPEG, opracowana metoda kompresji fraktalnej

wypada słabiej pod względem jakości obrazów dla współczynnika kompresji mniejszegoniż 14 : 1, natomiast dla wyższych współczynników niż 18 : 1 kompresja fraktalnaokazuje się lepsza. Miary objektywne nie wskazują jednoznacznie, która metoda dajewierniejszy obraz dla współczynników kompresji między 14 : 1 a 18 : 1. Jednak JPEGjest pozbawiony możliwości kompresji fraktalnej, w tym najważniejszej – dekompresjiobrazu do dowolnego rozmiaru, czyli powiększenia fraktalnego.Powiększenie fraktalne porównywane było z interpolacją liniową i sześcienną, jed-

nak żadna z tych metod nie mu dorównała. Obraz powiększony fraktalnie był znacznieostrzejszy a detale i krawędzie lepiej widoczne. Również porównanie jakości powięk-szonych zdjęć obiektywnymi miarami jednoznacznie wskazało na wyższość powiększeniafraktalnego.

Introduction

Background

More and more fields of human’s life are becoming computerized nowadays. Thisdetermines generation of huge, and further increasing, amount of information storedin digital form. All this is possible thanks to technological progress in registration ofdifferent kinds of data. This progress is also being observed in wide field of digital im-ages, which covers scanned documents, drawings, images from digital or video cameras,satellite images, medical images, works of computer graphics and many more.Many disciplines, like medicine, e-commerce, e-learning or multimedia, are bounded

with ceaseless interchange of digital images. A live on-line transmission of a sport event,or a surgery with a remote participation of one or more specialist, teleconference in aworld wide company constitute great examples. Such utilization of technology relatedto digital images becomes nowadays very popular.Long-lasting storage of any data often can be very profitable. In medicine, Hospital

Information Systems contain a large number of medical examination results. Thanksto them doctors can familiarize themselves with the case history and make a diagnosisbased on many different examination results. Such systems are also very useful for thepatients because they gain access to their medical data. A very good example is IZIP– Czech system, which gives Internet access to patients’ health records. Unfortunately,these hospital databases are growing rapidly – each day tens or hundreds of images areproduced and most of them, or even all, are archived for some period.Both mentioned aspects of digital data – sharing and storage, are linked with prob-

lems that restrain the progress in new technologies and growth of their applicationprevalence. During exchanging image data, one wishes to keep the quality on a highlevel and the time needed for transmission and the disk space needed for storage aslow as it can be. The increase of throughput in used communication connections un-fortunately is insufficient and some additional solution must be introduced to satisfyascending expectations and needs.Collecting any kind of data results in demand for increase of storage devices’ capac-

ity. Since the capacity growth of such devices is quite fast, almost any demand can be

Introduction 2

technically satisfied. However with extending of the capacity expenses, which cannotbe passed over, are related.Above-mentioned problems resulted in research and development of data compres-

sion techniques. With time many different compression methods, algorithms and fileformats were developed. In still images compression there are many different approachesand each one of them produces many compression methods. However all techniquesprove to be useful only in a limited usage area.Of course, image compression methods are also much desired or even necessary

in medicine. However, medical images require special treatment because correctnessof diagnosis depends on it. Low quality medical image, distortions in the image oruntrue details may be harmful for human health. Thus any processing of such images,including compression, should not interfere in the information carried by the images.

Research Aim and Objectives

The aim of the thesis is to investigate if it is possible to apply fractal compression tomedical images in order to bring the benefits of fractal compression to medical imaging.Fractal compression has some serious drawbacks, like information loss problem or longencoding time, that can thwart the plans of utilizing fractal compression for medicalimages. However, it also offers some great features, like compression ratios comparablewith JPEG standard, fractal magnification, asymmetric compression, and absence ofmost distortion effects. These positive aspects of fractal compression would be veryuseful in medical imaging.In author’s opinion, the aim of the thesis can be reached by finding or constructing a

fractal compression method that best fulfills needs and requirements of medical imagingand, at the same time, gives all advantages of fractal compression. Thus, correspondenceof existing fractal compression techniques to the domain is investigated and best suitingmethods and techniques are adapted and improved.Following objectives are realized to attain the goal:• Review and discuss fractal compression methods with special consideration of theirsuitability to medical imaging.

• Implementation of the chosen fractal compression algorithms.• Perform experiments in order to evaluate the size of information loss in imple-mented algorithm.

• Perform experiments in order to acquire information about time needed for en-coding in chosen (and improved) method/methods.

• Review speed-up techniques and adapt chosen techniques to the fractal compres-sion method, implement them and measure their impact on the duration of en-coding.

• Compare the results of implemented algorithm with other magnification and lossycompression techniques.

Research Questions

The thesis addresses the following research questions:

Introduction 3

Research Question 1 Is it possible, and how, to minimize the drawbacks of the frac-tal compression method to satisfying level in order to apply this method to medicalimaging?

Research Question 2 Which fractal compression method suits best for medical im-ages and gives best results?

Research Question 3 Do the fractal compression preserve image quality better orworse than other irreversible (information lossy) compression methods?

Research Question 4 Can the results of fractal magnification be better than theresults of traditional magnification methods?

Thesis Outline

This work is organized as follows:

IntroductionChapter 1 explains basic concepts of digital imaging and discusses characteristics andspecial requirements of medical imaging.

Chapter 2 provides basic information about fractal image compression, what is pre-ceded with general description of image compression.

Chapter 3 goes into more details about fractal compression and characterizes thelarge range scope of existing fractal coding methods. The emphasis has been madeon features important for compression of medical images.

Chapter 4 scrutinizes a variety of attempts to solve one of most important disadvan-tages of fractal compression – long encoding time.

Chapter 5 gives a look inside the implemented algorithm.Chapter 6 presents and discusses the results of experiments that were performed onthe implementation of the proposed fractal compression method.

Conclusions present the discussion of the results and recommendations.The answersto the research questions can be found here.

The first two chapters compose the introductory part of the thesis and can beskipped by knowledgeable readers.

Chapter 1

Digital Medical Imaging

Before medical imaging will be discussed, it is necessary to provide some basicinformation about digital imaging. The digital images are described in the first sectionand the next section concentrates on a specific class of digital images – digital medicalimages.Since this is an introductory chapter, it can be skipped by the readers knowledgeable

in medical digital imaging. The readers who are familiar with digital imaging may omitthe first section of this chapter.

1.1. Digital Images

1.1.1. Analog and Digital Images

Two classes of digital images can be distinguished – analog and digital images. Bothtypes fall into nontemporal multimedia type. Analog images are painted or createdthrough photographic process. During this process, the image is captured by a cameraon a film that becomes a negative. We have a positive when the film is developed– no processing is possible from this moment. When the photography is made on atransparent medium then we are dealing with a diapositive (slide). Analog images arecharacterized by continuous, smooth transition of tones. This means that between eachtwo different points at the picture there is an infinite number of tonal values.It is possible to transform an analog image into digital. The digitization process is

usually caused by a need of digital processing. The output of digitalization is a digitalapproximation of the input analog image – the analog image is replaced by a set ofpixels (points organized in rows and columns) and every pixel has a fixed, discretetone value. Therefore, the image is not a continuous tone of colors. The precision andaccuracy of this transformation depends on the size of a pixel – the larger area of ananalog image transformed into one pixel the less precise approximation. [Gon02]A digital image can be captured with a digital camera, scanner or created with a

graphic program. Transition from digital to analog image also takes place – by suchdevices as computer monitor, projector or printing device.

Chapter 1. Digital Medical Imaging 5

One can distinguish many different types of digital images. First of all the digitalimages are divided into recorded and synthesized images. To the first group, for exam-ple, belong analog images scanned by digital scanner. To the second group are classedall images created with graphical computer programs – they come into being alreadyas digital images.The second possible classification of digital images divides them into vector images

and raster images. Both of the groups can contain recorded as well as synthesizedimages. Vector images mostly are created with graphic software. Analog images can berecorded only to a raster image, but then they can be converted to vector image. Theopposite conversion (rasterization) is also possible. Vector images are treated as a setof mathematically described shapes and most often are used in creating drawings likelogos, cartoons or technical drawings. This work concerns only raster graphics, where animage (bitmap) is defined as set of pixels (picture elements) filled with color identifiedby a single discrete value. This kind of images is usually used for photographical images.[Dal92]

1.1.2. Digital Image Characterisitcs

Digital images are characterized by multiple parameters.The first feature of a digital image is its color mode. A digital image can have one of

three modes: binary, grayscale or color. A binary (bilevel) image is an image in whichonly two possible values for each pixel. A grayscale image means that its each pixelcan contain only a tint of gray color.As it was already mentioned, a digital image is a set of pixels. Each pixel has a value

that defines color of the pixel. All the pixels are composed into one array. The resolutionof a digital image is the number of pixel within a unit of measure [AL99]. Typically,the resolution is measured in pixels per inch (ppi). The higher image resolution thebetter is its quality. The image resolution can also be understood as dimension of thepixel array specified with two integers [Dal92]:

number of pixel columns× number of pixel rows

Bit depth, called also color depth or pixel depth, stands for how many bits aredestined for description of color for each pixel. Higher color depth means that morecolors are available in the image, but at the same time, it means that more disk spaceis needed for storage of the image. Monochrome images use only one bit per pixel, andgrayscale images engage usually 8 bits, which gives 256 gray levels. Color images canhave pixel depth equal 4, 8 or 16 bits; full color can be achieved with 24 or 32 bits.Colors can be described in various ways. Next digital images’ feature – color model

not only specifies how colors are represented, but also determines the spectrum ofpossible colors of pixels. The gamut of colors that can be displayed or printed dependson color model that is employed. This is why a digital image in a particular color modelcan use only a portion of visible spectrum – this portion is characteristic for the model.There are many different color models and most popular are: RGB, CMY, CMYK,HSB (HSV), HLS, YUV, and YIQ. These color models are divided into two classes:“subtractive” and “additive”. CMY (Cyan, Magenta, and Yellow) and CMYK (Cyan,Magenta, Yellow, and Black) are “subtractive” models. One should make use of one ofmodel from this class when printing inks (with presence of external light that is being


reflected by printed image) must be employed to display color. RGB (Red, Green, Blue),one of “additive” models, is used when displaying color with emission of light – e.g.image is displayed by computer display monitor. The main difference between these twokinds of color models is that in subtractive models black is achieved by combining colorsand in additive models in this way is produced white. In subtractive models, colors aredisplayed thanks to the light absorbed (subtracted) by inks. While in additive models,colors are displayed thanks to the transmitted (added) light. HSB (Hue, Saturation,and Brightness), also called HSV (Hue, Saturation, and Value), is more intuitive –color of a pixel is specified by three values: hue – the wavelength of light, saturation– the amount of white in the color and brightness – the intensity of color. Similar toHSB and also very intuitive is HLS (Hue, Lightness, and Saturation). The YUV colormodel is a part of PAL system in television and contains three components – one forluminance and two for chrominance. YIQ also has one component for luminance andtwo components for chrominance and it is used in NTSC television.Channels are closely related with color models. A channel is a grayscale image that

reflects one of color model component (base of color in used color mode). Channelshave same size as the original image. Thus an image in RGB will have 3 channels:Red color, Green color, Blue color, in CMYK four channels: Cyan color, Magentacolor, Yellow color, Black color, HSV will have three channels: Hue value, Saturationvalue, Brightness value, a grayscale image will have only one channel, etc. There canbe additional channels called Alpha Channels. An Alpha channel stores informationabout transparency of pixels. Therefore, the number of channels, although it partiallydepends on the color model, is also a feature of a digital image.Colors’ indexing is next image feature related to color model. Indexed color model

is only an option and means that number of colors that can be used is limited to afixed number (e.g. in GIF to 256) in order to reduce the bit depth and size of whole file.Most often indexing is done automatically in accordance to standard palettes or systempalettes. Palettes in different operating systems are not the same – they only partiallyoverlap. From 216 colors that are common for operating systems a standard palettewas created for purpose of World Wide Web. There are also other standard palettes– a palette with 16 colors is commonly used for simple images. Besides indexing tostandard/system palettes, there exists also adaptive indexing. In this indexing, thecolor space is reduced to a fixed number of colors that most accurately represent theimage. Not necessarily all colors needed by the image must be indexed, but they canbe. The difference between indexing to standard/system palette and adaptive indexingis that adaptive indexing requires definitions of colors from the palette at the beginningof the file and standard palettes do not have to be attached. [AL99]File format is next characteristic of a digital image. A digital image can be stored

in one of many file formats. Some formats are bounded with one specific program,but there are also common formats that are being understood by different graphicprograms. There is a very close relation between file formats and compression. Imagesstored in a particular format are usually compressed in order to reduce the size of thefile. Each format supports one or few compression methods; there are also formats thatstore uncompressed data. [Dal92]The last characteristic of a digital image is compression method used to reduce size

of file containing the image.


(a) Human hand (from [Wei98]) (b) Human chest (from [Wei98])

(c) Human knee (from[Pin04])

(d) Human skull (from [Pin04]) (e) Human neck (from[Pin04])

Figure 1.1. Examples of x-ray images.

As a conclusion, there is more than one method to reduce amount of disk spaceneeded to store a digital image. The most obvious one is compression, but there are alsoother, simpler like reduction of image resolution. There can be also decreased numberof colors or introduced index to used color palette.

1.2. Medical Images

Medical Imaging came into being in 1895 when W. K. Roentgen discovered X-rays.This invention was a great step forward for non-invasive diagnostics and rewardedwith Nobel Prize in 1901. With time, other discoveries in the field of medical imagingwere made that, like X-rays, support medicine and make possible more accurate andeffective diagnosis. It is not feasible to list all of discoveries and inventions. Similarsituation is with describing all types of medical images. Thus, only the most important


Figure 1.2. Example of a Computerized Tomography image (from [Gon02])

discoveries will be mentioned with a characterization of images, which are products ofthese technologies.Although X-rays were discovered over a century ago, they are still in common

use. During examination, the patient is being placed between an X-ray source and adetector. Different tissues absorb x-rays with different force thus the X-rays that wentthrough the patient have different energy depending on what tissues they ran into.Dense tissues, e.g. bones, block and soft tissues give no resistance to the X-rays. Partsof the detector that are behind tissue that absorbs X-rays in 100% produce whiteareas on the image. The “softer” a tissue is the darker becomes the image in partsthat represent this tissue. Many different X-ray detectors can be used during medicalexamination. They can be divided into two classes. One class contains detectors, likephotographic plate, that give analog images. These images can be transformed intodigital image by process of digitalization. —The second class of detectors consistsof devices that directly produce digital images. The most familiar detectors that fallinto the second class are Photostimulable Phosphors (PSPs), Direct SemiconductorDetectors and combination of Scintillator with semiconductor detectors.In 1971 G. Hounsfield build first Computerized Tomograph (Computerized Ax-

ial Tomograph) – an X-ray machine that produces a set of two-dimensional images(slices), which represent and three-dimensional object. For his invention, G. Hounsfieldwas awarded with Nobel price in 1979. Pictures created during tomography are calledtomograms and they create a specific class of X-ray images. There are also other classes,for example mammography images.Apart from X-rays, also other technologies are used in medical imaging. Gamma-ray

imaging is used in a field of nuclear medicine. In contrast to X-rays, here is no externalsource of gamma rays. A radioactive isotope, which emits gamma rays during decay, isadministered to patient. Then the gamma radiation is measured with a gamma camera(gamma-ray detectors). Most popular applications of gamma rays in medical diagnosisare bone scan and positron emission tomography (PET). Bone scan with gamma rayscan detect and locate pathologies like cancer or infections. PET generates a sequenceof images that, like in X-ray tomography, represent a 3-D object.Medical imaging employs also radio waves. Magnetic resonance imaging (MRI) is a


(a) (b)

Figure 1.3. Examples of gamma-ray images (from [Gon02])

technique in which short pulses of radio waves penetrate through a patient. Each suchpulse entails response pulse of radio waves generated by all tissues. Different tissuesemit a pulse with different strength. The strength and source of the each response pulseis calculated and a 2-D image is created from all of gathered information.Ultrasound imaging in medical diagnostic composes ultrasonography. An ultra-

sound system consists of a source, a receiver of ultrasound, a display and a computer.High-frequency sound, from 1 to 5 MHz, is sent into the patient. Boundaries betweentissues partially reflect the signal and partially allow it to pass. This means that thewaves can be reflected on various depths. The ultrasound receiver detects each reflectedsignal and the computer calculates the distance between the receiver and the tissue,which boundary reflected the waves. Determined distances to tissues and strengthsof reflected waves are presented on the display, i.e. they constitute a two-dimensionalimage. Such image typically contains information about millions of ultrasound signalsand it is updated each second.There are also other medical imaging techniques that were not described here. Most

important of them are Optical Transmission and Transillumination Imaging, PositronEmission Tomography (Nuclear Medicine), Optical Fluorescence Imaging, ElectricalImpedance Imaging.The review of medical imaging techniques unveil large diversity of medical images

classes and technology used in medical diagnosis. Nevertheless, all these images havesome common characteristics.All above-mentioned classes of medical images are chacterized with very restricted

size. Although there are color medical images, the most of them are monochromatic.Images from different classes have different sizes. Largest are the X-ray images, whichcan have size up to 2048 pixels vertically and horizontally. Other medical images are


(a) (b)

Figure 1.4. Examples of magnetic resonance images (from normartmark.blox.pl,21.09.2007).

much smaller, for example Computerized Tomography are smaller than 512×512 pixels,Magnetic Resonance images up to 256× 256 pixels and USG images 700× 500 or less.[Sta04]Medical images have also limited bit depth (how many bits are destined for descrip-

tion of single pixel color). X-ray images have bit depth equal 12 bit and USG imagesonly 8 bits. The matter is not so clear with Magnetic Resonance images. Image formatused here can store 216 (bit depth equal 16) tones of gray but, in fact, there are muchfewer tones – about 29 (bit depth equal 9). [Sta04]There are also other, more important issues, which distinguish medical images from

other. Medical images create a particular class of digital images, where the informationcarried by them is extremely important. High fidelity of compression and any otherprocessing is required or the diagnosis could be erroneous. The loss of informationmay mislead not only when a physician personally examines the image but also whensoftware is used for analyzing the image.The receiver operating characteristic (ROC) analysis is an evaluation method used

to measure the quality and diagnostic accuracy of medical images. It is performed bytrained observers who rate the perceptible loss of information. The analysis gives fordifferent medical image types the maximal compression ratios at which the fidelity ofthe images meets the expectations of the observers. For sample image types, the ratiosare [Kof06, Oh03]:• Ultrasonography: 9 : 1• Chest radiography: 40 : 1, 50 : 1 – 80 : 1 (JPEG2000)• Computered Tomography: 9 : 1 (chest), 10 : 1 – 20 : 1(head)• Angiography: 6 : 1• Mammography: 25 : 1 (JPEG2000)• Brain MRI: 20 : 1The information loss should be avoided during processing but also very important


(a)

Figure 1.5. Example of USG image.

is the quality of presentation of the image, especially the most important details. Oneshould care about the faithfulness of image not only when it is presented in scale 1:1.Due to small resolutions of medical images, their psychical size on a display device

also will be rather small. Because of this, it is difficult to perform measurements byhand during diagnosing or even to read the image by a physician. Thus, magnificationof the image is often very desirable and this means that also a zoomed-in image shouldbe maximally true, legible and clear.If it would be sure that images will not be magnified, probably the best choice for

a compression method would be one of lossless methods. This group of compressiontechniques assures that no information will be lost during encoding and decoding pro-cesses; this means that the recovered image from a compressed file will be exactly thesame as the original image.The fractal compression has one large advantage over lossless methods – it enables

fractal magnification that gives much better effects that traditional magnification al-gorithms, e.g. nearest neighbor, bilinear interpolation or even bicubic interpolation.Fractal magnification is actually the same process as fractal compression – the imageencoded with fractal method can be decompressed to arbitrary given size. An imagecompressed with one of lossless methods must be undergone to an interpolation algo-rithm if it has to be magnified. This means that although the compression algorithmdid not cause any distortion to the image the interpolation algorithm will cause somefaults. For example, there may appear block effect, image pixelization or image blurring.Fractal compression makes possible to keep the distortion rate on much lower level andthe image remains sharp regardless of the size to which it is magnified.Fractal magnification is not the only quality of fractal compression. As opposed to


most of other compression methods, the fractal coding is asymmetric. From one hand,it is a drawback because encoding lasts much longer that in other methods. But atthe same time it is an advantage because the decoding process is very fast – it takesusually less time to decode an image with fractal method than to read the same image,but uncompressed, from the hard drive. This feature is useful when the image must besent through the Internet – the transmission time will be shorter because the imagerepresentation is shorter when is encoded with fractal method (lossy algorithm) thanany lossless method, and there will be no significant additional time costs caused bydecoding.Another feature of fractal compression that attracts one’s attention is the greatness

of compression ratios that can be achieved with this method. Since it is a lossy method,it gives much smaller compressed file than any lossless compression algorithm. However,the medical images cannot be compressed with too high compression ratio because theloss of information can turn out to bee too high.

Chapter 2

Image Compression

This chapter introduces the reader into the field of image compression (section 2.1)and provides a general explanation of fractal compression (section 2.2). The readerswho are familiar with the topics may skip the entire chapter or parts of it.

2.1. Fundamentals of Image Compression

A compression method consists of definitions of two complex processes: compressionand decompression.Compression is a transformation of original data representation into different rep-

resentation characterized by smaller number of bits. Opposite process – reconstructionof the original data set is called decompression.There can be distinguished two types of compression: lossless and lossy. In lossless

compression methods, the data set reconstructed during decompression is identical asthe original data set. In lossy methods, the compression is irreversible – the recon-structed data set is only an approximation of the original image. At the cost of lowerconformity between reconstructed and original data, better effectiveness of compressioncan be achieved. A lossy compression method is called “visually lossless” when the lossof information caused by compression-decompression is invisible for an observer (duringpresentation of image in normal conditions). However, the assessment, if a compressionof an image is visually lossless, is highly subjective. Besides that, the visual differencebetween the original and decompressed images can become visible when observationcircumstances change. In addition, the processing of the image, like image analysis,noise elimination, may reveal that the compression actually was not lossless.There are many ways to calculate the effectiveness of the compression. The most

often used factor for this purpose is compression ratio (CR), which expresses the abilityof the compression method to reduce the amount of disk space needed to store thedata. CR is defined as number of bits of the original image (Borg) per one bit of thecompressed image (Bcomp):

CR =BorgBcomp

Chapter 2. Image Compression 14

The compression percentage (CP) serves the same purpose:

CP =(1− 1CR

)· 100%

Another measure of the compression effectiveness is bit rate (BR), which is equalto the average number of bits in compressed representation of the data per element(symbol) in the original set of data. High effectiveness of a compression method man-ifests itself in high CR and CP , but in low BR. When time needed for compressionis important must be used different factor – product of time and bit rate. Here werementioned only the most commonly used factors but there are many more ways toestimate the effectiveness.

2.1.1. Lossless Compression

Most of lossless image compression methods are adapted universal compressiontechniques. Lossless compression converts an input sequence of symbols into an out-put sequence of codewords, so it is nothing else like a coding process. One codewordusually corresponds to one element (symbol) in the original data; in stream coders,it corresponds to a sequence of symbols. The codewords can have fixed or variablelength. Decompression, of course, is decoding of the code sequence. The output of thedecoding in lossless compression is the same as the input of the coding process. Thedivision of the stream to be encoded into parts, which are bounded with codewords, isunequivocal.Lossless compression method comprises of two phases – modeling and coding. Cre-

ation of a method boils down to specification how those two phases should be realized.The modeling phase builds a model for the data to be encoded, which best describes

information contained in this data. Choice of the modeling method for a particularcompression technique depends to a large extent on the type of data to be compressed,but it always concentrates on recognition of the input sequence, its regularities andsimilarities [Deo03]. The model is a different, simpler representation of the originaldata that eliminates the redundancy [Prz02].The coding phase is based on a statistical analysis and strives after the shortest

binary code for a sequence of symbols obtained from the modeling phase [Prz02]. Inthis phase the analytical tools from information theory are commonly used [Prz02].Typically, entropy coding is used at this stage [Deo03].Not all compression methods can be divided into these two stages. There are older

algorithms, like Ziv-Lempel algorithms, that escape from this classification. [Deo03]Three groups are distinguished in lossless compression methods:• entropy-coding,• dictionary-based,• prediction methods.In the first group – entropy coding methods, various compression techniques can be

found in a great number, for example Shannon-Fao coding, Huffman coding, Golombcoding, Unary coding, Truncated binary coding, Elias coding. Within entropy codingmethods also arithmetic methods can be found, e.g. range coding.In dictionary-based methods are for example Lempel-Ziv-Welch (LZW) coding,

LZ77 and LZ78, Lempel-Ziv-Oberhumer algorithm, Lempel-Ziv-Markov algorithm.


Prediction methods gained recently some popularity; an example can be JPEG-LSand Lossless JPEG2000 algorithms.With lossless compression is bounded a limitation that is shown by information

and coding theory. The average length of codeword cannot be smaller than the entropy(expressed in bits) of the information source. So the closer a compression techniquecomes to this limit the better compression ratio can be achieved, and no lossless com-pression method can come beyond this limit. The basic concepts of information theoryare explained below.Information is a term that actually has no precise mathematical definition in infor-

mation theory. It should be understand in colloquial way and treated as indefinable.Information should not be confused with data (data build information) or message(transmitted information). Although there is no definition, it is possible to measureinformation. The amount of information is calculated thanks to following equation:

I(ui) = logbs1pi

where pi is the probability that the symbol ui will occur in the source of information.This equation measures the information related with occurrence of a single symbol ina probabilistic source of information. The unit of this information measure depends onthe basis bs of the logarithm. When bs = 2 then the unit is bit, when bs = 3 then theunit is trit, when bs = e (natural logarithm) then the unit is nat, and the last unit –Hartley is used when bs = 10.Entropy is a different measure of information – it describes the amount of informa-

tion specified by a stream of symbols. According to Shannon definition, the entropyis the average amount of information I(ui) for all symbols ui that build the stream.So when data U = u1, u2, . . . , uU constitute the information then the entropy can becalculated from:

H(U) =U∑i=1

p(ui) · I(ui) =U∑i=1

p(ui) · logbs1p(ui)

= −U∑i=1

p(ui) · logbs p(ui)

Above-mentioned formulas are correct only when emission of a symbol by the sourceis independent from past symbols – i.e. when the source is memoryless source. Othertypes of sources, e.g. source with memory or finite-state machine sources, like Markovsource, require consideration of changes in these formulas.

2.1.2. Lossy Compression

The limitation of the effectiveness of lossless compression techniques brought aboutdemand for different approach to compression, which will give better compression ra-tios. Better effectiveness can be achieved only by disposing of the reversible characterof the encoding process. The lossy compression methods reduce the information of theimage to be encoded up to some level that is acceptable by a particular applicationfield. This means that, apart from characteristics of a compression method known fromlossless techniques – compression ratio and time needed for encoding and decoding, inlossy methods occurs one more – distortion rate. By distortion rate, one should un-derstand the distance between original image and the image reconstructed in decodingprocess.


Figure 2.1. General scheme for lossy compression.

In lossy compression algorithms, two obligatory phases can be distinguished: quan-tization and lossless compression. This means that the quantization is the key issue forlossy methods. Before the quantization, one more phase can be found -– decomposition,which is optional, but very frequently used because it allows one to create more effectivequantization algorithms.The goal of the decomposition is to build a representation of the original data that

will enable more effective quantization and encoding phases. Basic way to achieve thisgoal is to reduce the length of the representation comparing to the original data. Al-though the decomposition phase is optional, it exists in every practical implementationof lossy compression. Before the quantization will proceed, decomposition reduces theredundancy and correlation of symbols (pixel values) in the stream to be encoded. Acombination of decomposition with simple quantization produces results in very goodeffectiveness with much lower complexity and encoding/decoding time.There are many different ways to perform the decomposition, the most popular are:• frequency transforms,• wavelet transforms,• fractal transforms.The quantization reduces the number of symbols of the alphabet, which will be

used by the intermediary representation of the stream to be encoded. This meansthat the information carried by the image is partially lost in this phase. Compressionmethods often allow adjusting the level of information loss – when the entropy is lowerthan the length of the encoded stream is smaller. Thus, the decomposition is the mostimportant phase in all practical realizations of lossy compression because it determinesthe compression ratio, quality of the recovered image and size of information loss duringencoding.The quantization in lossy compression techniques can be compared to digitization

of an analog image, where for a set of continuous values a representation with somelimited number of discrete levels of quantization. When it comes to compression, theimage is already represented with the set of discrete values, but it is being replaced withsmaller set that best keeps the information of the original image. The dequantizationis a process opposite to quantization, where the original stream is being reconstructed


based on the levels of quantization. The dequantization is inseparably bounded withapproximation of the value because the reconstruction of all symbols encoded withlossy compression is impossible.Two types of quantization are used n lossy compression methods – Scalar Quanti-

zation and Vector Quantization. Difference between these two types is what the ele-mentary unit of symbols for processing is. In scalar quantization, this unit is equivalentof single symbol. While in vector quantization, it consists of some number of successivesymbols – a vector of symbols. Both of these methods can employ regular or irregularlength of intervals.

Figure 2.2. Regular scalar quantization.

The quantization can be executed in an adaptive manner. The adaptation can goforward or backward. In forward adaptation, the input stream is divided into pieces,which have similar statistical characteristics, e.g. variance. For each one of these piecesa quantizator is being built separately. This method results in better quantization ofthe entire input stream with cost of greater computing complexity and enlargement ofthe size of description of the quantizator attached to the encoded stream.The backward method of adaptive quantization builds the quantizator based on

data already processed during the quantization process. This method does not re-quire any additional information about the quantization to be attached to the encodedstream.The last phase of lossy compression methods is de facto a complete lossless com-

pression method to which the output of quantization is passed as the input stream tobe encoded. A large variety of lossless methods is used in different lossy compressionmethods. Any type of lossless method can be used here, but it must be chosen withrespect to the decomposition and quantization techniques.Any phase of above-described scheme can be static or adaptive. Adaptive version

usually leads to increased effectiveness with the cost of higher complexity of the algo-rithm.As it was mentioned at the beginning of this section, compression ratio in lossy

techniques is not limited by the entropy of the original stream. The entropy of theencoded stream can be reduced if higher compression ratio is required. Nevertheless,decreased entropy entails higher distortion. Very helpful is here rate distortion theory


Figure 2.3. Compression system model in rate-distortion theory

which answers the question what is the minimal entropy within the encoded stream thatwill be enough to reconstruct the original image without exceeding a given distortionlevel. Notation, which will be used to explain the rate distortion theory, is explainedon figure 2.3. In the figure bit rate is marked with R. This theory allows determiningwhat the boundaries of compression ratio in lossy compression methods are. Accordingto rate distortion theory the bit rate BR (average bit length per symbol) is relatedwith distortion by following dependency:

BR(Dmax) = mind(X,X)

{I(X, X)

}

The Im in above equation means “mutual information”, it is the average informationthat random variables (here X, X) convey about each other:

Im(X, X) = H(X)−H(X|X) = H(X)−H(X|X)

=X∑xi

X∑xi

fX,X(xi, xi) · log

fX,X(xi, xi)

fX(xi) · fX(xi)=∑xi

∑xi

fX(xi) · fX|X(xi, xi) · logfX|X(xi, xi)

fX(xi)

The random variable X describes the original data set and the X represents thereconstructed data set. The fX(xi) represents the occurrence probability of a deter-mined symbol. The f

X|X(xi, xi) is the conditional probability that given symbol will

occur in source X under condition that some symbol will occur in source X. ValuesfX(xi) are defined by the statistics of the information source but the values fX|X(xi, xi)characterize the compression method.The mutual information has following properties:

0 ¬ I(X; X)− Im(X;X)

Im(X; X) ¬ H(X)

Im(X; X) ¬ H(X)

The distortion per symbol can be measured with Hamming distance or other mea-sure, e.g.: d(xi, xi) = (xi − xi)2 or d(xi, xi) = |xi − xi|. Independently from the measurethat will be chosen the distortion d has fallowing properties:

d(xi, xi) 0

d(xi, xi) = 0 when xi = xi


The value D expresses the average distortion for an image and it is expressed withthe equation:

D(X, X) = E{d(X, X)

}=∑xi

∑xi

fX,X(xi, xi) · d(xi, xi)

The formulas presented above state that, under the criterion that the average dis-tortion will be not greater than the given value Dmax, the minimal bit rate is equal tothe greatest lower bound of the average mutual information. To find such compressionmethod, characterized by the f

X|X(xi, xi), one has to minimize the amount of informa-

tion about random variable X carried by random variable X for given distortion levelD not greater than Dmax.The relationship between bit rate and distortion level is visualized on figure 2.4.

Figure 2.4. The relationship between bit rate and distortion in lossy compression.

There are many ways to measure the quality of the reconstructed image, obtainedwith a given compression method. Probably, the two most popular measures are meansquared error (MSE) and peak signal to noise ratio (PSNR), which are defined byfollowing formulas:

MSE =1M ·N

M−1∑m=0

N−1∑n=0

[X(m,n)− X(m,n)

]2

PSNR = 10 · log10(max2XMSE

)

where X is the original image, X – the reconstructed image, M – number of pixels ina row, N – number of pixels in a column, maxX = 2bitd − 1 – the maximal possiblepixel value of the image X (bitd – bit depth). PSNR is expressed in decibels (dB).

2.2. Fractal Compression

Fractal compression methods, which belong to lossy methods, distinguish them-selves from other techniques by a very innovative theory. To some extend, fractal com-pression diverges from the described above basic scheme of lossy compression methods.


(a) The picture of Lenna. (b) Selfsimilarity in Lenna’s picture.

Figure 2.5. Selfsimilarity in real images (from einstein.informatik.uni-oldenburg.de/rechnernetze/fraktal.htm, 19.01.2008).

The most important part of this theory is that parts of an image are approximated bydifferent parts of this image (the image is self-similar). This assumption makes possibleto treat the image as a fractal.According to B. Mandelbrot [Man83], the “father of fractals”, a fractal is

A rough or fragmented geometric shape that can be subdivided in parts, eachof which is (at least approximately) a reduced/size copy of the whole.

Fractal is a geometric figure with infinite resolution and some characteristic features.First of them is already mentioned self-similarity. Another one is fact that fractals aredescribed with a simple recursive definition and, at the same time, it is not possible todescribe them with traditional Euclidean geometry language – they are too complex.As a consequence of the self-similarity of fractals, the fractals are scale independent –change of size causes generation of new details. The fractals have plenty of other veryinteresting. Nevertheless, they are not necessary to understand fractal compressiontheory and they will not be explained here.The essence of fractal compression is to find a recursive description of a fractal

that is very similar to the image to be compressed. The distance between the imagegenerated from this description and the original image shows how large informationloss is. Although fractal compression is based on an assumption that the image can betreated a fractal, there are some divergence from above-presented fragments of fractaltheory. In fractal compression self-similarity of the image is loosen – it is assumed thatparts of the image are similar to other parts and not to whole image.All other properties of fractals remain valid for an image encoded with a fractal

compression method. The image can be generated in any size, smaller or larger thanthe original. Quality of reconstructed image will be the same in all sizes, and edgesalways will have same sharpness. The number of details can be adjusted by changingthe number of iterations for the recursive description of the image.


The fractal theory says that the recursive description of complex shape shall besimple. Any photographic-like image is very complex and if this image can be describedas a fractal then a great compression ratio shall be achieved.The fractal description of an image consists of a system of affine transformations.

This system is called fractal operator and has to be convergent.

2.2.1. Decoding

Iterated Function System

Fractal compression is based on IFS (Iterated Function Systems) – one of manyways to draw fractals. The IFS uses contractive affine transformations.By a transformation, one should understand an operation that changes the position

of points belonging to the image. If the space of the digital images will be marked with Fand a metric with d then the pair (F, d) constitutes a complete metric space. Nonemptycompact subsets of F are points of the space. In this space, a transformation means afunction w : F → F .A transformation w is contractive when the function satisfies the Lipschitz condi-

tion, i.e. for any x, y ∈ F there is a real number 0 < λ < 1 that d(w(x), w(y)) ¬λd(x, y), where d(x, y) denotes the distance between points x and y.A transformation is affine when it preserves certain properties of geometric objects

exposed to this transformation. The constrains, which make a transformation affine,are:• preservation of collinearity – lines are transformed into lines, the images of points(three or more) that lie on a line are also collinear

• preservation of the radios of distances between collinear points – if points p1, p2, p3are collinear then

d (p2, p1)d (p3, p2)

=d (w(p2), w(p1))d (w(p3), w(p2))

Affine transformations are combinations of three basic transformations:• shear (enables rotation and reflection)• translation (movement of a shape)• scaling/dilation (changing the size of a shape)A single transformation may be described with following equation:

wi

[xy

]=[ai bici di

] [xy

]+[eifi

]The coefficients a, d determine the dilation and the coefficients b, c determine the shear,e and f specify the translation. The variables x, y specify the coordinates of a point(pixel) that currently is being transformed.To generate a fractal, several transformations are needed. These transformations

form fractal operator W , often called Hutchinson’s operator:

W =W⋃i=1

wi

An Iterated Functions Systems is defined by complete metric space (F, d) and op-erator W . The Banach fixed point theorem guarantees that in complete metric space


Figure 2.6. Generation of the Sierpinski triangle. Four first iterations and the attractor.

F , the operator W has a fixed point A, called the attractor: W (A) = A, which canbe reached from any starting image through iterations of W . The images produced initerations are successive approximations of the attractor.

A = limi→∞W ◦i(X0), where W ◦i = W ◦W ◦i−1 and X0 ∈ F

Thus the fixed point of a fractal described with IFS can be found through a recursivealgorithm. An arbitrary image X0 (X0 ∈ F ) is put on the input, and processed witha given number of iterations. In each iteration, the whole output image from previousiteration is undergone to all transformations in the operator (Deterministic IFS):

Xr = W (Xr−1) =W⋃i=1

wi

where Xr means the image produced in iteration r or the initial image when r = 0.There is a second version of this algorithm in which at the beginning a starting

point X0i (X0i ∈ X0) is picked. In each iteration, a randomly chosen transformation is

applied to a point from previous iteration (Random IFS).In figure 2.6, several iterations of the deterministic IFS are shown. In each picture,

the dashed square contains the image that will be found on input in the next iteration.The squares with solid lines represent the transformations – the image from previous


iteration is rescaled and moved to fit the square. The Sierpiński triangle is describedwith only three transformations:

[x′

y′

]=[0.5 00 0.5

] [xy

]+[00

][x′

y′

]=[0.5 00 0.5

] [xy

]+[0.250.5

][x′

y′

]=[0.5 00 0.5

] [xy

]+[0.50

]

The first transformation is related with the bottom left square, the second with thetop square, and the last with the bottom right one.Iterated Functions Systems allow constructing very interesting fractals, e.g. Koch

curve, Heighway dragon curve, Cantor set, Sierpinski triangle, and Menger sponge.Some fractals, which can be drawn with IFS, quite well imitate nature, e.g. Barnsleyfern. The Barnsley fern (see figure 2.7) is described by an operator with four transfor-mations:

[x′

y′

]=[0.85 0.04−0.04 0.85

] [xy

]+[01.6

][x′

y′

]=[−0.15 0.280.26 0.24

] [xy

]+[00.44

][x′

y′

]=[0.2 −0.260.23 0.22

] [xy

]+[01.6

][x′

y′

]=[0 00 0.16

] [xy

]+[00

]

Figure 2.7. Barnsley fern


Partitioned Iterated Function System

Fractal compression uses PIFS (Partitioned Iterated Function System) which isa modified version of IFS. In IFS, one could specify the operator by the number ofaffine transformations and the set of coefficients in Hutchinson’s operator. In PIFS,the operator includes two additional coefficients for each transformation that determinethe contrast and brightness of the images generated by the transformations. The mostimportant difference between IFS and PIFS is that in IFS all transformations take thewhole image from previous iteration on input, in PIFS it is possible to specify what partof the image should be processed. Transformations can take on input different parts ofthe image. These two additional features give enough power to decode grayscale imagesfrom a description of the image consisting of the fractal operator.The fragment of the space that is put on input of a transformation is called domain.

Each transformation in PIFS has its own domain Di and transforms it into range Ri.Equivalent of Hutchinson’s matrix in IFS is in PIFS the following system:

wi

xyz

= ai bi 0ci di 00 0 si

xyz

+ eifioi

In PIFS, the z variable is the brightness function for given domain (for each pair x, ythere is exactly one value of brightness):

z = f(x, y)

Two new coefficients in PIFS are introduced to operate on the z variable: si specifiesthe contrast and oi the brightness.

2.2.2. Encoding

As it was already mentioned, the fractal code of an image contains a fractal operator.PIFS solves the problem of decompression of an image, but the compression is relatedwith the inverse problem – the problem of finding operator for given attractor.The first solution to the inverse problem was developed by Michael F. Barnsley.

The basis of his method is the collage theorem. The inverse problem is solved hereapproximately – the theorem states that one should concentrate on finding operatorW that generates an attractor A that is close to the given attractor X (i.e. to theimage to be encoded):

X ≈ A ≈ W (A) = w1(A) ∪ w2(A) ∪ . . . ∪ wW (A)

where X is the image to be encoded, W is the operator and A the attractor of W .Thus the goal is to find a fractal operator consisting of transformations wi that will

represent an approximation of a given image. The theorem gives information that ismore specific about the distance between the original image and the attractor generatedfrom found IFS:

δ(X,A) ¬ δ(W (X), X)1− s


where d is the distance measure, s is the contractivity factor of W and 0 < s < 1According to this equation, the closer is the collage W (X) (first-order approxima-

tion of the fixed point) to the original image X, the better is the found IFS – theattractor A is closer to the original image X. So during the encoding process one canfocus on minimizing the distance δ(W (X), X) and this will result in minimizing thedistortion δ(X,A), which is the goal of fractal compression. The quantitive distancemeasure δ(W (X), X) is called the collage error. The computational complexity of frac-tal compression is significantly reduced by minimization of the collage error instead ofthe distance between the original image and the attractor. However, this solution doesnot give optimal results.The distance δ(X,A) between the original image and the attractor is also influenced

by the contractivity factor – if s is smaller then the images are closer to each other.However, minimizing the s has also other effects. The smaller s is the larger the fractaloperator is – more transformations are needed to encode the image.Thus, one has to find all ranges and domains and to specify the transformations.

The distances between all ranges Ri ∈ R and corresponding domain blocks Di give thecollage error δ(W (X), X) thus they determine the accuracy of the compression. Thus,the size of information loss during encoding can be reduced by pairing closer rangesand domains into transformations. The process of finding proper range and domain isvery complex in computationally sense, so the computing time is also long. Improvingthe quality of encoding extends the process even more.The first fully automatic method for fractal compression was presented by

Jacquin [Jac93]. The key problem is to find a set of non-overlapping ranges Ri thatcovers the whole image to be encoded; each range must be related with a domain. Thedistance between Ri and corresponding Di has to be minimal – there should be noother domain that is closer to Ri. The draft of the encoding algorithm may look likethis:

1 divide the image into overlapping domains D = {D1, D2, . . . , Dm} anddisjoint ranges R = {R1, R2, . . . , Rn}// the size of each range is b× b and the size of each domain is 2b× 2b.

2 for each range Ri ∈ R:2. a set wi := NULL, Di := NULL, j := 12. b for each domain Dj ∈ D:

2. b. i compare Ri with 8 transformations of Dj

// transformations: rotation of Dj by 0, 90, 180, 270 degrees and rotationof the reflection of Dj by 0, 90, 180, 270 degrees

2. b. ii determine parameters of transformation wjithat gives minimal distance between Ri and w

ji (D

j)2. b. iii calculate δ(wji (D

j), Ri) - distance between wji (D

j) and Ri2. b. iv if δ(wi(Di), Ri) > δ(w

ji (D

j), Ri) or wi = NULL,// i.e. if ∀0 < k < j : δ(wki (Dk), Ri) > δ(w

ji (D

j), Ri)then wi := w

ji, Di := D

j

2. c add wi to fractal code

The above-presented algorithm became a basis for many fractal compression meth-ods. All other methods can be treated as improvements to Jacquin’s method. The result


of fractal encoding is the fractal code, which consists only of parameters of the fractaloperator’s transformations.

2.3. Fractal Magnification

Figure 2.8. Fractal magnifier block diagram

The fractal magnification (or resolution improvement) is simply the process of en-coding and decoding an image with partitioned iterated functions systems. The trans-formations that build the fractal code describe relations between different parts of theimage and no information about the size or resolution of the image is being stored.Thus, the fractal code is independent of the resolution of the original image. At thesame time, the fractal operator stored within the fractal code drives to an attractorthat is only an approximation of the original image but it has continuous tone. Thismeans that the image can be decoded to any resolution – higher or lower than theoriginal resolution. Resolution improvement is here equivalent to fractal magnification.A display device has a fixed size of the pixels. Higher resolution means that the imagecan be displayed on a higher number of pixels, thus the physical dimension of thedisplayed image is higher than the original’s.

Figure 2.9. Fractal magnification process

When an image is being magnified, the new details are generated during the decom-pression. Thus, there is no problem with the values of the pixels that do not exist inthe original image. Image interpolation, the most popular technique to zoom images,has to calculate the values of the pixels that were inserted between original image’spixels. There are different interpolation methods, e.g. nearest neighbor, linear or bicubic


interpolation. These classical methods of image enlargement are inseparably boundedwith some image distortions. For example the image pixelization (large pixels) mayappear, i.e. the borders between groups of pixels, which represent one pixel of theoriginal image, are much visible.

Chapter 3

Fractal Compression Methods

All fractal compression methods originate from the same ancestor, which was brieflydescribed in previous chapter. Because of this, there are an appreciable number of sim-ilarities between the methods. Thus, different fractal methods will not be describedfrom the beginning to the end since much of the content would be repeated severaltimes. Instead, differences between the compression methods will be discussed. Onehas to keep in mind that many different fractal methods, elaborated by different au-thors, may implement some element in the same manner. The elements of the fractalcompression algorithm that vary among different methods are grouped into severalcategories. Each section in this chapter corresponds to one such category.

3.1. Partitioning Methods

The partitioning scheme used to demarcate the range blocks is one of the mostcrucial elements of the fractal compression method. The fidelity and quality of thereconstructed image, the length and the structure of the fractal code, the shape of thetransformations used to map domains into ranges and their descriptions in the fractal,code compression ratio, encoding time and all other important characteristics of thecompression method are somehow influenced by the choice of the partitioning method.For example, only when uniform partitioning is used, there is no need to attach any

information about the partition to the fractal code – only transformation coefficientsare stored. At the same time, there are partitioning methods that consume even about44% of the fractal code to describe the partition [Har00]. Of course, there are plentyof methods that are between these two cases – e.g. quadtree partitioning takes about3.5% of the total code size to define the partition [Har00]. Surprisingly, this “addi-tional” information does not have negative effect on the rate-distortion performance –from the three mentioned methods, best results gives the one that needs the largestspace to specify the partition, and the weakest is the uniform partitioning. This isbecause the partitioning scheme has also impact on the number of transformations –the Hartenstein’s method produces only few but large range regions what cannot beachieved in two remaining methods.

Chapter 3. Fractal Compression Methods 29

The partitioning methods, including the tree mentioned above, are described inmore details in the following subsections. Notes on their performance can be also foundthere.

3.1.1. Uniform Partitioning

The uniform partitioning method is presented in section 2.2.2 but it is not theonly option in fractal compression. Actually, it is the most basic solution. The uniformpartitioning is image-independent because the ranges and domains have fixed size –usually the size of a range is 8× 8 (this means that the size of each domain is 16× 16).This partitioning method has some serious drawbacks. Firstly, details smaller than

the size of the range may be found in the image. That sort of details will be lost duringencoding because it would be hard to find a domain with exactly the same details.Of course, there will be found a domain for each range but another problem occurshere – there is no certainty that the distance between these two squares will be reallysmall. The size of the ranges can be adjusted to minimize the problem of matchingranges and domains. However, the smaller the ranges are the worse the compressionratio is because the transformations have to be found for larger number of ranges. Atthe same time, some parts of the image could be covered with larger ranges and theloss of information will be still on acceptable level. This would result in lower numberof transformation, thus, better compression ratio would be achieved.

3.1.2. Overlapped Range Blocks

This method, which is a modification of partitioning into squares, was created byPolidori and Dugelay [Pol01]. The method is very similar to uniform partitioning – allranges have same size b× b and domains 2b× 2b. The difference is that the ranges arenot disjunctive but mutually overlapping with half of their size. This means that allpixels belong to more than one range – pixels close to the edge of the image belongto two ranges and the rest of the pixels are within four ranges. Partitions are encodedindependently and decoding gives up to four different values for each pixel. From thesefour approximations, the final pixel value is calculated.This method gives much better results than pure “squares”, e.g. effectively reduces

the block effect. However, there are also shortcomings of the method. It is much moretime consuming – the image is actually four times encoded and four times decodedduring each encoding-decoding process. In addition, the fractal code representing theimage is almost four times longer. At the same time, the risk of losing small details isnot dismissed.

3.1.3. Hierarchical Approaches

Hierarchical approaches to image partitioning constitute the first class ofimage-adaptive techniques. The decomposition of the image depends on the content– parts of the image with condensed details are divided into smaller ranges and flatregions into large ones. This feature makes possible to overcome the limitations offixed size (uniform) partitions scheme. There are two types of hierarchical approaches:top-down and bottom-up.


In the top-down approaches, the whole image is treated as a single range at thebeginning of the encoding (or it is divided into large uniform partitions). If it is notpossible to find a domain that is close enough (error criterion) to a range then therange is being split into several ranges (the number depends on the method).The bottom-up approaches start with an image divided into small ranges that assure

low level of information loss. During later phase of partitioning, the neighbor rangesthat are close enough to each other are being merged and thanks to that, the finalranges can have different size.

Quadtree Partitioning

The quadtree partitioning presented by Yuval Fisher [Fis92c] was the first hierar-chical approach to partitioning. All ranges have here the shape of a square.In this method, the set D of domains contains all square ranges with sides’ size

8, 12, 16, 24, 32, 48 and 64. Here can be admitted also domains situated slantwise inorder to improve the quality of the encoded image. In the top-down algorithm, thewhole image is divided into fixed size (32× 32 pixels) rangesat the beginning.Then for each range, the algorithm tries to find a domain (larger than the range)

that gives the collage error smaller than some preliminary set threshold. If this attemptends with failure for some ranges then each such range is divided into four. For all newlycreated ranges the procedure is repeated, i.e. fitting domains are being searched forranges and, if necessary, the non-covered ranges are being broken down. The encodingends when there are no ranges that remain uncovered or the size of the ranges reachesa given threshold. In the second case, the smallest ranges are paired with domains thatdo not meet the collage error requirement but are closest to corresponding ranges.

Figure 3.1. Quadtree partitioning of a range. Four iterations.

Besides allowing the slantwise domains, there are also other improvements to themethod. The adaptivity of the division can be increased by introduction of unions ofquadrants created during division of a range.The main drawback of the quadtree partitioning method is that all ranges are

divided in the same way, independently from the content of the ranges. The size ofranges and the structure of partitioning are adaptive to the whole image but the actof braking down a single range produces always the same output – four quadrants ofthe input range. The partitioning would be better fitted to the content of the image ifthe partitioning process was adaptive also at the stage of drawing the borders of futureregions during range block division. Theoretically, this improvement would result inlarger range blocks, i.e. in less number of transformations.


Horizontal-Vertical Partitioning

In the horizontal-vertical (HV) partitioning method [Fis95c], the shape of a rangecan be not only a square but also any other rectangular because a range (when there isno domain close enough to it) is divided into two rectangles instead into four squares.The frontier between the two rectangles is established in the most significant horizontalor vertical edge. Thus, this method is an answer to disadvantages of quadtree parti-tioning – it tries to find best division of a range into two new ranges by horizontal orvertical cut.The image is partitioned in this manner from the beginning, i.e. there is no initial

phase in which the image is divided into uniform partitions (like in quadtree par-titioning). The algorithm includes also mechanisms preventing from degeneration ofrectangles.The algorithm uses two formulas (vn and hm) that allow determining the direction

and the position of the cut:

vm =min(m,width(Ri)− 1−m)

width(Ri)·

height(Ri)−1∑n=0

rm,n −height(Ri)−1∑n=0

rm+1,n

hn =min(n, height(Ri)− 1− n)

height(Ri)·

width(Ri)−1∑m=0

rm,n −width(Ri)−1∑

m

rm,n+1

where width(Ri) × height(Ri) is the dimension of the range block Ri and 1 ¬ m <width(Ri), 1 ¬ n < height(Ri).The second factors of these formulas, (

∑n rm,n −

∑n rm+1,n) and

(∑m rm,n −

∑m rm,n+1), give the difference of pixel intensity between adjacent

columns (vm and vm+1) and rows (hn and hn+1). Maximal values of these differencespoint out the most distinctive horizontal and vertical lines.The first factors, [min(m,width(Ri)− 1−m)] /width(Ri) and

[min(n, height(Ri)− 1− n)] /height(Ri), ensure that the rectangles created bysplitting the range block will not be too narrow – the closer a possible cutting linelocation is to the middle of the range block, the more privileged it is.At this point, we have two alternative lines along which the split can be done –

one vertical and one horizontal. The HV partitioning allows cutting along only one ofthem:• if max(h0, h1, . . . , hheight(Ri)−1) max(v0, v1, . . . , vwidth(Ri)−1) then the range blockis partitioned horizontally

• otherwise, the range block is partitioned verticallyIn other words, the more distinctive cutting line is chosen from the two alternatives.The increased adaptivity is paid dearly with increased time complexity (due to the

variety of range shapes and additional computations) and longer description of thepartitions. However, these additional costs pay off – the rate distortion is significantlyimproved comparing to quadtree partitioning method. This superiority is caused bybetter adaptivity and larger range block sizes (i.e. lower number of range blocks).

Triangular Partitioning

Next partitioning method [Fis92a] is based on triangles. In the first step, the rect-angular image is divided into two triangles along one of diagonals. At this point, the


Figure 3.2. Horizontal-vertical partitioning of a range. Four iterations.

recursive algorithm begins. Each triangle, for which no suitable domain can be found,is divided into four triangular ranges. The borders between these triangles are drawnbetween three points that lie on three diverse sides of the range to be divided. Thepoints that define the borders can be freely chosen in order to optimize the divisionand minimize the depth of the tree representing the partitioning, i.e. the number oftransformations.There was also elaborated a second triangular partitioning scheme, in which the

triangular range is divided along a line from a vertex of this triangle to a point on theopposite side [Nov93].The triangular partitioning has several advantages over HV partitioning. First of

them is the fact that distortions caused by not ideal matching of the ranges and domainare less noticeable. The second very significant advantage is possibility of occurrenceof rotation angles within the transformations other than multiple of right-angle. Thisis because the triangular ranges can have any orientation and rectangular ranges (HV,quadtree, fixed size partitioning) can lie only horizontally or vertically. The largestadvantage of triangular partitioning is reduction of the block effect, which can beobserved in uniform partitioning.Nevertheless, this partitioning scheme has also some heavy drawbacks. The com-

parison of a domain block with a range block is hampered because of the difficultieswith interpolation of the domain block when the pixels from these two blocks cannot bemapped one-to-one. This problem occurs in all partitioning schemes that are not basedon right-angled blocks and is the reason why the right-angled methods are superior[Woh99].

Polygonal Partitioning

The polygonal partitioning is very similar to horizontal-vertical but is more adaptiveto the image. It was invented by Xiaolin Wu and Chengfu Yao [Wu91] but Reusenswas the one who applied it to fractal image compression [Reu94]. In this method, arange can be divided horizontally, vertically (like in HV) or along a line inclined by 45or 135 degrees.Other method to get polygonal blocks is the modified Delaunay triangulation

method – in the merging phase of this method, not only triangles can be createdbut also quadrilaterals [Dav95]. However, this method belongs to the second group ofpartitioning schemes – the split-and-merge approaches.


3.1.4. Split-and-Merge Approaches

The hierarchical approaches perform the partitioning while the pairs of ranges anddomains are being found. The split-and-merge approaches divide the image into par-titions before the searching for transformations is started. The partitioning processconsists here of two phases. The first phase – the splitting yields a fine uniform parti-tioning or a partitioning with various density of ranges for different parts of the image.The second phase – the merging combines neighboring ranges with similar mean graylevels.

Delaunay Triangulation

Delaunay triangulation was adapted to fractal coding by Davoine and Chassery[Dav94, Dav96]. In this method, the partitioning results in a set of non-overlappingtriangles that cover whole image. The splitting phase starts by dividing the image intoregular, fixed size triangles. This triangulation is represented by regularly distributedpoints, which are equal to triangles’ vertices. Then the triangles are investigated and ifany triangle is not homogeneous in sense of variance or gradient criteria then a pointis added in the barycenter of the triangle. The splitting is recursively repeated until alltriangles are homogeneous or the non-homogeneous triangles are smaller than a giventhreshold. Before each iteration, the triangles must be recalculated based on the set ofpoints.The merging removes certain vertices and by this action, the triangles are com-

bined. A vertex is removed if all triangles to which it belongs have similar mean graylevels. Each single change of the set of vertices entails the necessity of recomputing thetriangulation before following actions are performed.The Delaunay triangulation has the same main advantages as the triangular hierar-

chical partitioning – related with unconstrained orientation of triangles. However, thenumber of transformations determined with Delaunay triangulation is lower than inhierarchical approaches.The triangles can be merged not only to larger triangles but also to quadrilaterals

[Dav95]. This increases the compression ratio because the number of transformations issmaller in such case. When the basic Delaunay partitioning and the enhanced schemeresult in similar compression ratio then the quality of the reconstructed image is betterin the quadrilateral approach.

Irregular Regions

The methods that produce irregular shaped range regions realize the splitting simplyby utilizing the existing simple partitioning methods. The uniform partitions wereemployed in first algorithm based on irregular regions created by Thomas and Deravi[Tho95] but also in the work of other researchers [Sau96d, Tan96, Ruh97]. The quadtreepartitioning was introduced to irregular partitioning by Chang [Cha97, Cha00]; Ochottaand Saupe also used this schema [Och04].The small squares from first phase are merged to form larger squares or irregular

range blocks. This partitioning scheme adapts very well to the content of the image,which is being encoded.However, there are problems with concise description of the regions’ boundaries.

There are two main approaches to this issue: chain codes and region edge maps. The


(a) Cells with attachedsymbols

(b) Context for the symbol X (c) Region edge map example

Figure 3.3. Region edge maps and context modeling

chain coding describes the path that agrees with the boundaries. To specify this path astarting point and a sequence of symbols representing steps (simple actions: go straight,turn left, and turn right) must be stored in the fractal code. The length of the stepis equal to the length of the side of region block in uniform partitioning, and it isequal to the length of the side of the smallest region block in the quadtree. The mostbasic version of chain coding encodes each closed region boundary into one path withspecified starting position. The performance of such approach leaves much to be desiredbecause redundant information is present since almost all of the boundaries are shearedby two regions.The region edge map [Tat92] utilizes a grid of squares. If uniform partitioning was

used in splitting phase then the grid is equal to these partitions. If quadtree partitioningwas used then the cells in the grid have the same size as the smallest ranges – any range(of quadtree partitioning) can be either a union of cells or a single cell. Each singlecell is provided with one of four symbols that indicate whether (and where) there is arange boundary at the edge of the cell; the symbol is stored in two bits. There are onlyfour instances considered:1. no range boundary2. boundary on the North edge3. boundary on the West edge4. boundary on the North and on the West edgesThe region edge maps can be efficiently encoded with an adaptive arithmetic coding

and context modeling. The context is build of four cells processed (encoded or decoded)before the current cell – these are the neighbors in the West, North West, Northand North East directions. There can be 256 different combinations of symbols inthe context; some of these combinations indicate which symbols cannot occur in thecurrently processed cell. For example, when the symbol 1 or 2 is attached to the cellto the North and the symbol in the cell to the West is 1 or 3, then the current symbolcannot be 2 or 3 either. This fact allows shortening the fractal code.


(a) Chain code [Sau96d] (b) Region edge maps [Har00] (c) Quadtree-based region edgemaps [Och04]

Figure 3.4. Performance of compression methods with irregular partitions.

The irregular partitions guarantee good results of the encoding. Such partitioningschemes are ultra adaptive to the image content and since they are right-angled theyare devoid of the drawbacks of triangular partitioning. The experiments (see figure3.4) show that they outperform any other partitioning method. However, there is stilldisagreement which method is superior, what will be explained in the last section ofthis chapter.

3.2. Domain Pools and Virtual Codebooks

The two terms – the domain pool and the codebook are very close connected witheach other. In the literature, they are often used interchangeably, but here by a domainpool, in the context of fractal coding, the author means a set of domains (a subset ofall possible domains in the image) that is being used during searching for a matchingdomain for a range. The codebook blocks correspond to domain blocks but their sizeis the same as the size of the range. The set of all codebook blocks is called virtualcodebook. The codebook in fractal compression is virtual because it is not neededat the decoding (it is not stored in the fractal code) – it is used only during theencoding phase. Summarizing, the codebook denotes a set of codebook blocks, whichare contracted (downfiltered) domain blocks from the domain search pool.The length and contents of the domain pool (codebook) is crucial for the efficiency

of the encoding process. If the domain pool is larger then more bits are required for therepresentation of selected domain in the code. At the same time, larger domain poolentails longer time for searching a domain for a range; this results in much extendedencoding time. However, larger domain pool also has a positive effect – it helps toachieve higher fidelity.There are two main approaches to domain search that can be observed in different

encoding methods. The first one, called global codebook search, provides the samedomain pool (codebook) for all ranges of the image but there may be various thedomain pools and codebooks for different classes of ranges. Local domain pool search,the second approach, makes the codebook dependent on the position of the range.


3.2.1. Global Codebooks

This solution to domain search is based on an assumption that a range and a domaincan be paired into a transformation even if they lie in completely different parts of theimage. This assumption is confirmed by [Fis92c, Fis95a, Fri94] where authors statethat there cannot be determined a neighboring area of a range within which the bestdomain for the range lies.An example of a global codebook can be seen in section 2.2.2. In the example fractal

encoding algorithm, each domain block of the image is considered during the searchingof matching domains and ranges. Because the algorithm employs uniform partitions,the domain pool consists of blocks of same size. The interval between correspondingborders of neighboring domain blocks is equal to one pixel vertically or horizontally.This solution is very complex computationally due to large number of blocks withinthe codebook. The time cost is here very high but this procedure gives optimal loss ofinformation because the best matching between a range and a domain will be alwaysfound.In order to reduce the time cost larger intervals between blocks, which are appended

to the domain pool, are introduced. The literature gives two typical interval values:equal to the domain-block width or to half of the domain-block width. This simplemove significantly decreases the number of domains in the pool and, thanks to that,speeds up the searching for a domain. The main rule is that the larger the domainpool is the better fidelity is achieved but with higher time cost. So reducing the sizeof domain pools gives shorter searching time (and shorter encoding time), but moreinformation is lost (the errors between paired ranges and domains might be larger).Higher intervals between the domains in the pool result also in better convergence atthe decoder (less iterations are required to decode the image).The global codebook constructed like above can be used when the image is seg-

mented into uniform range blocks or with quadtree partitioning. It can be also usedwith HV partitioning, but a domain pool containing ranges (larger than currentlyprocessed range) or blocks created by the partitioning mechanism (used also for de-termining range-blocks) are more often used. These two last methods of constructingglobal domain pools can be also used with other adaptive partitioning schemes.In the quadtree scheme there is not one global domain pool but several – for each

class of ranges (all ranges within one class have same size) is provided a separate domainpool (and codebook) that contains domains twice as large as ranges within the class.

3.2.2. Local Codebooks

A number of researches [Jac93, Woo95, Hur93] have proven that the probabilitydensity of the spatial distances between ranges and matching domains has a distinctpeak at zero. This means that it is much more likely to pair a range with a domainthat is close to the range than with a distant one.The literature gives several ways in which the advantage of this fact can be taken.


(a) (from [Hur93]) (b) (from [Woo95])

Figure 3.5. Probability density function of block offset.

Restricted Search Area

In fact, the probability that a distant domain will be judged as a matching one isso small that the searching can be restricted to only spatially close domain blocks. Theremaining part of search algorithm remains unaffected. [Jac93]

Spiral Search

In the approach the search order is modified - the codebook blocks that are morelikely to provide a good match for currently processed range block are tested first.Therefore, the search is performed along to a spiral-shaped path, which has a beginningin the codebook block directly above the range block and gradually recedes from therange. The search area can be here restricted by defining maximal number of rangeblocks that shall be tested for each range block – the length of the path. [Bar94b]

Figure 3.6. Spiral search (from [Bar94b])


It can be noticed that the density of the domain blocks tested during the spiralsearch is higher at the beginning of the path (close to the range block).

Mask

Another way to determine a not numerous domain pool is to put a mask on theimage and center it at the currently processed range block. The mask indicates thelocations of domain blocks that should be included into the domain pool. These loca-tions are denser near to the center of the mask and condensation decreases with theincrease of the distance to the center. [Hur93]

Solutions Without Domain Search

There are several ways to eliminate the time-consuming domain search. The first ofthem pairs a range and a domain when the position of the domain block fulfills someconditions. For example, P. Wakefield [Wak97] proposes to pair domains with rangesin such manner, that the range block lies within the domain block and the dominantedge should be in the same relative position in both blocks. Other solutions force thematching domain to be in a fixed relative position to the range [Mon92] or restrict thedomain pool to a very small set of domains neighboring to the range [Mon93a].Because this class of fractal methods eliminates one of the most time-consuming

phases, the encoding is very significantly accelerated. At the same time, the search-freemethods give best rate-distortion results [Woo95]. However, in medical imaging the in-formation carried by the image is much more important than the achieved compressionratio and the search-free methods loose details by imprecise matching of domains andranges. But without any doubt, it can be said that local codebooks outperform globalones what have been proved in [Hur93] where the signal-to-noise-ratio were only 0.3 dBlower for the search with a mask than for a full search; at the same time the domainpool contained only 12% of the domains from the global pool.

3.3. Classes of Transformations

As it was already said, the transformations determined during encoding have to beaffine and contractive. However, this restriction is very weak and further limitationshave to be introduced in order to provide full automation of the encoding process. Thussearching for the transformations that will constitute the fractal code is performedonly within a limited class of affine transformations. The choice of this class influencesthe effectiveness, fidelity of the algorithm and convergence properties of the fractaloperator. Thus, the importance of selecting the right class of transformations cannotbe overrated since it is crucial for both process of compression – encoding end decoding.A transformation usually can be decomposed into three separate transformations

that are carried out one after another. Therefore, a single elemental block transforma-tion τi (from the domain block Di to the range block Ri) is a composition of threetransformations:

τi = τ Ii ◦ τSe ◦ τC

After the transformation τi is placed on the domain block Di, the resulting pixelsmay be copied into the range block Ri. Thus, transformation τi is the key part of theaffine transformation wi, which maps domain block Di onto range block Ri.


In order to transform a domain block into an appropriate range block firstly thedomain block is spatially contracted (transformation τC), the product of this phase isa codebook block. The order of pixels within one codebook block is deterministicallychanged by τSe , i.e. it is undergone one of symmetry operations like rotation or reflection.The used symmetry operation is taken from a fixed pool; e denotes here the index ofthe used operation. The last component transformation τ Ii is an intensity transform,which adjusts the brightness of the codebook block.The contraction transformation usually is the same for all domains. However, the

symmetry operation is known not before the searching for matching pairs of domainsand ranges. Same situation is with intensity transformation – when a domain and arange are compared (during the searching), this transform is defined in such way thatthe error between them is minimized.All domains Dk (0 ¬ k < D, where D – length of the domain pool) from the pool

are transformed by τC what gives the codebook of blocks Ck. The codebook can beexpanded thanks to symmetry operations – every block of the codebook is transformedby all symmetry operations and the products of these operations are included in thecodebook. Theoretically, this step should allow better matching between the codebookblock and the range block (during searching a domain block fitting to a range block).One has to keep in mind that the codebook is virtual, i. e. the codebook blocks are notstored in four copies that differ from each other only with the rotation angle – there isa single copy of a codebook block that is rotated during the search.Then the real search is being performed. For a range block Ri every codebook

block Ck is checked – the coefficients of the intensity transformation (that minimizethe distance between the codebook block and the range block) are calculated, i.e. thetransformation τik is being determined. From all of the τik (and, at the same time,from all of the codebook blocks) the one is picked that gives the minimal error betweenthe range block and the product of the transformation – the chosen transformationbecomes τi.The description of the contraction transformation τC can be sawed into the program

– the transformation is the same for all domains/ranges and the same for the encoderand the decoder. But information about the τSe and

Ii has to be attached to the fractal

code. In particular, the symmetry operation musts be pointed out and the coefficientsof the intensity transformations must be stored for every range block.

3.3.1. Spatial Contraction

The spatial contraction of domain is not necessary for the process of fractal compres-sion. The transformation must be contractive but the metrics that are used to assurethe contraction usually is not influenced by the spatial dimension [Dav96, Fis92c]. Asufficient constraint is that a domain block and a range block paired into one transfor-mation cannot be equal. However, the spatial contraction is commonly used in almostfractal compression methods. It was introduced by Jacquin [Jac90b] and it is movedout directly from the first fractal compression algorithm where the spatial size of squaredomain blocks was twice as large as the size of range blocks.Also using the same contraction ratio as Jacquin became a custom – the spatial

contraction usually reduces the dimensions of a domain block by two. However, it ispossible to adjust this number in order to achieve desired behavior of the encoder or


decoder. A contraction ratio higher than 2 : 1 decreases the number of iterations neededto reconstruct the image from fractal code (fractal operator) [Bea90]. It is possible toadjust the contractivity in such way that the decoding will be made by a single iteration[Fis95a]. A contraction ratio smaller than 2 : 1 entails higher error propagation duringdecoding. But it also has positive effects – it allows better approximations of rangeblocks with codebook blocks [Bar94b].In the original work of Jacquin [Jac90b], the domain block was contracted by aver-

aging of four neighboring pixel values into one. So according to this, when the widthand the height of the codebook block are equal to h and the contraction is made byfactor of 2 then a value of a pixel of a codebook block Ci can be calculated from thefollowing formula:

Ci (m,n) =Di (2m, 2n) +Di (2m+ 1, 2n) +Di (2m, 2n+ 1) +Di (2m+ 1, 2n+ 1)

4

for all m,n ∈ 0, · · · , h− 1. This formula can be easily generalized to any size of thecodebook block.The contraction by neighboring pixel averaging is still very popular but also other

solutions can be employed here. Barthel and Voye introduced anti-aliasing filter whatallowed to obtain better coding results [Bar94b]. Instead of averaging neighboring pix-els, the excess pixels can be removed. This solution slightly speeds up encoding buthas negative influence on the accuracy [Fis92c, Fis95a].

3.3.2. Symmetry Operations

The symmetry operations, called also isometries, operate on pixel values of a blockwithout changing their values. They change the positions of pixels within the block ina deterministic way. For a square block, there are eight canonical isometries [Jac90b]:1. identity2. orthogonal reflection about mid-vertical axis of block3. orthogonal reflection about mid-horizontal axis of block4. orthogonal reflection about first diagonal (m = n) of block5. orthogonal reflection about second diagonal of block6. rotation around center of block, through +90◦

7. rotation around center of block, through +1800◦

8. rotation around center of block, through −90◦The isometries significantly enlarge the size of the domain pool so they should

take effect in better fidelity of the reconstructed image. According to a number of re-searchers, all isometries are used in same frequency during encoding [Fri94, Mon94b].This proves that they are useful and fulfill their destination. At the same time,other authors prove that the isometries are dispensable and have no positive effect[Jac93, Lu97, Mon94b]. Probably different design choices not directly related with theisometries are the main cause of this contradiction. [Woh99].However, an overwhelming agreement can be observed in the literature, that the

use of the isometries results in weaker rate-distortion relation [Mon94b, Kao91, Sau96a,Woo94].Besides, other affine transformations can be used in place of isometries. [Lu97]


3.3.3. Block Intensity

The last component transformation also operates on pixel values but it changes theluminance of pixels instead their positions. Once again, the most basic intensity trans-formation was introduced already by Jacquin. It is linear and operates on one codebookblock (after application of symmetry operations) and one block of unit components:

C ′i = siCi + oi1

The si and oi denote the scaling and offset respectively. These coefficients are calculatedby the encoder when the best approximation R ≈ skCk + ok1 is found (0 ¬ k < C,where C – length of the codebook).Although the linear intensity transformation still can be found in many more present

fractal compression methods, other transformations can be found in the literature.According to the authors, these new approaches improve the fidelity of the compressionby enabling better approximation of a range block by a codebook block.

Orthogonalization

Øien [Øie94b] modified the intensity transform by introducing orthogonal projectionprior to scaling. From the codebook block the dc component is being subtracted. Thedc denotes the mean pixel value of the codebook block:

dc =C(1)i + · · ·+ C

(Ci)i

Ci

where Ci is the number of pixels in a codebook block.The intensity transform in this case can be described by following formula:

C ′i = si

(Ci −

〈Ci,1〉‖1‖2

)+ oi1

The 〈Ci,1〉 is the inner product of the codebook block and the block of fixed intensity,‖1‖ is the derived norm of an appropriate product space, i.e. L2 here.This transformation yields a block that is orthogonal to the block of unit coeffi-

cients – 1 and gives several advantages. First of all the si and the oi coefficients aredecorelated. When a special choice of domain pool is made (each domain is a unionof range blocks in quadtree partitioning, the contraction based on pixel averaging),the decoding is accelerated – the convergence of the decoder is guaranteed to be in afinite number of iterations. The number of iterations is independent of the si and oicoefficients and only the sizes of the domains and ranges influence it. [Øie93]

Multiple Fixed Blocks

The topic of multiple fixed blocks was raised by Øien, Lepsøy and Ramstad [Øie91]and continued by Monro [Mon93c, Mon93b] and many other researchers. The mainidea is based on replacing the single fixed block 1 with multiple fixed blocks Vh:

C ′i = siCi +∑h

oihVh


Multiple Codebook Blocks

Another approach uses several codebook blocks that are independently scaled:

C ′i =∑h

sihCih + oi1

It is also possible to merge the multiple fixed blocks approach with the multiple code-book blocks approach. In this case, also domains that do not have to be spatiallycontracted can be used. [GA94b, GA94a, GA96, GA93, Vin95]The linear combination multiple domain blocks and multiple fixed blocks was used

in [GA96] and resulted in great rate-distortion relation – at the bitrate 0.43, the peaksignal to noise ratio achieved 34.5 dB.

Polynomials

Other attempt to the intensity transformation [Mon93a, Mon94b, Mon94a] resignsfrom the linear character and uses higher order polynomials. When the transformationis a second order polynomial then an additional component is added to Jacquin’s basictransformation – the codebook block with quadratic form:

C ′i =[c′|si2c2 + si1c+ oi1

]where c symbolizes a matrix coefficient of Ci and c′ a coefficient of C ′i. The third orderpolynomials will require extending the transformation with one more component –codebook block with cubic form:

C ′i =[c′|si3c3 + si2c2 + si1c+ oi1

]When the basic linear transformation is used then a single pixel of the codebook blockis undergoes the following intensity transformation:

τ Ii (z) = siz + oi

The application of the polynomials modifies the shape of the fractal operator (com-pare to section 2.2.1) [Kwi01]. Here the operator takes following form:

wi

xyz

= ai bi 0ci di 00 0 τ Ii

xy1

+ eifi0

The τ Ii (intensity transformation) looks as follows:• order 2 polynomials

τ Ii (z) = si2z2 + si1z + oi1

• order 3 polynomials

τ Ii (z) = si3z3 + si2z

2 + si1z + oi1

Of course, also higher order polynomials can be applied but it results in worsecompression ratio because more parameters have to be encoded. However, the higherorder polynomials are used the better fidelity can be achieved [Lin97]. The use of secondorder polynomials turns out to be the best when it comes to the rate-distortion relation[Woo95].


3.4. Quantization

The quantization occurs in encoding as well as decoding. During the encoding,the scaling and offsets coefficients have to be quantized. The domain positions, thedescription of used symmetry operations and any partition description relevant to theadaptivity of the segmentation are represented by discrete values from the beginning.

3.4.1. Quantization During Encoding

Most often, a uniform quantization is used. Nevertheless, the distribution of thescaling or of the offset coefficient in general has a strongly non-uniform character.The application of a uniform quantization method entails inefficiency and entropycompression of quantized coefficients can be very useful for eliminating it.

(a) si2 · 103 (b) si1

(c) oi

Figure 3.7. Distributions of scaling and offset coefficients (second order polynomialintensity transformation) (from [Zha98])

.

The coefficients are stored on various numbers of bits in solutions of different re-searchers. The bit allocation for the scaling coefficient takes values from 2 ([Jac93]) to5 ([Øie94a]) and for the offset coefficient from 6 ([Jac93]) to 8 ([Øie94a]). The combi-nation of 5-bit quantization of the scaling coefficient si and 7-bit quantization of theoffset coefficient oi was found to be optimal [Fis95b].Besides uniform, also logarithmic and pdf-optimized quantizators were investigated

by researchers. The logarithmic quantization did not turn out to be better in the


context of fractal compression [Fis95a]. The pdf-optimized quantization shrank the bitallocation for parameters for a single domain block to 5 – 6 bits with small costs of thefidelity [Øie94a].The quantization of the coefficients can be made directly before adding them to the

fractal code. However, many algorithms, especially those that pay special attention tothe fidelity, quantize the coefficients before computing the error between a range blockand a transformed domain block. This solution slows down the encoder because not onlythe final coefficients but all of the scaling and offset coefficients (that are calculated forany domain from the pool during the search) are being quantized. But the quantizationoperation may have influence on the error value between a range and a domain (aftercontraction and isometries). Thus, not necessarily the same domain will be indicatedas the closest to a given range when the blocks are described real coefficients and afterapplying quantization. Quantization before the error computation also ensures thatboth the encoder and the decoder use the same coefficient values.The scaling and offset coefficients in transformations without orthogonalization are

correlated [Hur94, Bar93] what can be harnessed in two ways. The scaling and coeffi-cients together can be vector quantized [Bar94b, Bar93, Har97]. The offset can be alsopredicted from the scaling – linear prediction was used in [Hur94].

3.4.2. Quantization During Decoding

The quantization occurs also during decoding. Each iteration of the algorithm pro-duces an image that is an approximation of the fixed point of the IFS. In the originalapproach, the images created in successive iterations were stored as raster images, i.e.the pixel values were quantized. However the brightness of the fixed point’s pixelstakes real values and not discrete and the error caused by quantization in this solutionis propagating on the result of following iterations. This may cause difficulties withreaching the correct values of brightness of some pixels.This problem can be minimized by introducing matrices of real numbers to represent

the images created in successive iterations. This solution is called the Accurate Decodingwith Single-time Quantization and guarantees that the quantization will be performedonly once – when the matrix from the last iteration will be converted to a raster image.[Kwi01]

3.5. Decoding Approaches

The fractal code contains the quantized coefficients of the fractal operator. Thedecompression is actually the process of computing the fixed point described by thisoperator. The fractal operator is independent of the size of the original image so thedecoding may result in a reconstructed image in any size – the image may be zoomedin or zoomed out comparing to the original one.The basic decoding algorithm is based on PIFS and was already explained in section

2.2.1. One of advantages of fractal compression is fast decoding – usually it takes lessthan 10 iterations. However there are introduced some alternative approaches thatimprove the speed or accuracy of the process.


3.5.1. Pixel Chaining

The method can be utilized only when the intensity transform is based on subsam-pling. In such situation, each pixel of a range block is associated with one referencepixel in the domain block – the range and the domain are paired by a transformation.The reference pixel lies not only in the area of the domain block but also in the areaof some other range block. Thus, another reference pixel is associated with it. In thisway, a chain of pixels that are associated is created.The pixel chain can be used in two manners. The first way is utilized it to track the

path of influence in order to find a pixel with wanted value. The second way executesa part of the chain long enough to achieve acceptable pixel value. [Fis95a, Lu97]

3.5.2. Successive Correction Decoding

The basic decoding algorithm uses for each iteration a temporary image in whichthe changes are made by transformations. This means that the image that provides thevirtual codebook in current iteration remains unchanged by the transformations andthe range blocks are situated on the temporary image.The successive correction method is inspired by Gauss-Seidel correction scheme.

The basis of the successive correction algorithm is resignation from the temporaryimages – the transformations operate on the same image. The domain blocks coveringactually decoded range blocks are immediately updated, i.e. the change made by onetransformation is visible for transformations executed after that one but in the sameiteration.The main advantage of this technique is increased decoding speed. A further speed

improvement can be made by ordering the transformation. In the image are staked outdomains with different density. Transformations that have domain ranges in areas ofthe highest domain concentration are executed first in each iteration. [Kan96, Ham97b]

3.5.3. Hierarchical Decoding

The first stage of hierarchical decoding is actually nothing else as the baseline de-coding algorithm. The only difference is that the image is reconstructed at a very lowresolution – the size of the range blocks is reduced to only one pixel. This low-resolutionimage is treated as a basis to find the fixed point in any other resolution with a deter-ministic algorithm (similar like in wavelet coding – the transformations from domains toranges are treated as consecutive resolution approximations in the Haar wavelet basis).Because vectors of lower dimensions are processed during IFS reconstruction, thereare considerable computational savings comparing to the standard decoding scheme.[Bah93, Bah95, Mon95]

3.5.4. Decoding with orthogonalization

This approach was already mentioned in section 3.3.3. It requires some changes ofthe encoding process, i.e. all domain blocks from the pool have to consist of a union ofrange blocks and the intensity transform has to utilize orthogonalization. These restric-tions result in meaningful benefits: an uncomplex computationally decoding algorithmbased on pyramid-structure, decoding length independent of the transformation coef-


ficients (it depends only on domain and range sizes) and at least as fast as in the basicscheme. [Øie93]

3.6. Post-processing

Any fractal compression method is based on blocks and, because of this, blockartifacts are always present in the reconstructed image. In order to reduce the undesiredartifacts the reconstructed image can be post-processed. The block boundaries aresubjected to smoothing. [Fis95a, Lu97]There are at least several ways to reduce the block artifacts during post-processing.

The first one is simply the right choice of partitioning method – the overlapped rangesgive very good result, the blocks are also less noticeable when a highly adaptive parti-tioning method is used.A simple method that uses a lowpass filter can be engaged. However, the results

are not satisfying [Fis95c]. Other estimation-based methods, more complex, give betterperformance [Zak92, Ste93].There are also post-processing methods that depend on the partition scheme used

in the compression and heads for the best overall performance taking into considerationthe human visual system. The Laplacian pyramidal filtering presented in [Lin97] is anexample of such method.

3.7. Discussion

The chapter presents the diversity of issues connected with building a fractal com-pression method and at the same time the large diversity of the methods. Althoughthe basis of fractal compression remains the same in all implementations, there still isnotable latitude during the act of constructing a fractal compression method becausethere are no standards to it and only a general idea how to utilize the fractal theoryto image compression. This freedom can be problematic because there is not alwaysagreement which solutions in particular elements of the fractal compression methodyield the best effects. This confusion is being amplified by the fact that each designdecision influences on the performance of other design elements.As an example, the choice of partitioning scheme can be given there – there is

a disagreement in the literature which one is the best. Some researchers appointthe simple quadtree scheme as the superior comparing to polygonal and HV parti-tions [Reu94]. Others, at the same time, prove that the HV partitioning gives betterresults than the quadtree [Fis94, Har00, Ruh97]. However, most of the researchersagree that irregular regions give better results in rate-distortion sense over quadtreescheme [Och04, Sau96d, Cha97, Ruh97, Har00, Bre98]. The comparison of HV withirregular schemes does not show as large superiority of the method based on irreg-ular regions [Har00], especially the methods utilizing quadtree partitioning in thesplit phase [Cha97, Cha00, Och04], or even these two approaches yield very similarrate-distortion performance [Ruh97]. One can notice that for small compression ratios,for which the best fidelity can be obtained, the HV partitioning results in slightly betterPeak-Signal-to-Noise ratio. However, the irregular-shaped approaches allow encoding


an image with the same PSNR ratio faster. A remarkable observation is that none ofthe partitioning schemes that are not based on right-angled regions matches to theperformance of above-mentioned methods [Woh99].The effectiveness of fractal compression can be improved by merging it with trans-

form coding or wavelets. Nevertheless, such hybrid-methods are not discussed in thedocument.

Chapter 4

Accelerating Encoding

The main drawback of fractal image compression is its computational complexityand resulting from it long encoding time. The most time-consuming part of the encodingscheme is the search through the domain block pool in order to find the best possiblematch for a given range block. The time complexity of the encoding is O(n), i.e. thetime spent for each search is linear in the number of domains (n) in the pool.The researchers have undertaken many attempts to accelerate the encoding process.

The solutions proposed by them can be divided into two groups: complexity reductiontechniques and parallelization of the implementation. The chapter presents short de-scriptions with explanations of the most successful acceleration techniques.

4.1. Codebook Reduction

The reduction of the codebook/domain pool size is the simplest and most obviousspeed-up techniques. The first way to achieve this is to utilize the local codebookinstead of the global codebook or the full search (compare with section 3.2). There arealso other techniques that decrease the size of the domain pool independently of itstype.The domain pool can contain domains that are very close to other (in error mea-

sure sense). Eliminating such domains allows to significantly reduce the size of thedomain pool without loss of the fidelity – when the distance between the domains isbelow certain level then after contraction they will become the same (or almost thesame) codebook blocks. The method utilizes the invariant representation, which willbe explained below. [Sig97]A block with low variance cannot be changed into a block with higher variance

by any transformation that is considered in fractal coding. However, uniform orlow-variance blocks can be generated with fixed block or absorbent transformation.[Jac90a] Thus, a range block has to be paired with a domain block with higher vari-ance to create a contractive affine transformation and there is no need for keeping thelow-variance domains in the pool. The awareness of this fact allowed Saupe to create adomain pool reduction technique that excludes a fraction of the domain pool from thesearch equal to 1−α. The size of the fraction was adjusted with the parameter α ∈ (0, 1]

Chapter 4. Accelerating Encoding 49

in order to investigate the impact of the pool reduction on the image fidelity, compu-tation time and compression ratio. The results for encoding with Fisher’s quadtreepartitioning scheme are as follows. The computation time is directly proportional tothe parameter α and the reduction of the domain pool with this method does notinfluence negatively the fidelity (even for low values of α, e.g. α = 0.15) and even itcan slightly improve the fidelity. [Sau96b]

4.2. Invariant Representation and Invariant Features

During the search for matching domains and ranges, the domain blocks and rangeblocks cannot be compared directly. The distance is measured between the range blockand the transformed domain block – after contraction, isometries (not in all algorithms)and intensity transform. Thus the problem is not only in finding the correct domainblock for the range block but also to find the transformation parameters that willminimize the distance between the blocks.The invariant representations/features of blocks were introduced to fractal com-

pression in order to ease the distance measure by enabling direct comparison of thedomain block and the range block.The original version of invariant features proposed by Novak utilizes a 4-dimensional

feature vector to each block [Nov93]. The components of the vector are invariant mo-ments, which are defined from the gray level distribution of the block. One vector fullysuffices for one domain because it is insensible to any geometric transformation (isome-tries) of the domain block. The shape of the feature vector depends on the luminanceof the block. To solve this problem, a normalization procedure (with respect to meanand variance) was introduced.There are several drawbacks of this method. There is no (and there cannot be) a

theory that would motivate the argument that closeness of the feature vectors ensurescloseness of the range and domain blocks in the error measure sense. Another problemis that the blocks with negative intensity are not considered at all; this problem canhave negative effect on the fidelity. The nearest neighbor search is impossible in thisapproach without logarithmic rescaling because the components of the vectors takevalues from various orders of magnitude. [Sau96c]The method elaborated by Novak was originally designed for triangular partitioning.

Frigaard adapted his work to quadtree partitioning but he removed the normalizationfrom the method because according to him it can decrease the quality of the encoding.[Fri95]Other approaches were presented by Gotting, Ibenthal and Grigat [Got95, Got97]

but also by Popescu and Yan [Pop93].Generally, the invariant representation techniques differ from the invariant features

because they cannot be invariant to the block isometries. However, they are invariantto the block intensity. The basic approach is based on an orthogonal projection ofthe block onto the orthogonal complement of the space spanned by the fixed blockcoefficients, which is followed by a normalization. Other approach utilizes the DCT(after applying the transform the dc coefficient is zeroed), followed by normalization.[Bea90, Sau95b, Woh95]


A great advantage of the invariant representation, besides the search time reduction,is the possibility of adaption of the distance measure to the properties of the humanvisual system. [Bea90, Bar94a]

4.3. Nearest Neighbor Search

This method of accelerating the search boils down the range-domain block matchingproblem to a nearest neighbor search problem. The time complexity is reduced fromO(n) to O(log n).Prior the proper search, a preprocessing stage is being performed, where a set of

the codebook blocks to be searched is arranged in an appropriate data structure – treestructures are usually used. The nearest neighbor search utilizes the invariant repre-sentations of blocks, for which a function is provided that gives the Euclidean distancebetween the projections of a domain block and a range block. The minimization of thisdistance shall be equivalent to the minimization of the error between the domain andthe range. Thus, the set contains codebook blocks that are the closest to the actuallyconsidered range block in terms of the Euclidean distance measure.Many algorithms were developed to determine the neighborhood of the range block

– existing techniques [Ary93, Sam90] have been applied to fractal compression [Kom95,Sau95b, Sau95a] as well as new ones have been especially designed [BE95, Cas95].

4.4. Block Classification

Both type of blocks, domain and range, have features that can be used to classifythem. Each block is contained only by one class. This allows delimiting the searchprocess only a part of the domain pool – the domain has to be in the same classas the range. Although the time complexity is still linear, the classification of blocksreduces the factor of proportionality in the O(n) complexity. The literature providesmany different classification schemes, which can be divided in three groups discussedin following subsections.

4.4.1. Classification by Geometric Features

The block classification occurred already in Jacquin’s work [Jac89, Jac90a, Jac92]were the classification designed for vector quantization by Ramamurthi and Gesho[Ram86] was adapted for the purpose of fractal compression. In this scheme, the do-mains are divided according to their block geometry into four classes: shade blocks,simple and mixed edge blocks, and midrange blocks. The shade blocks class containsblocks only with very low variance of the intensity. To the classes of edge blocks belongblocks where strong changes of the intensity are observed. A block is “midrange” whenit has considerable variance but with no pronounced edge.Since the shade blocks can be replaced with fixed block or an absorbent transfor-

mation imposed on block with higher variance, the whole class of shade blocks doesnot have to be searched during matching a domain with a range (compare with section4.1). Thus, the block classification method can be bound with domain pool reduction.[Jac90a, Jac90b]


The main drawback of this classification scheme is weak performance for largeblocks or blocks with weak edges or strongly contrasted textures. Such blocks oftenare incorrectly classified due to inaccurate bock analysis what results in artifacts inreconstructed images [Jac90a]. The speed-up is also not very impressive since there areonly four classes.

4.4.2. Classification by intensity and variance

The classification technique elaborated by Jacobs, Fisher and Boss works on squareblocks subdivided into four quadrants. The upper left quadrant is marked with B1,upper right with B2, lower left with B3, and lower right with B4. For each quadrant theaverage pixel intensity Bj and variance Vj is computed (j = 1, . . . , 4). With symmetryoperations, the block can be transformed in such way that the quadrants will be orderedin one of three ways:• B1 B2 B3 B4• B1 B2 B4 B3• B1 B4 B2 B3No other order has to be considered because one (and only one) of above orders alwayscan be attributed to any block. Thus based on the three orderings of the quadrantsaccording to average intensity, the three major classes are defined.When a block is classified into one of the major classes then the subclass is com-

puted. The subclasses are defined by the order of the quadrants (according to theaverage intensity) with the constraint that symmetry operations are not allowed. Sincethere are four quadrants, for each major class there are 24 subclasses.The range and the domain blocks, which are bounded by a transformation τi, must

have the same order of the quadrants (when scaling coefficient si > 0) or the oppositeorder (when si < 0). Thus during the search two subclasses of domain blocks have tobe searched in order to find the best match for a range block.This method has one weakness – the impossibility to extend the search to neigh-

boring classes. This would be very helpful when a domain that yields an error smallerthat the admissible distance between a range and a domain cannot be found in eitherof the two subclasses.The second classification scheme based on the intensity and variance, proposed by

Caso, Obrador and Kuo [Cas95], is devoid of this drawback. In this scheme, the majorclasses remain as they were in the scheme of Boss, Fisher and Jacobs. However, thesubclasses are based on strongly quantized vectors of variances. Each vector producesa class of domains and it is possible to point out neighboring classes.

4.4.3. Archetype classification

This classification scheme was elaborated by Boss and Jacobs [Bos95]. This methoddefines the classes during preliminary learning phase, where several training imagesare examined. From a set of codebook blocks, one block is singled out for the role ofarchetype. This privileged block (Da) is the one that best approximates other blocksin the set:

Da = arg minDp

∑j 6=pminsi,oi‖Dj − (siDa + oi1‖


An iteration of the learning process starts from an arbitrary classification of blocksbelonging to training images. The authors of the scheme used here the Boss, Fisher andJacobs method based on intensity and variance. For each of these preliminary classes anarchetype is computed. When all archetypes are known, a new set of classes is defined– blocks are moved from the prior classes to classes with archetype that covers thembest. The archetype-computation and reclassification is repeated in a loop as long asany change in the classification occurs in iteration.The final set of archetypes can be used during encoding of any number of images.

When an image is being compressed, all domain blocks are classified using the set ofarchetypes – for each block an archetype that can it cover best is found (just after theimage segmentation and domain pool definition).Experiments carried out by Boss and Jacobs [Bos95] prove that a given range block

can be covered very well by a domain block that is classified to the same class. This solu-tion guarantees the acceleration of the encoding process and the fidelity. The archetypeclassification is much slower than a conventional classification scheme for low qualityimage encoding since it is much more complex. However, when high fidelity has to beassured then the archetype classification turns out to be faster.

4.5. Block Clustering

The clustering method is similar to the classification. Blocks are here clusteredaround centers, which can be computed adaptively from the image to be encoded offrom a set of training images. The use of the set of training images reduces time costsdue to the lack of the computational cost of clustering during encoding. During thesearch, firstly the optimum cluster center is located and then the optimum domainwithin this cluster. The classes, i.e. the sets of blocks grouped around the clustercenters, are defined by the clustering algorithm.The clusters define disjoint subsets of domains. The subsets are defined by cluster

centers, which at the same time are their representatives. The distances between blockswithin one cluster should be smaller than distances to blocks from other clusters. Acriterion function is used to measure the quality of the clustering.A range block is encoded in two phases. Firstly, the closest cluster center is found

and, after that, the range block is compared to domain blocks within the cluster ofthe found center. This schema is very similar as in search within one class. Insteadperforming the search only through one cluster, also clusters of the centers neighboringto the closest center can be searched. This would increase the fidelity but with the costof time.There were several clustering methods introduced. They can be classified into three

groups: based on generalized Lloyd algorithm, the pairwise nearest neighbor algorithmand self-organizing maps. Øien, Lepsoy and Ramstad presented an efficient clusteringmethod based on the Lloyd algorithm [Øie92, Øie93, Lep95]. En example of this classof clustering methods is also proposed by Davoine, Antonini, Chassery and Barlaudin [Dav96]. The nearest neighbor approach can be found in work of Wein and Blake[Wei96] and Saupe [Sau95c]. The self-organizing maps were utilized already in the firstapproach to adaptive clustering for fractal compression [Bog92] but with unsatisfyingresult. Hamzaoui combined the self-organizing maps with the Fisher’s classification


scheme (based on intensity and variance) and the Saupe’s nearest neighbor algorithm[Ham97a].

4.6. Excluding impossible matches

The aim of the block classification or clustering is to group the most likely matchesfor actually considered range block. This method is based on an opposite foundation –the domain blocks that cannot be matched are excluded. The literature gives two wayshow this can be achieved.The first one utilizes inner products with a fixed set of vectors. The range and the

domain are independently compared with a certain unit vector and the domain blockcan provide a good approximation to the range block only when the results of thesecomparisons are similar. Thanks to them, the lower distance bound is provided duringthe domain-range comparison, what allows eliminating many of the domains from theprecise distance measure. [Bed92]The second solution is basing on the distribution of energy within the image blocks,

which is treated as a feature of the blocks. These features are used to detect distanceinequalities. [Cas95]

4.7. Tree Structured Search

To facilitate the search, the blocks can be organized in a tree structure. An exampleof tree structured search can be the solution proposed by Caso, Obrador and Kuo.[Cas95]The tree is being built in a recursive algorithm as follows.

1 Choose size sb of the buckets of domain blocks and assign the whole domain poolto container PD

2 Choose two random domain blocks – they become the parent blocks.3 Assign each of remaining domain blocks to the one of the parent blocks, which isclosest to the currently considered domain block (in error sense). At this point,the initial set PD is divided into two subsets PD1 and PD2 .

4 If the size of the subset PD1 is greater than sb then perform steps 2 – 4 for thissubset (PD1 will take the role of PD in that iteration). Repeat this step for thesubset PD2 .

During the search, a range is compared to the domain blocks starting from theroot. At each level of the binary tree, one node is chosen that is closer to the rangeblock. The path of the range finishes when a bucket leaf is encountered – the closestdomain block is chosen from this bucket or from the path leading to this bucket andthis finishes the search process.The fidelity of this method can be improved by enhancing the search to some nearby

buckets. The authors provide a numerical test based on the angle criterion to determinethe neighboring buckets.There are plenty of other accelerating techniques that use tree structure, e.g. the

majority of nearest neighbor search techniques or multiresolution search.


4.8. Multiresolution Search

The method described in this section has many names: pyramidal search, multigridmethod, multiresolution analysis, multilevel approach and hierarchical representation[Duf92, Ros84, Ram76, Tan96]. It was independently adapted to the fractal imagecompression by Lin and Ventsanopoulos [Lin95a, Lin95b], Dekking [Dek95a, Dek95b]and Kawamata, Nagahisa and Higuchi [MK94]. The multiresolution approach is basedon tree structure where each level represents blocks with different resolution. The rootof the tree contains the domains in the coarsest resolution. Successive levels containcopies of the same blocks in resolution twice as large as the prior level (i.e. the blockfield is four times larger).The search begins in the root of the tree, at each level of the tree considered are

only these blocks that connected with best match from previous level, i.e. which areversions of the block found on prior level but with finer resolution. At each level ofthe pyramid, the range block is downsampled to the same resolution as the domainblocks. The computational cost, which is proportional to the product of the number ofdomains to be searched and the number of pixels in each block, is significantly reducedby these method.Other methods that are related to the multiresolution approach were proposed by

Caso, Obrador and Kuo [Cas95] and Bani-Eqbal [BE95].

4.9. Reduction of Time Needed for Distance Calculation

An important part of the time is consumed by calculation of the distance betweenthe range and the codebook block thus improvements in this area are desired.The search time may be decreased by computing partial distance known from vector

quantization. In this method an invariant representation is being constructed basing onHadamard transform (Haar transform was used for this purpose in [Cas95]) coefficientsin zigzag scan order. The transform shifts the most of the variance to the initial elementsof the vector. [Bea90]The most time consuming part of the distance calculation is the computation of the

inner product between codebook and range blocks. Effectiveness of these computationsmay be significantly improved by calculating the convolution (cross-correlation) of aparticular range block with downfiltered whole image (in the frequency domain) –the inner products between the range and all codebook blocks are given. The methodtakes the advantage of the fact that the domain blocks (thus also codebook blocks) areoverlapping in the image and does not bring about any trade-off between the fidelityand the speedup. Many of the acceleration methods may result in suboptimal choiceof the match between domains and ranges but here the codebook block with minimalcollage error is matched, i.e. no additional loss of information is caused.Typically the computations of optimal scaling and offset coefficients, si and oi re-

spectively, are carried out from the formulas:

sji =Cj 〈Cj, Ri〉 − 〈Cj,1〉〈Ri,1〉Cj 〈Cj, Cj〉 − 〈Cj,1〉2

, oji =〈Ri,1〉 − s 〈Cj,1〉

Cj


where Cj is the number of pixels in a codebook block during the search in followingorder (compare with the algorithm in section 2.2.2):

1 compute⟨Ck, Ck

⟩and

⟨Ck,1

⟩for all codebook blocks Ck ∈ C

2 for each range Ri ∈ R:2. a compute 〈Ri, Ri〉 and 〈Ri,1〉2. b for all codebook blocks Cj ∈ C:

2. b. i compute 〈Cj, Ri〉2. b. ii compute coefficients sji and o

ji

2. b. iii compute the distance d(τ I ji (Cj), Ri), where τ I

ji = s

jiCj + oji1 is the

intensity transformation

It can be observed in this algorithm that the only values that haveto be computed for one pair of codebook block and range block are:〈Cj, Ri〉 , 〈Cj, Cj〉 , 〈Cj,1〉 , 〈Ri, Ri〉 , 〈Ri,1〉 because the coefficients sji , o

ji and distance

measure d are functions of these values. From the five values, the computation of theinner products of the domains and ranges consumes the most time since it is performedin the most inner loop.These computations are moved to the more outer loop by the application of the

method based on fast convolution. The search process is modified in following manner:

1 compute⟨Ck, Ck

⟩and

⟨Ck,1

⟩for all codebook blocks Ck ∈ C

2 for each range Ri ∈ R:2. a compute 〈Ri, Ri〉 and 〈Ri, 1〉2. b compute the convolution of the range Ri with the downfiltered image2. c for all codebook blocks Cj ∈ C:

2. c. i compute coefficients sji and oji

2. c. ii compute the distance d(τ I ji (Cj), Ri), where τ I

ji = s

jiCj + oji1 is the in-

tensity transformation

Also the computations of the products⟨Ck, Ck

⟩and

⟨Ck,1

⟩can be also accelerated

by the convolution technique.

4.10. Parallelization

The speed-up by reduction of the computational complexity gives significant resultsbut it can be strengthen by parallelizing the coding algorithm. This is especially simpleand promising since each range block is encoded independently – the domain searchesare also independent. The literature [Lin97, Kwi01] gives several possible parallel im-plementations:• The process of choosing the best transformation parameters done in parallel forparts of the transformation area or for each transformation.

• All range blocks are encoded in parallel – each range block is encoded in separatethread/process.

• The domain block searches for each range block are implemented in parallel.These are not exclusive options – for example, they can be combined with nested

parallel pattern: encoding different range blocks and searching the domain blocks foreach range block are both parallelized.

Chapter 5

Proposed Method

The review of fractal compression methods presented in chapter 3 reveals two ap-proaches to partitioning that are regarded by researchers as the best in rate-distortionsense. The first one is based on hierarchical horizontal-vertical partitioning and thesecond one utilizes irregular regions.

(a) Performance of the compression methods according to the lit-erature.

(b) Performance of the compres-sion methods at compression ratiosacceptable for medical images.

Figure 5.1. Comparison of fractal compression methods based on irregular-shaped re-gions and HV partitioning.

The figure 5.1, which presents the performance of the best fractal compressionmethods using one of these partitioning schemes, clearly shows that irregular regionsare superior to HV partitions at higher compression ratios. Since the image fidelityhas to be preserved in medical imaging, the highest compression ratios will never beused and the HV scheme turns out to be better at low or very low compression ratios.

Chapter 5. Proposed Method 57

Although the irregular partitions built from product of quadtree-based split phase[Och04] is way ahead of all other methods, there is no proof that it will give alsoas good results at lower compression ratios. Therefore, since the increase of PSNRwith reduction of the CR is smaller for the irregular regions, it is assumed that thehorizontal-vertical approach is supreme at low compression ratios.Thus, the horizontal-vertical partitioning was found to be the best option for med-

ical images. This approach is implemented and tested. Nevertheless, it is also consid-ered whether the best hierarchical partitioning method can be adapted for the useas the first stage (splitting) to construct irregular regions. The utilization of the HVmethod shall give further improvement comparing to the irregular regions based onquadtree-partitioning.

5.1. The Encoding Algorithm Outline

The design of the elaborated algorithm is typical for hierarchical partitioning meth-ods. The outline of the algorithm is enclosed in the block diagram in figure 5.2.The presented algorithm always finds the best matching codebook block to the

current range block. However, there is considered also other approach, in which thesearch is being interrupted after finding first codebook block that yields error fulfillingtolerance criterion.

5.2. Splitting the Blocks

The method briefly described in section 3.1.3 is the basis of the block divisionalgorithm in the proposed compression method. However, some improvements are in-troduced. In the following description, the symbols m and n denote the index of thecolumn and row. The size of the block to be divided is width(Ri)× height(Ri).The original algorithm was able to detect the most significant edge when a row or

column with higher mean intensity was prior during traversing the image. Two columnswill produce a negative value of vm when the mean intensity of the pixels in the columnm is smaller than the mean intensity in column m + 1. Because the maximal value ofvm indicates the edge that will be used for division, the most significant edge may bemissed. Analogous situation will take place while determining the maximal hn. Theoriginal formulas are:


width(Ri)·

height(Ri)−1∑n

rm,n −height(Ri)−1∑

n

rm+1,n

hn =min(n, height(Ri)− 1− n)

height(Ri)·

width(Ri)−1∑m


m

rm,n+1


Figure 5.2. Hierarchical fractal encoding. Block diagram.


To ensure that the actual most significant edge will become the cutting line, theformulas used to calculate vm and hn are slightly modified:


width(Ri)·

∣∣∣∣∣∣height(Ri)−1∑

n


n

rm+1,n

∣∣∣∣∣∣hn =

min(n, height(Ri)− 1− n)height(Ri)

·

∣∣∣∣∣∣width(Ri)−1∑

m


m

rm,n+1

∣∣∣∣∣∣The first part of each of above formulas prevents from creating very thin or very

flat rectangles. This will be called “Fisher’s rectangle degradation mechanism” and itutilizes the function gFisher, which is put on the possible cutting line locations (indexesof the columns / rows):

gFisher(x) =min(x, xmax − x)

xmax

where x ∈ 〈0, xmax〉 and x is an integer.Thus the formulas vm and hn can be presented in the following form:

vm = gFisher(m) · EvED(m)

hn = gFisher(n) · EhED(n)

where the functions EvED and EhED are used to detect the most significant vertical and

horizontal line:

EvED(m) =

∣∣∣∣∣∣height(Ri)−1∑

n


n

rm+1,n

∣∣∣∣∣∣EhED(n) =

∣∣∣∣∣∣width(Ri)−1∑

m


m

rm,n+1

∣∣∣∣∣∣The rectangle degradation prevention mechanism in above-presented formulas has

a great influence on the product of these formulas. Because of this, the search forthe most significant edge may be misled. Because of such mistake, a very significantedge might be found in the middle of a block that cannot be divided (because of thesize of the block). In order to inspect if this feature of the original mechanism hasa negative influence on the resulting image segmentation and compression quality, asimple alternate version of the above formulas were elaborated, which minimize theinfluence of the mechanism on the outcome of block splitting. This alternate formulasuse a binary function for the degradation prevention. The function eliminates cuttinglines location that cannot produce two rectangles meeting the range size thresholdsizet. Any other locations are given same weights:

gflat(x) ={0 when (x < sizet) ∨ (xmax − x < sizet)1 otherwise

The functions EvED and EhED remain unchanged. The “flat” rectangle degradation

prevention mechanism gives the following vm and hn functions:


vm ={0 when (m < sizet) ∨ (width(Ri)− 1−m < sizet)|∑n rm,n −∑n rm+1,n| otherwise

hn ={0 when (n < sizet) ∨ (height(Ri)− 1− n < sizet)|∑m rm,n −∑m rm,n+1| otherwise

In the work devoted to optimal hierarchical partitions [Sau98], a different crite-rion to determine the line, along which the block shall be divided. The split is hereperformed in such way that the sum of the square errors between resulting blocks andblocks consisting of pixels equal to the average intensity on appropriate range block areminimized. Thus, the cutting line minimizes the sum of the intensity variances withinthe two new blocks. The formulas for the errors are:

EvVM(m) =m∑i=0

height(Ri)−1∑j=0

(ri,j − dcleft(m))2 +width(Ri)−1∑i=m+1

height(Ri)−1∑j=0

(ri,j − dcright(m))2

EhVM(n) =width(Ri)−1∑i=0

n∑j=0

(ri,j − dctop(n))2 +width(Ri)−1∑i=0

height(Ri)−1∑j=n+1

(ri,j − dcbottom(n))2

The minimal values of these formulas yield best vertical and horizontal cutting line.If min(v0, v1, . . . , vwidth(Ri)−1) ¬ min(h0, h1, . . . , hheight(R)i)−1) then the division is donealong the vertical line, otherwise along the horizontal one.Also, a different rectangle degradation mechanisms are used. First of all the resulting

rectangles cannot have less than 2 pixels in width and in height. Besides, the sum ofsquare errors is multiplied by following function

gSaupe(x) = 0.4[( 2xmaxx− 1

)2+ 1

]

where x = m and xmax = width(Ri)−1 or x = n and xmax = height(Ri)−1 dependingon whether the vm or hn is being calculated.The formulas for vm and hn in this approach are as follows:

vm = gSaupe(m) · EvVM

hn = gSaupe(n) · EhVMThe vm and hn are not maximized like in Fisher’s method but minimized.Altogether, there are possible six possible methods to divide blocks. These methods

are created by combining different elements of the above-described methods:• Edge detection with Fisher’s rectangle degradation prevention mechanism

vm = gFisher(m) · EvED , hn = gFisher(n) · EhED

• Variance minimization with Saupe’s rectangle degradation prevention mechanism

vm = gSaupe(m) · EvVM , hn = gSaupe(n) · EhVM


• Edge detection with Saupe’s rectangle degradation prevention mechanism

vm = g∗Saupe(m) · EvED , hn = g∗Saupe(n) · EhED

• Variance minimization with Fisher’s rectangle degradation mechanism

vm = g∗Fisher(m) · EvVM , hn = g∗Fisher(n) · EhVM

• Edge detection with flat rectangle degradation mechanism

vm = gflat(m) · EvED , hn = gflat(n) · EhED

• Variance minimization with flat rectangle degradation mechanism

vm = gflat(m) · EvVM , hn = gflat(n) · EhVM

When Fisher’ rectangle degradation mechanism is used with Saupe’s splitting tech-nique (variance minimization) or Saupe’s rectangle degradation mechanism is usedwith splitting technique based on Fisher’s work (edge detection) then the mathemat-ical function used to prevent from rectangle degradation has to be changed. The newfunction (g∗) is based on the original one (g): g∗(x) = max (g)− g(x). The max(g) de-notes the maximum of the function g. Thus, Saupe’s rectangle degradation preventionmechanism will be changed to:

g∗Saupe(x) = 0.8− 0.4[( 2xmaxx− 1

)2+ 1

]

and Fisher’s mechanism to

g∗Fisher (g(x)) =xmax2− min(x, xmax − x)

xmax

This change is necessary because the two splitting techniques use different methods docalculate the cutting line and to decide whether the horizontal or the vertical line shallbe used.One has to keep in mind that when the range block division is made along the most

significant horizontal/vertical edge (splitting method based on Fisher’s work) then themaximal value from all of the vm and hn values points out the cutting line position(m ∈ 〈0, width(Ri)− 1〉, n ∈ 〈0, height(Ri)− 1〉). When the division should minimizethe variance in the produced blocks (Saupe’s approach) then the minimal value fromall vm and hn values determines the position of the cutting line.Independently from the approach used to split blocks and the rectangle prevention

mechanism the blocks are divided only if there cannot be found a codebook block forcurrent range. Such situation may be caused by the size of the range block – the rangeblock has to be at least CF times smaller than the image that is being encoded (the CFdenotes the contraction factor used by spatial contraction transformation τC). However,the most often occurring reason of the necessity to divide a range is not meeting theerror threshold by the best match for the range. The error threshold (called also thedistance tolerance criterion) is set by the user before starting the encoding.


The output of this phase also may be twofold. If there does not exist a division ofthe range block that will produce two blocks with sides larger or equal to the rangesize threshold sizet (preset by the user), the range block will be sent to be stored inthe fractal code. The second possible output is a pair of range blocks divided by themechanisms that is in force for current encoding process. If the splitting ends with asuccess then the divided range block Ri is omitted in further processing and the twonew ranges Ri1, Ri2 are added to queue of ranges to be encoded.

5.3. Codebook

In the proposed encoding method, the codebook blocks are described with its po-sition in the spatially contracted image, block size and identifier of the symmetryoperation put on the codebook block. During the encoding, the codebook blocks arenot stored and processed as matricies of pixels but only their descriptions, which allowto load the proper pixels from image when they are needed.For clarity, the term “sub-codebook” is introduced and shall be understood as the

subset of codebook’s blocks, which contains only these codebook blocks that will becompared with a given range blocks during the search. All same-sized range blocks arecompared with codebook blocks from the same sub-codebook. In fractal compressionbased on uniform partitioning there would be only one sub-codebook and it wouldcontain all codebook blocks. The HV-partitioning method is characterized with highdiversity of range block’ shapes, what results in a great number of sub-codebooks. Themore sub-codebooks there are the larger the codebook is – more time is needed to buildit and more memory is needed to store it.Two different approaches to the codebook are proposed and investigated.

5.3.1. On-the-fly Codebook

The first type, called the “on-the-fly codebook” or the “light codebook”, is onlya piece of functionality that takes on the input the position of the last consideredcodebook block and produces the next codebook block. If symmetry operations areallowed then also the identifier of the symmetry operation put on the last codebookblock also has to be passed.The on-the-fly codebook creates the codebook blocks descriptions during the search

process (when they are needed) and automatically disposes the descriptions if thecodebook blocks cannot be matched with currently considered range block. This meansthat there exists no collection of codebook blocks and that the sub-codebooks will becreated independently for a each range block, even if there are range blocks that requireexactly the same sub-codebooks (this takes place when two range blocks have exactlythe same dimensions and the algorithm pairs each range block with best matchingcodebook block.If the algorithm is not forced to find the best match for each range block, the

building process of a sub-codebook for a concrete range block is ended when the firstmatching codebook block is found. After creation of a codebook block, its inner prod-ucts are calculated (〈C,C〉 and 〈C, 1〉, necessary to calculate the coefficients of the


intensity transformation). The independence in building sub-codebooks results in rep-etition of these calculations.

5.3.2. Solid Codebook

The “solid codebook” also provides a codebook block when it is needed during thesearch exactly in the same manner as the light codebook. However, the solid codebookis also a container for the codebook blocks, which is filled up in a preliminary phase. Forall codebook blocks that will be compared with more than one range blocks, the innerproducts are computed only once because all created codebook blocks’ descriptions arestored for further use.The drawback of this codebook is that there is a time cost caused by the preliminary

phase. Another drawback is the fact that to the solid codebook block might be addedblocks that cannot be included to any sub-codebook, what increases the time cost andmemory use. Even if a codebook block falls to a sub-codebook it does not guaranteethat it will be needed when the user does not wish to find always the best match and ifall range blocks of the same-size will be paired with a codebook block before all otherblocks within the same sub-codebook will be tested.These drawbacks do not occur in the light codebook, where always only these

codebook blocks’ descriptions are created that are needed but some of them have tobe constructed (with inner product calculations) from the beginning more than once.The most time-consuming part of creating the codebook blocks’ descriptions are the

inner products’ calculations. In order to avoid calculating inner products for codebookblocks that will be never used during the encoding, it is possible to postpone thesecalculations until the first access to the codebook block.

5.3.3. Hybrid Solutions

In order to balance the advantages and disadvantages of the two presented ap-proaches to codebooks, a hybrid codebook is introduced. Such codebook combines theadvantages of the two above-presented approaches with downplaying their drawbacks.Instead creating a solid codebook filled with blocks only a framework of the code-

book is created in the preliminary phase. The solid codebook is filled during the encod-ing using the light codebook. Simply, before searching for a codebook block, matchingto given range block, the algorithm checks if there were already encoded range blocksof the same size (i.e. range blocks that use the same sub-codebook). If there weresuch range blocks then successive codebook blocks are taken from the solid codebook,otherwise from the light codebook.When the light codebook is in use then all codebook blocks given by the codebook

are automatically inserted to the solid codebook. If the search shall be interruptedafter finding first codebook block that fulfills the error tolerance condition (accordingto the user’s will) then all untested codebook blocks have also to be defined by thelight codebook in order to make the solid codebook complete.The larger a range block is the more possible proportions of its sides’ lengths exist.

Each such proportion generates a separate sub-codebook. At the same time, the larger acodebook block is, the smaller amount of same-sized codebook blocks can be packed inthe image (offset between adjacent codebook blocks is the same regardless of their size)


– thus, most probably, the larger a codebook block is the less numerous its appropriatesub-codebook will be. This observation is the basis of the next way of combining solidand light codebook – all range blocks smaller than some given by the user size will usethe solid codebook (which will be created only up to this preset size) and all largerrange blocks will be bounded with the light codebook.

5.4. Symmetry Operations

The implemented compression algorthm can optionally use isometries while thecodebook is created. The isometries are realized by permuting the pixels of the block.The top left corner of the block remains in the same location after any of the symmetrytransformation because by this vertex the location of the codebook block in the imageis described. There are eight possible symmetry operations (presented in figure 5.4):• identity: τS1 (ci,j) = ci,j• orthogonal reflection about mid-horizontal axis: τS2 (ci,j) = ci,height−1−j• orthogonal reflection about mid-vertical axis: τS3 (ci,j) = cwidth−1−i,j• orthogonal reflection about first diagnoal: τS4 (ci,j) = cj,i• orthogonal reflection about second diagonal: τS5 (ci,j) = cheight−1−j,width−1−i• rotation around point (max(width, height)/2,max(width, height)/2) by 90 deg:τS6 (ci,j) = cj,width−1−i

• rotation around point (width/2, height/2) by 180 deg:τS7 (ci,j) = cwidth−1−i,height−1−j

• rotation around point (min(width, height)/2,min(width, height)/2) by -90 deg:τS8 (ci,j) = cheight−1−j,iThe symmetry operations are used in above order, i.e. when all codebook blocks

transformed with symmetry operation indexed with l in above list were considered thenthe contracted domain blocks are subjected to the l + 1 symmetry operation and theresulting blocks are added to the solid codebook or successively tested (light codebook).The isometries enlarge the codebook (and each subcodebook) by eight times; thus,

it should be possible to find a better match for each range block because there aremore options. However, higher number of codebook blocks to consider results also inlonger encoding time. Because the topic of the use of isometries is very controversial(see 3.3.2), there will be checked whether there is any benefit from the use of them.

5.5. Constructing the fractal code

5.5.1. Standard Approach

The method of constructing the binary code for the fractal operator determinedby the fractal compression methods based on HV partitions is similar in both Fisher’s[Fis92b] and Saupe’s [Sau98] methods. The code consists of information about thepartition and the information regarding the transformations. The partition informa-tion is actually an overhead brought about by the HV partitioning method, there aremethods, e.g. based on uniform squares or quadtree based, that do not require thistype of information attached to the binary representation of the fractal operator. Here


(a) identity (b) orthogonal reflectionabout horizontal axis

(c) orthogonal reflectionabout vertical axis

(d) orthogonal reflectionabout first diagnoal

(e) orthogonal reflectionabout second diagonal

(f) rotation by 90 deg

(g) rotation by 180 deg (h) rotation by -90 deg

Figure 5.3. Isometries for rectangle block.

the description of the whole tree of range blocks is stored. The binary tree is createdduring encoding – when a block is divided then it automatically becomes the parentnode of the newly created blocks. When encoding is finished, the range blocks thatconstitute the fractal operator are the leaves of the tree.The proposed algorithm of this standard approach is as follows. The fractal code is

being constructed by traversing the tree. Each node finds its reflection in the code butdifferent information is stored about inner nodes and about the leaves. The descriptionof any inner node RIN consumes L bits:

L ={1 + dlog2 (width(RIN)− 1− sizet)e when RIN is divided vertically1 + dlog2 (height(RIN)− 1− sizet)e when RIN is divided horizontally

One bit is used to store information whether the block is divided vertically or horizon-tally and the rest of the bits contain the position of the cutting line. The number of thebits for the line position is various – the smaller the range is the fewer bits are needed.Also the number of bits may be reduced by utilizing the knowledge of the minimalblock size through elimination of the impossible positions of the line – the range sizethreshold can be passed once for all ranges (in the file header).


The leaf ranges produce code that describes the transformation, which they arepart of. The description of the transformation consists of the position of the rangeblock (indicated by the descriptions of the inner nodes), position of the domain block,coefficients of the intensity transform. If the isometries are allowed, identifier of thesymmetry operation also has to be added to this description (information whether ornot isometries are used is stored in the file header). If the scaling coefficient of theintensity transformation is equal to zero then the codebook block location is not storedin the fractal code – it is replaced by the fixed (uniform) block at the decoder withoutany loss of quality.The position of the domain block normally is stored on

dlog2 [(M − 1− sizet) · (N − 1− sizet)]e bits. However, the author of this dissertationintroduces the location of the codebook block to the fractal code instead specifyingthe location of the domain block. The change has almost no effect on the fractalcompression algorithm and the file format but allows to save 2 log2CF bits pertransformation, where CF is the contraction factor of the τC transformation.Codebook blocks cannot be placed in any point of the image – the location of

their top left vertex must be in accordance with the domain offset set up by the user.This means that the coordinates of each domain block are dividable by the domainoffset and each codebook block coordinates are dividable by domainOffset/CF . Thisremark become the foundation of the next proposition of improvement – the coordinatesof any codebook block will be translated to the coordinates system with unit sizeequal to domainOffset/CF . This step allows to save log2(domainOffset/CF ) perone codebook block. Therefore, because for some number of transformations, where thescaling coefficient of the intensity transform is zero, no information about codebookblock location is has to be stored, almost one bit is saved per transformation.The quantized scaling coefficient of the intensity transform will be stored on 5 bits

and the offset on 7 bits, which were provided as optimal in [Fis92a]. It is worthy ofmentioning the in the optimal hierarchical method created by Saupe et al. [Sau98] only6 bits were used for the offset.For the symmetry operation indicator it is enough to spend 3 bits because there

are only 8 possible isometries. As it was already mentioned, it is very doubtful thatthe use of isometries has any positive effect. Thus, it will be also checked encodingwithout symmetry operations, what should help to save not only these three bits pertransformation but also time because of the smaller size of the codebook. There is onemore way to save some disk space.

5.5.2. New Approach

The above-presented manner of constructing the fractal code have proved itselfby giving rate-distortion performance that ensured one of top places to the fractalcompression methods based on HV partitions. However, the medical images createsuch characteristic class of digital images that one can attempt to build a file formatthat will suit and adapt best to the features of the class.

The idea

The fidelity of the medical images has to be preserved at very high level. At thesame time medical images contain natural objects, thus it is hard to find flat regions


of the image. The fidelity of such images can be assured by adequately large numberof transformations and this results in rather small range blocks because the size of theoriginal images is restricted. This assumption was the basis of the proposed approachto encoding range blocks into binary code.The original approach presented in previous section stores the whole partition tree.

This can be a disadvantage when there are countless levels in this tree. The smallerare the range blocks, the higher is the tree and, because of this, the overhead of theinformation becomes larger. The elaborated method attempts to describe only theleaves of the tree in an efficient way in order to avoid the superfluous descriptions ofthe non-leaf nodes.As it was said in the introduction to this chapter, the irregular partitioning per-

forms much better than the uniform/quadtree partitioning method. The applicationof adaptive quadtree partitioning to the splitting phase gave a further improvement.Thus, one may consider creating a fractal compression method with irregular partitionsbased on HV partitioning instead the quadtree approach. Because the HV scheme isdefinitely superior over the quadtree then it might be expected that the irregular parti-tioning based on HV ranges will outperform irregular-region method based on quadtreescheme. The standard approach, which stores the entire partitioning tree, seems to beinapplicable because the fractal compression method based on irregular regions modifiesthe partitioning of the image by merging of some range blocks and the final shape ofthe range blocks could not be stored by writing locations of the cutting lines in theinner nodes. The proposed approach to the representation of the fractal operator canbe easily adapted to the purpose of irregular regions (explained in the section 5.5.4).

Bit Allocation per Transformation

The simplest way to eliminate the inner nodes from the fractal code is to describethe range block by storing its location and dimension. However, this would be highlyinefficient because the location would take dlog2 [(M − 1− sizet) · (N − 1− sizet)]e(M ×N is the image size and sizet the range size threshold). Moreover, the size of therange would take the same amount of space. For example, after encoding an 512× 512image, a single range would take 2 ·18 bits for the location and dimension of the range,also 18 bits for the location of the domain block, 12 bits for the intensity transformationcoefficients and 3 bits for the isometries. All this together gives 69 bits per a singletransformation. According to [Sau98] the number of 10000 ranges results in PSNRequal to 39.1, what is a satisfactory result. This number of ranges gives 2.632 bits perpixel and 3.04 compression ratio, when the above-mentioned method for range blockdescription is used. The original approach to saving the transformations gives here 6.47compression ratio, 32.4 bits per range and 1.236 bits per pixel [Sau98]. This means thatthe simple solution is not a good solution in this case – it performs twice weaker thanthe tested safe solution.Thus, the goal of the new approach is to store the location and width of the range

on smaller number of bits. This can be achieved by translating the position of a rangeblock into other coordinate system. One way to do this is to store the relative positionof the range to the previous range. The distance between two ranges is minimal or closeto minimal when:• the blocks have one common border or


• the blocks have one common vertexThis solution has several drawbacks. First of all this is a very weak improvementbecause the height and width of the ranges are still constrained only by the size ofthe image and the factor CF of the contraction transformation, i.e. they can achieveeven the value of M/CF and N/CF respectively. Thus if the contraction factor CF isequal to 2 then only one bit can be saved on a single transformation. Another problemwould be ordering of the transformation before storing to file. It can happen that tworanges that are encoded one after another are adjoining with opposite borders of theimage. This, however, can be solved by allowing wrapping around the image. The lastdrawback of this solution is the fact that the size of the space needed for the size of therange was not reduced at all. In general, this attempt results in very week improvementand still this approach is inferior to the original one.

(a) Whole range block in onecell.

(b) The bottom right vertex inthe neighboring cell to the East.

(c) The bottom right vertexin the neighboring cell to theSouth.

(d) The bottom right vertexin the neighboring cell to theSouth-East.

(e) The bottom right vertex in adistant cell.

Figure 5.4. The structure of the fractal code describing a single transformation depend-ing on the position of the range block with respect to the position of the underlying

cell.

The author comes with an idea of placing a grid onto the image, which tends to bemuch more promising. All cells in the grid shall have the same size. In order to utilize


the bit allocation to maximal extent the width and height of the cell shall be a powerof two.The location of each range will be translated to the coordinate system with the

point (0, 0) in the upper left corner of the cell that encloses the upper left vertex of therange block. This translation can be performed without any major costs. All rangesof one cell shall be grouped together in the fractal code in order to avoid placing thecoordinates of the cell (in the image’s coordinate system) before each coordinates ofthe range block (in the proper cell’s coordinate system).The order of traversing the cells can be made fixed at the encoder and at the decoder,

and at the beginning of each group of all ranges that lie in the same cell, the number ofthe ranges in the group shall be placed in the fractal code. These two informations: theorder of traversing and the number of ranges in each cells is enough to find the locationof each cell. Thus, it is also enough to retranslate the coordinates of each range blockto the image’s coordinate system. Since the bit allocation per transformation can befully determined by the decoder, no other information is necessary.The solution can give considerable disk space saving, which will be investigated

on the example, where the image 512 × 512 is encoded to 10000 ranges. The averagenumber of pixels in a range is about 26. When a grid with cell size 32 × 32 is put onthe image then almost 40 average ranges can be packed in one cell. The location ofthe range blocks would be stored in this solution on 5 bits instead of 9 bits, but thenumber of ranges within the cell has to be stored for each cell (i.e. for each 40 ranges).This number can be stored at log2 [d322/size2t )e] bits per transformation. If the sizethreshold is 2 then 8 bits are needed to store the number of ranges within one cell;this gives the cost of 0.2 bits per one range block. Comparing to the simple solutionpresented before, 3.8 bits are saved per transformation. This is still not enough andfurther improvements must be made.When there are a large number of ranges, then most of them are rather small. Thus,

there is quite a chance that the entire range block will fit into one cell. This gives theopportunity of efficient storing the size of ranges. Since there will be always ranges thatwill lie on the edges of the cells, a single bit has to be introduced to indicate whetherthe bottom right vertex of the range lies in the same cell as the upper left. If this bitwill be set to 1 then the coordinates of the bottom right vertex will be translated tothe coordinate system of the cell of the upper left vertex. Otherwise, the location ofthe second vertex shall be given in the system of the image. Instead storing the widthand the height of the range block, the author suggests storing the coordinates of twoopposite vertices with utilization of the grid put on the image.Next step made by the author to optimize the fractal code length is based on an

observation that there can be two reasons why a range block cannot be fitted into acell. The first reason is that the width or the height of the block is larger than thecell. Such ranges always will be possible to appear and the absolute coordinates of thesecond vertex have to be stored in these cases.However, the second reason why the bottom right vertex lies on the other side of

the cell border than the upper left vertex can be used to shorten the fractal code. Suchsituation takes place when the upper left vertex lies too close to the cell border andcan happen even with the range blocks of the minimal allowed size. To eliminate thelarge overhead of cost of the second vertex location storage a neighborhood of the cells


Figure 5.5. Grid putted on an image. The currently processed cell and the neighboringcells are marked with triangles with labels. The order of traversing the grid is showed

by the arrows in beckground.

is introduced. Instead of the absolute location of the second vertex, the neighboringcell will be indicated in which the vertex lies and the location of the vertex in thecoordinate system of this neighboring cell.When ranges with width and height smaller than the width/height of the cells are

taken into consideration, there are only three neighboring cells in which the secondvertices can lie. In these cells, vertices of blocks that have width or height larger thanthe width/height of the cell but smaller than the dimensions of the cell multiplied bytwo also can lie. In this situation, it can also happen that the bottom right vertex willbe beyond the borders of the neighboring cells. If a range has width or height largerthan the width/height of cells multiplied by two then the second vertex will always liebeyond the borders of the neighboring cells. However, it is believed that such situationwill occur rarely when it comes to accurate compression of medical images.To determine whether the second vertex lies in neighboring cell and to appoint that

cell two bits are needed. Three values of four possible numbers obtained thanks to thesetwo bits will indicate the neighboring cell – in our case the neighboring cells are: tothe East, to the South, and to the South-East. The fourth value will indicate that thebottom right vertex does not lie in any of the neighboring cells. For a range block withwidth or height larger than the side of the cells an oversized representation is forcedby described here solutions. The size of the cells should be automatically adjusted inorder to make such situations really rare.Theoretically there it is meaningless if the location of a second vertex lying in a

distant cell will be expressed in the terms of the image coordinate system or if firstlythe identifier of the cell in which it lies will be given and then the coordinates in the


system of that cell. However, the second solution creates new possibilities of furtherimprovements. The number of cells is relatively small comparing to the number ofpixels in the image. Thus, the cells can be treated as letters of alphabet and encodedwith variable length code. It is obvious that to some of the cells will be referenced moreoften because the distribution and size of the range block is not uniform in adaptivefractal compression when a natural image is being compressed. Thus, there is a greatchance to shorten the code length. The adaptive Huffman compression seems to be theright for the purpose. The elaborated adaptive Huffman coding method is presented innext section.The processes of saving transformations to file and reading them from file will begin

in the bottom right cell. In this cell there cannot be any ranges that do not lie entirelyin the cell because the right and bottom borders of the cell are at the same timethe borders of the image and the ranges that lie on the North and West borders areincluded to the neighboring cells instead this one. During processing ranges of this cell,no Huffman tree is present. When the processing goes to the next cell, the bottomright cell becomes the first node of the tree. The cells are being traversed in rows fromright to the left and after a row is finished then the processing moves to the higherrow. When finishing the saving/reading of all ranges from a given cell, the cell (as analphabet symbol) is added to the Huffman tree. The more symbols are in the tree thelonger codes are for some of the cells. However, the most frequently referenced cellshave the shortest codes prescribed.

Adaptive Huffman Coding of Cell Identifiers

The Huffman coding is a lossless entropy coding methods that gives in best results insymbol-by-symbol coding. A symbol-by-symbol method ascribes code words to succes-sive input symbols in accordance with their order in the input stream. The code wordsare variable length and in one-to-one relation with the input symbols. Better results oflossless compression can be achieved only with one method – arithmetic coding, whichalso is an entropy coding technique. However, this method cannot be applied withconnection to fractal coding because it is bounded with much higher computationalcost, which is already too high in the fractal method.The static version of the Huffman coding calculates the occurrence frequencies of

the symbols and then builds the Huffman tree based on that knowledge. The moretimes a symbol occurred, the closer the symbol is to the root of the Huffman tree; thenumber of occurrences becomes the weight of the symbol. The code words are createdsimply by traversing the tree from the root to the leaf with currently considered symbol– at each node of the path is added a binary literal to the code word, 0 if the node isthe left child of its parent and 1 in the opposite situation.The static algorithm has two basic drawbacks. The first one is that the frequency

table must be attached to the code. The second problem is the performance of thealgorithm when the occurrence frequencies are similar – the compression ratio is positiveonly when the input symbols have oversized binary representations in the input stream.The adaptive version of the Huffman coding algorithm solves both problems. Here theHuffman tree is being rebuilt after encoding/decoding of each input symbol.However, the adaptive version brings about some new problems. The first problem

is how to encode a symbol that occurs for the first time in the input stream. In such


Figure 5.6. Example of Huffman tree used for constructing binary representation offractal operator. Nodes marked with weights and indexes in the ordered list.

situation, the Huffman tree contains only a subset of the input stream alphabet con-taining only symbols that occurred in the already encoded part of the input stream.There are two ways to solve it. The first one is to add a tree leaf to the tree that willrepresent any not-yet-transmitted symbol (NYT). When the encoder will encode somesymbol for the first time then it will store the code of the NYT leaf followed by thebinary representation of the new symbol (requires log2 |A| bits, where A denotes thealphabet of the input stream and |A| its length). Second solution is to add all inputsymbols to the tree with weight 1 before starting encoding/decoding.The proposed Huffman method differs from the basic algorithms and it is designed

especially for coding the cell identifiers in the fractal code of the format describedabove. Here the nodes can have weights in the range (0, 1) or any positive integernumber.The new symbols are added to the tree when there is a chance that they will be

used. The order of traversing the cells during storing transformations to file ensuresthat no cell will be referenced during describing the location of the lower right vertexof a range block before storing all range blocks that have their upper left vertex in thecell. A new node receives the weight 1 · 10−5. Thanks to this, the root of any subtreewith all leaves with not-yet-transmitted symbols will always have weight smaller thanone.As a proof, it is enough to say that the maximal image size is 2048 × 2048, the

minimal allowable cell size is 2×2. This gives the maximum possible number of symbols(possible cells’ coordinates) equal to 1024. A binary tree of 10 levels height (height =log2(leafNo) in a full binary tree) is required when all symbols are not-yet-transmitted(none cell contains a second vertex of any range block). Because the weight of each innernode is equal to the sum of the weights of its children (which are equal in describedsituation) and the number of nodes is twice smaller as on the lower level, the sum ofweights of nodes on each level is the same. Then the weight of the root of the tree


(with only not-yet-transmitted) symbols will be weight(root) = levelNo · leafNo ·weight(leaf) = 10 · 1024 · 10−5 = 0.1024.The weight of a root of any subtree where all leaves contain not-yet-transmitted

symbols cannot be larger than calculated value, thus this root will be on the same orlower level as the leaf with the symbol that occurred least frequently but did occur atleast once. This fact is very important for the effectiveness of the algorithm becauseall symbols besides the least frequently occurred will have optimal codes just as if thecodes would be created in a static Huffman coding algorithm (imposed on the alreadyencoded part of the input stream with all already transmitted symbols as the alphabet).The simplest way to implement the adaptive Huffman coding is to rebuild the whole

Huffman tree (in exactly same manner how it is being built in the static version) afterprocessing of each successive symbol occurrence. This seems to be rather ineffectivewhen there are a large number of symbols in the tree because a single weight incrementmight propagate only on a limited part of the tree or even it may cause no changes.This is why a tree updating algorithm is introduced, that involves only the smallestnecessary subtree that has to be transformed in order to preserve the properties of theHuffman tree. The weights of the nodes in the tree have to grow from the left to theright on each level and from the leaves to the root.The tree updating algorithm uses a list of all nodes that are ordered as it is showed

in the figure 5.6. When a symbol occurs on the input then the appropriate node weightis incremented (if previously the weight was an integer number) or set to 1 (if the weightwas smaller than one) – the node is denoted with node1. Then it is checked whetherthere are elements with higher indexes but with smaller weight than the updatedsymbol. If such symbols exist in the list then the node containing the currently updatedsymbol is swapped with the node that contains the symbol with the highest index andweight smaller then the new weight of current symbol – the node is marked with node2.However, this simple refactoring can be done only if there does not exist a subtree

that contains both of the nodes to be swapped. The opposite situation will happenwhen the node2 is an ancestor of the node1 and requires some more complicated trans-formation. If the node1 is the left child of the node2 or a descendant of the left child,then the node1 is swapped with the right child of the node2. If the node1 is the rightchild of the node2 then no tree update is needed. If the node1 is a descendant of theright child of node2 then the children of the node2 are swapped one with another andafter that the node1 (which is already a descendant of the left child) is swapped withthe right child of the node2.When the location of the node1 is already established then the whole algorithm

described in previous section is repeated for its new parent because its weight shallalso be increased. The processing is recursively repeated until the root is reached.If the node1 (with the currently occurred symbol) had (before the increment) the

weight equal or larger than one then the node2 has the same value. Thus, the weightof the previous parent of node1 will be unchanged. But if the node1 contains a symbolthat occurs for the first time then the node2 can have any weight smaller than 1 (ofcourse only values k · 10−5 are allowed). This means that the subtree containing allnot-yet-transmitted symbols has to be updated with the same algorithm – startingwith the node2 (placed in the new location) and ending on the closest to the root nodewith weight smaller than one.


5.5.3. Choosing the Parameters

The proposed method of constructing the binary code can give various effectivenessdepending on the imposed size of the cells. The cell size is chosen by the algorithmfor each fractal operator that shall be stored by estimating the number of bits pertransformation through the calculation explained below.Any transformation, independently from the location of the second vertex, always

consumes some number of bits that is a consequence of the following:• the location of the top left corner of the range:

L1 = dlog2 (width(cell) · height(cell))e bits

• the single bit is needed to determine if the second vertex lies in the same cell asthe first vertex

L2 = 1 bit

• the intensity transformation coefficients:

L3 = 5 + 7 bits

• the location of the matching codebook block’s top left corner in the appropriatecell:

L4 =⌈log2

(⌈M

CR

⌉·⌈N

CR

⌉)⌉bits

• the descriptions of the cells (number of ranges in a cell) also result in cost that isallocated equally to all ranges:

L5 =1W·⌈log2

(⌈width(cell)sizet

⌉·⌈height(cell)sizet

⌉+ 1

)⌉· M

width(cell)· N

height(cell)bits

• the header of the file contains information that is required to properly read thefractal operator from file and requires L6 =

∑ex=a L6x:

◦ the original image size:

L6a =1W· log2 (2048 · 2048) =

22Wbits

◦ the range size threshold (maximal allowed range size threshold is 64):

L6b =1W· log2 64 =

5Wbits

◦ the size of the cells used to create the fractal code (maximal allowed image sizeis 2048× 2048):

L6c =1W· log2 (2048 · 2048) =

22Wbits

◦ the contraction ratio used for fractal compression (maximal allowed CF is 3,minimal allowed CF is 1, the CF values is always an integer value):

L6d =1W· dlog2 (3)e =

2Wbits


◦ the number of bits used to store the scaling and offset coefficients of the in-tensity transformations. The number of scaling bits may take values from therange 〈2, 10〉 and the number of offset bits from the range 〈2, 8〉:

L6e =1W· (dlog2 (8)e+ dlog2 (6)e) =

6Wbits

So, the base bit cost of a single transformation is:

LB =6∑x=1

Lx bits

The regions, which have their bottom right vertex in a different cell than the cell ofthe upper left vertex, carry additional costs. These costs relate only to a limited subsetof transformations. The numbers of elements of the subsets depends on the size of thecell and on the sizes of the ranges. If the second vertex is placed in one of neighboringcells then the overhead is relatively small – only 2 bits. However, when it is in somemore distant cell, then the overhead is equal to these two bits plus the bit length ofthe Huffman code identifying the cell in which the vertex lies. These additional costscan be produced by ranges divided into three classes depending on their size:• ranges with the width smaller than the width of the cells and the height smallerthan the height of the cells will require additional 2 bits when it is placed closerto the bottom border of the cell than its height or closer to the right border of thecell than its width. There are no chances that such range will require more bitsthan in this situation.

• ranges with width larger than the cell width or with height larger than the cellheight; but none of these two dimensions is larger than the corresponding celldimension multiplied by two. This class of ranges always requires additional bitsto describe the location of the second vertex:◦ if the x coordinate of the upper left vertex of the range is smaller than

2 · width(cell)− width(Ri)

and the y coordinate is smaller than

2 · height(cell)− height(Ri)

then the second vertex lies in a neighboring cell and only 2 additional bits areneeded

◦ if at least one of above conditions is not fulfilled then 2 bits plus the length ofthe Huffman code for the cell identifier is added to the base bit length.

• the last situation takes place when the width of the range is larger than the widthof two cells or its height is larger than the double height of cells. In such casealways will be attached the identifier of the cell in which the second vertex lies.Thus, the bit length of the transformation is extended by 2 bits plus the length ofthe Huffman code.The accurate number of bits per transformation can be met only after investigating

each range block independently and summarizing the total additional bits required bythe blocks. However, it is very important to be able to estimate the approximation of


this value because the size of the cell is adaptively chosen to each single image. Forall powers of two, larger or equal to two and smaller than the width and the height ofthe image, the number of bits per transformation shall be approximately computed inorder to find the best size of the cell. This approximate value is calculated based onthe average size of the ranges and on all possible localizations of the range blocks thatbring about additional bits to the transformation description. Simply, from all of thepixels the ones are counted that yield two additional pixel, and the ones that cause theincrease of the bit length by 2 plus the length of the Huffman code for cell identifier.The numbers of such localization, divided by the total number of pixels (i.e. the totalnumber of possible range block localizations) result in potential percentage of the rangeblocks that will require 2 additional bits and 2 bits + Huffman code length.Thus, if the average range block dimensions is smaller than the currently consid-

ered cell size then the percentage of range blocks that have second vertex outside theboundaries of the neighboring blocks is pronounced to be equal to zero (pdistant = 0).However, the percentage of the ranges that require two additional bits as the indicatorof the neighboring cell in which the second vertex lies is positive.

pneighboring =avgHeight(R) · avgWidth(R) · (width(cell)− avgWidth(R))

width(cell) · height(cell)

pdistant = 0

If at least one of the average block dimensions turns out to be larger than thecorresponding cell dimension, but still the range block dimensions are smaller thandouble cell size, then both situations can occur. Some range block localizations willincrease the bit length of the transformation representation by two and the rest of thelocalizations will enlarge the bit length by 2 + bit length of the Huffman codes:

pneighboring = min (2 · width(cell)− avgWidth(R), width(cell))· min (2 · height(cell)− avgHeight(R), height(cell))/ [width(cell) · height(cell)]

pdistant = 1− pneighboring

If the average width and average height of the range blocks are larger than thewidth and height of the cell respectively then all potential localizations of range blocksare bounded with the necessity of putting the cell identifier before the localization ofthe second vertex in the code of the transformation:

pneighboring = 0

pdistant = 1

This estimation is performed by calculating the L for all pairs of values that canbecome the width or the height of the cells.

L = LB + 2 · pneighboring +[2 + log2

(M

width(cell)· N

height(cell)

)]· pdistant bits


The width and the height of the cell do not necessarily have to be equal one to another.By allowing various values of the width and the height better adaption of the cellshape to the range blocks can be achieved, what minimizes the number of bits pertransformation. One should notice that during computing this estimation the fixedlength identifiers of cells are used instead variable length Huffman codes.The result of this estimation will for sure differ from the actual bits number per

transformation because the simplification is large. When the average range block issmaller than the cell, it does not mean that there cannot occur ranges that will gothrough several cells and that will significantly increase the number of bits per trans-formation. Same goes for the last situation, where it is assumed that all ranges will bestored on the maximal possible number of bits. The actual number of bits per trans-formation will be surly lower than the estimated one. The used measure will performespecially poorly when the image contains large flat regions and, at the same time,there are many regions of the image where a high density of details can be observed.In this case, the real result may differ a lot from the estimated one. However, in themost typical situations the formulas shall be quite good models of the actual imagereactions of the number of bits per transformations on the changes of the cell size.

5.5.4. Adaption to irregular regions coding

A fractal coding method with irregular partitioning based on HV scheme in thesplitting phase would result in regions, which sides can have any length. When theuniform/quadtree scheme is employed, the range regions are not so elastic because thelengths of sides of the irregular regions are equal to the length of the smallest possiblelength of range block created in the splitting phase multiplied by an integer number.Because of this, no method of describing the shape of the region created for alreadyexisting irregular-region based coding methods can be used (see 3.1.4). The only wayto store the shape of the region when the splitting was made by HV partitioning is tostore the coordinates of each vertex in the range.The method of construction of the binary representation of the fractal operator

created with the HV is nothing else but a method how to store efficiently the verticesof the rectangle blocks. The adaption to irregular regions is not very large. Firstly,a binary representation of each region has to contain information about the numberof vertices in the region, which has the coordinates stored in the fractal code. Then,starting from the top left vertex, every second vertex is stored in the binary code. Thecoordinates of the vertex that was omitted can be calculated from the coordinates ofits neighboring vertices. There are only two possibilities: the calculated vertex is abovethe diagonal connecting the previous with the next vertex (the x coordinate is sameas in the previous vertex and the y coordinate is same as in the next vertex) or thecalculated vertex is below this diagonal (the y coordinate is same as in the previousvertex and the x coordinate is same as in the next vertex). Description of all verticesbesides the top left vertex must contain a single bit to indicate where the previous(omitted) vertex lies with respect to the diagonal.No other change has to be introduced in order to use the proposed method of

describing the fractal transformations with a binary representation. The number ofbits per a single transformation will be significantly larger than in the HV codingmethod. However, the compression method based on irregular regions is expected to


produce smaller number of ranges so the overall number of bits per pixel may be lowerthan in the HV coding method.

5.6. Time Cost Reduction

5.6.1. Variance-based Acceleration

The proposed method to accelerate the fractal encoding is related on block clas-sification (section 4.4), codebook reduction (section ch4cr) and excluding impossiblematches (section 4.6). The work of He, Yang and Huang [CH04] is the basis of theproposed method.There are only two classes:• shade blocks• ordinary blocksThe classification is made based on the intensity variance of the blocks. If the

variance value is smaller than the value given by the user, the range block will beapproximated by a fixed block – no search after matching codebook block is made.The “ordinary” blocks are the ones for which the codebook blocks have to be found.Lee and Lee [Lee98] state that the distance between the range and codebook block

is very close connected to the distance between the variances of the blocks. Smallervariance distance between two blocks should result in smaller error between them.To measure the variance the following equation is used:

σ(Bi) =1Bi

Bi−1∑j=0

b2j −

Bi−1∑i=0

bj

2

where Bi = b0, . . . , bBi−1 is the measured block and Bi is the number of pixels in theblock Bi.Before the encoding process is started, the user sets the variance value ∆σ that will

delimit a subset of the codebook that shall be considered during the search. The subsetwill contain different codebook blocks depending on the range block that currently isbeing encoded.All codebook blocks that have variance larger than the variance of the range block

plus the variance value given by the user are omitted in the search – they are treatedas blocks that cannot yield error smaller than the tolerance criterion.Because the scaling coefficient of the intensity transformation is restricted to values

from the range 〈0, 1〉, also all codebook blocks, which have variance smaller than thecurrent range block, can be removed from considerations.Thus, for each range block a subset of codebook is dynamically created. The subset

contains all codebook blocks that potentially can be matched with the range block andno other codebook block is considered for the range block. The content of the codebooksubset is:

C ⊇ CRi = {Cj : σ(Cj) > 0 ∧ σ(Ri) + tσ < σ(Cj)}

where: C – the entire codebook, CRi – the subset of the codebook with blocks thatshall be compared with range block Ri during the search.


5.6.2. Parallelization

The encoding of a process can be accelerated by parallelization. Parallel processingis very promising because the affine transformation that composes the fractal code canbe found independently. The parallelization is possible also in the proposed encodingalgorithm where the user can fix the number of threads that shall be used to theencoding.Before the encoding starts, a pool of threads is created with the given number of

threads. The threads share a queue of range blocks that await to be encoded – the firstrange block is added to this queue by the main thread and it covers the whole image.Each thread consumes one range block from the beginning of the queue and performsthe entire encoding process, i.e. determines the codebook for the range block, finds thebest matching codebook block, computes the intensity transformation parameters andthe distance between the range and codebook blocks. If the error between the rangeblock and the best matching codebook does not fulfill the tolerance criterion then therange block is split within the same thread and the newly created range blocks areadded to the common queue. However, if the error between the range block and thetransformed codebook block meets the tolerance criterion then the range block (withthe description of the affine transformation) is added to the thread’s internal datastructure.All threads end their life when the queue with range blocks to encode is empty and

there is no active thread that can produce new range blocks – the main thread monitorsthe state of the threads and the number of elements in the queue and interrupts thethreads if the encoding is finished. The last action the threads perform before thethreads are terminated is sending the structure with the encoded range blocks to themain block.In the program there are only two instances of the image – the first instance is

used by all range blocks and the second one is contracted and used by the codebookblocks. Thus, the threads share not only the queue with the uncovered range blocksbut indirectly also the images.The algorithm of the main thread is as follows:

1 declare and initialize the queue and the array2 create a range with location (0,0) and size equal to the image sizeand add it to the queue

3 create given number of threads that perform the proper encoding andlaunch them

4 wait for signal from any thread5 if the queue is empty and all threads are in ’idle’ state then goto next step, otherwise go to previous step

6 get the arrays of encoded ranges and merge them into a fractaloperator

7 store the binary representation of the fractal operator to the file

The threads that are runnin parallel perform the following algorithm:

1 wait until the queue in the main thread is not empty2 set the state of this thread to ’processing’3 get the first element from the queue


4 find the best matching codebook block, the intensity transform andthe error

5 if the error fullfils the tolerance criterion then add the rangeblock with all transformation parameters to the inner array ofrange blocksotherwise split the range block and add the new range blocks to thequeue in the main thread

6 set the state of this thread to ’idle’

5.7. Summary of the Proposed Compression Method

All issues of the design are solved in several approaches in order to find the optimalsolutions.The proposed method uses the Horizontal-Vertical partitioning [Fis95c]. The rea-

sons of this choice are very good results of this approach and lack of any other approach(in the literature) that would give better results at low compression ratios.However, there are considered several ways to divide range blocks during encoding

(when matching codebook block cannot be found). The work of Fisher [Fis95c], andthe work of Saupe, Ruhl, Hamzaoui, Grandi and Marini [Sau98] is here the basis forcreating following division methods:• Edge detection with Fisher’s rectangle degradation prevention mechanism• Variance minimization with Saupe’s rectangle degradation prevention mechanism• Edge detection with Saupe’s rectangle degradation prevention mechanism• Variance minimization with Fisher’s rectangle degradation mechanism• Edge detection with flat rectangle degradation mechanism• Variance minimization with flat rectangle degradation mechanismThe splitting technique using the edge detection is taken from [Fis95c] and im-

proved. The approach based on variance minimization is the same as in [Sau98]. Fisher’sand Saupe’s rectangle degradation mechanism are exactly the same as in the originalworks. The flat mechanism is a new approach that allows checking whether the rectangledegradation prevention mechanisms have a positive effect in the image quality.The codebook in fractal compression is “virtual”, i.e. there is no need to store the

codebook blocks in a separate data structure because they can be taken from the imagethat is being encoded. Nevertheless, some additional information of the codebook blockshas to be stored in the memory during the encoding, like the location of the block,its size and its inner product’s values (which requires time-consuming calculations).Depending on how this additional information is created and stored in the memory,there are distinguished several codebook types:• on-the-fly codebook (light codebook) – the additional information is created everytime a codebook block is accessed (during encoding) and it is not stored for furtherprocessing. The application stores only the position of the next codebook blockposition.

• solid codebook – the additional information about any possible codebook blockis created in a preliminary phase, independently whether the codebook block willbe accessed at least once during the encoding or not. This information is storedfor entire encoding process. The calculations of codebooks’ inner products and


variances can be made during this preliminary phase or they can be performedwhen the blocks are accessed for the first time.

• hybrid codebooks – two types that attempt to merge the advantages of the twoabove-listed codebook types and give a whole new value:◦ Solid codebook filled during the encoding with the light codebook.◦ Solid codebook used for range blocks smaller than some size and the lightcodebook for all other range blocks.

The realization of the isometries was proposed by the author in section 5.4. Theuse of symmetry operation is optional, what allows investigating if there is any profitfrom the isometries. This allows resolving doubts about this topic (presented in section3.3.2).There are two methods of creating the fractal code. The first one is very similar to

the methods presented in [Fis95c] and [Sau98]. The difference is that instead storingthe locations of the domain blocks the locations of the corresponding codebook blocksare stored. This saves a single bit per transformation but the full search (the maximalsize of the domain pool, i.e. domain offset equal to 1) is not possible. This approachstores the entire partitioning tree in the file. The author proposes a second approach,which stores only the leafs of this tree, i.e. only the range blocks that were matchedwith codebook blocks are stored. This approach can be used not only in the fractalcompression method based on HV-partitioning but also in a method based on irregularregions created with the use of HV-partitioning.There are proposed two methods for accelerating the encoding. The first one uses

blocks’ intensity variance. The range blocks with low variance can be treated exactlythe same as flat range blocks – no search after matching codebook block is requiredbecause the range block can be well approximated with a block of fixed intensity. Theother aspect of the variance-based acceleration eliminates from the search all codebookblocks that have variance lower than the variance of the currently being encoded rangeblock. The codebook blocks that have variance much larger than the range block alsocan be eliminated from the search with this acceleration technique.The second acceleration technique parallelizes the encoding. The proposed paral-

lelization scheme created a pool of threads that encodes independently range blocks.Each such thread can encode one range block at the time. The threads share the fol-lowing data: the image, the contracted copy of the image and a queue of range blocksto be encoded.

Chapter 6

Experimental Results

The application WoWa Fractal Coder implements the proposed fractal compressionmethod and provides an easy to use graphical user interface. More information aboutthe application can be found in appendix C.Whole application is written in Java (JDK 1.6). The GUI is created with Swing

library. The look and feel of the application is made with help of the Substance library1.The implementation allows numerous alternative versions of the encoding algo-

rithm. The version can be picked by configuring the application before starting theencoding process.The encoder settings that significantly influence the algorithm are:• method used to divide blocks:◦ a block divided along the most significant horizontal or vertical edge◦ a block divided along a line that gives minimal sum of intensity variance ofresultant blocks

• rectangle degradation prevention mechanism◦ Fisher’s mechanism◦ Saupe’s mechanism◦ Flat mechanism

• whether or not the codebook will be extended by allowing isometries• the contraction factor CF of the contraction transformation τC• if the decoding shall be performed through Single-Time Quantization or by stan-dard algorithm

• which type of the codebook shall be usedThere are also a number of settings that can be enabled in order to accelerate the

encoding process:• whether the algorithm shall find the best matching codebook block for all consid-ered range blocks, or a codebook block that fulfills the error tolerance criterion isenough

• if the search for transformations shall be performed in sequence – range after rangeor in parallel by enabling multithreading

• if the variance acceleration shall be used

1. more information at project website: https://substance.dev.java.net

Chapter 6. Experimental Results 83

◦ blocks with variance lower than what value shall be treated as shade blocks(approximated by a fixed codebook block)

◦ codebook search space restriction – what is the maximal variance differencebetween the range block and the codebook blocks that shall be consideredduring the search for matching codebook block

• whether or not the solid codebook shall be filled on the fly (from the light code-book)

(a) Original 532× 434 image

(b) Fractally encoded and decoded image. PSNR =40.01, CR = 8.92.

Figure 6.1. An example medical image and its fractally encoded version.

The influence of the settings on the encoding process and its output was checked,especially their influence on the fidelity, compression ratio and the encoding time.Doppler ultrasonography images were used for the tests. The image is presented in thefigure 6.1(a). The image, which is a monochrome image with 8 bits used for the pixel


value, size is 532 × 434. All tests were performed on the same PC machine with Intel2 Core CPU 2 GHz and 1 GB RAM.

6.1. Block Splitting

According to the performed experiments, the choice of the block splitting techniqueand rectangle degradation prevention mechanism has large effect on the rate-distortionperformance of the algorithm. The performance of the six options described in section5.2 is presented in the figure 6.2.

(a) Rate-distortion functions

(b) Encoding time and image fidelity relationships

Figure 6.2. The proposed compression algorithm performance depending on the blocksplitting mechanism and rectangle degradation prevention mechanism. ED denotes thesplitting mechanism based on edge detection and VM the mechanism based on variance

minimization.


The best results are given by the modified Fisher’s approach, i.e. the splittingtechnique based on the most significant edge in the block intensity with the originalFisher’s rectangle degradation prevention mechanism.However, there are two other configurations, which give lower noise level at very

low compression ratio. Both of these methods are based on the edge detection. WhenSaupe’s rectangle degradation mechanism is used, then this approach is superior tothe edge detection with Fisher’s mechanism at compression ratios lower than 3.5. Andwhen the flat rectangle degradation mechanism is used then it is better at compressionratio lower than 2.3.Compression ratios lower than 2.3 or even than 3.5 can be obtained with lossless

image coding methods. This means that at this range of the compression ratios anylossy compression method will never be used because the same amount of disk spacecan be saved without losing any information.However, when the image is encoded in order to decode it in a higher resolution

(fractal magnification) then the compression ratio is insignificant. The edge detectionsplitting technique with Saupe’s rectangle degradation mechanism seems to be appro-priate for this purpose because it allows achieving the highest PSNR, which is equalto 52.07 dB.Surprisingly, low rate-distortion performance can be observed when Saupe’s block

splitting technique is used – it is never higher when the approach based on edge detec-tion combined with Fisher’s rectangle degradation prevention mechanism.Besides the fidelity of the compressed image, also the encoding time is important.

Here, all splitting methods that are based on the variance minimization approach arethe slowest ones.From the splitting methods based on the edge detection, the use of Saupe’s rectangle

degradation mechanism results in best encoding time. To obtain the PSNR equal to 40dB it needs about 12 minutes while the approach based on edge detection with Fisher’srectangle degradation prevention mechanism requires 18 minutes and 20 seconds andthe approach based on the variance minimization with Saupe’s mechanism consumesover 23 minutes. This comparison once again suggests that the combination of the edgedetection approach with Saupe’s rectangle degradation mechanism can be very usefulin the image magnification where the user surely does not want to spend too muchtime to zoom an image.At compression ratios from 4 to 13, this combination is characterized by the second

good noise-distortion. Thus, because of the advantage of the shortest encoding time,the combination of the block splitting mechanism based on edge-detection and Saupe’srectangle degradation mechanism can be a reasonable choice also when compression isperformed in order to reduce the file size.The flat rectangle degradation prevention mechanism was introduced in order to

check if any such mechanism gives benefits. Both Saupe’s and Fisher’s mechanismsfavor the cutting line localizations that are closer to the middle of the block to bedivided. It turns out that the application of these mechanisms decreases the numberof transformations, i.e. it increases compression ratio. In both splitting approaches, theflat rectangle degradation mechanism, gave the worst results.From these experimental results, there are following important conclusions:


• the edge detection technique to find the cutting line for the split outperforms theblock variance-minimization approach

• the use a rectangle degradation prevention mechanism causes positive effects in allaspects of fractal compression performance

When the Fisher’s mechanism is used then better rate-distortion relationship can beobserved but when Saupe’s mechanism is chosen then the encoding lasts shorter.

6.2. Number of Bits for Scaling and Offset Coefficients

Low number of bits for intensity transform coefficients results in lower numberof bits per transformation. However, the coefficients can take only few values in suchsituation, what can result in difficulties with founding quantized coefficient’s values thatwill produce error between range and codebook blocks smaller than the given threshold(distance tolerance criterion). Thus, low number of bits for coefficients may increase thenumber of transformations in the fractal code. The higher number of transformationsthere are the longer time was needed to pair the range blocks and codebook blocks.In order to find the number of bits that will balance the transformation’ code lengthand the number of transformations (and the encoding time) a new factor is proposed,which takes into account also the PSNR:

F (b) = PSNR(b) ∗ CR(b)/EncodingT ime(b)

where b denotes the number of bits intended to store the scaling/offset coefficients inthe code of a single transformation.The maximal value of the function F (b) shows the optimal number of bits that

will balance the three most important factors in the fractal compression: noise level,compression ratio and encoding time. Of course, the factor F shall be calculated forany possible number of bits for the coefficient and no other settings shall be changed.

(a) Scaling coefficient (b) Offset coefficient

Figure 6.3. Test of the bit allocation for the intensity transformation coefficients.


The test was repeated for different values of the distance tolerance criterion in orderto check whether the optimal bit allocation remains the same for different compressionratios. The considered tolerance criterion values were:• errort = 0.05◦ when the bit allocation of the scaling coefficient is investigated then CR ∈(1.29, 1.6)

◦ when the optimal bit allocation for offset coefficient is searched then CR ∈(1.43, 1.7)

• errort = 1◦ scaling – CR ∈ (2.32, 2.39)◦ offset – CR ∈ (2.17, 2.6)

• errort = 5◦ scaling – CR ∈ (5.02, 5.24)◦ offset – CR ∈ (4.25, 5.62)

• errort = 50◦ scaling – CR ∈ (24, 26.2)◦ offset – CR ∈ (21, 26.7)The normalized value of the factor is presented in the figure 6.3. For very low

compression ratios, the optimal numbers are: three bits for scaling and two for offset.But this case is exceptional because for higher compression ratios the optimal numberof scaling bits falls into the range 〈4, 6〉 and the number of bits indicated by Fisher asoptimal ([Fis95b]) always gives at least the second best result. The optimal numberof the offset bits is in the range 〈6, 8〉 what also confirms that the number 7 can betreated as the golden mean.

6.3. Coding the Transformations

In the section 5.5 two different approaches to the construction of the fractal codewere presented. The first one stores the whole partitioning tree (information about allrange blocks – the ones that have matching codebook block as well as the ones thathad to be divided) while the second one tries to efficiently store only the leafs of thistree (the range blocks that have a bounded transformation).Their performance was measured and it is presented in the figure 6.4.The standard approach (whole partitions tree) turns out to be much more effective

when the partitioning and search processes produce a small number of transformations– e.g. when the error tolerance criterion is not to restrictive. Although the new approachalways turns out to give longer fractal code, when high fidelity is preserved (what foundsits reflection in many small range blocks) the difference between the two approaches israther small – about two bits per transformation.This means that the new approach can give very good results when it is employed

in fractal compression method based on irregular regions created from HV-partitions.The first phase of the irregular partitioning, where the image is divided into manyblocks with low variance (splitting phase), is actually a very similar to pure hierarchicalpartitioning with maximal fidelity. And in such circumstances, the new approach tostore the transformations gives best outcomes.


(a) Scaling coefficient

(b) Offset coefficient

Figure 6.4. Test of the bit allocation for the intensity transformation coefficients.

6.4. Acceleration

The encoding process in the above tests took at least 18 minutes (algorithm variantbased on edge detection with Fisher’s range degradation prevention mechanism) or11.5 minutes (edge detection with Saupe’s range degradation prevention mechanism)when the PSNR is slightly higher than 40 dB.The encoding can be accelerated in several ways:• reducing the virtual codebook size• quitting the search process (the process of searching a matching codebook blockfor given range block) after encountering the first codebook block that fulfills theerror tolerance criterion

• choosing a faster approach to fill and store the virtual codebook• enabling variance-based acceleration:


◦ treating range blocks with low intensity variance as shade blocks (approximatedwith fixed codebook blocks)

◦ performing the search only on a subset of codebook containing the blocks –the selection of the codebook blocks is made by setting the maximal distanceof variances of the range block and the codebook blocks.

6.4.1. Codebook Size Reduction

The first way to reduce the codebook size is to resign from the symmetry operations.The encoding time of a single range block is eight times longer when the isometries areenabled. When the domain offset is equal to 2 then it takes over 30 hours to encode thetest image (532 × 434). Such encoding time is very unacceptable, especially becausethe increase of the PSNR is rather marginal. All tests mentioned here were performedwith isometries.A larger offset between domain blocks locations also reduces the codebook size.

This operation very significantly shortens the encoding time. The fidelity of the imageand encoding time at higher domain offset values are presented in figure 6.5.

(a) Domain offset – PSNR relationship. (b) Domain offset – encoding time relation-ship.

Figure 6.5. Influence of the codebook size on the PSNR and the encoding time.

For the domain offset equal to the smallest possible value, i.e. 2, when the codebookis the largest, the PSNR reaches the highest value for the variant of the algorithm basedon edge detection and Fisher’s rectangle degradation mechanism. The PSNR value isequal to 50.67 dB but the time cost is high – about 50 minutes. Increasing the domainoffset from 2 to 4 reduces four times the codebook size and in consequence results infour time shorter encoding time (12 minutes 10 seconds). Further increasing the domainoffset (from 2 to 6) reduces the encoding time to 5 minutes 32 seconds (almost 9 timesimprovement). Domain offset equal to 8 results in encoding time: 3:09 (almost 16 timesimprovement).These very significant savings of time do not have very negative consequences in

the image fidelity. When the domain offset is 4 then the PSNR is only 1.2 dB (2.5%)smaller comparing to the results of encoding with domain offset equal to 2. Setting


domain offset to 6 reduces the PSNR by 2.4 dB (4.7%) and setting it to 8 gives PSNRsmaller by 3.3 dB (6.6%). In all of these examples, the PSNR is at a very high level –over 47 dB.An interesting example is also the value 12 of the domain offset, which allows

encoding the test image with PSNR equal to 46.1 dB in only 1 minute 20 seconds.Not only the codebook size influences the image fidelity – as same important is

the content of the codebook. This is the reason why the PSNR does not consequentlydecrease with increasing domain offset.

(a) Distortion–time relationship.

(b) Rate – distortion relationship.

Figure 6.6. Best match searching versus first match searching.

6.4.2. Breaking the Search

The error tolerance does not have any influence on the subcodebook (subset ofthe codebook that contains all codebook blocks that are compared with a given range


block) size in the original approach. In this situation, a higher error tolerance valuecan only reduce the encoding time indirectly by reducing the number of range blocks.However, when the search is interrupted after finding the first codebook block from

the subcodebook that yields error smaller than the error tolerance (FM – First Match),then the search process for a single range block also is shortened at error tolerancevalues higher than 0.But when the search is broken in the above-described way, not all range blocks are

paired with the best matching codebook blocks. What is the gain of the encoding timeand the cost of the PSNR, can be seen in the figure 6.6.Breaking the search definitely allows to obtain compressed image with same PSNR

in shorter time – the 40 dB can be achieved in almost two minutes (10.7%) less time.However, it has a negative influence on the rate-distortion characteristic of the algo-rithm. For example, when the best match is searched then the resulting image produces41.3 dB at compression ratio 7.4 : 1, but the “first match” option would give only 39.6dB at the same compression ratio. This cost of image fidelity cannot be ignored.

Figure 6.7. Performance of different codebook types.


6.4.3. Codebook Optimization

The discussion of different approaches to the codebook was presented in the section5.3.The shortest encoding time can be achieved with the hybrid solution, which uses

the light codebook to fill the solid codebook during encoding. When the second hybridapproach is put on such codebook, i.e. the solid codebook contains only the blockssmaller than given value and the light codebook provides larger blocks, the results areequally good.The solid codebook with inner products calculated in the preliminary phase gives

very weak results, even when its use is restricted to the smallest codebook blocks. Thisconfirms that a large set of the codebook blocks, which potentially can be used duringencoding, is unnecessary because the partitioning process does not produce any rangeblocks of the same size as these codebook blocks.However, it is profitable to remember the inner products of the codebook blocks

that were compared with a range block. The inner products calculations contributeremarkably to the time cost of the fractal encoding. Even when the pure solid codebookis used, but the inner products are postponed to the first access to a codebook blocks,the acceleration is impressive.

6.4.4. Variance-based Acceleration

The acceleration method described in section 5.6.1 gives outstanding results. Thegraphs 6.8 and 6.9 present the percentage of the encoding time that can be saved thanksto the variance-based acceleration. However, degradation in the image fidelity also canbe observed – it is presented by the percentage of the PSNR value calculated for thesame image encoded with the same settings but without variance-based acceleration.

Figure 6.8. Influence of the classification to ”shade” and ”oridinary” blocks on theencoding time, compression ratio and PSNR.

The figure 6.8 presents the results for the encoding the test image with the errortolerance set to 0 but at higher values of error tolerance (higher compression ratios,


lower image fidelity) the dependences of the presented values remain the same. Thehigher is the limit of the shade blocks variance, the higher is the reduction in PSNR.However, it is always exceeded by the savings of the encoding time. It can be noticed,that the use of the shade blocks give also profits in the compression ratio. When arange block approximated with a fixed block is stored to the fractal code, there is noneed to store the information about the codebook block (location, isometry) – severalbits are stored per each range block that becomes the shade block.Even better results are given by eliminating the codebook blocks with variance

lower than currently processed range block or with too high variance.

Figure 6.9. Performance of the codebook block elimination based on the blocks’ vari-ance difference.

The reduction of the PSNR by 1 dB results in almost two times smaller encodingtime. The encoding time can be shortened to 2/3 of the original time by setting verylarge variance distance between range and codebook blocks – for the test image thisvalue was equal to 3000 and caused a negligible loss of fidelity – 0.3 dB of PSNR.Too restrict selection of the codebook blocks to be compared with the range blocks

may cause problems with finding any codebook blocks that will fulfill the conditions(variance-based selection and range block-codebook block error). If such situation oc-curs then some range blocks will not be encoded and if the range blocks will be toosmall to be divided then not whole image will be covered with transformations – theimage fidelity will drastically fall down. It can be visible in the figure 6.9 – the varianceselection criterion from the range 〈30, 60〉 results in one range block that cannot beencoded and lower values of the criterion give much more such range blocks.

6.4.5. Parallelization

The tests were performed on machine with dual-core processor. Thus, it is possibleto check what the speed-up is when the encoding is made with two threads.Theoretically, the speed-up should be equal to the number of processors. At PSNR

equal 40 dB the encoding time is decreased exactly 1.95 times., so the actual speed-upis very close to ideal.


6.4.6. Spiral Search

According to the literature, the order of traversing the codebook blocks has a sig-nificant impact on the encoding time (section 3.2.2). The reduction of the encodingtime through applying the spiral search is possible only when the probability of findingmatching codebook block grows while spatial distance between the codebook block andrange block is decreased.In order to verify that the spiral search can be save some encoding time also when

medical image is compressed, the histogram of spatial distances between range andcodebook blocks that were paired into a single transformation. Although the histogramwas created basing on the test image, it looks similar also for other Doppler USGimages.

Figure 6.10. Histogram of spatial distances between matching codebook blocks andrange blocks.

The histogram shows that only in the closest neighborhood of the range block,when the spatial distance is smaller or equal to 2 pixels, some small increase of thelikelihood of matching the range block and codebook block is present. However, themost probable is to find a matching codebook block that is about 209 pixels (the sizeof the image is 532× 434) away from the range block. The probability of the distanceis concentrated around the maximum point and decreases in the shape of a trinomial.The observations allow to state that the improvement in the encoding time made by

the spiral search would be insignificant or there would be none. Some improvement maybe only achieved when the codebook blocks that are located too far would be eliminated– during the encoding of the test image, no range block was paired with a codebookblock that finds farther than 648 pixels. A good value of the spatial range-codebookblock distance limit would be in this case 550 or 500 because very small percentage offound transformations are requiring that range of the distance. This fact confirms thatthe Restricted Search Area may have positive influence on the encoding time but thesize of the area has to include the most part of the image in order to preserve highimage fidelity.

6.5. Comparison with JPEG

The JPEG/DCT compression is lossy – like the fractal compression. It is a goodreference point for any lossy compression method because it is included to the DICOM(the Digital Imaging and Communications in Medicine) standard.


The JPEG standard was tested on five images including the test image from pre-vious sections. Other images can be seen in appendix A. The comparison of the twomethods can be seen in the figure 6.11, where average results for the five images arepresented.The USG images are accurate enough when the compression ratio gained with

JPEG algorithm is not higher than 9 : 1 (see section 1.2). Other sources indicate thatthe PSNR of the reconstructed medical image shall be higher than 40 dB [Mul03]. Thetwo requirements are coincident because at compression ratio 9 : 1 average PSNR valueobtained with JPEG algorithm is equal to 40 dB. The proposed fractal compressionalgorithm gives slightly worse results – PSNR equal to 40 dB is attainable at compres-sion ratio about 7.5 : 1. Compression ratio equal to 9 : 1 produces here 38.75 dB ofPSNR.

(a) Peak Signal to Noise Ratio (b) Mean Squared Error

(c) Image Fidelity (d) Mean Absolute Error

Figure 6.11. Comparison of the proposed fractal compression method and JPEG ac-cording to different objective measures.

In order to gain full knowledge of the fractal compression performance in comparisonto JPEG standard, the fidelity of the decompressed images were measured not onlywith PSNR and MSE measures but also with following objective measures:


• Image Fidelity (IF)

IF = 1−M−1∑m=0

N−1∑n=0

[X(m,n)− X(m,n)

]2/M−1∑m=0

N−1∑n=0

[X(m,n)]2

• Mean Absolute Error (MAE)

MAE =1M ·N

M−1∑m=0

N−1∑n=0

[X(m,n)− X(m,n)

]All objective measures indicate that proposed fractal compression method out-

performs the JPEG algorithm only at high compression ratios. According to PSNRmeasure, fractal compression is better at compression ratios higher than 18 : 1. Sameconclusion gives analysis of MAE. However, results of IF and MSE show that fractalcompression is better at compression ratios higher then 14, 15 dB.Thus, proposed fractal compression method is superior to JPEG algorithm at com-

pression ratios higher than 18 : 1 and it is inferior at compression ratios lower than14 : 1. In both compression methods, compression ratio between 14 : 1 and 18 : 1 isbounded with similar fidelity of the reconstructed image.According to the literature [Bel02, Che98], the encoding time of fractal method is

much more extensive (at least several dozen times) than in JPEG algorithm. However,detailed experimental measures do not have to be performed in order to achieve thegoal of the thesis.

6.6. Magnification

The fidelity of a fractally magnified image is measured indirectly because there isno objective error measure that allows comparing image of different size. Because ofthis, when the magnified image X is created, it is encoded (with same method andsettings) and decoded to the dimension of the original image X – the created imagewill be denoted with X. Instead measuring the PSNR between X and X the distancewill be measured between X and X.

XA×B

zoom in−→η times

Xη·A×η·B

zoom out−→η times

XA×B

Analogous calculations are made for the bilinear and bicubic interpolation. Figure6.12 presents results of the first magnification test that was performed on six images ofdifferent size. All of the six images are parts of the test image 6.1(a). Each image wasmagnified two, four and eight time. The encoding settings may vary between imagesand magnification factors. The fractal magnification outperforms both interpolationmethods. Only at zoom equal to four times, the quality of the image magnified withbicubic interpolation is close to the fractally zoomed image. The largest difference canbe observed at two-times magnification – close to 1 dB.Results of above experiment are affected by imperfection of the method used to

measure the fidelity of magnified images. During magnification (X −→ X) as well asduring demagnification (X −→ X) the range blocks may have size 2×2 or larger. This


Figure 6.12. Comparison of fractal magnification with interpolation methods for dif-ferent magnification ratios.

setting of the minimal range block size gives best image fidelity but also causes problemsduring demagnifying. When the image X is demagnified, sizes of all range blocks arereduced as many times as many times the image is demagnified. A 2 × 2 range blockis degraded to a single pixel when the image is demagnified two times. However, samesize range blocks are rescaled to zero pixels when the image is demagnified four or moretimes. Thus, during decoding not whole fractal operator is used. The transformationswith smallest range blocks (with at least one side of the range block smaller than halfof the demagnification ratio) do not have any influence on the reconstructed image.Although the results of four and eight times fractal magnification (presented in figure6.12) are lowered by this imperfection of the comparison method, they are still betterthen the results of the interpolation methods.The second magnification test was performed on a larger set of images. The images

have various sizes and they were created by cutting of parts of five different ultra-sonograms. The magnification methods were tested on forty created in such manner,average values of the results are presented in table 6.1 and in figure 6.14. An additionalmeasure was used – maximal error (ME) which returns the maximal difference betweencorresponding pixels of images X and X. In the figures as well as in the table the resultsof the fractal magnification are confronted only with better magnification method basedinterpolation – bicubic interpolation.The goal of this test was to produce experimental results independent of character-

istics of a single image but also to find a configuration of fractal magnification settingsthat would give results close to optimal for any ultrasonography image.The results of the second test confirm the superiority of fractal magnification over

interpolation methods. All objective measures, besides maximal error (ME), indicatethat the fidelity of fractally magnified images is higher than the fidelity of magnifiedimages through bicubic interpolation. The maximal error is higher for fractal magnifi-cation because the distribution of the pixels errors is different in the methods. Although


Table 6.1. Comparison of fractal magnification and bicubic interpolation

method PSNR MSE IF ME MAE

fractal (optimal TC) 37.77 12.17 0.9982 42.00 1.78fractal (TC = 1) 37.58 12.68 0.9982 42.23 1.82bicubic 36.58 16.69 0.9978 22.4 2.57

the overall error caused by the fractal magnification is lower than the one caused bybicubic interpolation, there can be observed rare pixels where the errors are higher.The interpolation decreases the brightness of the image during magnification, while

the fractal magnification most often increases the brightness.Based on the experimental results obtained in the second magnification test also

the dependence of the image fidelity on the image size was investigated (figure 6.13).When the image is smaller, the virtual codebook contains less blocks and it is harderto find a good match between range blocks and codebook blocks. Thus, the larger animage is, the better quality of the magnified image can be obtained. The quality of theimage is significantly lower when images 32× 32 are magnified.

Figure 6.13. Fractal magnification and interpolation methods for different image sizes.

Nevertheless, the decrease of image fidelity caused by too small size of the originalimage is smaller when fractal method is used comparing to bicubic interpolation. Theimage fidelity is lowered in bicubic interpolation because during calculating pixel in-tensity an incomplete context of the pixel is used. This concerns only pixels close toborder of the image but when the image is smaller, the relative number of such pixelsis higher.The fidelity, quality and usefulness of images magnified with fractal method are

higher than with bicubic interpolation according to subjective assessment made withcooperation with medicine physicians. Visibility and fidelity of small details, readabilityof edges in the image and lack of image blurring are the main reasons that state thesuperiority of fractal magnification. The quality of the fractally magnified image islowered by blocks effect but it does not cause problems with reading the image. Theblocks effect can be reduced by introducing postprocessing (see section 3.6).


(a) Peak Signal to Noise Ratio

(b) Mean Squared Error

(c) Image Fidelity

(d) Mean Absolute Error

Figure 6.14. Comparison of the proposed fractal compression/magnification methodand interpolation methods according to different objective measures.


(a) Original 128× 128 image. (b) Fractally magnified image (PSNR = 36.94 dB).

(c) Magnification through bicubic interpolation (PSNR =35.94 dB.

Figure 6.15. Fractal magnification and bicubic interpolation examples.

Conclusions

Research Question 1: Is it possible, and how, to minimize thedrawbacks of the fractal compression method to satisfyinglevel in order to apply this method to medical imaging?

The two main drawbacks of the fractal image compression that have to be eliminatedin order to make possible the use of if in medical imaging are:1. too large loss of information2. very long encoding timeAny irreversible compression method forces to face up the first problem because

the medical images have to be very accurate. The satisfying level of the distortions formedical images was mentioned in section 6.5. The experiments proved that the proposedfractal compression method can obtain this level (PSNR = 40 dB). Furthermore, animage of this quality is compressed at compression ratio about 7.5 : 1. Thus, the useof fractal compression reduces the file size about two times efficiently than the losslesscompression methods (compression ratios not larger than 4 : 1). At compression ratiosfrom 4 : 1 to 9 : 1 the PSNR varies from 40 dB to 44.2 dB The accuracy is ashigh because the proper fractal compression method is used. The choice of the propermethod could not be possible without the meticulous review of the fractal compressionmethod presented in this thesis.The second problem can be solved in a variety of ways that were described in the

thesis. Several of them were implemented and some new propositions were made tooptimize the codebook operations. All of these efforts gave satisfactory results. It ispossible to encode a mediocre ultrasonography image in few seconds or several dozenof seconds and a 256 × 256 image in about 5 seconds and the required image fidelityis preserved (2 GHz processor, 1 GB RAM). The main means to reduce the encodingtime of the hierarchical horizontal-vertical fractal encoding to a reasonable value canbe encapsulated in following points:• establish the codebook on the fly – this will prevent from creating codebook blocks(and computing inner products) that will be not used by any of the range block,

• reduce the codebook size to the minimal size that will give desired fidelity• parallelize the encoding if there is more than one processor

Conclusions 102

• use variance-based acceleration:◦ exclude form the search the codebook blocks with intensity variance muchlarger than the variance of the range block

◦ treat range blocks with low variance equally as flat blocksIt is worth of mentioning that the proposed fractal coder has been written in Java

and compiled to byte code. The byte code is interpreted by the Java Virtual Machine.An improvement in encoding time could be achieved by compiling the program toplatform-dependent machine code. The native machine code from java source files canbe created with the gcj compiler 2, however it does not support Java 1.6 at the moment.Other way to get the machine code is to use one of the Just-in-Time Compilers, e.g. theIBM JIT Compiler, which re-compile the byte code to native code. The Java-programs,even when compiled to machine code, are slower than the programs written in C/C++.programming language results in best performance. Thus, use of C/C++ would resultin further encoding time reduction.

Research Question 2: Which fractal compression method suitsbest for medical images and gives best results?

The survey of the literature showed two fractal compression methods that are su-perior to other approaches. The irregular regions approach is unmatched at highercompression ratios and the Horizontal-Vertical approach gives best results at low com-pression ratios.Because of the specific character of medical images (require very high fidelity) the

HV method was chosen - only the lower compression ratios can be used because thehigher ones would result in too high distortion level.However, in the thesis also a new approach is considered. It is based on the two

algorithms mentioned above. The method based on irregular regions would constructthe range regions by using the HV partitioning. In the opinion of the author, this combi-nation could give better performance in the rate-distortion sense than any hierarchicalor irregular regions-based approach.

Research Question 3: Do the fractal compression preserveimage quality better or worse than other irreversible(information lossy) compression methods?

According to objective measures, the elaborated fractal compression method turnsout to be slightly weaker than the JPEG algorithm when the image is encoded withpreservation of the fidelity required by medical images. Because the wavelet codingoutperforms the JPEG, the fractal compression also gives worse results than the waveletcompression. The fractal compression gives better results (in rate-distortion sense) thanJPEG at compression ratios higher than 18 : 1. When compression ratio falls between14 : 1 and 18 : 1 then the objective measures do not give unambiguous indicationwhich method, fractal or JPEG, produces image closer to the original. Nevertheless,

2. The GNU Compiler for the Java Programming Language

Conclusions 103

some types of medical images can be compressed (the distortions are at satisfying level)at compression ratios where fractal compression gives better results – more in section1.2.The second approach to fractal compression – irregular regions based on HV parti-

tioning, might give better rate-distortion results than the JPEG.However, the fractal compression is inseparably bounded with the fractal magnifi-

cation – a very useful feature that cannot found in any other compression method. Iffractal compression is treated not only as a method to compress images but also as amethod to improve presentation quality of the images then there is no other methodthat could be compared against fractal compression.

Research Question 4: Can the results of fractal magnificationbe better than the results of traditional magnificationmethods?

In author’s opinion, the performed experiments leave no doubt – the fractal mag-nification is superior to bilinear as well as bicubic interpolation. Although the imagezoomed with fractal compression has visible block effect, the sharpness of the image ismuch higher and the details are better visible then in the images magnified throughinterpolation. The sharpness is one of the most important factors in medical images.When the image is out of focus, some small but very important details, like a fractureof a bone in an X-ray image, may become invisible. In addition, measurements aremuch easier and reliable when the image is sharp enough because the edges of tissues(or other orientation points) can be better localized. For example, measurement ofintima-media thickness is used in detecting and monitoring aortic atherosclerosis.The objective measurement (Peak Signal to Noise Ratio) also interchangeably

points at the fractal magnification as the one that better magnifies medical images.The measurement of magnification is performed by calculating PSNR between theoriginal image and an image that was created by zooming out the magnified image tothe original image’s size.

Future Work

The tests performed to utilize the accuracy of the proposed fractal compressionmethod utilized only objective measures. These measures are very helpful in comparisonof different compression methods but they do not reflect the usefulness of the evaluatedcompression to the persons who read the images. Because of this, it is highly advisableto perform experiments with human observers in order to evaluate the quality of thecompressed images. Preferably, the tests should involve the specialists that in theirday-to-day work establish diagnosis based on the medical images. The tests wouldmake possible to adjust the proposed fractal compression method to Human VisualSystem (HVS).The proposed fractal compression algorithm was tested only on Doppler ultrasonog-

raphy images. Nevertheless, types of medical images tolerate different distortions and

Conclusions 104

amount of lost information. In order to gain more general knowledge about the suit-ability of the method to medical imaging, also other types of medical images shall besubjected to the experiments.The selection of the partitioning method was based on an assumption that the

existing fractal compression methods based on irregular regions perform weaker thanthe Horizontal-Vertical partitioning at low compression ratios. The assumption cannotbe confirmed due to lack of such comparison of the two methods in the literature. Inthe future, also the best method based on the irregular methods (utilizing quadtreepartitioning in the split phase) shall be submitted to the tests that were made for theHV-partitioning.As it was repeatedly suggested, further improvement in the rate-distortion sense

and the encoding time may be achieved by merging the best two fractal methods –based on irregular regions and based on HV-partitioning. The way to do this is toutilize the HV-partitioning as the first step (splitting phase) to create the irregularregions. The literature provides solutions how to merge the partitions to create theirregular regions, i.e. the neighboring blocks with similar variance and average pixelintensity may be united into a single irregular range block. This thesis provides thesolution to the only problem that cannot be solved with help of the literature becausesuch compression method was not considered by any researcher. The question is how tostore the transformations’ descriptions in the fractal code and the proposed solution isto use the format presented in this thesis (section 5.5.4). However, this new compressionmethod with irregular regions still needs to be implemented and tested.

Appendix A

Sample Images

All presented here images have been encoded with fractal method based onmost-significant edge detection with Fisher’s mechanism for rectangle degradation pre-vention. Minimal range block size has been set to 2×2, domain offset to 2 and the searchis not interrupted by first found codebook block that fulfills the tolerance criterion. Thepresented images were encoded with various tolerance criteria.Images #1 – #5 are complete ultrasonograms; these images were reconstructed

to original sizes. In order to fit the presented images into pages without rescaling,these images are rotated by 90 degrees. The images #6 and #7 are only parts ofultrasonograms – they were reconstructed to larger than original sizes.

Appendix A. Sample Images 106

Figure A.1. Original image #1.


Figure A.2. Reconstructed image #1: errort = 6, PSNR = 42.53dB, CR = 6.26.


Figure A.3. Differential image #1: errort = 6.


Figure A.4. Partitioning of image #1: errort = 6.




















































Figure A.30. Original image #6, size: 192 times192.

Figure A.31. Partitioning of image #6: errort = 1


Figure A.32. Magnified image #6, size: 384 times384, errort = 1, PSNR = 41.36dB.


Figure A.33. Original image #7, size: 96 times96.

Figure A.34. Partitioning of image #7: errort = 1.3


Figure A.35. Magnified image #7, size: 384 times384, errort = 1.3, PSNR = 41.5dB.

Appendix B

Glossary

1

Fixed block – all pixels have the same value.

A

Attractor – fixed point of the operator W .

A = limi→∞W ◦i(f 0)

avgHeight(R)

Average height in pixels of all range blocks that constitute the fractal operator.

avgWidth(R)

Average width in pixels of all range blocks that constitute the fractal operator.

b

Width and height of range blocks when uniform partitioning is used.

Bi

Rectangle block of pixels. May denote a range block, codebook block or domainblock.

Bi

Number of pixels in the block Bi.

Appendix B. Glossary 140

Bcomp

Number of bits in the compressed image (length in bits of the compressed repre-sentation).

Borg

Number of bits in the original image (before compression).

bitd

Bit Depth – number of bits used to store a single pixel of the original image.

bs

Base of the logarithm in I(ui) formula. When bs = 2 then the unit of I(ui) is bit,when bs = 3 then the unit is trit, when bs = e (natural logarithm) then the unit is nat,and the last unit – Hartley is used when bs = 10.

BR

Bit Rate - the average number of bits in compressed representation of the data pera single element (symbol) in the original set of data.

BR = Bcomp/Borg

C

Virtual codebook – set of all spatially contracted domain blocks from D.

C

Length of the codebook C. C = D

Ci

Codebook block – spatially contracted domain block Di.

Ci

Number of pixels in the codebook block Ci.

CRi

Subset of the codebook C with blocks that shall be compared with range block Riduring the search for transformation.


CF

Spatial contraction factor used in transformation τC .

CP

Compression Percentage.

CP = (1− 1/CR) · 100%

CR

Compression Ratio – ability of the compression method to reduce the amount ofdisk space needed to store the data.

CR = Borg/Bcomp

D

Domain pool – set of all domain blocks utilized during encoding.

D

Length of the domain pool D. D = C

D(X, X)

Average distortion between images X and X.

D(X, X) = E{d(X, X)

}=∑xi

∑xi

fX,X(xi, xi) · d(xi, xi)

Di

Domain block.

d(xi, xi)

Distortion per symbol.

dc

Mean intensity of a block.

dcbottom(n)

Average value of block’s pixels that lie in rows with indexes higher than n.


dcleft(m)

Average value of block’s pixels that lie in columns with indexes not higher than m.

dcright(m)

Average value of block’s pixels that lie in columns with higher than m.

dctop(n)

Average value of block’s pixels that lie in rows with indexes not higher than n.

domainOffset

Offset in pixels between two spatially closest domain blocks.

δ(X, X), δ(X,A)

Distance between two images / attractors.

EhED(n)

Significance of the horizontal edge between column nth and (n+ 1)th.

EvED(m)

Significance of the vertical edge between column mth and (m+ 1)th.

EhVM(n)

Sum of the intensity variances of two blocks that can be created by cutting therange block Ri horizontally between nth and (n+ 1)th row.

EhVM(n) =M−1∑i=0

n∑j=0

(ri,j − dctop(n))2 +M−1∑i=0

N−1∑j=n+1

(ri,j − dcbottom(n))2

EvVM(m)

Sum of the intensity variances of two blocks that can be created by cutting therange block Ri vertically between mth and (m+ 1)th columns.

EvVM(m) =m∑i=0

N−1∑j=0

(ri,j − dcleft(m))2 +M−1∑i=m+1

N−1∑j=0

(ri,j − dcright(m))2

errort

Error threshold, distance tolerance – maximal allowed distance between pairedrange and domain blocks.


fX(xi)

Occurrence probability of a determined symbol xi in stream X.

fX|X(xi, xi)

Conditional probability that given symbol xi will occur in source X under conditionthat symbol xi will occur in source X.

gFisher

Fisher’s mechanism for rectangle degradation prevention [Fis95c].

gFisher(x) =min(x, xmax − x

xmax

g∗Fisher

Fisher’s mechanism of rectangle degradation prevention; adapted to block splittingmethod based on variance minimization.

g∗Fisher (g(x)) =xmax2− min(x, xmax − x)

xmax

gflat

Simple mechanism for rectangle degradation prevention.

gflat(x) ={0 when (x < t) ∨ (xmax − x < t)1 otherwise

gSaupe

Saupe’s mechanism for rectangle degradation prevention [Sau98].

gSaupe(x) = 0.4[( 2xmaxx− 1

)2+ 1

]

g∗Saupe

Saupe’s mechanism of rectangle degradation prevention; adapted to block splittingmethod based on most-significant edge detection.

g∗Saupe(x) = 0.8− 0.4[( 2xmaxx− 1

)2+ 1

]


H(U)

Entropy – the amount of information specified by a stream U of symbols.

H(U) =U∑i=1

p(ui) · I(ui) =U∑i=1

p(ui) · logbs1p(ui)

= −U∑i=1

p(ui) · logbs p(ui)

hn

Numerical value that characterizes the potential range block’s cutting line betweenrow n and n + 1. Values {hn : 0 ¬ n < N} are used as the weights of all potentialvertical cutting lines.

height(Bi)

Number of pixels in a column of block Bi.

Im

Mutual Information – the average information that random variables (X and X)convey about each other.

Im(X, X) = H(X)−H(X|X) = H(X)−H(X|X)

I(ui)

Amount of information carried by the symbol ui. The unit depends on the bs value.

I(ui) = logbs 1/pi

IF

Image Fidelity – an objective measure of distortions.

IF = 1−M−1∑m=0

N−1∑n=0

[X(m,n)− X(m,n)

]2/M−1∑m=0

N−1∑n=0

[X(m,n)]2

leafNo

Number of leafs in Huffman tree. In the thesis leafNo = W

levelNo

Number of levels in Huffman tree.


M

Number of columns in the original image.

m

Index of a column. 0 ¬ m < M

MAE

Mean Average Error – an objective measure of distortions.

MAE =1M ·N

M−1∑m=0

N−1∑n=0

[X(m,n)− X(m,n)

]

maxX

Maximal possible pixel value of the image X. maxX = 2bitd − 1

ME

Mean Error – maximal difference between corresponding pixels of X and X.

MSE

Mean Squared Error – an objective measure of distortions.

MSE =1M ·N

M−1∑m=0

N−1∑n=0

[X(m,n)− X(m,n)

]2

N

Number of rows in original image.

n

Index of a row. 0 ¬ n < N

oi

Offset coefficient of the intensity transform τ I .

pi

Probability that the symbol ui will occur in the input sequence.


PSNR

Peak Signal to Noise Ratio – an objective measure of distortions, expressed indecibels (dB).

PSNR = 10 · log10(max2XMSE

)

R

Set of all range blocks in the image.

Ri

Range block.

Ri(m,n), Di(m,n), Ci(m,n)

Pixel value in the mth column and nth row of the block Ri, Di or Ci.

RIN

Inner node of the partitioning tree – a range block divided horizontally or vertically.Its representation in fractal code requires L bits.

rm,n

Pixel value in the mth column and nth row of the range block Ri.

s

Contractivity factor of fractal operator W . 0 < s < 1

si

Scaling coefficient of the intensity transform τ Ii

sizet

Range size threshold – minimal size of the range blocks, set before encoding.

σ(Bi)

Variance of the codebook Bi.

τi(Di)

Transformation putted on the pixels of a Di before they will be mapped to pixelsof Ri. Part of transformation wi. τi = τ Ii ◦ τSe ◦ τ Ii


τ Ii (Bi)

Intensity transformation of the pixels within domain block Bi.

τ Ii (Bi) = si ·Bi + oi · 1

τC(Bi)

Spatial contraction of a domain block.

τSe (Bi)

Symmetry operation with index e applied to the domain block Bi.

U

Sequence of symbols. U = u1, u2, . . . , uU

ui

Symbol in the input sequence U on the ith position.

vm

Numerical value that characterizes the potential range block’s cutting line betweencolumn m and m+1. Values {vm : 0 ¬ m < M} are used as the weights of all potentialvertical cutting lines.

W

Fractal Operator.

W =W⋃i

wi

W

Number of transformations in the fractal operator W.

wi

A contractive affine transformation that is part of the fractal operator W .

width(Bi)

Number of pixels in a row of block Bi.


X

Stream to be encoded with lossy compression method. The original image. X =x1, x2, . . . , xX

X

Magnified image.

X

Image X demagnified to the size of the original image X.

X

Stream reconstructed from lossy compression. The reconstructed image. X =x1, x2, . . . , xX

X(m,n)

Intensity value of the pixel in the mth column and nthrow of the image X.

X0

Initial image submitted to fractal decoding.

Xr

Image produced in iteration r of fractal decoding.

f r = W (f r−1)

xi

Symbol on the ith position in the input stream X that is encoded with lossy com-pression.

xi

Symbol on the ith position in the reconstructed stream X from lossy compression.

xmax

Maximal possible value of the variable x.

Appendix C

Application Description andInstructions for Use

The WoWa Fractal Coder software was created in order to test practically thesuitability of the fractal compression methods to medical imaging.

C.1. How to Run the Program

WoWa Fractal Coder requires Java 1.6 (also referred to as Java 6.0). The JRE(Java Runtime Environment) version of Java is enough to run the program. The newestversion of JRE can be downloaded from Sun’s webpage (http://www.sun.com).When Java 1.6 is installed, the zip-file with the application has to be extracted into

an empty folder. In this folder, an executable jar and a folder named lib shall appear.The application can be run by double-clicking the wowaCoder.jar file or by typing

the following in the command line:

java -jar "wowaCoder.jar"

Because the application is very memory consuming, it is advised to increase to JavaHeap Space. It can be done by specifying the parameter -Xmx, e.g.:

java -jar -Xmx250m wowaCoder.jar

C.2. Common Interface Elements and Functionalities

The software provides two fundamental views: Wowa Fractal Encoder and WoWaFractal Decoder, the user can switch between them through menu View in the menubar. Both the encoder and the decoder window have almost the same menu bar, toolbar and the status bar.

C.2.1. Menu Bar

In the menu bar, following menus can be found:

Appendix C. Application Description and Instructions for Use 150

(a) The WoWa Fractal Encoder window.

(b) The WoWa Fractal Decoder window.

Figure C.1. The WoWa Fractal Coder user interface.


• File◦ Load – this menu item displays an open-file dialog, in the encoder window theuser picks the image to be encoded and in the decoder window the file withthe fractal operator describing an encoded image

◦ Save – the content of the log of the currently visible window can be saved aswell as the result of the last encoding (encoder window) or decoding (decoder)process

◦ Close – exit the application• View – makes possible to switch between the encoder and the decoder view, whatcan be also obtained by pressing F2 key. In the encoder window there is one moremenu item which displays the Image Comparator window.

• Options – allows to access to dialog windows that allow to change encoding anddecoding settings as well as application preferences

C.2.2. Tool Bar

The tool bars have three common buttons:• Load – has same effect as the Load item in the File menu• Start – begins the processing proper to currently visible window: encoding ordecoding

• Stop – if the encoding/decoding process is currently running than this buttonreplaces the Start button and allows the user to interrupt this processBesides these three buttons, both the encoder and the decoder have three more

buttons. In the encoder there are:• Save button which allows to save the lastly produced fractal operator• a toggle button that allows to show/hide the selected area in the Original tab• a button which sets the selection to default (whole original image)The decoder in these places has following buttons:• a button that allows to load the lastly generated fractal operator directly from theencoder instead loading it from file

• a button that gives the user possibility to set an initial image that will be passedto the decoder algorithm. First iteration will take the domain blocks from theinitial image, also the reconstructed image size and proportions are same as in theinitial image. If there is no initial image loaded, than a plain black image becomesthe initial image. The size of an initial image created in this way is equal to theoriginal size of the encoded image multiplied by the magnification factor (decodingsettings). The proportions in this case are same as in the original image.

• a button that clears the decoder window – the images from the Decompressed taband the Initial tab are erased and all tabs created during the encoding are deleted.

C.2.3. Status Bar

The status bars in the encoder and decoder windows have three same elements:• the path of the loaded file – the image to be encoded or the file with fractaloperator describing encoded image

• the description of current program status – if the encoding or decoding is currentlyexecuted than proper information shall appear in the status bar


• the progress bar that shows the progress of the process indicated by the label onthe left to this progress bar (see the point above)

In the encoder, between the file path and the current process name, the location andsize of the selected area of the loaded image can be found. In this place in the decoderwindow the original size of the image that is encoded into loaded fractal operator andthe size to which the image is decompressed is showed.

C.2.4. Pop-up Menus

Any images that are presented in the application can be zoomed in, zoomed outor saved to a file. All these actions can be accessed through a pop-up menu that isdisplayed after right-clicking on an image. When the cursor is over the pop-up menuthan an information about current zoom of the considered image is displayed in thetop left corner of the image.Right clicking on the content of the Log tabs results in displaying a pop-up menu

with only one item, which allows to save the log to a *.txt file.

C.3. WoWa Fractal Encoder

The main part of the Encoder window is a occupied by a tabbed panel. There areseveral tabs:• “Original” – here the original image is displayed and the area of the image thatshall be compressed can be defined.

• “Reconstructed” – after encoding, there the image reconstructed from the fractalcode created during the encoding process can be found (only if the Reconstructthe image after encoding option is enabled in the application preferences).

• “Comparison” – the tab contains the selected area of the original image convertedto grayscale color model, the reconstructed image and the differential image. Thereare also presented histograms of pixel intensities in the original image and in thereconstructed image.

• “Partitions” – the tab contains two images that reflect the partitioning of theoriginal image into range blocks.

• “Transformations” – there can be found an interactive image based on the originalimage, which allows to track which domains are mapped to which range blocksand how the domain blocks are transformed.

• “Log” – the actions that were taken up by the user are written to the log withattached timestamp. Also the summary of the encoding process is presented there.Some tabs, which have more complex content, are described in more details in

following subsections.

C.3.1. Original Tab

In this tab not only the user can familiarize himself with the image loaded in orderto be encoded but also limit the area of this image that shall be passed to the encoder.The are can be delimited in two ways:


• by left-clicking the upper left corner of the selection is pointed, through holding theleft mouse button and moving the mouse cursor the selection size is determined.This method can be used when the selection is not visible.

• when the selection is shown, by adjusting the size and location through moving thedisplayed selection borders or drag-and-drop performed on the selection rectangle.The selection rectangle can be displayed/hid by clicking a toggle button show/hide

selection in the tool bar. The utmost right button in the tool bar allows to set theselection to whole loaded image, what is also the default selection.

C.3.2. Comparison Tab

In this tab several images are displayed, their number depends on the applicationsettings.The selected area of the original image, which was passed to the fractal encoder, is

always displayed. If the original image is in color than the selected area is convertedto grayscale through calculating the luminance of each pixel with function f(x, y) =0.3R + 0.59G + 0.11B (where R,G,B are the red, green and blue components of theoriginal pixel value), which is used in the YUV and L*a*b color models.If the encoded image is automatically reconstructed then also the reconstructed

image is displayed. The image is decoded in accordance with the settings which areused also by the WoWa Fractal Decoder window. One of these settings is the zoomfactor.If the reconstructed image is zoomed in or zoomed out than next to the recon-

structed image another image is displayed. This image is created by encoding thereconstructed image with the same algorithm and settings and decoding to the size ofthe selected area in the original image.There can be also visible a differential image (only if it is enabled in the application

preferences). If the zoom parameter is not equal to one than the differential imagereflects the differences between this second reconstructed image and the selected areaof the original image.Both histograms in this tab, but also the distance histogram in the Transformations

tab, are presented in bar charts. The user can display an exact value of the histogramby left-clicking on the chart. Double-left-click on the chart erases all earlier-demandedmassages about histogram values from the chart.

C.3.3. Partitions Tab

The partitioning of the image into the ranges is presented in two images. The imageon the right shows a net created from the boundaries of all range blocks and displaysit with the original image in the background. The image on the left does not show theoriginal image but some additional information about the range blocks and the searchprocess is displayed there. The range blocks filled with blue color were paired withcodebook blocks provided by the solid codebook. The yellow color points the rangeblocks that were not paired with any codebook block - their are approximated with auniform block.


C.3.4. Transformations Tab

How it was already mentioned, here can be found an interactive image based on theoriginal image. When the cursor is over any pixel of the image, than a yellow rectangleis drawn around the range block to which the pixel belongs. Another yellow rectangle,but darker, surrounds the domain block mapped to the range block. After left-clickingon the image the currently displayed range and domains blocks are marked permanently(the color of the rectangles is changed to green). When the mouse exits the image, thetransformation represented by green rectangles becomes the selected one, what makespossible to zoom in and take a closer look at the images in the bottom of the tab.The bottom part of the tab is occupied with a chain of images that describe cur-

rently selected transformation starting from the domain block, through the codebookblock and finishing with an intensity transformed codebook block. There are two moreimages in this chain: the content of the range block putted on the original image (withwhich the intensity transformed codebook block is compared during the search formatching domain and range blocks) and the content of the range block putted on thereconstructed image.Next to the interactive original image a histogram can be found that shows the

frequency of occurrences of spatial distances between paired range and domain blocks.The distances between upper left corners of these corners are taken here into consid-eration. This histogram can be very helpful in determining whether spiral search mayhave any positive effect on the encoding time.

C.3.5. Log Tab

In the log, besides annotating the events like, starting the encoder window, load-ing image or starting/finishing encoding also summaries of the encoding process arepresented. Such summary provides the user with following information:• description of the original image:◦ the file path◦ the size of the loaded image◦ the position of the upper left vertex of the image area selected to be encoded◦ the size of the selected area

• in-force decoding settings◦ the zoom factor◦ number of decoding iterations◦ what decoding method was used (single-time quantization vs. standard decod-ing)

• in-force encoding settings◦ range size threshold◦ contraction factor of the contraction transformation◦ offset between neighboring domains in the domain pool◦ used method to split range blocks into two◦ maximal error tolerated during comparing codebook block with range block

• time consumption◦ duration of the encoding◦ duration of the decoding


◦ number of search threads active during encoding (influences only the encodingtime)

• characteristics of the fractal operator◦ number of transformations◦ average, largest and smallest range block spatial sizes◦ average spatial distance between range blocks and domain blocks

• quality of the recovered image expressed by calculating distance measures◦ image fidelity◦ average difference◦ maximal difference◦ mean squared error◦ peak mean squared error◦ signal to noise ratio◦ peak signal to noise ratio

C.3.6. Image Comparator

The window allows to calculate the error between any two images with same mea-sures and algorithms that are used to assess after encoding the quality of reconstructedimage. The interface is very simple – there are two buttons in the top, which allow toload the two images that shall be compared. When two images are loaded, then asummary appears in the bottom of the window. The Image Comparator is used whenthere is need to asses the quality of other compression or magnification methods thanthose implemented in the WoWa Fractal Coder.

C.4. WoWa Fractal Decoder

Also in the decoder window, a tabbed panel takes the central place. Initially thereare three tabs:• “Decompressed” – after decoding, it contains the image reconstructed from thefractal code loaded earlier

• “Log” – all user actions are denoted and marked with a timestamp• “Initial” – shows the image (if set) that will be the starting point of the decodingalgorithm – it is used by the first iteration at the decoded imageDuring decoding, tabs named with successive integer numbers starting from 0 are

created – the products of successive decoding iterations are presented in these tabs. Thenumber of this tabs is equal to the number of iterations set in decoding settings. Theproduct of each iteration becomes the base image for the next iteration – all domainblocks are take from this image.

C.5. Settings

The user can define multiple parameters that have an influence on the way theapplication works and on the encoding and decoding algorithms.


C.5.1. Application Preferences

The Application Preferences define the behavior of different elements of the appli-cation:• Zoom step – how many times an image shall be zoomed in / zoomed out afterpressing the zoom in / zoom out buttons in the pop-up menu of the image

• Minimal size of the image selection – too small size of the selected area of theoriginal image (in the encoder window) may cause problem with resizing/relocatingthe selection

• check box that allows the user to decide whether the image shall be automaticallydecoded after encoding. If the option is unchecked than the quality of any imagereconstructed from code created by the software can be done only through theImage Comparator.

• check box that indicates if the differential image shall be created after automaticdecoding (see above). The pixel values may be equal to the difference between cor-responding pixel values in the original and reconstructed images or this differencemay be multiplied by a factor set by user.

• the user can choose between two file formats (see section 5.5). If the new approachwill be chosen than the cell coordinates can be stored with or without use ofHuffman coding.

C.5.2. Encoder Settings

The encoding algorithm widely depends on the settings made by the user. Theparameters that can be defined by the user are:• the range block size threshold• the contraction factor of the contraction transformation• the offset between neighboring domain blocks in the domain pool – it cannot besmaller than the contraction factor

• what mechanism shall be used to split the range blocks into two during encoding.There are two mechanisms possible: edge detection and variance minimization (seechapter 5.2.

• what rectangle degradation prevention mechanism shall be used during splittingeach range block. There are three options: Fisher’s function, Saupe’s function andflat function that does not have any influence on the splitting mechanism besideseliminating the cutting lines that produce range block smaller that range blocksize threshold.

• the distance tolerance criterion – what is the maximal allowed error between thepaired range block and codebook block. The value from the text block is interpretedas the square of the maximal allowed root mean squared error.

• whether the symmetry operations shall be used during the encoding• whether the search shall be interrupted after finding the first codebook block thatbrings about error smaller than the tolerance criterion for given range block oralways the best match shall be found

• whether the codebook shall be determined only on the fly (during performing thesearch) or the part of the codebook with the smallest codebook blocks shall befixed in a preliminary phase (this part is called the “solid codebook”). During


building the solid codebook, the codebook blocks’ inner products are calculated〈Cj, 1〉, 〈Cj, Cj〉, as well as the average pixel intensity within the bock and thevariance of the pixel intensity (if necessary, see next bullet) – this computation isperformed only once when the codebook block finds in the solid codebook, and ithas to be performed each time a codebook is considered when the block is providedby the “on the fly” codebook. The user can set and the maximal size of codebookblocks that will be added to this codebook. Any codebook block that has bothsides not longer than this value will be looked after only in the solid codebook, thesolid codebook lasts in the memory over the whole encoding process, while “onthe fly” codebook is only a piece of functionality and no data structures has to bestored.

• a variance-based acceleration technique can be activated for the encoding process.If this option is turned on, additional information about the blocks is required bythe encoding algorithm – the variance of pixel values. The user has two possibilitiesto use this information in order to speed up the encoding.◦ setting the variance of the shade blocks – all range blocks with variance notlarger than the value given by the user will be treated as shade blocks, i.e.there will be not performed the search process after matching codebook blockbut the range block will be approximated with a fixed block

◦ setting the maximal allowed distance between codebook block and range blockin terms of the variance value – during the search performed to find the code-book block for a given range block, only these codebook blocks will be consid-ered, that have variance not larger than the user given value over the rangeblock’s variance and not smaller than the range block’s variance. All othercodebook blocks are omitted during the search. This acceleration mechanismcan be applied to whole codebook or only to one part of it (the solid codebookor the “on the fly” codebook).

• how many threads should perform the search for matching range and codebookblocks during encoding

C.5.3. Decoder Settings

Most of the decoder settings are used not only during decoding started in thedecoder window but also in decoding performed automatically after encoding. Thefollowing options can be set:• number of decoding iterations• whether the products of successive decoding iterations shall be displayed in tabsof the decoder window

• whether the decoding should use matrices of double numbers to store the productsof the iterations (decoding with single-time quantization) or matrices of integers(standard approach).

• whether the reconstructed image should be zoomed in or zoomed out comparingto the original image size and how many times.

Appendix D

Source Code and Executable Files

The source code and executable files are attached to the thesis on a CD disk (hardcopy of the thesis) or they are packed to the same zip file with the pdf document(electronic version).The source code of the WoWa Fractal Coder is compressed to a zip file and placed

in folder src and the executables (also compressed to a zip file) are in folder app.The archive with executable files contains also several sample medical images.

List of Figures

1.1 Examples of x-ray images. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2 Example of a Computerized Tomography image (from [Gon02]) . . . . . . . . . . 81.3 Examples of gamma-ray images (from [Gon02]) . . . . . . . . . . . . . . . . . . . 91.4 Examples of magnetic resonance images (from normartmark.blox.pl, 21.09.2007). 101.5 Example of USG image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.1 General scheme for lossy compression. . . . . . . . . . . . . . . . . . . . . . . . . 162.2 Regular scalar quantization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3 Compression system model in rate-distortion theory . . . . . . . . . . . . . . . . 182.4 The relationship between bit rate and distortion in lossy compression. . . . . . . 192.5 Selfsimilarity in real images (from einstein.informatik.uni-oldenburg.de

/rechnernetze/fraktal.htm, 19.01.2008). . . . . . . . . . . . . . . . . . . . . . . . 202.6 Generation of the Sierpinski triangle. Four first iterations and the attractor. . . 222.7 Barnsley fern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.8 Fractal magnifier block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.9 Fractal magnification process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.1 Quadtree partitioning of a range. Four iterations. . . . . . . . . . . . . . . . . . 303.2 Horizontal-vertical partitioning of a range. Four iterations. . . . . . . . . . . . . 323.3 Region edge maps and context modeling . . . . . . . . . . . . . . . . . . . . . . . 343.4 Performance of compression methods with irregular partitions. . . . . . . . . . . 353.5 Probability density function of block offset. . . . . . . . . . . . . . . . . . . . . . 373.6 Spiral search (from [Bar94b]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.7 Distributions of scaling and offset coefficients (second order polynomial intensity

transformation) (from [Zha98]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.1 Comparison of fractal compression methods based on irregular-shaped regionsand HV partitioning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.2 Hierarchical fractal encoding. Block diagram. . . . . . . . . . . . . . . . . . . . . 585.3 Isometries for rectangle block. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655.4 The structure of the fractal code describing a single transformation depending

on the position of the range block with respect to the position of the underlyingcell. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

List of Figures 160

5.5 Grid putted on an image. The currently processed cell and the neighboring cellsare marked with triangles with labels. The order of traversing the grid is showedby the arrows in beckground. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5.6 Example of Huffman tree used for constructing binary representation of fractaloperator. Nodes marked with weights and indexes in the ordered list. . . . . . . 72

6.1 An example medical image and its fractally encoded version. . . . . . . . . . . . 836.2 The proposed compression algorithm performance depending on the block

splitting mechanism and rectangle degradation prevention mechanism. EDdenotes the splitting mechanism based on edge detection and VM the mechanismbased on variance minimization. . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

6.3 Test of the bit allocation for the intensity transformation coefficients. . . . . . . 866.4 Test of the bit allocation for the intensity transformation coefficients. . . . . . . 886.5 Influence of the codebook size on the PSNR and the encoding time. . . . . . . . 896.6 Best match searching versus first match searching. . . . . . . . . . . . . . . . . . 906.7 Performance of different codebook types. . . . . . . . . . . . . . . . . . . . . . . 916.8 Influence of the classification to ”shade” and ”oridinary” blocks on the encoding

time, compression ratio and PSNR. . . . . . . . . . . . . . . . . . . . . . . . . . 926.9 Performance of the codebook block elimination based on the blocks’ variance

difference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 936.10 Histogram of spatial distances between matching codebook blocks and range

blocks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 946.11 Comparison of the proposed fractal compression method and JPEG according

to different objective measures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 956.12 Comparison of fractal magnification with interpolation methods for different

magnification ratios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 976.13 Fractal magnification and interpolation methods for different image sizes. . . . . 986.14 Comparison of the proposed fractal compression/magnification method and

interpolation methods according to different objective measures. . . . . . . . . . 996.15 Fractal magnification and bicubic interpolation examples. . . . . . . . . . . . . . 100

A.1 Original image #1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106A.2 Reconstructed image #1: errort = 6, PSNR = 42.53dB, CR = 6.26. . . . . . . . 107A.3 Differential image #1: errort = 6. . . . . . . . . . . . . . . . . . . . . . . . . . . 108A.4 Partitioning of image #1: errort = 6. . . . . . . . . . . . . . . . . . . . . . . . . 109A.5 Reconstructed image #1: errort = 11, PSNR = 40.02dB, CR = 8.92. . . . . . . 110A.6 Differential image #1: errort = 11. . . . . . . . . . . . . . . . . . . . . . . . . . . 111A.7 Partitioning of image #1: errort = 11. . . . . . . . . . . . . . . . . . . . . . . . . 112A.8 Reconstructed image #1: errort = 11, PSNR = 36.94dB, CR = 14.11. . . . . . 113A.9 Differential image #1: errort = 22. . . . . . . . . . . . . . . . . . . . . . . . . . . 114A.10 Partitioning of image #1: errort = 22. . . . . . . . . . . . . . . . . . . . . . . . . 115A.11 Reconstructed image #1: errort = 100, PSNR = 30.01dB, CR = 52.22. . . . . . 116A.12 Differential image #1: errort = 100. . . . . . . . . . . . . . . . . . . . . . . . . . 117A.13 Partitioning of image #1: errort = 100. . . . . . . . . . . . . . . . . . . . . . . . 118A.14 Original image #2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119A.15 Reconstructed image #2: errort = 12, PSNR = 40.32dB, CR = 9.68. . . . . . . 120A.16 Differential image #2: errort = 12. . . . . . . . . . . . . . . . . . . . . . . . . . . 121A.17 Partitioning of image #2: errort = 12. . . . . . . . . . . . . . . . . . . . . . . . . 122A.18 Original image #3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123A.19 Reconstructed image #3: errort = 10, PSNR = 40.43dB, CR = 6.69. . . . . . . 124

List of Figures 161

A.20 Differential image #3: errort = 10. . . . . . . . . . . . . . . . . . . . . . . . . . . 125A.21 Partitioning of image #3: errort = 10. . . . . . . . . . . . . . . . . . . . . . . . . 126A.22 Original image #4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127A.23 Reconstructed image #4: errort = 12, PSNR = 40.37dB, CR = 5.64. . . . . . . 128A.24 Differential image #4: errort = 12. . . . . . . . . . . . . . . . . . . . . . . . . . . 129A.25 Partitioning of image #4: errort = 12. . . . . . . . . . . . . . . . . . . . . . . . . 130A.26 Original image #5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131A.27 Reconstructed image #5: errort = 10, PSNR = 40.12dB, CR = 5.36. . . . . . . 132A.28 Differential image #5: errort = 10. . . . . . . . . . . . . . . . . . . . . . . . . . . 133A.29 Partitioning of image #5: errort = 10. . . . . . . . . . . . . . . . . . . . . . . . . 134A.30 Original image #6, size: 192 times192. . . . . . . . . . . . . . . . . . . . . . . . 135A.31 Partitioning of image #6: errort = 1 . . . . . . . . . . . . . . . . . . . . . . . . 135A.32 Magnified image #6, size: 384 times384, errort = 1, PSNR = 41.36dB. . . . . 136A.33 Original image #7, size: 96 times96. . . . . . . . . . . . . . . . . . . . . . . . . . 137A.34 Partitioning of image #7: errort = 1.3 . . . . . . . . . . . . . . . . . . . . . . . 137A.35 Magnified image #7, size: 384 times384, errort = 1.3, PSNR = 41.5dB. . . . . 138

C.1 The WoWa Fractal Coder user interface. . . . . . . . . . . . . . . . . . . . . . . 150

Bibliography

[AL99] Anderson-Lemieux, A., Knoll, E. Digital image resolution: what it meansand how it can work for you. Professional Communication Conference, 1999.IPCC 99. Communication Jazz: Improvising the New International Communica-tion Culture. Proceedings. 1999 IEEE International, pp. 231–236, 1999.

[Ary93] Arya, S.,Mount, D. M. Algorithms for fast vector quantization. W Proceedingsof the IEEE Data Compression Conference DCC’93, (J. A. Storer, M. Cohn, eds.),pp. 381–390, Snowbird, UT, USA, 1993.

[Bah93] Baharav, Z., Malah, D., Karnin, E. Hierarchical interpretation of fractalimage coding and its applications to fast decoding. W Intl. Conf. on Digital SignalProcessing, Cyprus, 1993.

[Bah95] Baharav, Z., Malah, D., Karnin, E. Hierarchical interpretation of fractalimage coding and its applications. W Fisher [Fis95a].

[Bar93] Barthel,K. U.,Voye,T.,Noll, P. Improved fractal image coding. W Proceed-ings of the International Picture Coding Symposium PCS’93, p. 1.5, Lausanne,1993.

[Bar94a] Barthel, K. U., Schuttemeyer, J., Voye, T., Noll, P. A new image codingtechnique unifying fractal and transform coding. W Proceedings of the IEEE Inter-national Conference on Image Processing ICIP-94, vol. III, pp. 112–116, Austin,Texas, 1994.

[Bar94b] Barthel, K. U., Voye, T. Adaptive fractal image coding in the frequency do-main. W Proceedings of the International Workshop on Image Processing, vol.XLV, pp. 33–38, Budapest, Hungary, 1994.

[BE95] Bani-Eqbal, B. Speeding up fractal image compression. W Proceedings of theIS&T/SPIE 1995 Symposium on Electronic Imaging: Science & Technology, vol.2418: Still-Image Compression, pp. 67–74, 1995.

[Bea90] Beaumont, J. M. Advances in block based fractal coding of still pictures. WProceedings IEE Colloquium: The application of fractal techniques in image pro-cessing, pp. 3.1–3.6, 1990.

[Bed92] Bedford, T., Dekking, F. M., Keane, M. S. Fractal image coding techniques

Bibliography 163

and contraction operators. Nieuw Archief voor Wiskunde (Groningen), vol. 10(3),pp. 185–218, 1992. ISSN 0028–9825.

[Bel02] Belloulata, K., Konrad, J. Fractal image compression with region-based func-tionality. IEEE Transactions on Image Processing, vol. 11(4), pp. 351–362, 2002.

[Bog92] Bogdan, A., Meadows, H. E. Kohonen neural network for image coding basedon iteration transformation theory. W Proceedings of SPIE Neural and StochasticMethods in Image and Signal Processing, vol. 1766, pp. 425–436, 1992.

[Bos95] Boss, R. D., Jacobs, E. W. Archetype classification in an iterated transforma-tion image compression algorithm. W Fisher [Fis95a], pp. 79–90.

[Bre98] Breazu, M., Toderean, G. Region- based fractal image compression usingdeterministic search. W Proc. ICIP-98 IEEE International Conference on ImageProcessing, Chicago, 1998.

[Cas95] Caso,G.,Obrador, P.,Kuo, C.-C. J. Fast methods for fractal image encoding.W Proceedings of IS&T/SPIE Visual Communications and Image Processing 1995Symposium on Electronic Imaging: Science & Technology, (L. T. Wu, ed.), vol.2501, pp. 583–594, Taipei, Taiwan, 1995.

[CH04] C. He, X. H., S. X. Yang. Variance-based accelerating scheme for fractal imageencoding. Electronics Letters, vol. 40(2), pp. 115–116, 2004.

[Cha97] Chang, Y.-C., Shyu, B.-K.,Wang, J.-S. Region-based fractal image compres-sion with quadtree segmentation. W ICASSP ’97: Proceedings of the 1997 IEEEInternational Conference on Acoustics, Speech, and Signal Processing (ICASSP’97) -Volume 4, p. 3125, IEEE Computer Society, Washington, DC, USA, 1997.ISBN 0-8186-7919-0.

[Cha00] Chang, Y.-C., Shyu, B.-K., Wang, J.-S. Region-based fractal compressionfor still image. W In Proc. WSCG’20000, The 8-th International Conferencein Central Europe on Computer Graphics, Visualization and Interactive DigitalMedia, Plzen, Czech Republic, 2000.

[Che98] Chen, C.-C. On the selection of image compression algorithms. icpr, vol. 02, pp.1500 – 1504, 1998. ISSN 1051-4651.

[Dal92] Dallwitz,M. J. An introduction to computer images. TDWG Newsletter, vol. 7,pp. 1–3, 1992.

[Dav94] Davoine, F., Chassery, J.-M. Adaptive delaunay triangulation for attractorimage coding. W Proc. of 12th International Conference on Pattern Recognition(ICPR), pp. 801–803, Jerusalem, 1994.

[Dav95] Davoine, F., Svensson, J., Chassery, J.-M. A mixed triangular and quadri-lateral partition for fractal image coding. W Proceedings of IEEE InternationalConference on Image Processing ICIP-95, vol. III, pp. 284–287, IEEE ComputerSociety, Washington, D.C, 1995. ISBN 0-8186-7310-9.

[Dav96] Davoine, F., Antonini, M., Chassery, J.-M., Barlaud, M. Fractal im-age compression based on delaunay triangulation and vector quantization. IEEETransactions on Image Processing, vol. 5(2), pp. 338–346, 1996.

[Dek95a] Dekking, F. M. Fractal image coding: some mathematical remarks on its limits

Bibliography 164

and its prospects. Raport tech. DUT-TWI-95-95, Faculty of Technical Mathe-matics and Informatics, Delft Univeristy of Technology, Delft, The Netherlands,1995.

[Dek95b] Dekking, F. M. An inequality for pairs of martingales and its applications tofractal image coding. Raport tech. DUT-TWI-95-10, Faculty of Technical Math-ematics and Informatics, Delft Univeristy of Technology, 1995.

[Deo03] Deorowicz, S. Universal lossless data compression algorithms’, 2003. Doctor ofPhilosophy Dissertation, Silesian University of Technology.

[Duf92] Dufaux, F., Kunt, M. Multigrid block matching motion estimation with anadaptive local mesh refinement. W Proceedings of SPIE Visual Communicationsand Image Processing 1992„ Lecture Notes in Computer Science, vol. 1818, pp.97–109, IEEE, 1992.

[Fis92a] Fisher, Y. Fractal image compression. Raport tech. 12, Department of Mathe-matics, Technion Israel Institute of Technology, 1992. SIGGRAPH ‘92 COURSENOTES.

[Fis92b] Fisher, Y. Fractal image compression. W Chaos and fractals: new frontiers ofscience, (H.-O. Peitgen, H. Jurgens, D. Saupe, eds.). Springer-Verlag, New York,1992.

[Fis92c] Fisher, Y., Jacobs, E. W., Boss, R. D. Fractal image compression usingiterated transforms. W Image and text compression, (J. A. Storer, ed.), roz. 2, pp.35–61. Kluwer Academic Publishers Group, Norwell, MA, USA, and Dordrecht,The Netherlands, 1992. ISBN 0-7923-9243-4.

[Fis94] Fisher, Y., Shen, T. P., Rogovin, D. A comparison of fractal methods withdiscrete cosine transform (DCT) and wavelets. Proceedings of the SPIE — TheInternational Society for Optical Engineering, vol. 2304-16, pp. 132–143, 1994.ISSN 0277-786X.

[Fis95a] Fisher, Y., ed.. Fractal image compression: theory and application. SpringerVerlag, Berlin, Germany / Heidelberg, Germany / London, UK / etc., 1995. ISBN0-387-94211-4.

[Fis95b] Fisher, Y. Fractal image compression with quadtrees. W Fractal image compres-sion: theory and application [Fis95a], pp. 55–77.

[Fis95c] Fisher, Y.,Menlove, S. Fractal encoding with hv partitions. W Fisher [Fis95a],pp. 119–126.

[Fri94] Frigaard, C., Gade, J., Hemmingsen, T., Sand, T. Image compression basedon a fractal theory. Internal Report S701, Institute for Electronic Systems, Aal-borg University, Aalborg, Denmark, 1994.

[Fri95] Frigaard, C. Fast fractal 2D/3D image compression. Manuscript, Institute forElectronic Systems, Aalborg University, 1995.

[GA93] Gharavi-Alkhansari, M., Huang, T. S. A fractal-based image block-codingalgorithm. W Proceedings of ICASSP-1993 IEEE International Conference onAcoustics, Speech and Signal Processing, vol. 5, pp. 345–348, 1993.

[GA94a] Gharavi-Alkhansari, M., Huang, T. S. Fractal based techniques for a gener-

Bibliography 165

alized image coding method. W Proc. ICIP-94 IEEE International Conference onImage Processing, vol. III, pp. 122–126, Austin, Texas, 1994.

[GA94b] Gharavi-Alkhansari, M., Huang, T. S. Generalized image coding usingfractal-based methods. W Proceedings of the International Picture Coding Sym-posium PCS’94, pp. 440–443, Sacramento, California, 1994.

[GA96] Gharavi-Alkhansari, M., Huang, T. S. Fractal image coding usingrate-distortion optimized matching pursuit. W Proceedings of SPIE Visual Com-munications and Image Processing 1996, (R. Ansari, M. J. Smith, eds.), vol. 2727,pp. 1386–1393, Orlando, Florida, 1996.

[Gon02] Gonzalez, R. C., Woods, R. E. Digital image processing. Prentic-Hall, NewJersey, 2nd edn, 2002. ISBN 0-201-18075-8.

[Got95] Gotting, D., Ibenthal, A., Grigat, R.-R. Fractal image coding and mag-nification using invariant features. W NATO ASI Conference on Fractal ImageEncoding and Analysis, Trondheim, Norway, 1995.

[Got97] Gotting, D., Ibenthal, A., Grigat, R.-R. Fractal image coding and magnifi-cation using invariant features. Fractals, vol. 5 (Supplementary Issue), pp. 65–74,1997.

[Ham97a] Hamzaoui, R. Codebook clustering by self-organizing maps for fractal imagecompression. Fractals, vol. 5 (Supplementary Issue), pp. 27–38, 1997.

[Ham97b] Hamzaoui, R. Ordered decoding algorithm for fractal image compression. WProceedings of the International Picture Coding Symposium PCS’97, pp. 91–95,Berlin, Germany, 1997.

[Har97] Hartenstein, H., Saupe, D., Barthel, K.-U. VQ-encoding of luminance pa-rameters in fractal coding schemes. W Proceedings ICASSP-97 (IEEE Inter-national Conference on Acoustics, Speech and Signal Processing), vol. 4, pp.2701–2704, Munich, Germany, 1997.

[Har00] Hartenstein,H.,Ruhl,M., Saupe,D. Region-based fractal image compression.IEEE Transactions on Image Processing, vol. 9(7), pp. 1171–1184, 2000.

[Hur93] Hurtgen, B., Stiller, C. Fast hierarchical codebook search for fractal codingof still images. Proceedings of the SPIE — The International Society for OpticalEngineering, vol. 1977, pp. 397–408, 1993.

[Hur94] Hurtgen, B.,Mols, P., Simon, S. F. Fractal transform coding of color images.W Proceedings of SPIE Visual Communications and Image Processing, (A. K.Katsaggelos, ed.), vol. 2308, pp. 1683–1691, 1994.

[Jac89] Jacquin, A. E. Image coding based on a fractal theory of iterated contractivemarkov operators, part ii: Construction of fractal codes for digital images. Raporttech. 91389-017, Georgia Institute of Technology, 1989.

[Jac90a] Jacquin, A. E. Fractal image coding based on a theory of iterated contractiveimage transformations. Proceedings of the SPIE — The International Society forOptical Engineering, vol. 1360, pp. 227–239, 1990. ISSN 0277-786X.

[Jac90b] Jacquin,A. E. A novel fractal block-coding technique for digital images. Proceed-

Bibliography 166

ings of IEEE International Conference on Acoustics, Speech and Signal ProcessingICASSP-1990, vol. 4, pp. 2225–2228, 1990.

[Jac92] Jacquin, A. E. Image coding based on a fractal theory of iterated contractiveimage transformations. IEEE Transactions on Image Processing, vol. 1(1), pp.18–30, 1992.

[Jac93] Jacquin, A. E. Fractal image coding: a review. Proceedings of the IEEE,vol. 81(10), pp. 1451–1465, 1993. ISSN 0018-9219.

[Kan96] Kang,H.-S.,Kim, S.-D. Fractal decoding algorithm for fast convergence. OpticalEngineering, vol. 35(11), pp. 3191–3198, 1996.

[Kao91] Kaouri, H. Fractal coding of still images. W IEE 6th International Conferenceon Digital Processing of Signals in Communications, pp. 235–239, 1991.

[Kof06] Koff, D. A., Shulman, H. An overview of digital compression of medical im-ages: can we use lossy image compression in radiology? Canadian Association ofRadiologist, vol. 57, pp. 211–217, 2006.

[Kom95] Kominek, J. Algorithm for fast fractal image compression. W Digital videocompression: algorithms and technologies, (A. A. Rodriguez, R. J. Safranek, E. J.Delp, eds.), vol. 2419, pp. 296–305, San Jose, CA, USA, 1995.

[Kwi01] Kwiatkowski, J., Kwiatkowska, W., Kawa, K., Kania, P. Using fractalcoding in medical image magnification. W PPAM, (R. Wyrzykowski, J. Dongarra,M. Paprzycki, J. Wasniewski, eds.), Lecture Notes in Computer Science, vol. 2328,pp. 517–525, Springer, 2001. ISBN 3-540-43792-4.

[Lee98] Lee, C. K., Lee,W. K. Fast fractal image block coding based on local variances.IEEE Trans. Image Proc, vol. 7(6), pp. 888–891, 1998.

[Lep95] Lepsøy, S., Øien, G. E. Fast attractor image encoding by adaptive codebookclustering. W Fisher [Fis95a], pp. 177–197.

[Lin95a] Lin, H., Venetsanopoulos, A. N. Fast fractal image coding using pyramids. WProceedings of the 8th International Conference on Image Analysis and ProcessingICIAP ’95, (C. Braccini, L. D. Floriani, G. Vernazza, eds.), Lecture Notes inComputer Science, vol. 974, pp. 649–654, Springer-Verlag, San Remo, Italy, 1995.ISBN 3-540-60298-4.

[Lin95b] Lin, H., Venetsanopoulos, A. N. A pyramid algorithm for fast fractal imagecompression. W Proceedings of the 1995 International Conference on Image Pro-cessing ICIP ’95, vol. 3, pp. 596–599, IEEE Computer Society, Washington, DC,USA, 1995. ISBN 0-8186-7310-9.

[Lin97] Lin, H. Fractal image compression using pyramids, 1997. Doctor of PhilosophyDissertation, University of Toronto.

[Lu97] Lu, N. Fractal imaging. Academic Press, 1997.

[Man83] Mandelbrot, B. B. The fractal geometry of nature. W. H. Freedman and Co.,New York, 1983.

[MK94] M. Kawamata, M. N., Higuchi, T. Multi-resolutioin tree search for iteratedtransformation theory-based coding. Proceedings of the IEEE International Con-ference on Image Processing ICIP-94, vol. III, pp. 137–141, 1994.

Bibliography 167

[Mon92] Monro, D. M., Dudbridge, F. Fractal approximation of image blocks. WProceedings of ICASSP-1992 IEEE International Conference on Acoustics, Speechand Signal Processing, vol. 3, pp. 485–488, 1992.

[Mon93a] Monro, D. M. Class of fractal transforms. Electronics Letters, vol. 29(4), pp.362–363, 1993.

[Mon93b] Monro,D. M. Generalized fractal transforms: Complexity issues. W ProceedingsDCC’93 Data Compression Conference, (J. A. Storer, M. Cohn, eds.), pp. 254–261,IEEE Comp. Soc. Press, 1993. ISBN 0-8186-3392-1.

[Mon93c] Monro,D. M. A hybrid fractal transform. W Proceedings of ICASSP-1993 IEEEInternational Conference on Acoustics, Speech and Signal Processing, vol. 5, pp.169–172, 1993. ISSN 0013-5194.

[Mon94a] Monro, D. M., Woolley, S. J. Fractal image compression without searching.W Proceedings of ICASSP-1994 IEEE International Conference on Acoustics,Speech and Signal Processing, vol. 5, pp. 557–560, Adelaide, 1994.

[Mon94b] Monro, D. M.,Woolley, S. J. Rate/distortion in fractal compression: order oftransform and block symmetries. W Proc. ISSIPNN’94 Intern. Symp. on Speech,Image Processing and Neural Networks, vol. 1, pp. 168–171, Hong Kong, 1994.

[Mon95] Monro, D. M., Dudbridge, F. Rendering algorithms for deterministic fractals.IEEE Compututer Graphics and Applications, vol. 15(1), pp. 32–41, 1995. ISSN0272-1716.

[Mul03] Mulopulos, G. P., Hernandez, A. A., Gasztonyi, L. S. Jpeak signal tonoise ratio performance comparison of jpeg and jpeg 2000 for various medicalimage modalities. W 20th Symposium on Computer Applications in Radiology,Boston, Massachusetts, 2003.

[Nov93] Novak, M. Attractor coding of images. W Proceedings of the InternationalPicture Coding Symposium PCS’93, Lausanne, 1993.

[Och04] Ochotta, T., Saupe, D. Edge-based partition coding for fractal image compres-sion. The Arabian Journal for Science and Engineering, Special Issue on Fractaland Wavelet Methods, vol. 29(2C), 2004.

[Oh03] Oh, T. H., Besar, R. Jpeg2000 and jpeg: image quality measures of compressedmedical images. W 4th National Conference of Telecommunication Technology(NCTT2003), vol. 001, pp. 31–35, Shah Alam, Malaysia, 2003.

[Øie91] Øien, G. E., Lepsøy, S., Ramstad, T. A. An inner product space approachto image coding by contractive transformations. W Proceedings of ICASSP-1991IEEE International Conference on Acoustics, Speech and Signal Processing, pp.2773–2776, 1991.

[Øie92] Øien,G. E., Lepsøy, S.,Ramstad,T. Reducing the complexity of a fractal-basedimage coder. W Proceedings of the Vth European Signal Processing ConferenceEUSIPCO’92, pp. 1353–1356, 1992.

[Øie93] Øien,G. E. l2 optimal attractor image coding with fast decoder convergence, 1993.Doctor of Philosophy Dissertation, The Norwegian Institute of Technology.

[Øie94a] Øien, G. E. Parameter quantization in fractal image coding. W Proc. ICIP-94

Bibliography 168

IEEE International Conference on Image Processing, vol. III, pp. 142–146, Austin,Texas, 1994.

[Øie94b] Øien, G. E., Baharav, Z., Lepsøy, S., Karnin, E., Malah, D. A new im-proved collage theorem with applications to multiresolution fractal image coding.W Proceedings ICASSP-94 (IEEE International Conference on Acoustics, Speechand Signal Processing), vol. 5, pp. 565–568, Adelaide, Australia, 1994.

[Pin04] Pinhas, A., Greenspan, H. A continuous and probabilistic framework for med-ical image representation and categorization. Proceedings of SPIE Medical Imag-ing, vol. 5371, pp. 230–238, 2004.

[Pol01] Polidori, E.,Dugelay, J.-L. Zooming using iterated function systems. Fractals,vol. 5, pp. 111 – 123, 2001. Supplementary Issue 1.

[Pop93] Popescu, D. C., Yan, H. MR Image compression using iterated function sys-tems. Magnetic Resonance Imaging, vol. 11, pp. 727–732, 1993.

[Prz02] Przelaskowski,A. Kompresja danych. http://www.ire.pw.edu.pl/~arturp/Dydaktyka/koda/skrypt.html, 2002. Internet textbook.

[Ram76] Ramapriyan, H. K. A multilevel approach to sequential detection of pictorialfeatures. IEEE Transactions on Computers, vol. 25(1), pp. 66–78, 1976.

[Ram86] Ramamurthi, B., Gersho, A. Classified vector quantization of images. IEEETransaction on Communication, vol. 34, pp. 1105–1115, 1986.

[Reu94] Reusens, E. Partitioning complexity issue for iterated function systems basedimage coding. W Proceedings of the VIIth European Signal Processing ConferenceEUSIPCO’94, vol. I, pp. 171–174, Edinburgh, 1994.

[Ros84] Rosenfeld, A., ed.. Multiresolution image processing and analysis.Springer-Verlag, 1984.

[Ruh97] Ruhl,M., Hartenstein, H., Saupe, D. Adaptive partitionings for fractal imagecompression. W Proceedings ICIP-97 (IEEE International Conference on ImageProcessing), vol. II, pp. 310–313, Santa Barbara, CA, USA, 1997.

[Sam90] Samet, H. The design and analysis of spatial data structures. Addison-WesleyLongman Publishing Co., Inc., Boston, MA, USA, 1990. ISBN 0-201-50255-0.

[Sau95a] Saupe, D. Accelerating fractal image compression by multi-dimensional near-est neighbor search. W Proceedings of the IEEE Data Compression ConferenceDCC’95, (J. A. Storer, M. Cohn, eds.), pp. 222–231, Snowbird, UT, USA, 1995.

[Sau95b] Saupe, D. Fractal image compression via nearest neighbor search. W NATO ASIon Fractal Image Encoding and Analysis, Trondheim, Norway, 1995.

[Sau95c] Saupe, D. From classification to multi-dimensional keys. W Fisher [Fis95a], pp.302–305.

[Sau96a] Saupe, D. The futility of square isometries in fractal image compression. WProc. ICIP-96 IEEE International Conference on Image Processing, vol. I, pp.161–164, Lausanne, 1996.

[Sau96b] Saupe, D. Lean domain pools for fractal image compression. W Proceedingsof IS&T/SPIE, (R. L. Stevenson, A. I. Drukarev, T. R. Gardos, eds.), vol. 2669:

http://www.ire.pw.edu.pl/~arturp/Dydaktyka/koda/skrypt.html

http://www.ire.pw.edu.pl/~arturp/Dydaktyka/koda/skrypt.html

Bibliography 169

1996 Symposium on Electronic Imaging: Science & Technology - Still Image Com-pression II, pp. 150–157, San Jose, California, USA, 1996.

[Sau96c] Saupe, D., Hamzaoui, R., Hartenstein, H. Fractal image compression - anintroductory overview. W Fractal Models for Image Synthesis, Compression andAnalysis, ACM SIGGRAPH’96 Course Notes, (D. Saupe, J. Hart, eds.), vol. 27,New Orleans, Louisiana, USA, 1996.

[Sau96d] Saupe, D., Ruhl, M. Evolutionary fractal image compression. W Proc. ICIP-96IEEE International Conference on Image Processing, vol. I, pp. 129–132, Lau-sanne, Switzerland, 1996.

[Sau98] Saupe, D., Ruhl, M., Hamzaoui, R., Grandi, L., Marini, D. Optimal hier-archical partitions for fractal image compression. W Proc. ICIP-98 IEEE Inter-national Conference on Image Processing, Chicago, 1998.

[Sig97] Signes, J. Geometrical interpretation of IFS based image coding. Fractals, vol.5 (Supplementary Issue), pp. 133–143, 1997.

[Sta04] Starosolski, R. Przeglad metod bezstratnej kompresji obrazów medycznych. Stu-dia Informatica, vol. 25(2), pp. 49 – 66, 2004.

[Ste93] Stevenson,R. Reduction of coding artifacts in transform image coding. Proceed-ings of the International Conference on Acoustics, Speech and Signal Processing,vol. 5, pp. 401–404, 1993.

[Tan96] Tanimoto, M., Ohyama, H., Kimoto, T. A new fractal image coding schemeemploying blocks of variable shapes. W Proceedings ICIP-96 (IEEE InternationalConference on Image Processing), vol. 1, pp. 137–140, Lausanne, Switzerland,1996.

[Tat92] Tate, S. R. Lossless compression of region edge maps. Raport tech. Technicalreport DUKE–TR–1992–09, Duke University, Durham, NC, USA, 1992.

[Tho95] Thomas, L., Deravi, F. Region-based fractal image compression using heuristicsearch. IEEE Transactions on Image Processing, vol. 4(6), pp. 832–838, 1995.ISSN 1057-7149.

[Vin95] Vines, G. Orthogonal basis ifs. W Fisher [Fis95a].

[Wak97] Wakefield, P., Bethel, D., Monro, D. M. Hybrid image compression withimplicit fractal terms. W ICASSP ’97: Proceedings of the 1997 IEEE Interna-tional Conference on Acoustics, Speech, and Signal Processing (ICASSP ’97)-Volume 4, p. 2933, IEEE Computer Society, Washington, DC, USA, 1997. ISBN0-8186-7919-0.

[Wei96] Wein, C. J., Blake, I. F. On the performance of fractal compression withclustering. IEEE Transactions on Image Processing, vol. 5(3), pp. 522–526, 1996.

[Wei98] Weisfield, R. L. Amorphous silicon tft x-ray image sensors. IEEE IEDM’98Technical Digest, pp. 21–24, 1998.

[Woh95] Wohlberg, B., de Jager, G. Fast image domain fractal compression by DCTdomain block matching. Electronics Letters, vol. 31(11), pp. 869–870, 1995.

[Woh99] Wohlberg, B., de Jager, G. A review of the fractal image coding literature.IEEE Transactions on Image Processing, vol. 8(12), pp. 1716–1729, 1999.

Bibliography 170

[Woo94] Woolley, S. J., Monro, D. M. Rate-distortion performance of fractal trans-forms for image compression. Fractals, vol. 2(3), pp. 395–398, 1994. ISSN0218-348X.

[Woo95] Woolley, S. J., Monro, D. M. Optimum parameters for hybrid fractal im-age coding. W Proceedings of ICASSP-1995 IEEE International Conference onAcoustics, Speech and Signal Processing, Detroit, 1995.

[Wu91] Wu, X., Yao, C. Image coding by adaptive tree-structured segmentation. IEEETransactions on Information Theory, pp. 73–82, 1991.

[Zak92] Zakhor, A. Iterative procedures for reduction of blocking effects in transformimage coding. IEEE Transactions on Circuits and Systems for Video Technology,vol. 2(1), pp. 91–95, 1992.

[Zha98] Zhao, Y., Yuan, B. A new affine transformation: its theory and application toimage coding. IEEE Transactions on Circuits and Systems for Video Technology,vol. 8(3), pp. 269–274, 1998.

Documents

Thesis-Fractal Compression