15
Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document http://calypod.free.fr [email protected] Thierry Brouard, Mathieu Delalandre, Nicholas Journet and Frédéric Nicolier NaviDoMass Meeting 6th November 2007 Paris V University, Paris, France

Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document [email protected] Thierry Brouard,

Embed Size (px)

Citation preview

Page 1: Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document  calypod@ml.free.fr Thierry Brouard,

Tuesday, 6th November 2007

Work Group CALYPODgraphiCs imAge anaLYsis from Printed Old Document

http://[email protected]

Thierry Brouard, Mathieu Delalandre, Nicholas Journetand Frédéric Nicolier

NaviDoMass Meeting6th November 2007

Paris V University, Paris, France

Page 2: Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document  calypod@ml.free.fr Thierry Brouard,

2

General Presentation (1/2)• Research Work Group

Group of researchers, coming from different laboratories, teams and projects, working toward a common specific research topic.

• Specific topic of researchAutomatic processing of the graphical parts in old printed books (segmentation, pre-processing, matching, OCR, retrieval, …)

• Objectives1. To develop and maintain a website to collect and

centralize information (web links, bibliographic references, papers …)

2. To put in relation (mailings, meetings) every people (human and computer sciences) working on this topic and to strengthen the collaborations

3. To develop “real-life” applications (AGORA, DMOS, ..) for the end-users partners of human science (CESR, ..)

ornamental letter

headline

figure

headline

Page 3: Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document  calypod@ml.free.fr Thierry Brouard,

3

General Presentation (2/2)

November

December

January February

March April

May June

July August

September

October

November

December

Calendar …

2006

2007

11th June, 1st Meeting (Paris)Starting date of Calypod Group

GDR-ISIS “Jeune Chercheur” Application “SILCIL”

5th July, opening of http://calypod.free.fr13th July, 2sd Calypod Meeting (La Rochelle)

13th November, 3rd Calypod Meeting (Tours)

Break period …..

6th November, Calypod talk at NaviDoMass Meeting (Paris)

6%

16%

17%

17%

27%

17%BVH

ANITTA

IAnaDoc

Not linked

NaviDoMass

Madonne

Calypod People (17)

Busson SébastienBaudrier EtienneNicolier Frédéric Landré Jérôme Delalandre MathieuKaratzas DimosthenisLladós JosepNicolas StéphaneRamos OriolPetitjean CarolineEngineer 5,88 1,00PhD Student 17,65 3,00Post-Doc 29,42 5,00Lecturer 29,40 5,00Professor 17,65 3,00

17,00

6%

18%

29%

29%

18%

Engineer

PhD Student

Post-Doc

Lecturer

Professor

Journet Nicholas Salmon Jean-PierreCoustaty Mickael Brouard ThierryOgier Jean-Marc Ramel Jean-Yves Sidere Nicolas

Page 4: Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document  calypod@ml.free.fr Thierry Brouard,

4

Research Project (1/2)

Color (black, white)

Size (small, large)

Background (almost empty, riched graphics)

letter (c) topic (vegetal) pattern (cross)

Multi-Criterion Retrieval of Ornamental Letter

Problematic ?

Page 5: Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document  calypod@ml.free.fr Thierry Brouard,

5

Research Project (2/2)

OLRImage

Pre-Processing

Printing Retrieval

L (90%)

Style Retrieval

Performance Evaluation

Page 6: Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document  calypod@ml.free.fr Thierry Brouard,

6

Image Pre-Processing

OLRImage

Pre-Processing

Printing Retrieval Style Retrieval

Performance Evaluation

Offset

Skewing

Overview• Translation, SPOMF (Symetric Phase Only

Matched Filter) correlation based method• Rotation, SPOMF on polar form of images• Scale, SPOMF + Mellin transform

Approach [Thévenaz98] A Pyramid Approach to SubpixelRegistration Based on Intensity, IEEE Trans

ImageProcessing

Degradation

Page 7: Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document  calypod@ml.free.fr Thierry Brouard,

7

Printing Retrieval (1/2)

(2) Most of the images are copyrighted, a system must retrieve them in real-time in order to allow crossed queries between the databases.

DB

DB

DB

query

query

r1 r2 r3

r1 r2 r3

(1) Historian people are interested in the wood plug tracking as tool to date the old books

Vascosan 1555

Marnef 1576

Printing houseplugexchangecopy

1531-1548

1511-1542

1555-1578

1497-1507

OLRImage

Pre-Processing

Printing Retrieval Style Retrieval

Performance Evaluation

Page 8: Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document  calypod@ml.free.fr Thierry Brouard,

8

Printing Retrieval (2/2)

Level 1 : image sizes Level 2 : image densityLevel 3 : RLE comparison

Our key ideas

(2) To use different level of operator

(from more speed to more accurate)

query

1st Level

2sd Level

Speed

Depth

(1) To use a Run Length Encoding (RLE) of Image

Compression rate/Dropcap

0,7

0,8

0,9

1

Dropcap

Co

mp

res

sio

n r

ate

0.75

0.950.8

8

x2 x2 x2

x1x1 x1

x2

line (y) image

1

3

1

2 4

5 6

7line (y+dy)

image 2

while x1 x2 handle image 1

while x2 x1 handle image 2

OLRImage

Pre-Processing

Printing Retrieval Style Retrieval

Performance Evaluation

Page 9: Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document  calypod@ml.free.fr Thierry Brouard,

9

• 2 steps– 1) Cluster the ornamental letters according to their styles– 2) Apply letter recognition algorithms according to the

cluster (letter black or white, background specificity…)

PreprocessingPreprocessing

Features ExtractionFeatures Extraction

Model TrainingModel Training

-Binarization-Resizing

-FFT, DCT, [Radon] Coefs.-Zernike Moments-Threshold Adj. Stats.-[Haralick, QMF]

-SVM•N-folder Cross Validation•Evaluation of the best model on a test database

OLRImage

Pre-Processing

Printing Retrieval Style Retrieval

Performance EvaluationStyle Retrieval (1/3)

Page 10: Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document  calypod@ml.free.fr Thierry Brouard,

10

C1

OLRImage

Pre-Processing

Printing Retrieval Style Retrieval

Performance EvaluationStyle Retrieval (2/3)

89,25%

420/466

87,5%

47/54

375/41291,0%

C1

C2

Test Samples(FFT, 100 coefs.)

C2

Graphical style retrieval (homogeneous vs. textured)

Page 11: Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document  calypod@ml.free.fr Thierry Brouard,

11

Test Samples(FFT, 100 coefs.)

93,1%

298/320

90,6%

145/160

175/16095,6%

C1

C2

Letter color retrieval (black vs. white)

C1

C2

OLRImage

Pre-Processing

Printing Retrieval Style Retrieval

Performance EvaluationStyle Retrieval (3/3)

Page 12: Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document  calypod@ml.free.fr Thierry Brouard,

12

Ornamental Letter Recognition (1/2)

A

Letter segmentatio

n

Character recognition

OLRImage

Pre-Processing

Printing Retrieval Style Retrieval

Performance Evaluation

Page 13: Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document  calypod@ml.free.fr Thierry Brouard,

13

Ornamental Letter Recognition (2/2)

OLRImage

Pre-Processing

Printing Retrieval Style Retrieval

Performance Evaluation

Page 14: Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document  calypod@ml.free.fr Thierry Brouard,

14

Performance Evaluation (1/1)

BaseOur

Retrieval engine

control

display

retrieve

Metadata

driven metadata acquisition

Bench1 Bench2 Bench2To produce

OCRImage

Pre-Processing

Printing Retrieval Style Retrieval

Performance Evaluation

Metadata file Metadata

file

Without retrieval

With retrieval more faster reduce error

Page 15: Tuesday, 6th November 2007 Work Group CALYPOD graphiCs imAge anaLYsis from Printed Old Document  calypod@ml.free.fr Thierry Brouard,

15

Conclusion• Website

35 references20 weblinks4 test databases1 wiki

• Human Network17 people from computer and human sciences, still in progress (BCU Lausanne, ….)http://calypod.free.fr [email protected] Meetings, 3 invited talks

• Research WorksA common research project under way,grouped publications expected for the 1st semester 2008

August 144 Visit

September

196 Visit

October 334 Visit