12
A Low-Cost FPGA-based Embedded Fingerprint Verification and Matching System Maitane Barrenechea 1 , Jon Altuna 2 , and Miguel San Miguel 2 1 Signal Theory and Communications Group, Department of Electronics University of Mondrag´ on, Arrasate-Mondrag´ on, Spain [email protected] 2 {jaltuna, msanmiguel}@eps.mondragon.edu Abstract The development of a fingerprint verification system on a low-cost embedded platform is an open issue in nowadays biometrics. Our paper describes a low-cost fingerprint minutiae extraction and matching system based on a Spartan3 family FPGA with an embedded Leon2 open core processor. The proposed system architecture incorporates a Floating Point Unit and a Discrete Fourier Transform coprocessor to accelerate the minutiae extraction process. The whole verification algorithm is based on the NFIS version 2 open source software developed by the National Institute of Standards and Technology (NIST). The results on execution time reduction and FPGA occupation for different system configurations show that the proposed architecture improves substantially the performance of the baseline system architecture. 1 Introduction In nowadays society identity verification is becoming a crucial issue in several business sectors such as access or border control. Due to this fact, a new field known as biometrics has emerged, which uses some unique physiological or behavioural characteristics, not shared by any other individual, to positively identify a person. Examples of physical characteristics include fingerprints, facial patterns and hand measurements, eye retinas and irises, while examples of mostly behavioural characteristics include signature, gait and typing patterns. Fingerprint based verification is one of the most used biometric systems nowadays due to its easiness of acquisition and high distinctiveness, persistence and acceptance by the public [1]. This paper describes the design flow of a low-cost embedded system for fingerprint verification. The proposed system consists of a 32-bit Sparc Leon2 processor, a fingerprint image sensor, signal processing hardware acceleration and a floating point unit. It is worth noting that the minutiae extraction module is open to work with any fingerprint sensor, in which case a change in the fingerprint image capture driver would be enough.

[IEEE 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems - Leganes, Spain (2007.06.21-2007.06.22)] 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems -

Embed Size (px)

Citation preview

Page 1: [IEEE 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems - Leganes, Spain (2007.06.21-2007.06.22)] 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems -

A Low-Cost FPGA-based Embedded

Fingerprint Verification and Matching System

Maitane Barrenechea1, Jon Altuna2, and Miguel San Miguel2

1Signal Theory and Communications Group, Department of Electronics

University of Mondragon, Arrasate-Mondragon, Spain

[email protected]

2{jaltuna, msanmiguel}@eps.mondragon.edu

Abstract — The development of a fingerprint verification system on a low-cost

embedded platform is an open issue in nowadays biometrics. Our paper describes

a low-cost fingerprint minutiae extraction and matching system based on a Spartan3

family FPGA with an embedded Leon2 open core processor. The proposed system

architecture incorporates a Floating Point Unit and a Discrete Fourier Transform

coprocessor to accelerate the minutiae extraction process. The whole verification

algorithm is based on the NFIS version 2 open source software developed by the

National Institute of Standards and Technology (NIST). The results on execution time

reduction and FPGA occupation for different system configurations show that the

proposed architecture improves substantially the performance of the baseline system

architecture.

1 Introduction

In nowadays society identity verification is becoming a crucial issue in several business

sectors such as access or border control. Due to this fact, a new field known as biometrics

has emerged, which uses some unique physiological or behavioural characteristics, not

shared by any other individual, to positively identify a person. Examples of physical

characteristics include fingerprints, facial patterns and hand measurements, eye retinas

and irises, while examples of mostly behavioural characteristics include signature, gait

and typing patterns.

Fingerprint based verification is one of the most used biometric systems nowadays due

to its easiness of acquisition and high distinctiveness, persistence and acceptance by the

public [1].

This paper describes the design flow of a low-cost embedded system for fingerprint

verification. The proposed system consists of a 32-bit Sparc Leon2 processor, a fingerprint

image sensor, signal processing hardware acceleration and a floating point unit. It is worth

noting that the minutiae extraction module is open to work with any fingerprint sensor, in

which case a change in the fingerprint image capture driver would be enough.

Page 2: [IEEE 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems - Leganes, Spain (2007.06.21-2007.06.22)] 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems -

M. BARRENECHEA, J. ALTUNA, M. SAN MIGUEL

Similar FPGA-based fingerprint verification systems have been developed and proposed

in the literature [2][3][4]. From the system architecture point of view, the Thumbpod

project [2] is probably the most important reference due to its similarities to the platform

presented in this paper. Both systems have been built upon the Leon2 soft processor,

although for our system implementation we have selected a low-cost Spartan3 family

FPGA as the core of the design, while a more expensive Virtex II device was chosen in

the Thumbpod project. Moreover, a hardware coprocessor for floating point management

(FPU) enabled our system to accurately perform high speed floating point operations in

contrast with the fixed-point refinement required in the Thumbpod project.

Regarding software development issues, both projects are based on the NFIS (NIST

Fingerprint Image Software) open source software. Nevertheless, the verification system

proposed in this paper has its roots in the enhanced version 2 (NFIS2) of this algorithm

and uses the specified input fingerprint image format (500 dpi and 256 greyscale images)

for its optimum performance. On the other hand, the Thumbpod project algorithm uses

low quality images (3 bits per pixel) as an input pattern to execute the NFIS version 1

minutiae extraction flow. The proposed system is also open to most fingerprint sensors

in contrast to other platforms which have been customized for a specific fingerprint sen-

sor. As for the matching algorithm is concerned, the BOZORTH3 algorithm has been

implemented.

The paper is organized as follows: In Section 2, the software architecture for the minu-

tiae extraction and matching algorithm targeted to a Leon2 based platform is described.

In section 3, the proposed HW architecture of the design is explained, emphasizing in

the floating point operation acceleration achieved by means of a FPU and a DFT co-

processing engine. In Section 4, some results on speed enhancement for the best of the

three proposed system configurations are shown and finally, in Section 5 some concluding

remarks will be drawn.

2 Software Architecture

The fingerprint authentication algorithms developed for the target system are based on

some routines of the NFIS2 collection from the National Institute of Standards and Tech-

nology (NIST). Specifically, custom versions of the MINDTCT and BOZORTH3 pack-

ages have been developed for the minutiae acquisition and matching respectively.

It is worth noting that the algorithm and the implemented software parameters have

been designed and set for an optimum performance with 256 greyscale and 500 ppi-

scanned images. These input image characteristics match perfectly with those of the

image provided by the most relevant fingerprint sensors, such as the Fujitsu MBF200,

which was the chosen sensor for the research that has been carried out.

2.1 Software Implementation on a Leon2 Platform

The original software was designed and tested to be run on a Linux operating System and

compiled with gcc. A bare-C cross-compiler and a GRMON debug monitor from Gaisler

Research have been used in this project. Dynamically allocated data arrays have been

used for intermediate result storing depending on the application’s requirements.

On the other hand, although the original software only accepts input image files in

Page 3: [IEEE 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems - Leganes, Spain (2007.06.21-2007.06.22)] 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems -

A LOW-COST EMBEDDED FINGERPRINT VERIFICATION AND MATCHING SYSTEM

ANSI/NIST, WSQ, JPEGB, JPEGL and IHEAD formats, the selected fingerprint sensors

provide the fingerprint image in RAW format, for which the algorithm has been modified

and adapted so that only RAW format image files are accepted.

2.2 Minutiae Extraction Algorithm

The MINDTCT software has been designed in a modular fashion, and as a result of that,

each step in the algorithm is mainly executed in a subroutine. Only those strictly nec-

essary modules for XYT formatted (position, direction and quality) output minutiae list

generation have been taken from the original algorithm. The functional steps executed in

the minutiae extraction algorithm are shown in “Figure 1”.

Figure 1: Functional steps of the minutiae extraction algorithm

In the image map generation phase degraded fingerprint areas prone to give as a result

erroneous minutiae are identified. To this end, unreliable image zones are detected based

on the following criteria:

Low contrast: Marks low contrast areas in the image which mainly correspond with the

background of the image or smudges in the fingerprint (Low Contrast Map, LCM)

(“Figure 2”).

Low ridge flow: Identifies those image areas where the dominant ridge flow could not

be determined initially (Low Flow Map, LFM) (“Figure 2”).

High curvature: Flags high curvature areas in the image, such as the fingerprint core or

possible delta regions (High Curve Map, HCM) (“Figure 2”).

Page 4: [IEEE 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems - Leganes, Spain (2007.06.21-2007.06.22)] 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems -

M. BARRENECHEA, J. ALTUNA, M. SAN MIGUEL

Low Contrast Map Low Flow Map High Curve Map

Figure 2: Generated Image Maps

The computed Low Contrast Map, Low Flow Map and High Curvature Map are shown

in “Figure 2”.

As a combination of these three features a quality map is derived. This map assigns one

of the possible five quality levels to each of the blocks in the image (“Figure 3”). The

quality ranking is sorted as follows:

0: Poor quality

1: Fair quality

2: Good quality

3: Very good quality

4: Excellent quality

In this phase of the algorithm one of the fundamental maps for the minutiae extraction

process is also derived: The directional ridge flow map. For the acquisition of this map,

the original image is divided into 8x8 size pixel blocks. For each of the image blocks a

24x24 pixel sized window is defined, conformed by the block itself and other surrounding

pixels. The window is rotated incrementally in the 16 orientations defined in the algorithm

(each of them 11.25◦ apart) and a DFT is executed at each position. In every orientation,

the pixels along each rotated row of the window are summed up to form 16 vectors of

row sums. Each one of these vectors is then convolved with four waveforms of different

frequency. The spatial frequency of each waveform discretely represents the width of

different ridges and valleys, in such a way that 12, 6, 3 and 1.5 pixel widths are covered

in the algorithm. To determine the dominant ridge flow within a block, the resonance

coefficient obtained from the convolution is evaluated. The result for this module of the

algorithm is shown in “Figure 4”.

Once the image maps have been acquired, it is necessary to binarize the fingerprint

image so that the minutiae can be extracted. To carry out this process, the previously

computed directional ridge flow map is used to determine the binary value assigned to

each pixel. After the binarization (“Figure 4”), the minutiae detection module analyzes

the binarized image looking for candidate minutiae (ridge ending or bifurcation). How-

ever, not all the ridge patterns selected after this procedure correspond to true minutiae,

Page 5: [IEEE 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems - Leganes, Spain (2007.06.21-2007.06.22)] 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems -

A LOW-COST EMBEDDED FINGERPRINT VERIFICATION AND MATCHING SYSTEM

Figure 3: Computed quality map

therefore, a false minutiae removing process should be carried out. Even after the false

minutiae removing process, false minutiae may potentially remain in the candidate list.

To counteract this fact, a reliability measure is assigned to each minutia based on the qual-

ity map and other pixel intensity statistics. The resulting minutiae for the template image

are shown in “Figure 4”.

2.3 Matching Algorithm

The BOZORTH3 matching algorithm, included in the second distribution of NFIS, com-

putes a match score that reflects the similarity degree between a fingerprint’s minutiae

and a template minutiae set, both of them in XYT format. One of the most remarkable

features of this algorithm is its invariance to both rotation and translation.

The first step in the algorithm flow shown in “Figure 5” is to construct a comparison

table for each one of the input minutiae sets. Relative measures between a minutia and

the rest of the minutiae in the same fingerprint are computed and stored in a comparison

table. This is what provides the algorithm’s translation and rotation invariance.

The next step is to look for compatible entries between the two tables. The results

of this analysis are stored in a new compatibility table which consists of a list of com-

patibility associations between two possible corresponding minutiae. Each one of these

associations represents single links in the compatibility graph.

In the final phase of the matching software flow, the compatibility graph is traversed and

clusters are created by linking table entries. Once the traversals are complete, compatible

clusters are combined and a match score is computed by accumulation of the linked table

entries across the combined clusters. Generally, a match score greater than 40 indicates

that both fingerprint minutiae and template minutiae belong to the same finger, and so, to

the same individual.

Page 6: [IEEE 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems - Leganes, Spain (2007.06.21-2007.06.22)] 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems -

M. BARRENECHEA, J. ALTUNA, M. SAN MIGUEL

Direction Map Binarized Image Final Minutiae

Figure 4: Resulting images of the minutiae extraction process

3 HW Architecture

3.1 Initial System Architecture

The initial HW architecture is composed of a 50 MHz fixed-point Leon2 soft-processor

with 8 KB of cache memory for data and instructions, all of this embedded in a GR-

XC3S1500 board.This processor has been chosen for this application not only because of

its high performance and usability, but also due to the fact that it can be obtained under

LGPL license. According to a report on synthesizable CPU cores [5], where Leon2, Mi-

croBlaze and OpenRISC 1200 where tested under three different hardware configurations

and three different benchmarks, Leon2 yielded the best performance per clock cycle for

all the benchmarks and configurations. Moreover, in the opinion of the authors of the

mentioned report, Leon2 is the processor with the highest usability among the tested

CPU cores. A reason for this may be the VHDL code availability and the TCL/Tk based

configuration tool, which facilitates the design of a custom Leon2 based system. The fin-

gerprint image acquisition is performed by means of an IP (Intellectual Property) module

connected to the MBF200 fingerprint sensor. This module is attached to the APB bus

(AMBA Peripheral Bus) and provides the processor with the input fingerprint image as

shown in “Figure 6”. The operation of the sensor is controlled by means of three On-chip

registers (control, data and status) generated in the address range allocated for the APB

bridge.

3.1.1 Running the application on the initial system

The original minutiae extraction and matching algorithm was implemented using floating-

point notation while the Leon2 soft-processor is fixed-point. In order to run the program

on the target platform the ’-msoft-float’ option must be set in the compiler options. This

option forces all floating-point operations to be done in software with integer arithmetic.

The execution of the algorithm is successful as for the matching results is concerned but

not in terms of execution time. The required computation time for the minutiae extraction

is 157 seconds while the matching process for a one-to-one comparison is carried out in

51 seconds. Even when the results for the extracted minutiae and match score are correct,

Page 7: [IEEE 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems - Leganes, Spain (2007.06.21-2007.06.22)] 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems -

A LOW-COST EMBEDDED FINGERPRINT VERIFICATION AND MATCHING SYSTEM

Figure 5: Functional steps of the matching algorithm

the program execution delay is unacceptable for a biometric verification system.

The excessive execution time is mainly due to the MINDTCT algorithm, and thus, the

analysis of the reduction of the time required for minutiae extraction becomes one of the

main objectives of this paper.

In the MINDTCT module a great amount of floating-point data is used. The emulation

of this data format introduces a serious delay in the execution of the program. To ac-

celerate this process, a floating point unit (FPU) is required. In the following section the

effects of this core concerning execution time reduction and floating-point data processing

are analyzed.

3.2 FPU Tests

The Leon2 processor provides an interface to different FPUs. The floating-point units

from Gaisler Research (GRFPU) and Sun Microsystems (Meiko), as well as the incom-

plete LTH FPU [6] are compatible with this interface. For the acceleration of the floating-

point processes the GRFPU has been used eventually for its high-performance and com-

pliance with the IEEE-754 standard.

The insertion of a FPU in the embedded system leads to a considerable increase in the

amount of logic inside the FPGA. This is the reason why a reduction in the processor

clock and/or a cutback in the cache memory amount will be required. The effect of

the attachment of a floating point unit has been analyzed for the following three system

configurations:

– 31 MHz and 8KB cache memory.

– 37 MHz and 8KB cache memory.

– 40 MHz and 4KB cache memory.

Page 8: [IEEE 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems - Leganes, Spain (2007.06.21-2007.06.22)] 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems -

M. BARRENECHEA, J. ALTUNA, M. SAN MIGUEL

Figure 6: Proposed system architecture

The FPGA utilization is almost complete and very similar for the three analized hard-

ware configurations. The device utilization summary for the three cases is shown in “Ta-

ble 1”.

31 MHz and 37 MHz and 40 MHz and

8 KB cache 8 KB cache 4 KB cache

Number of external IOBs 36% 36% 36%

Number of LOCed External IOBs 97% 97% 97%

Number of MUL18X18s 53% 53% 53%

Number of RAMB16s 56% 68% 56%

Number of slices 99% 99% 99%

Number of SLICEMs 1% 1% 1%

Number of BUFGMUXs 37% 37% 37%

Number of DCMs 50% 50% 50%

Table 1: Device utilization for different system configurations

The floating-point behaviour has been assessed by means of the Stanford and Paranoia

benchmarks. The first one measures the execution time in milliseconds for each one of

the ten small programs included in the algorithm. Only two of these modules make use of

numbers in floating-point format: FFT and Mm. On the other hand, Paranoia is a program

to test the compliance with the IEEE-754 floating-point standard [5].

It is important to point out that the compilation of these programs must be carried out

without the ’-msoft-float’ compiler option set for those system configurations where the

floating-point core is inserted.

Page 9: [IEEE 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems - Leganes, Spain (2007.06.21-2007.06.22)] 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems -

A LOW-COST EMBEDDED FINGERPRINT VERIFICATION AND MATCHING SYSTEM

3.2.1 Stanford Benchmark

The outcome of the execution of this program for different system configurations is shown

in “Table 2”. The study has been carried out for the following embedded designs:

A: 50 MHz clock frequency with 8KB cache memory (No FPU).

B: 31 MHz clock frequency with 8KB cache memory and FPU.

C: 37 MHz clock frequency with 8KB cache memory and FPU.

D: 40 MHz clock frequency with 4KB cache memory and FPU.

It is worth mentioning the considerable execution time reduction achieved in those pro-

gram modules where floating-point operations where carried out. The decrease in the

execution time for the matrix multiplication program (Mm) is of 95% in the best case

(37 MHz and 8KB cache memory) and 91.6% in the worst case (31 MHz and 8KB cache

memory). The results for the FFT analysis are similar: A reduction of 95.3% of the com-

putation time has been achieved in the best case (40 MHz and 4KB cache memory) and

of 92.22% in the worst case (31 MHz and 8KB cache memory). The execution time for

integer operations does not show remarkable variations.

A B C D

Perm 34 50 33 34

Towers 50 83 67 50

Queens 33 50 33 33

Intmm 166 133 100 116

Mm 1000 84 50 67

Puzzle 317 450 350 350

Quick 50 50 33 33

Bubble 50 50 50 50

Tree 233 334 266 250

FFT 1067 83 67 50

Table 2: Stanford benchmark results for different system configurations

3.2.2 Paranoia Benchmark

The outcome of the Paranoia benchmark has been identical in the three proposed con-

figurations. The arithmetic diagnosed is satisfactory though a little underflow flaw has

been found:

Paranoia version 1.1 [cygnus]

. . .

FLAW: X = 3.05947655544740190e-308

is not equal to Z = 2.22507385850720138e-308 .

yet X - Z yields 0.00000000000000000e+00 .

This is correct as long as the underflow is notified.

Page 10: [IEEE 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems - Leganes, Spain (2007.06.21-2007.06.22)] 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems -

M. BARRENECHEA, J. ALTUNA, M. SAN MIGUEL

3.3 Introducing the GRFPU in the design

Tests on IEEE-754 compliancy drew positive results for the Gaisler Research floating-

point core, and therefore this module was included in the hardware design. Results

for computation time have improved substantially after the FPU insertion, mainly in the

execution time required for the MINDTCT process completion. A 94.14% time reduction

has been achieved for the case of 40MHz and 4KB cache memory configuration. The

execution time for the matching algorithm had a slight improvement.

The timing results for the different system configurations are shown in “Figure 7”. Note

that the analized system configurations are the same as those in section 3.2.1.

Figure 7: Execution Time Results for different Leon2-based Configurations

3.4 HW Speed Enhancement

Even if the computation time for the minutiae extraction algorithm has been reduced in

a 94.14%, the program completion delay is yet excessive for its implementation in a real

commercial system.

Several timing analyses show that the 75% of the computation time is occupied by

the low contrast map, direction map and low flow map generation process. The 92% of

the time dedicated to image map computation is needed to generate the directional ridge

Page 11: [IEEE 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems - Leganes, Spain (2007.06.21-2007.06.22)] 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems -

A LOW-COST EMBEDDED FINGERPRINT VERIFICATION AND MATCHING SYSTEM

flow map, as shown in “Figure 8”. This process is accelerated by means of a hardware

accelerator that computes the required DFT calculations for the accomplishment of the

algorithm.

Figure 8: Profiling of the execution time for: a.- MINDTCT algorithm b.- LCM, LFM

and DM modules

4 Conclusions

This paper describes the implementation of a fingerprint minutiae extraction and matching

algorithm running on a Spartan3 based system with an embedded Leon2 soft-processor.

The original application developed by NIST has been modified and ported to the target

platform. Several tests have been carried out to analyze the performance of the software

algorithms with different Leon2 and GRFPU configurations.

After the insertion of a floating-point unit, the results on execution time of the algorithm

have been reduced in a 94.14% for a 40MHz and 4KB cache memory configuration.

Acknowledgments

This work was supported by the BIOSEG PROFIT Project funded by the Spanish Ministry of

Science and Technology. The authors would also like to thank Jiri Gaisler and Richard Pender for

their support in setting up the system architecture.

References

[1] A.K. Jain. Biometric recognition: How do I know who you are?. In Signal Processing and Commu-

nications Applications Conference on, 2004. Proceedings of the IEEE 12th, April 2004.

[2] S. Yang, K. Sakiyama and I. Verbauwhede. A Compact and Efficient Fingerprint Verification System

for Secure Embedded Devices. In ”Signals, Systems and Computers, 2003. Conference Record of the

Thirty-Seventh Asilomar Conference”, pages 2058-2062, Pacific Grove, California, November 2003.

[3] A. Lindoso, L. Entrena and J. Izquierdo. FPGA-Based acceleration of fingerprint minutiae match-

ing. In ”III Southern Conference on Programmable Logic (SPL2007)”, Mar del Plata, Argentina,

February 2007.

Page 12: [IEEE 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems - Leganes, Spain (2007.06.21-2007.06.22)] 2007 Fifth Workshop on Intelligent Solutions in Embedded Systems -

M. BARRENECHEA, J. ALTUNA, M. SAN MIGUEL

[4] M. Lopez Garcıa and E. F. Canto Navarro. FPGA Implementation of a Ridge Extraction Fingerprint

Algorithm Based on a MicroBlaze and Hardware Coprocessor. In ”16th International Conference on

Field Programmable Logic and Applications (FPL 2006)”, Madrid, Spain, August 2006.

[5] Daniel Mattson and Marcus Christensson. Evaluation of synthesizable CPU cores, Master’s Thesis,

Chalmers University of technology, Gothenburg, Sweden, 2004.

[6] Martin Kasprzyk. Floating Point Unit, Digital IC Project 2001, January 2002.