Upload
miguel-san
View
216
Download
3
Embed Size (px)
Citation preview
A Low-Cost FPGA-based Embedded
Fingerprint Verification and Matching System
Maitane Barrenechea1, Jon Altuna2, and Miguel San Miguel2
1Signal Theory and Communications Group, Department of Electronics
University of Mondragon, Arrasate-Mondragon, Spain
2{jaltuna, msanmiguel}@eps.mondragon.edu
Abstract — The development of a fingerprint verification system on a low-cost
embedded platform is an open issue in nowadays biometrics. Our paper describes
a low-cost fingerprint minutiae extraction and matching system based on a Spartan3
family FPGA with an embedded Leon2 open core processor. The proposed system
architecture incorporates a Floating Point Unit and a Discrete Fourier Transform
coprocessor to accelerate the minutiae extraction process. The whole verification
algorithm is based on the NFIS version 2 open source software developed by the
National Institute of Standards and Technology (NIST). The results on execution time
reduction and FPGA occupation for different system configurations show that the
proposed architecture improves substantially the performance of the baseline system
architecture.
1 Introduction
In nowadays society identity verification is becoming a crucial issue in several business
sectors such as access or border control. Due to this fact, a new field known as biometrics
has emerged, which uses some unique physiological or behavioural characteristics, not
shared by any other individual, to positively identify a person. Examples of physical
characteristics include fingerprints, facial patterns and hand measurements, eye retinas
and irises, while examples of mostly behavioural characteristics include signature, gait
and typing patterns.
Fingerprint based verification is one of the most used biometric systems nowadays due
to its easiness of acquisition and high distinctiveness, persistence and acceptance by the
public [1].
This paper describes the design flow of a low-cost embedded system for fingerprint
verification. The proposed system consists of a 32-bit Sparc Leon2 processor, a fingerprint
image sensor, signal processing hardware acceleration and a floating point unit. It is worth
noting that the minutiae extraction module is open to work with any fingerprint sensor, in
which case a change in the fingerprint image capture driver would be enough.
M. BARRENECHEA, J. ALTUNA, M. SAN MIGUEL
Similar FPGA-based fingerprint verification systems have been developed and proposed
in the literature [2][3][4]. From the system architecture point of view, the Thumbpod
project [2] is probably the most important reference due to its similarities to the platform
presented in this paper. Both systems have been built upon the Leon2 soft processor,
although for our system implementation we have selected a low-cost Spartan3 family
FPGA as the core of the design, while a more expensive Virtex II device was chosen in
the Thumbpod project. Moreover, a hardware coprocessor for floating point management
(FPU) enabled our system to accurately perform high speed floating point operations in
contrast with the fixed-point refinement required in the Thumbpod project.
Regarding software development issues, both projects are based on the NFIS (NIST
Fingerprint Image Software) open source software. Nevertheless, the verification system
proposed in this paper has its roots in the enhanced version 2 (NFIS2) of this algorithm
and uses the specified input fingerprint image format (500 dpi and 256 greyscale images)
for its optimum performance. On the other hand, the Thumbpod project algorithm uses
low quality images (3 bits per pixel) as an input pattern to execute the NFIS version 1
minutiae extraction flow. The proposed system is also open to most fingerprint sensors
in contrast to other platforms which have been customized for a specific fingerprint sen-
sor. As for the matching algorithm is concerned, the BOZORTH3 algorithm has been
implemented.
The paper is organized as follows: In Section 2, the software architecture for the minu-
tiae extraction and matching algorithm targeted to a Leon2 based platform is described.
In section 3, the proposed HW architecture of the design is explained, emphasizing in
the floating point operation acceleration achieved by means of a FPU and a DFT co-
processing engine. In Section 4, some results on speed enhancement for the best of the
three proposed system configurations are shown and finally, in Section 5 some concluding
remarks will be drawn.
2 Software Architecture
The fingerprint authentication algorithms developed for the target system are based on
some routines of the NFIS2 collection from the National Institute of Standards and Tech-
nology (NIST). Specifically, custom versions of the MINDTCT and BOZORTH3 pack-
ages have been developed for the minutiae acquisition and matching respectively.
It is worth noting that the algorithm and the implemented software parameters have
been designed and set for an optimum performance with 256 greyscale and 500 ppi-
scanned images. These input image characteristics match perfectly with those of the
image provided by the most relevant fingerprint sensors, such as the Fujitsu MBF200,
which was the chosen sensor for the research that has been carried out.
2.1 Software Implementation on a Leon2 Platform
The original software was designed and tested to be run on a Linux operating System and
compiled with gcc. A bare-C cross-compiler and a GRMON debug monitor from Gaisler
Research have been used in this project. Dynamically allocated data arrays have been
used for intermediate result storing depending on the application’s requirements.
On the other hand, although the original software only accepts input image files in
A LOW-COST EMBEDDED FINGERPRINT VERIFICATION AND MATCHING SYSTEM
ANSI/NIST, WSQ, JPEGB, JPEGL and IHEAD formats, the selected fingerprint sensors
provide the fingerprint image in RAW format, for which the algorithm has been modified
and adapted so that only RAW format image files are accepted.
2.2 Minutiae Extraction Algorithm
The MINDTCT software has been designed in a modular fashion, and as a result of that,
each step in the algorithm is mainly executed in a subroutine. Only those strictly nec-
essary modules for XYT formatted (position, direction and quality) output minutiae list
generation have been taken from the original algorithm. The functional steps executed in
the minutiae extraction algorithm are shown in “Figure 1”.
Figure 1: Functional steps of the minutiae extraction algorithm
In the image map generation phase degraded fingerprint areas prone to give as a result
erroneous minutiae are identified. To this end, unreliable image zones are detected based
on the following criteria:
Low contrast: Marks low contrast areas in the image which mainly correspond with the
background of the image or smudges in the fingerprint (Low Contrast Map, LCM)
(“Figure 2”).
Low ridge flow: Identifies those image areas where the dominant ridge flow could not
be determined initially (Low Flow Map, LFM) (“Figure 2”).
High curvature: Flags high curvature areas in the image, such as the fingerprint core or
possible delta regions (High Curve Map, HCM) (“Figure 2”).
M. BARRENECHEA, J. ALTUNA, M. SAN MIGUEL
Low Contrast Map Low Flow Map High Curve Map
Figure 2: Generated Image Maps
The computed Low Contrast Map, Low Flow Map and High Curvature Map are shown
in “Figure 2”.
As a combination of these three features a quality map is derived. This map assigns one
of the possible five quality levels to each of the blocks in the image (“Figure 3”). The
quality ranking is sorted as follows:
0: Poor quality
1: Fair quality
2: Good quality
3: Very good quality
4: Excellent quality
In this phase of the algorithm one of the fundamental maps for the minutiae extraction
process is also derived: The directional ridge flow map. For the acquisition of this map,
the original image is divided into 8x8 size pixel blocks. For each of the image blocks a
24x24 pixel sized window is defined, conformed by the block itself and other surrounding
pixels. The window is rotated incrementally in the 16 orientations defined in the algorithm
(each of them 11.25◦ apart) and a DFT is executed at each position. In every orientation,
the pixels along each rotated row of the window are summed up to form 16 vectors of
row sums. Each one of these vectors is then convolved with four waveforms of different
frequency. The spatial frequency of each waveform discretely represents the width of
different ridges and valleys, in such a way that 12, 6, 3 and 1.5 pixel widths are covered
in the algorithm. To determine the dominant ridge flow within a block, the resonance
coefficient obtained from the convolution is evaluated. The result for this module of the
algorithm is shown in “Figure 4”.
Once the image maps have been acquired, it is necessary to binarize the fingerprint
image so that the minutiae can be extracted. To carry out this process, the previously
computed directional ridge flow map is used to determine the binary value assigned to
each pixel. After the binarization (“Figure 4”), the minutiae detection module analyzes
the binarized image looking for candidate minutiae (ridge ending or bifurcation). How-
ever, not all the ridge patterns selected after this procedure correspond to true minutiae,
A LOW-COST EMBEDDED FINGERPRINT VERIFICATION AND MATCHING SYSTEM
Figure 3: Computed quality map
therefore, a false minutiae removing process should be carried out. Even after the false
minutiae removing process, false minutiae may potentially remain in the candidate list.
To counteract this fact, a reliability measure is assigned to each minutia based on the qual-
ity map and other pixel intensity statistics. The resulting minutiae for the template image
are shown in “Figure 4”.
2.3 Matching Algorithm
The BOZORTH3 matching algorithm, included in the second distribution of NFIS, com-
putes a match score that reflects the similarity degree between a fingerprint’s minutiae
and a template minutiae set, both of them in XYT format. One of the most remarkable
features of this algorithm is its invariance to both rotation and translation.
The first step in the algorithm flow shown in “Figure 5” is to construct a comparison
table for each one of the input minutiae sets. Relative measures between a minutia and
the rest of the minutiae in the same fingerprint are computed and stored in a comparison
table. This is what provides the algorithm’s translation and rotation invariance.
The next step is to look for compatible entries between the two tables. The results
of this analysis are stored in a new compatibility table which consists of a list of com-
patibility associations between two possible corresponding minutiae. Each one of these
associations represents single links in the compatibility graph.
In the final phase of the matching software flow, the compatibility graph is traversed and
clusters are created by linking table entries. Once the traversals are complete, compatible
clusters are combined and a match score is computed by accumulation of the linked table
entries across the combined clusters. Generally, a match score greater than 40 indicates
that both fingerprint minutiae and template minutiae belong to the same finger, and so, to
the same individual.
M. BARRENECHEA, J. ALTUNA, M. SAN MIGUEL
Direction Map Binarized Image Final Minutiae
Figure 4: Resulting images of the minutiae extraction process
3 HW Architecture
3.1 Initial System Architecture
The initial HW architecture is composed of a 50 MHz fixed-point Leon2 soft-processor
with 8 KB of cache memory for data and instructions, all of this embedded in a GR-
XC3S1500 board.This processor has been chosen for this application not only because of
its high performance and usability, but also due to the fact that it can be obtained under
LGPL license. According to a report on synthesizable CPU cores [5], where Leon2, Mi-
croBlaze and OpenRISC 1200 where tested under three different hardware configurations
and three different benchmarks, Leon2 yielded the best performance per clock cycle for
all the benchmarks and configurations. Moreover, in the opinion of the authors of the
mentioned report, Leon2 is the processor with the highest usability among the tested
CPU cores. A reason for this may be the VHDL code availability and the TCL/Tk based
configuration tool, which facilitates the design of a custom Leon2 based system. The fin-
gerprint image acquisition is performed by means of an IP (Intellectual Property) module
connected to the MBF200 fingerprint sensor. This module is attached to the APB bus
(AMBA Peripheral Bus) and provides the processor with the input fingerprint image as
shown in “Figure 6”. The operation of the sensor is controlled by means of three On-chip
registers (control, data and status) generated in the address range allocated for the APB
bridge.
3.1.1 Running the application on the initial system
The original minutiae extraction and matching algorithm was implemented using floating-
point notation while the Leon2 soft-processor is fixed-point. In order to run the program
on the target platform the ’-msoft-float’ option must be set in the compiler options. This
option forces all floating-point operations to be done in software with integer arithmetic.
The execution of the algorithm is successful as for the matching results is concerned but
not in terms of execution time. The required computation time for the minutiae extraction
is 157 seconds while the matching process for a one-to-one comparison is carried out in
51 seconds. Even when the results for the extracted minutiae and match score are correct,
A LOW-COST EMBEDDED FINGERPRINT VERIFICATION AND MATCHING SYSTEM
Figure 5: Functional steps of the matching algorithm
the program execution delay is unacceptable for a biometric verification system.
The excessive execution time is mainly due to the MINDTCT algorithm, and thus, the
analysis of the reduction of the time required for minutiae extraction becomes one of the
main objectives of this paper.
In the MINDTCT module a great amount of floating-point data is used. The emulation
of this data format introduces a serious delay in the execution of the program. To ac-
celerate this process, a floating point unit (FPU) is required. In the following section the
effects of this core concerning execution time reduction and floating-point data processing
are analyzed.
3.2 FPU Tests
The Leon2 processor provides an interface to different FPUs. The floating-point units
from Gaisler Research (GRFPU) and Sun Microsystems (Meiko), as well as the incom-
plete LTH FPU [6] are compatible with this interface. For the acceleration of the floating-
point processes the GRFPU has been used eventually for its high-performance and com-
pliance with the IEEE-754 standard.
The insertion of a FPU in the embedded system leads to a considerable increase in the
amount of logic inside the FPGA. This is the reason why a reduction in the processor
clock and/or a cutback in the cache memory amount will be required. The effect of
the attachment of a floating point unit has been analyzed for the following three system
configurations:
– 31 MHz and 8KB cache memory.
– 37 MHz and 8KB cache memory.
– 40 MHz and 4KB cache memory.
M. BARRENECHEA, J. ALTUNA, M. SAN MIGUEL
Figure 6: Proposed system architecture
The FPGA utilization is almost complete and very similar for the three analized hard-
ware configurations. The device utilization summary for the three cases is shown in “Ta-
ble 1”.
31 MHz and 37 MHz and 40 MHz and
8 KB cache 8 KB cache 4 KB cache
Number of external IOBs 36% 36% 36%
Number of LOCed External IOBs 97% 97% 97%
Number of MUL18X18s 53% 53% 53%
Number of RAMB16s 56% 68% 56%
Number of slices 99% 99% 99%
Number of SLICEMs 1% 1% 1%
Number of BUFGMUXs 37% 37% 37%
Number of DCMs 50% 50% 50%
Table 1: Device utilization for different system configurations
The floating-point behaviour has been assessed by means of the Stanford and Paranoia
benchmarks. The first one measures the execution time in milliseconds for each one of
the ten small programs included in the algorithm. Only two of these modules make use of
numbers in floating-point format: FFT and Mm. On the other hand, Paranoia is a program
to test the compliance with the IEEE-754 floating-point standard [5].
It is important to point out that the compilation of these programs must be carried out
without the ’-msoft-float’ compiler option set for those system configurations where the
floating-point core is inserted.
A LOW-COST EMBEDDED FINGERPRINT VERIFICATION AND MATCHING SYSTEM
3.2.1 Stanford Benchmark
The outcome of the execution of this program for different system configurations is shown
in “Table 2”. The study has been carried out for the following embedded designs:
A: 50 MHz clock frequency with 8KB cache memory (No FPU).
B: 31 MHz clock frequency with 8KB cache memory and FPU.
C: 37 MHz clock frequency with 8KB cache memory and FPU.
D: 40 MHz clock frequency with 4KB cache memory and FPU.
It is worth mentioning the considerable execution time reduction achieved in those pro-
gram modules where floating-point operations where carried out. The decrease in the
execution time for the matrix multiplication program (Mm) is of 95% in the best case
(37 MHz and 8KB cache memory) and 91.6% in the worst case (31 MHz and 8KB cache
memory). The results for the FFT analysis are similar: A reduction of 95.3% of the com-
putation time has been achieved in the best case (40 MHz and 4KB cache memory) and
of 92.22% in the worst case (31 MHz and 8KB cache memory). The execution time for
integer operations does not show remarkable variations.
A B C D
Perm 34 50 33 34
Towers 50 83 67 50
Queens 33 50 33 33
Intmm 166 133 100 116
Mm 1000 84 50 67
Puzzle 317 450 350 350
Quick 50 50 33 33
Bubble 50 50 50 50
Tree 233 334 266 250
FFT 1067 83 67 50
Table 2: Stanford benchmark results for different system configurations
3.2.2 Paranoia Benchmark
The outcome of the Paranoia benchmark has been identical in the three proposed con-
figurations. The arithmetic diagnosed is satisfactory though a little underflow flaw has
been found:
Paranoia version 1.1 [cygnus]
. . .
FLAW: X = 3.05947655544740190e-308
is not equal to Z = 2.22507385850720138e-308 .
yet X - Z yields 0.00000000000000000e+00 .
This is correct as long as the underflow is notified.
M. BARRENECHEA, J. ALTUNA, M. SAN MIGUEL
3.3 Introducing the GRFPU in the design
Tests on IEEE-754 compliancy drew positive results for the Gaisler Research floating-
point core, and therefore this module was included in the hardware design. Results
for computation time have improved substantially after the FPU insertion, mainly in the
execution time required for the MINDTCT process completion. A 94.14% time reduction
has been achieved for the case of 40MHz and 4KB cache memory configuration. The
execution time for the matching algorithm had a slight improvement.
The timing results for the different system configurations are shown in “Figure 7”. Note
that the analized system configurations are the same as those in section 3.2.1.
Figure 7: Execution Time Results for different Leon2-based Configurations
3.4 HW Speed Enhancement
Even if the computation time for the minutiae extraction algorithm has been reduced in
a 94.14%, the program completion delay is yet excessive for its implementation in a real
commercial system.
Several timing analyses show that the 75% of the computation time is occupied by
the low contrast map, direction map and low flow map generation process. The 92% of
the time dedicated to image map computation is needed to generate the directional ridge
A LOW-COST EMBEDDED FINGERPRINT VERIFICATION AND MATCHING SYSTEM
flow map, as shown in “Figure 8”. This process is accelerated by means of a hardware
accelerator that computes the required DFT calculations for the accomplishment of the
algorithm.
Figure 8: Profiling of the execution time for: a.- MINDTCT algorithm b.- LCM, LFM
and DM modules
4 Conclusions
This paper describes the implementation of a fingerprint minutiae extraction and matching
algorithm running on a Spartan3 based system with an embedded Leon2 soft-processor.
The original application developed by NIST has been modified and ported to the target
platform. Several tests have been carried out to analyze the performance of the software
algorithms with different Leon2 and GRFPU configurations.
After the insertion of a floating-point unit, the results on execution time of the algorithm
have been reduced in a 94.14% for a 40MHz and 4KB cache memory configuration.
Acknowledgments
This work was supported by the BIOSEG PROFIT Project funded by the Spanish Ministry of
Science and Technology. The authors would also like to thank Jiri Gaisler and Richard Pender for
their support in setting up the system architecture.
References
[1] A.K. Jain. Biometric recognition: How do I know who you are?. In Signal Processing and Commu-
nications Applications Conference on, 2004. Proceedings of the IEEE 12th, April 2004.
[2] S. Yang, K. Sakiyama and I. Verbauwhede. A Compact and Efficient Fingerprint Verification System
for Secure Embedded Devices. In ”Signals, Systems and Computers, 2003. Conference Record of the
Thirty-Seventh Asilomar Conference”, pages 2058-2062, Pacific Grove, California, November 2003.
[3] A. Lindoso, L. Entrena and J. Izquierdo. FPGA-Based acceleration of fingerprint minutiae match-
ing. In ”III Southern Conference on Programmable Logic (SPL2007)”, Mar del Plata, Argentina,
February 2007.
M. BARRENECHEA, J. ALTUNA, M. SAN MIGUEL
[4] M. Lopez Garcıa and E. F. Canto Navarro. FPGA Implementation of a Ridge Extraction Fingerprint
Algorithm Based on a MicroBlaze and Hardware Coprocessor. In ”16th International Conference on
Field Programmable Logic and Applications (FPL 2006)”, Madrid, Spain, August 2006.
[5] Daniel Mattson and Marcus Christensson. Evaluation of synthesizable CPU cores, Master’s Thesis,
Chalmers University of technology, Gothenburg, Sweden, 2004.
[6] Martin Kasprzyk. Floating Point Unit, Digital IC Project 2001, January 2002.