Upload
kevin-neal
View
1.292
Download
2
Embed Size (px)
DESCRIPTION
This was a presentation I created on some simple, but extremely useful, techniques that can be used when scanning documents to drastically improve your automatic data capture accuracy. These techniques include Multistream which simply means that the scanner can output two versions of the scanned document. Typically one is in color and one is in black and white. Why? You would want to save the color version of the image for retrieval purposes. In other words, the user would see an identical electronic version of the hard copy document. The black and white version is used strictly for automatic data extraction because often times the color in unnecessary for OCR. The second technique is Background Color Removal. Forms designed specifically for automatic data capture such as Health Care Financing Administration (HCFA) CMS1500, UB-92 or OB04's will have one-shade of a consistent background color. Why? This form color is designed make it obvious for the person completing the form exactly where characters and specific information is to be placed in the form. In other words, Social Security Number has an exact box for each of the nine numbers in your SSN. This way the software knows exactly where to automatically look for the SSN field then accurately populate each of the nine numbers. In forms processing, you don't care about the background color, you care about the information on the form. So, therefore, you "dropout" the color and expose the data. I've written about additional data capture tips, tricks and techniques here: http://www.aiim.org/community/blogs/expert/Demystifying-Forms-Processing-and-Data-Capture
Citation preview
Image ProcessingFujitsu Image
Processing Software
A Series of predefined templates for forms processing applications
Modify or create new templates with ease
Examples:• Light color text• OCR form• Magazine
Generate B&W and Color images simultaneously
Simultaneous Color and B&W output The Fujitsu fi-5950 scanner has the ability to create two images for
each page.
Front Side Images
Back Side Images
Color images for viewing
(people)
Bitonal images for processing
(computers)
Simultaneous Color and B&W output
OMR
ICR
OCR
Terminology:OMR = Optical Mark Recognition (Check Box)
ICR = Intelligent Character Recognition (Handprint)OCR = Optical Character Recognition (Machine Print)
Forms Dropout Color Example• Sample claim form with excellent form structure including red constraint areas to be eliminated
Dynamic Color Dropout
ICR on handprint
OMR on check marks
Deskewed Image
Image Enhanced
• After Dynamic Color Dropout only the relevant form data is exposed for effective Forms Processing
Color Documents
Dropout Color options Color reduction (background washout)
Forms outline removal