24
James Coughlan, Ph.D. Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

Embed Size (px)

DESCRIPTION

Portable and Mobile Systems in Assistive Technology - Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users - Coughlan, James (f)

Citation preview

Page 1: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

James Coughlan, Ph.D.

Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

Page 2: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

2

Informational signsSigns are ubiquitous indoors and outdoors

Useful for wayfinding, finding shops and businesses, accessing variety of servicesBut nearly all are inaccessible to blind and visually impaired persons!

Page 3: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

3

OCR (Optical Character Recognition)

Originally developed for clear images of text documents, acquired by a flatbed scannerNot equipped to find text in an image with lots of non-text clutter (buildings, trees, etc.)

Page 4: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

4

Portable OCR for visually impaired users

Smartphone (Nokia N82) implementation: kReader Mobile, knfbReader Mobile (K–NFB Reading Technology, Inc.)

Page 5: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

5

kReader Mobile limitation

Assumes text comprises all (or most) of image: “Get as close to the text as you can without cutting off any text, as it is displayed on the screen” “Distance from the target can greatly affect the text recognition quality. Most, but not all, documents should be approximately 10 inches from the Reader.” (KNFB Mobile Reader User Guide)

Page 6: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

6

Related work

Much research on computer vision algorithms for finding text in cluttered imagesVery challenging problemEven if text is correctly located in an image, many problems with OCR: • non-standard fonts• poor illumination• curved surfaces, perspective distortion• other forms of noise in images

Page 7: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

Related work (continued)

Some smartphone apps find text, read it and translate it in real time

7

Page 8: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

8

Related work (continued)

A small amount of work targeted specifically at finding and reading text for blind and visually impaired persons:

•C. Yi & Y. Tian, 2011•“Smart Telescope” project from Blindsight Corporation (www.blindsight.com): find text regions and present enlarged text to low vision user

Page 9: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

9

Our approach• Design algorithm to rapidly find text on Android smartphone

running in video mode (640 x 480 pixels)• Perform on-board OCR (Tesseract)• Read aloud (text-to-speech) immediately• For speed, all processing is done on-board (no need for

internet connection). Read aloud up to 1-2 frames per second.

Page 10: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

10

System UI (user interface)• Philosophy: text detection/reading errors are inevitable. To

overcome them, have user obtain multiple readings of each text sign over time. Ignore spurious (unreproducible) readings, and come to consensus about true contents of each sign.

• If multiple text strings in one image, read aloud in “raster” order (from top to bottom, and along a line from left to right)

Page 11: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

11

Overview of algorithm

Page 12: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

12

Big challenge: how to aim the smartphone camera?

If you are blind, you may have little idea where to aim the camera! (kReader Mobile User Guide has an entire section on “Learning to Aim Your Reader”)

Also, text is best read when it is horizontal, but many blind users have trouble holding camera horizontal

Page 13: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

13

Help with aiming: UI features

1) Tilt detection function: allows user to vary pitch and yaw but forces roll to be zero. Issue vibration any time roll is far enough from zero.

Allows user to point in any compass direction, and to aim high or low depending on whether text is above or below shoulder height. Increases chances that text appears horizontal in image.

Page 14: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

Help with aiming: UI features

2) Warning whenever text is close to being cut off: read aloud detected text in a low pitch.

Red box = camera field of

view “Smoking”

(low pitch)14

Page 15: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

Help with aiming: UI features

Red box = camera field of view

“No smoking”(normal

pitch)

15

Page 16: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

16

Help with aiming: UI features

3) Warning whenever text is small: read text in a high pitch signal user to approach text for clearer view

Red box = camera field of view

“No smoking”(high pitch)

NO SMOKING

Page 17: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

ExperimentsTen signs printed out and placed on two adjoining walls of conference room

Two blind volunteer subjects, out of reach of wall

Brief training session: purpose of experiment, how to hold and move camera

17

Page 18: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

-

Subjects told to search for an unknown number of signs on the two walls, and to tell experimenter content of each sign detected

18

Page 19: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

Experimental resultsSubject 1: •6 signs reported perfectly correctly•2 signs completely missed•2 other signs reported with some errors: “Dr. Samuels” was detected as “Samuels” (audible to experimenter but not subject)•“Meeting in Session” sign gave rise to the words “Meeting” and “section” (though they were not uttered together)

19

Page 20: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

Experimental resultsSubject 2: •3 signs reported perfectly correctly•Typical errors:- “Exam Room 150” was detected and read aloud correctly, but subject was unable to understand the word “exam”- Reported “D L Samuels meeting in session” as a sign, which is an incorrect combination of two signs, “Dr. Samuels” (which the system misread as “Dr.”) and “Meeting in Session”

20

Page 21: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

21

DiscussionSystem still very difficult to use!False positives and false negatives (i.e.,

missed text) still a big problem we are improving our text detection algorithm

Even when text is correctly detected, OCR still causes many errors

Slow processing speeds (plus camera motion blur) force user to pan camera very slowly

Page 22: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

22

Discussion (continued)UI planned in the future:• Have user scan environment, sound an

audio tone whenever text is detected• Compute an image mosaic (panorama) of

entire scene, to seamlessly read text strings that don’t fit inside a single image frame

• Cluster multiple text strings into distinct sign regions

• User will be able to hear text-to-speech repeated for any sign region

Page 23: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

23

Discussion (continued)Further in the future:“Visual spam” is a big problem task-

driven search (“find me Dr. Smith’s office”)

Finding signs will always be difficult at times (even for people with normal vision) integration with “indoor GPS” (i.e., localization indoors) to provide useful, location-specific information

Page 24: Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

24

Thanks to…

First author: Dr. Huiying Shen (Smith-Kettlewell)

Collaborators: Dr. Roberto Manduchi (UC Santa Cruz), Dr. Vidya Murali and Dr. Ender Tekin (Smith-Kettlewell)

Funding from NIH and NIDRR