Upload
derek-malone
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
M.A.Sc. Thesis Presentation
Automated Reading Assistance Automated Reading Assistance System System
Using Point-of-Gaze EstimationUsing Point-of-Gaze Estimation
Jeffrey J. Kang
Supervisor: Dr. Moshe Eizenman
Department of Electrical and Computer Engineering
Institute of Biomaterials and Biomedical Engineering
January 24, 2006
IntroductionIntroduction
Reading• Visual examination of text
• Convert words to sounds to activate word recognition
We learn appropriate conversions through repetitive exposure to word-to-sound mappings
Insufficient reader skill or irregular spelling can lead to failed conversion: assistance is required
Objective: Develop an automated reading assistance system that automatically vocalizes unknown words in real-time on the reader’s behalf. The system should operate within a natural reading setting.
What We Need To Do — Step 1What We Need To Do — Step 1
1. Identify the word being read, in real-time
2. Detect when the word being read is an unknown word
3. Vocalization of the unknown word
Identifying the Word Being Read Identifying the Word Being Read
Identify the viewed word using point-of-gaze estimation Point-of-gaze is:
• Where we are looking with the highest visual acuity region of the retina
• Intersection of the visual axis of each eye within the 3D scene
• Intersection of the visual axis one eye with a 2D plane
Point-of-Gaze Estimation MethodologiesPoint-of-Gaze Estimation Methodologies
1. Head-mounted 2. Remote (no head-worn components)
Head-mounted Point-of-Gaze EstimationHead-mounted Point-of-Gaze Estimation
Based on principle of tracking the pupil centre, and corneal reflections to measure eye position
Point-of-gaze is estimated with respect to a coordinate system attached to the head
scene camera
eye camera
IR LEDshot
mirror
corneal reflections
pupil centre
Point-of-Gaze in Head Coordinate SystemPoint-of-Gaze in Head Coordinate System
Point-of-gaze is measured in the head coordinate system, and placed on the scene camera image
Locating the Reading ObjectLocating the Reading Object
The position of the reading object is determined by tracking markers
Mapping the Point-of-GazeMapping the Point-of-Gaze
Establish point correspondences from• the estimated positions of the markers in the scene image
• the known positions of the markers on the reading object Homographic mapping of point-of-gaze from scene
camera image to reading object coordinate system
Identify the Reading Identify the Reading ObjectObject
Extract the barcode from the scene camera image to identify the reading object (e.g. page number)
Match barcode to database of reading objects to determine what text is being read
Identifying the Word Being ReadIdentifying the Word Being Read
Using the mapped point-of-gaze, identify the word being read by table lookup
Sample Reading VideoSample Reading Video
Sample Reading VideoSample Reading Video
Mapping AccuracyMapping Accuracy
0 1 2 3 4 5 60
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Mapping Error (mm)
Pro
po
rtio
n o
f Tria
ls
Point-of-Gaze Estimation MethodologiesPoint-of-Gaze Estimation Methodologies
1. Head-mounted 2. Remote (no head-worn components)
O
2D scene objectZ
X
P
C
visual axis
Y
Remote Point-of-Gaze EstimationRemote Point-of-Gaze Estimation
Point-of-gaze is estimated to a fixed coordinate system
• C – centre of corneal curvature
• P – point-of-gaze
IR LEDs
eye camera
computer screen
O
assumed position of 2D scene object
Z
X
P
C
visual axis
P’
true position of 2D scene object
Y
Moving Reading CardMoving Reading Card
How can point-of-gaze be estimated to a coordinate system attached to a moving reading object?
O
assumed position of 2D scene object
Z
X
P
C
visual axis
P’
true position of 2D scene object
Y
Estimate MotionEstimate Motion
R, T
t0t1
O
assumed position of 2D scene object
Z
X
P
C
visual axis
P’
true position of 2D scene object
Y
Use a Scene Camera and TargetsUse a Scene Camera and Targets
t0t1
Scene Camera
H0
O
assumed position of 2D scene object
Z
X
P
C
visual axis
P’
true position of 2D scene object
Y
Calculate Two HomographiesCalculate Two Homographies
t0t1
H1
Scene Camera
O
assumed position of 2D scene object
Z
X
P
C
visual axis
P’
true position of 2D scene object
Y
Decompose Homography MatricesDecompose Homography Matrices
t0t1
Scene Camera
R0, T0 R1, T1
O
assumed position of 2D scene object
Z
X
P
C
visual axis
P’
true position of 2D scene object
Y
Calculate Motion of 2D Scene ObjectCalculate Motion of 2D Scene Object
t0t1
Scene Camera
R, T
R0, T0 R1, T1
Point-of-Gaze AccuracyPoint-of-Gaze Accuracy
0.00
2.00
4.00
6.00
8.00
10.00
12.00
14.00
16.00
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1
Noise Std. Dev. (pixels)
Err
or
in P
oin
t-o
f-G
aze
(m
m)
What We Need To Do: Step 2What We Need To Do: Step 2
1. Identify the word being read, in real-time
2. Detect when the word being read is an unknown word
3. Vocalization of the unknown word
Dual Route Reading ModelDual Route Reading Model
text
speech
Orthographic Analysis
OrthographicInput Lexicon
PhonologicalOutput Lexicon
Response Buffer
Semantic System
Grapheme-Phoneme
Rule System
LEXICAL ROUTE
NON-LEXICAL ROUTE
Coltheart, M. et al. (2001)
Each word’s graphemes are processed in parallel
Dual Route Reading ModelDual Route Reading Model
text
speech
Orthographic Analysis
OrthographicInput Lexicon
PhonologicalOutput Lexicon
Response Buffer
Semantic System
Grapheme-Phoneme
Rule System
LEXICAL ROUTE
NON-LEXICAL ROUTE
Dual Route Reading ModelDual Route Reading Model
text
speech
Orthographic Analysis
OrthographicInput Lexicon
PhonologicalOutput Lexicon
Response Buffer
Semantic System
Grapheme-Phoneme
Rule System
LEXICAL ROUTE
NON-LEXICAL ROUTE
Each word’s graphemes are individually converted into phonemes based on mapping rules
Detecting Unknown WordsDetecting Unknown Words
For unknown words, the lexical route fails and the slower non-lexical route is used
Hypothesis: we can differentiate between known and unknown words by the duration of the processing time
Processing TimeProcessing Time
Gaze Duration (Subject P.L. - Aloud Reading)
0
500
1000
1500
2000
2500
3000
3500
4000
0 2 4 6 8 10 12
Word Length (letters)
Tim
e (s
ec)
Normal Words
Difficult Words
Setting a Threshold CurveSetting a Threshold Curve
Gaze Duration (Subject P.L. - Aloud Reading)
0
500
1000
1500
2000
2500
3000
3500
4000
0 2 4 6 8 10 12
Word Length (letters)
Tim
e (s
ec)
Normal Words
Difficult Words
Threshold
Threshold curve is a function of word length Model processing time for known words (length k) as a Gaussian random variable
(μk, σk2)
Estimate μk, σk2 from a short training set for each subject
Each point on threshold curve is given by
• α is the constrained probability of false alarm
)μ 2α(1erf2σT k1
kk
Setting the ThresholdSetting the Threshold
Experiment: Detecting Unknown WordsExperiment: Detecting Unknown Words
Remote point-of-gaze estimation system• Reading material presented on computer screen
• Head position stabilized using a chinrest
Four subjects read from 40 passages of text • 20 passages aloud and 20 passages silently
• Divided into training set to “learn” μk, σk2 and set detection
threshold curves
Set false alarm probability α = 0.10 Evaluate detection performance
Experiment: Detecting Unknown WordsExperiment: Detecting Unknown Words
Detection Performance
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 2 3 4
Subject
Ra
te
Training Set Detection RateTest Set Detection RateTraining Set False Alarm RateTest Set False Alarm Rate
Experiment: Natural Setting Reading Experiment: Natural Setting Reading AssistanceAssistance
Natural reading pose• Unrestricted head movement
• Reading material is hand-held
Head-mounted eye-tracker• Identify viewed word in real-time
• Measure per-word processing time
Detecting unknown words• Processing time threshold curves
established in previous experiment
Assistance• Detection of unknown word activates
vocalization
Experiment: Natural Setting Reading Experiment: Natural Setting Reading AssistanceAssistance
Results
Point-of-gaze mapping method accommodated head and reading material movement without reducing detection performance
Subject Detection Rate False Alarm Rate
M.E. 0.94 0.10
P.L. 0.95 0.09
ConclusionsConclusions
Developed methods to map point-of-gaze estimates to an object coordinate system attached to a moving 2D scene object (e.g. reading card)• Head-mounted system
• Remote system
Developed method to detect when a reader encounters an unknown word
Demonstrated principle of operation for an automated reading assistance system
Future WorkFuture Work
Implement reading assistant using remote-gaze estimation methodology
Validate efficacy of system as a teaching tool for unskilled English readers, in collaboration with an audiologist
Evaluate other forms of assistive intervention
• e.g. translation, definition
Questions?Questions?