26
Replicating Semantic Replicating Semantic Connections Made by Visual Connections Made by Visual Readers for a Scanning Readers for a Scanning System for Nonvisual Readers System for Nonvisual Readers Kathy McCoy Kathy McCoy (Debbie Yarrington) (Debbie Yarrington) Dept. of Computer and Information Dept. of Computer and Information Sciences Sciences University of Delaware University of Delaware & & Consultant for National Institute on Consultant for National Institute on Disability and Rehabilitation Disability and Rehabilitation Research (NIDRR) Research (NIDRR) US Department of Education US Department of Education

Replicating Semantic Connections Made by Visual Readers for a Scanning System for Nonvisual Readers Kathy McCoy (Debbie Yarrington) Dept. of Computer and

Embed Size (px)

Citation preview

Replicating Semantic Replicating Semantic Connections Made by Visual Connections Made by Visual

Readers for a Scanning Readers for a Scanning System for Nonvisual System for Nonvisual

ReadersReaders

Kathy McCoyKathy McCoy(Debbie Yarrington)(Debbie Yarrington)

Dept. of Computer and Information SciencesDept. of Computer and Information SciencesUniversity of DelawareUniversity of Delaware

& & Consultant for National Institute on Consultant for National Institute on

Disability and Rehabilitation Research Disability and Rehabilitation Research (NIDRR)(NIDRR)

US Department of EducationUS Department of Education

GoalGoalThe goal of this system is to give The goal of this system is to give

nonvisual readers information similar nonvisual readers information similar to what visual readers get when to what visual readers get when skimming through a document in skimming through a document in response to a question.response to a question.

Motivation Motivation Working with college students who were blind Working with college students who were blind

and visually impairedand visually impaired Students took significantly longer to find Students took significantly longer to find

homework question answers within documents homework question answers within documents than their visual-reading counterpartsthan their visual-reading counterparts

Current screenreaders have limited search Current screenreaders have limited search ability.ability.

ApproachApproach

Use eye-tracker to see what sighted Use eye-tracker to see what sighted people look at when they are people look at when they are skimming to answer a questionskimming to answer a question

Identify important paragraphsIdentify important paragraphs Develop Natural Language Processing Develop Natural Language Processing

Techniques to replicate the dataTechniques to replicate the data Work with people who are blind to Work with people who are blind to

develop appropriate interfaces using develop appropriate interfaces using the results from above.the results from above.

Part 1: Visual Skimming DataPart 1: Visual Skimming Data

Goal: Goal: To achieving an understanding To achieving an understanding of what information visual skimmers of what information visual skimmers pay attention to when skimming pay attention to when skimming through documents to answer through documents to answer questionsquestions

Procedure: Procedure: ◦ Have visual readers skim through a Have visual readers skim through a

document for a question answer document for a question answer while being tracked by an eye while being tracked by an eye tracking systemtracking system

Gathering DataGathering Data14 complex questions and 14 complex questions and

accompanying documentsaccompanying documents

◦ 10 were 2-pages10 were 2-pages, 2 were 5-pages, , 2 were 5-pages, and 2 were 8 pages or longer.and 2 were 8 pages or longer.

◦ Documents were text documents Documents were text documents No images, few subtitles and listsNo images, few subtitles and lists

Example Questions Example Questions ConsideredConsidered

““What effect does China’s rising oil What effect does China’s rising oil prices have on other sectors of its prices have on other sectors of its economy?”economy?”

““According to Piaget, what techniques According to Piaget, what techniques do children use to adjust to their do children use to adjust to their environment?”environment?”

““How do people catch the West Nile How do people catch the West Nile Virus?Virus?””

Gathering DataGathering Data

Individuals skimmed for question Individuals skimmed for question answer in a document while being answer in a document while being tracked by an eye tracking system.tracked by an eye tracking system.

◦ 43 subjects skimmed for answers 43 subjects skimmed for answers to between 6-13 question,to between 6-13 question, Total of 513 question-answer Total of 513 question-answer skimming resultsskimming results

Subjects then answered multiple Subjects then answered multiple choice questionchoice question

Eye Tracker Data:Eye Tracker Data:Tobii Eye Tracker:Tobii Eye Tracker: AOIs:AOIs:

We could define areas of interest (AOI) in the We could define areas of interest (AOI) in the text document ahead of timetext document ahead of time

We chose paragraphs, titles, subtitles, and the We chose paragraphs, titles, subtitles, and the question as separate AOIs.question as separate AOIs.

We then counted the number of gaze points We then counted the number of gaze points (gazes of over 100 ms duration) in each AOI(gazes of over 100 ms duration) in each AOI

HotSpot and Duration File: HotSpot and Duration File: The tracker gave us an image that showed “hot The tracker gave us an image that showed “hot

spots”, or locations and durations of where the spots”, or locations and durations of where the eyes gazed eyes gazed

A file with locations and durations of gaze pointsA file with locations and durations of gaze points

Example of Eye Tracking Example of Eye Tracking ResultsResults

Results Analysis:Results Analysis:

We examined AOIs most frequently We examined AOIs most frequently focused on that did not have focused on that did not have physical attributes that would physical attributes that would explain the attraction of peopleexplain the attraction of people’’s s gazesgazes

Assumption is that these areas were Assumption is that these areas were focused on because of their focused on because of their connection to the question.connection to the question.

Subjects found question answerSubjects found question answer Example:Example:

““How do people catch the West Nile Virus?How do people catch the West Nile Virus?””

The paragraph with the most gaze points for the most subjects The paragraph with the most gaze points for the most subjects was:was:

““In the United States, wild birds, especially crows In the United States, wild birds, especially crows and jays, are the and jays, are the main reservoir of West Nile virusmain reservoir of West Nile virus, , but the virus is actually spread by certain species of but the virus is actually spread by certain species of mosquitoes. mosquitoes. TransmissionTransmission happens when a happens when a mosquito bites a bird infected with the West Nile mosquito bites a bird infected with the West Nile virus and the virus and the virus enters virus enters the mosquito's the mosquito's bloodstream. It circulates for a few days before bloodstream. It circulates for a few days before settling in the salivary glands. Then the infected settling in the salivary glands. Then the infected mosquito bites an animal or a human and mosquito bites an animal or a human and the virus the virus enters the host's bloodstreamenters the host's bloodstream, where it may , where it may cause cause serious illnessserious illness. The virus then probably multiplies . The virus then probably multiplies and moves on to the brain, crossing the blood-brain and moves on to the brain, crossing the blood-brain barrier. Once the virus crosses that barrier and barrier. Once the virus crosses that barrier and infects the brain or its linings, the brain tissue infects the brain or its linings, the brain tissue becomes inflamed and becomes inflamed and symptoms arisesymptoms arise..””

Subjects focused on areas that have a Subjects focused on areas that have a semantic relationship with the semantic relationship with the questionquestion

E.g., with the question,E.g., with the question,““Why was MonetWhy was Monet’’s work criticized by the public?s work criticized by the public?””

the second most frequently focused on paragraph was:the second most frequently focused on paragraph was:

In 1874, Manet, Degas, Cezanne, Renoir, Pissarro, In 1874, Manet, Degas, Cezanne, Renoir, Pissarro, Sisley and Monet put together an exhibition, which Sisley and Monet put together an exhibition, which resulted in resulted in a large financial loss a large financial loss for Monet and his for Monet and his friends and marked a return to friends and marked a return to financial insecurity financial insecurity for Monet. It was only through the help of Manet that for Monet. It was only through the help of Manet that Monet was able to remain in Argenteuil. In an Monet was able to remain in Argenteuil. In an attempt to recoup some of his attempt to recoup some of his losseslosses, Monet tried to , Monet tried to sell some of his paintings at the Hotel Drouot. This, sell some of his paintings at the Hotel Drouot. This, too, was atoo, was a failurefailure. . Despite the Despite the financial uncertaintyfinancial uncertainty, , MonetMonet’’s paintings never became s paintings never became morosemorose or even all or even all that that sombersomber. Instead, Monet immersed himself in . Instead, Monet immersed himself in the task of perfecting a style which still hadthe task of perfecting a style which still had not not been accepted been accepted by the world at large. Monetby the world at large. Monet’’s s compositions from this time were extremely loosely compositions from this time were extremely loosely structured, with color applied in strong, distinct structured, with color applied in strong, distinct strokes as if no reworking of the pigment had been strokes as if no reworking of the pigment had been attempted. This technique was calculated to suggest attempted. This technique was calculated to suggest that the artist had indeed captured a spontaneous that the artist had indeed captured a spontaneous impression of nature.impression of nature.

This Paragraph does not contain the answerThis Paragraph does not contain the answer

Part 2:Part 2:

►Next Step: Developing Natural Language Next Step: Developing Natural Language Processing (NLP) techniques to Processing (NLP) techniques to automatically identify areas of text visual automatically identify areas of text visual readers focus on as determined in 1.readers focus on as determined in 1.

Process:Process:

1.1. Generate keywords from questionGenerate keywords from question2.2. Weight keywords based on inverse of # Weight keywords based on inverse of #

of paragraphs in which they occur in the of paragraphs in which they occur in the documentdocument

3.3. Generate matching score for each Generate matching score for each paragraph paragraph

• # of occurrences of each keyword x # of occurrences of each keyword x keywordkeyword’’s weights weight

4.4. Rank paragraphRank paragraph’’s likelihood of being s likelihood of being related to the question based on related to the question based on matching scorematching score

What is it that we match?What is it that we match?Keyword Sets:Keyword Sets:

Directly using the words from the query Directly using the words from the query did not work well; using words similar to did not work well; using words similar to the words in the query also did not work.the words in the query also did not work.

We needed to find a way to match the We needed to find a way to match the “loose semantic connections” found in the “loose semantic connections” found in the eye-tracking data.eye-tracking data.

Topically-Related Topically-Related KeywordsKeywords

Our solution: Our solution: ◦ use the use the World Wide Web World Wide Web to form to form

clusters of topically-related wordsclusters of topically-related words Intuition – to find loosely related Intuition – to find loosely related

words, we want to find words that words, we want to find words that are discussed “with” the words in are discussed “with” the words in the questionthe question

Use a google search to identify Use a google search to identify places on the web that the question places on the web that the question words are discussed – take words words are discussed – take words from those areas.from those areas.

Procedure: Cluster Procedure: Cluster formationformation

1.1. Use content words from question as Use content words from question as search engine (Google) query terms search engine (Google) query terms

2.2. Search returns ordered list of relevant Search returns ordered list of relevant URLs with accompanying snippetsURLs with accompanying snippets

3.3. Retrieve web page from URLRetrieve web page from URL4.4. Locate snippet within web page Locate snippet within web page

(stripped of html)(stripped of html)5.5. Include 50 content words before Include 50 content words before

snippet and 50 content words after snippet and 50 content words after snippet and call that a snippet phrasesnippet and call that a snippet phrase

Procedure: Cluster Procedure: Cluster formation IIformation II

6.6. Take the top 50 snippet phrases Take the top 50 snippet phrases containing the most search termscontaining the most search terms

7.7. Generate a word cluster with those Generate a word cluster with those phrasesphrases

8.8. Add a global meaning weight so as to Add a global meaning weight so as to eliminate words that are very common eliminate words that are very common (Global Indirect Document Frequency (Global Indirect Document Frequency seeded from a large list of words)seeded from a large list of words)

9.9. Take top 25% of cluster – and use it to Take top 25% of cluster – and use it to rank rank sentencessentences

10.10. Rank paragraphs by the sentencesRank paragraphs by the sentences

Results:Results:

Example Important Example Important Sentences: How do People Sentences: How do People catch the West Nile Virus?catch the West Nile Virus?

1.1. west nile virus west nile virus

2.2. it is spread by mosquitoes it is spread by mosquitoes

3.3. transmission happens when a mosquito transmission happens when a mosquito bites a bird infected with the west nile bites a bird infected with the west nile virus and the virus enters the mosquito virus and the virus enters the mosquito bloodstream bloodstream

4.4. most people infected with the west nile most people infected with the west nile virus have no signs or symptoms virus have no signs or symptoms

5.5. most people recover from west nile virus most people recover from west nile virus without treatment without treatment

6.6. to help control west nile virus eliminate to help control west nile virus eliminate standing water in your yard standing water in your yard

7.7. about 20 percent of people develop a mild about 20 percent of people develop a mild infection called west nile fever infection called west nile fever

8.8. some laboratory workers involved in west some laboratory workers involved in west nile research have contracted the disease nile research have contracted the disease from infected animals from infected animals

9.9. mosquitoes breed in pools of standing mosquitoes breed in pools of standing water water

10.10. in rare cases it is possible for west nile in rare cases it is possible for west nile virus to spread through other routes virus to spread through other routes includingincluding

11.11. watch for sick or dying birds and report watch for sick or dying birds and report them to your local health departmentthem to your local health department

12.12. west nile virus is common in areas such as africa west nile virus is common in areas such as africa west asia and the middle eastwest asia and the middle east

13.13. in the united states wild birds especially crows and in the united states wild birds especially crows and jays are the main reservoir of west nile virus but the jays are the main reservoir of west nile virus but the virus is actually spread by certain species of virus is actually spread by certain species of mosquitoes mosquitoes

14.14. your best bet for preventing the virus and other your best bet for preventing the virus and other mosquito borne illnesses is to avoid exposure to mosquito borne illnesses is to avoid exposure to mosquitoes and eliminate mosquito breeding sites mosquitoes and eliminate mosquito breeding sites

15.15. your overall risk of contracting west nile virus your overall risk of contracting west nile virus depends on these factors time of yeardepends on these factors time of year

16.16. then the infected mosquito bites an animal or a then the infected mosquito bites an animal or a human and the virus enters the host bloodstream human and the virus enters the host bloodstream where it may cause serious illness where it may cause serious illness

17.17. even if you are infected your risk of developing a even if you are infected your risk of developing a serious west nile virus related illness is extremely serious west nile virus related illness is extremely smallsmall

Results: Discounted Results: Discounted Cumulative Gain with Cumulative Gain with EyeTracking ResultsEyeTracking Results

Current Work on Current Work on SkimmingSkimming

Incorporating Physical Attributes in Incorporating Physical Attributes in assessment of paragraphsassessment of paragraphs

Developing a user interface in conjunction Developing a user interface in conjunction with potential userswith potential users Important that it provide access to information Important that it provide access to information

like what a visual reader getslike what a visual reader gets Read important sentences with indication of Read important sentences with indication of

paragraph?paragraph? Read word clusters with indication of paragraph?Read word clusters with indication of paragraph?

Allow user to stop and start skimming with a Allow user to stop and start skimming with a keypresskeypress

Thank You! Thank You!

Questions?Questions?