20
Empowering users to access information in the Digital Library Corin Anderson University of Washington

Empowering users to access information in the Digital Library Corin Anderson University of Washington

Embed Size (px)

Citation preview

Page 1: Empowering users to access information in the Digital Library Corin Anderson University of Washington

Empowering users to access information in the Digital Library

Corin Anderson

University of Washington

Page 2: Empowering users to access information in the Digital Library Corin Anderson University of Washington

2

Empowering users

• DLs provide information to users

• Tricky: Not all users will be programmers– Non-programmer Web surfer– 5th grade student– Your grandmother

• How to cater to non-programming masses?

Page 3: Empowering users to access information in the Digital Library Corin Anderson University of Washington

3

The Digital Library

• Many specialized DLs exist today– Medicine, literature, etc.

• Eventually, all DLs will be integrated: The DL

• Until then, use the Web as approximation

Page 4: Empowering users to access information in the Digital Library Corin Anderson University of Washington

4

DL research at the UW

• Improving web search using popularity

• Automatic question answering

• Extracting information by demonstration

• Adaptive web sites

Page 5: Empowering users to access information in the Digital Library Corin Anderson University of Washington

5

Popularity-based web search

• Want authoritative pageswww.dodgeviper.com vs. www.homepages.com/~joe/my-car.html

• Approximate authority by freq. of web visits– Data gathered from NCSA web proxies

• Rank query results based on popularity1,000 hits www.dodgeviper.com 3 hits www.homepages.com/~joe/my-car.html

Page 6: Empowering users to access information in the Digital Library Corin Anderson University of Washington

6

Automatic question answering

• Return answers, not web pages, to queries“What’s the tallest mountain in the world?”www.mountainweb.com/mountaineering/worldmtns.htm vs.“Mount Everest”

• Search web for pages that contain answers–“the tallest mountain is”

• Heuristics for yes/no, “which is” questions

Page 7: Empowering users to access information in the Digital Library Corin Anderson University of Washington

7

Information extraction

• Info from DL usually used elsewhere– Query stock history to build a graph in Excel– Email a list of current movies to a friend

• Extracting info is tricky– Special file formats (XML, .csv) arcane– Custom built wrappers to select tuples– Building wrappers isn’t easy, either!

• Solution: demonstrate a wrapper

Page 8: Empowering users to access information in the Digital Library Corin Anderson University of Washington

8

ICE-9 Wrapper generation

• User demonstrates extracting info

• ICE-9 learns generalized program

• Demonstrate on very few instances

Page 9: Empowering users to access information in the Digital Library Corin Anderson University of Washington

9

ICE-9 in action

• User demonstrated two instances

• ICE-9 steps through program correctly

Page 10: Empowering users to access information in the Digital Library Corin Anderson University of Washington

10

ICE-9 – current work

• Collaborative demonstrationICE-9 predicts each step, asks for confirmation– If can’t predict with confidence, just ask user

• Active learning– ICE-9 suggests which example the user should

demonstrate

Page 11: Empowering users to access information in the Digital Library Corin Anderson University of Washington

11

Adaptive web sites

• Different users have different goals– But traditional web sites treat everyone the same– Everyone sees the same start page, query page, etc.

• Personalized sites can be customized– Customization is manual, tedious

• Want a site to learn users’ interest– Based on observed behavior, similarity to others– Adapt to individuals accordingly

Page 12: Empowering users to access information in the Digital Library Corin Anderson University of Washington

12

Adaptations – structural

• Add link, remove link

• Add page (index page synthesis)

Page 13: Empowering users to access information in the Digital Library Corin Anderson University of Washington

13

Adaptations – presentational

• Highlight link, content

Page 14: Empowering users to access information in the Digital Library Corin Anderson University of Washington

14

From users to adaptations

• Users are clustered to find related visitors

• Models are fit to clusters to predict behavior

• Adaptation space is searched for best changes

Page 15: Empowering users to access information in the Digital Library Corin Anderson University of Washington

15

AWS – current work

Building, clustering user models

• Hierarchical user clustering– Users are leaf nodes, related groups interior– Influence of parent nodes decreases with distance

• Selecting adaptations from models– Choosing structural changes– Defining, selecting presentational changes

Page 16: Empowering users to access information in the Digital Library Corin Anderson University of Washington

16

Summary

• Successful DLs cater to their users

• UW research concentrating on connecting users with information

• Look for us at IUI, KDD, ICML, AAAI, IJCAI, and elsewhere

Page 17: Empowering users to access information in the Digital Library Corin Anderson University of Washington

17

ICE-9 in action

• ICE-9 learns from subsequent instances– Probabilities now

100%

Page 18: Empowering users to access information in the Digital Library Corin Anderson University of Washington

18

ICE-9 – Version space algebra

Page 19: Empowering users to access information in the Digital Library Corin Anderson University of Washington

19

Selecting adaptations

• Cluster models analyzed to determine interests

“The user has an interest in the page”

“The user visits the page by starting at the page – add a link between the two.”

Page 20: Empowering users to access information in the Digital Library Corin Anderson University of Washington

20

User’s computer Date and time of visit

Requested page Referring page

some-pc.cs 22/Feb/2000 11:49:13 / -some-pc.cs 22/Feb/2000 11:49:23 /edu/ http://www.cs/some-pc.cs 22/Feb/2000 11:49:34 /edu/courses/ http://www.cs/edu/some-other-pc.cs 22/Feb/2000 11:49:55 / -some-pc.cs 22/Feb/2000 11:50:08 /574 http://www.cs/edu/courses/some-other-pc.cs 22/Feb/2000 11:50:20 /info/current/ http://www.cs/

<html>

</html>

<html>

</html>

<html>

</html>

<html>

</html>

<html>

</html> </html>

<html>