39
Knowledge Management Institute 1 Markus Strohmaier 2012 Navigational Models Markus Strohmaier, Denis Helic Multimediale Informationssysteme II

Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

1

Markus Strohmaier 2012

Navigational Models

Markus Strohmaier, Denis Helic Multimediale Informationssysteme II

Page 2: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

2

Markus Strohmaier 2012

Overview

Next 3 lectures (held by me): •  Models of user navigation •  Search query log analysis •  Recommendation algorithms

Page 3: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

3

Markus Strohmaier 2012

The Memex (1945)

l  A mechanized private library for individual use

l  Mimics associative memory where users can l  insert documents l  navigate documents l  retrieve documents l  build trails through documents

l  Operated and maintained individually

l  But trails can be shared socially e.g.

(i) a user A can send trail to user B (ii) user B modifies and shares it with user C (iii) user C uses the trail for navigation

[Bush 1945] V. Bush. As We May Think. Atlantic Monthly, 1945.

The Memex [Bush 1945]:

A C

B

(i)

(iii) C‘s interaction with documents

is mediated by user A and B

(ii)

Page 4: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

4

Markus Strohmaier 2012

Web based Retrieval: Challenges

The Web is self-organized No central authority (for the WWW) or main index Everyone can add (even edit) pages Pages disappear on regular basis

–  A US study claimed that in 2 investigated tech. journals 50% of the cited links were inaccessible after four years.

Lots of errors and falsehood, no quality control

Page 5: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

5

Markus Strohmaier 2012

Web based Retrieval: Challenges

The Web is hyperlinked Based on HTML Markup tags and URIs Pages are interconnected

–  Unidirectional links (in-link, out-link, self-link)

Network structures emerge from the links –  Link analysis is possible

Page 6: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

6

Markus Strohmaier 2012

The World Wide Web (1990-2000)

A user‘s interaction with the web is

mediated by (a few) editors and publishers

Page 7: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

7

Markus Strohmaier 2012

Interaction between individuals and computational systems is mediated by the aggregate behavior of massive numbers

(millions) of users.

The World Wide Web Today (2010)

Page 8: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

10

Markus Strohmaier 2012

Information Foraging Theory

Slides based on •  Information Foraging Theory for Control Room

Resilience, Ronald Laurids Boring, PhD, [email protected]

•  Information Search (Shneiderman and Plaisant, Ch. 13) from http://wps.aw.com/aw_shneider_dtui_13

Page 9: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

11

Markus Strohmaier 2012

Cost of Knowledge, Search, Cognition, and Computers

Information systems (computers) and “cost” of acquiring knowledge –  A first principle of information system design –  “Cognitive information ergonomics”

•  Efficiency/productivity gain/usability/… –  “Economics of cognition and the cognitive cost of knowledge”

There is (and has always been) a cost to acquire information /

knowledge –  cost = user/worker time +, e.g., machine cost, db access charge, book

Many studies fail to document increased profit directly from implementation of (single) information system

–  However, no doubt that worker productivity in late 20th century dramatically increased

–  Productivity greatly enhanced by pervasive use electronic information systems (computers)

Page 10: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

12

Markus Strohmaier 2012

Informavores and Information Foraging That human quest for information is innate and adaptive is

well known

Humans are informavores –  George Miller, 1983, “… magic number 7 + 2” –  Organisms that hunger for information about the world and

themselves

“A wealth of information creates a poverty of attention and a need to allocate it efficiently”

–  Herb Simon, AI, Nobel prize, economics, cognition

Consider analogy of acquiring knowledge with animals seek food

–  Pirolli, P. and S. Card (1995). Information Foraging in Information Access Environments, in CHI '95, p. 518

–  Pirolli, P. (2007) ….. Book …..

Information overload vs. diet

Page 11: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

13

Markus Strohmaier 2012

Information Foraging Theory (IFT) Information Foraging Theory (IFT)

–  Pirolli and Card – Xerox PARC –  “an approach to the analysis of human activities involving information access

technologies” –  Derives from optimal foraging theory in biology and anthropology

•  Analyzes adaptive value of food-foraging strategies

Analyzes trade-offs in value of information gained against the costs of performing activity in human-computer interaction tasks

–  And need models and analysis techniques to determine value added by information access, manipulation, and presentation techniques

Real information system design problem is not how to collect more information, but how to optimize user’s time

–  Increase relevant information gained per unit time expended

IFT provides a relatively “formal” (quantitative account)

Page 12: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

14

Markus Strohmaier 2012

IFT – Time Scales Considers “adaptiveness of

human-system designs in the context of the information ecologies in which tasks are performed”

–  Ecology, as system, here, information

Time scales of information seeking and sense making activities

–  Cognitive band (~100 ms – 10 s) –  Rational band (minutes to hours) –  Social band (days to months)

Have seen much of cognitive, now others

Page 13: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

15

Markus Strohmaier 2012

Time scales of analysis

Time scale (s) Psychological domain

10-1000 • Problem solving • Decision making

1-100 • Visual search • Motor behavior

Pete Pirolli's Home Page

Peter Pirolli. ... Palo Alto, CA 94304 USA phone: +1-650-812-4483 fax: +1-650-812-4241

email: [email protected] This page updated December 18, 2000.

www.parc.xerox.com/istl/members/pirolli/pirolli.html - 9k - Cached - Similar pages

.100-1 • Visual attention • Perceptual judgment

User Interface Domain

Page 14: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

16

Markus Strohmaier 2012

IFT – An Ecological Perspective Time scales of information seeking and sense making activities

–  Cognitive band (~100 ms – 10 s) –  Rational band (minutes to hours) –  Social band (days to months)

As time scale increases, less regard for how internal processing

accomplishes linking of actions to goals Assumes behavior governed by “rational principles and shaped by

constraints and affordances of the task environment” An ecological perspective, i.e., that behavior is “adaptive” in that it

accomplishes some goal

Page 15: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

17

Markus Strohmaier 2012

IFT – Metaphor and Quantitative Information Foraging Theory

–  name both a metaphor and straightforward use of biological “optimal foraging theory”

Metaphor:

–  Animals adapt behavior and structure through evolution •  (humans don’t have to wait that long!)

–  Animals adapt to increase their rate of energy intake, etc. •  To do this they evolve different methods •  E.g., wolf hunts prey, spiders build webs and wait

And there are analogies to this

–  E.g., hunting = active information seeking, waiting = information filtering –  Humans (and others) hunt in groups - when variance of food is high

•  Accept lower expected mean to minimize probability of days without food –  Also, on social time scale, sharing of information

Page 16: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

18

Markus Strohmaier 2012

Optimal Foraging Theory - Biology Developed in biology for understanding opportunities and forces of

adaptation –  elements of the theory can help in understanding existing human

adaptations for gaining and making sense of information –  Also, aid in task analysis for creating new interactive information system

designs

Optimality models include –  Decision assumptions

•  Which of the problems faced by an agent are to be analyzed •  E.g., whether to pursue a particular type of information (or prey) when

encountered, how long to spend –  Currency assumptions

•  How choices are to be evaluated, e.g., information value (food value) –  Constraint assumptions

•  Limit and define relationships among decision and currency variables •  E.g., from task structure, interface technology, user knowledge

Page 17: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

19

Markus Strohmaier 2012

Information Foraging Theory Information foraging usually a task embedded in context of some

other task –  Value and cost structure defined in relation to the embedding task –  Value of external information may be in improvements to outcomes of

embedding task

Usually, embedding task is some ill-structured problem –  Additional knowledge is needed to better define goals, available actions,

heuristics, etc. –  E.g., choosing a graduate school, developing business strategy

Though use optimality model, not imply human behavior is classically rational

–  I.e., have perfect information and infinite computational resources –  Rather, humans exhibit bounded rationality, or make choices based on

satisficing

Page 18: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

20

Markus Strohmaier 2012

IFT – Information Patch Model Information patch model – from optimal foraging theory

Rate of currency intake, R = U / (Ts + Th) –  U = net amount of currency gained –  Ts = time spent searching –  Th = time spent exploiting

Net currency gain, U = Uf - Cf –  Uf = overall currency intake (gross amount foraged) –  Cf = currency expended in foraging

Average rate of currency intake u = Uf / λTs –  If assume information workers/foragers/consumers encounter information as

linear function of time –  Total n items encountered = λTs, where λ is rate of encounter with items

Page 19: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

21

Markus Strohmaier 2012

IFT – Information Patch Model Average cost of handling items: Let s = search cost per unit time, then total cost of search = sTs

Then, substituting in equation for R, rate of currency intake: So, can express R in terms of

–  Average rate of currency intake, u –  Search cost per unit time, s –  Cost of handling items, h

Page 20: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

22

Markus Strohmaier 2012

Optimal Foraging Time in a Patch •  gi(t), cumulative gain function –  Amt of information gained in time t –  gA(t) = random order of encounter

•  Increase in information equal for all elements

•  Hence, constant slope

–  gB(t) and gc(t) = ordered by relevancy •  “Relevant” items, those with higher

information content, encountered earlier •  Hence, highest rate of information increase

earlier, and rate decreases

λp, rate of encounter with relevant items

x-axis, travel time between patches

RB and RC = rate of return tc and tb optimal foraging time

Page 21: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

23

Markus Strohmaier 2012

IFT - Cost of Knowledge Foraging Efficiency

–  Animals minimize energy expenditure to get required gain in sustenance –  Humans minimize effort to get necessary gain in information

Again, foraging for food has much in common with seeking information –  Like edible plants in wild, useful information items often grouped together,

but separated by long distances in an “information wasteland”

Also, information “scent” –  Like scent of food, information in current environment that will assist in

finding more information clusters

Activities analyzed according to value gained and the cost incurred –  Resource costs

•  Expenditures of time and cognitive effort incurred –  Opportunity costs

•  Benefits that could be gained in engaging in other activities •  E.g., if not gaining information about visualization, could be gaining information about

software design

Page 22: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

24

Markus Strohmaier 2012

IFT Information processing systems evolve so as to maximize the gain of

valuable information per unit cost –  Sensory systems (vision, hearing) –  Information access (card catalogs, offices)

information value cost of interaction ( ) maximize

Page 23: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

25

Markus Strohmaier 2012

Navigating Networks

How can we model user navigation on

networks?

Page 24: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

26

Markus Strohmaier 2012

Experiment [Milgram] Goal •  Define a single target person and a group of starting persons •  Generate an acquaintance chain from each starter to the target Experimental Set Up •  Each starter receives a document •  was asked to begin moving it by mail toward the target •  Information about the target: name, address, occupation, company,

college, year of graduation, wife’s name and hometown •  Information about relationship (friend/acquaintance) [Granovetter 1973] Constraints •  starter group was only allowed to send the document to people they

know and •  was urged to choose the next recipient in a way as to advance the

progress of the document toward the target

1933-1984

Page 25: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

27

Markus Strohmaier 2012

Introduction

The simplest way of formulating the small-world problem is: Starting with any two people in the world, what is the likelihood that they will know each other? A somewhat more sophisticated formulation, however, takes account of the fact that while person X and Z may not know each other directly, they may share a mutual acquaintance - that is, a person who knows both of them. One can then think of an acquaintance chain with X knowing Y and Y knowing Z. Moreover, one can imagine circumstances in which X is linked to Z not by a single link, but by a series of links, X-A-B-C-D…Y-Z. That is to say, person X knows person A who in turn knows person B, who knows C… who knows Y, who knows Z.

[Milgram 1967, according to ]http://www.ils.unc.edu/dpr/port/socialnetworking/theory_paper.html#2]

Page 26: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

28

Markus Strohmaier 2012

An Experimental Study of the Small World Problem [Travers and Milgram 1969]

A Social Network Experiment tailored towards •  Demonstrating •  Defining •  And measuring Inter-connectedness in a large society (USA) A test of the modern idea of “six degrees of separation” Which states that: every person on earth is

connected to any other person through a chain of acquaintances not longer than 6

Page 27: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

29

Markus Strohmaier 2012

Results I

•  How many of the starters would be able to establish contact with the target? –  64 out of 296 reached the target

•  How many intermediaries would be required to link starters with the target? –  Well, that depends: the overall mean 5.2 links –  Through hometown: 6.1 links –  Through business: 4.6 links –  Boston group faster than Nebraska groups –  Nebraska stockholders not faster than Nebraska random

•  What form would the distribution of chain lengths take?

Page 28: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

30

Markus Strohmaier 2012

Results III .

•  Common paths •  Also see:

Gladwell’s “Law of the few”

Page 29: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

31

Markus Strohmaier 2012

Follow up work (2008) http://arxiv.org/PS_cache/arxiv/pdf/0803/0803.0939v1.pdf

–  Horvitz and Leskovec study 2008 –  30 billion conversations among 240 million people of Microsoft

Messenger –  Communication graph with 180 million nodes and 1.3 billion

undirected edges –  Largest social network constructed and analyzed to date (2008)

Page 30: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

32

Markus Strohmaier 2012

Decentralized Search

Shortest path to target

A (tag-tag) network:

Background knowledge: (a tag hierarchy)

start target

Goal: Navigate from START to TARGET using local and background knowledge only

Folksonomy 1

Folksonomy ...

Folksonomy n

J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science Technical Report 99-1776 (October 1999)

shortest path with global knowledge pGK = 3

shortest path found with local knowledge pLK = 4

Δ = pLK-pGK

Idea: use folksonomies as background knowledge Then, the performance of decentralized search

depends on the suitability of folksonomies. In other words, we can evaluate the suitability of folksonomies for decentralized search through simulations.

candidates

Page 31: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

33

Markus Strohmaier 2012

Evaluating Hierarchical Structures in Networks

How can measure the efficiency of

hierarchical structures for navigation?

Page 32: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

34

Markus Strohmaier 2012

The World Wide Web (1990-2000)

How efficient is this as a navigational aid?

Page 33: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

35

Markus Strohmaier 2012

Construction of hierarchies from unstructured tagging data

From tag centrality to tag generality:

[Heyman and Garcia-Molina 2006]

high tag centrality: more abstract

low tag centrality: more specific

Other existing folksonomy algorithms: k-means, affinity propagation, …

Page 34: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

36

Markus Strohmaier 2012

Decentralized Search

Evaluation Framework

Folksonomy 1

Folksonomy n

Folksonomy …

Simulation

Click-Data

Performance Evaluation

Explanatory Evaluation

which folksonomy explains actual user behavior best

which folksonomy performs best on a given navigational task

Page 35: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

37

Markus Strohmaier 2012

Success Rates Across Different Folksonomies

Random folksonomy

k-means / affinity propagation

Tag generality approaches

All approaches outperform a random folksonomy

Tag generality approaches outperform k-means / Aff. Propagation

flickr dataset

Success rate: The number of times an agent is successful in finding a path using a particular folksonomy as background knowledge

max hops n: the maximal number of steps an agent is allowed to perform before stopping (a tunable parameter e.g., an agent only follows n links).

n

Page 36: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

38

Markus Strohmaier 2012

Success Rates Across Different Datasets

Holds for all datasets (to diff. extents)

But how efficient are

those folksonomies

during search? Efficiency: how often does an agent not

find the global shortest path, but some other path that is longer.

Page 37: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

39

Markus Strohmaier 2012

Stretch Δ = pLK-pGK Shortest Paths found with Local Knowledge

Finds no path: Δ = infinite Finds paths that is +1 longer: Δ = 1

Finds shortest possible path: Δ = 0

Tag generality approaches (d+e) find much shorter

paths!

Holds for all datasets (to diff. extents)

Bibsonomy K-Means

Page 38: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

40

Markus Strohmaier 2012

Summary

•  Dsearch as a natural model of user navigation on the web

•  Emergence of dynamic, user-generated links reduces control

•  Empirical studies and new algorithms are needed to recover important system properties

Page 39: Markus Strohmaier, Denis Helic Multimediale ...kti.tugraz.at/staff/denis/courses/mmis2/material/navigation_models.pdfInformavores and Information Foraging That human quest for information

Knowledge Management Institute

41

Markus Strohmaier 2012

End of Presentation