48
Information Search (Shneiderman and Plaisant, Ch. 13) from http://wps.aw.com/aw_shneider_dtui_13

Information Search (Shneiderman and Plaisant, Ch. 13)

Embed Size (px)

DESCRIPTION

Information Search (Shneiderman and Plaisant, Ch. 13). from http://wps.aw.com/aw_shneider_dtui_13. Overview. Introduction “Information search should be a joyous experience” Searching in Textual Documents Multimedia Document Searches Advanced Filtering and Search Interfaces - PowerPoint PPT Presentation

Citation preview

Page 1: Information Search (Shneiderman and Plaisant, Ch. 13)

Information Search(Shneiderman and Plaisant, Ch. 13)

from http://wps.aw.com/aw_shneider_dtui_13

Page 2: Information Search (Shneiderman and Plaisant, Ch. 13)

Overview

• Introduction– “Information search should be a joyous experience”

• Searching in Textual Documents

• Multimedia Document Searches

• Advanced Filtering and Search Interfaces

• Information Foraging– Some trees …

Page 3: Information Search (Shneiderman and Plaisant, Ch. 13)

Information Search• Critical need to access information, as part of any task

– always has been, always will be (ahbawb)– Cultural change, if not evolution, due to amount of information accessible by

individual

• “Information overload” – ahbawb– What’s new is ubiquity due to massive e-access

• Old school “information retrieval” and “end

user searching”– Gurus and cost

• Genuinely new …– Interest, due to market/user size

• E.g., search engines can be profitable

– tools, e.g., visualization, due to Moore’s law

Page 4: Information Search (Shneiderman and Plaisant, Ch. 13)

Information Search - Words

• Old school– Information retrieval, database management– Bibliographic document systems, structured relational db – attributes

• New school– Information gathering, seeking, filtering, sensemaking, visual analytics

• CS focus– Data mining, data warehouses, data marts

• Toward future ends such as– Knowledge networks, semantic webs, …

• Range of search elements increases– Cf. Hearst November, 2011 CACM paper, “collaborative search” (on web site)

Page 5: Information Search (Shneiderman and Plaisant, Ch. 13)

Search Terminology

• Shneiderman’s taxonomy

• Task objects • E.g., movies for rent, are stored in structured relational databases,

textual document libraries, or multimedia document libraries

• Structured relational database • relations and a schema to describe the relations• Relations have items (usually called tuples or records), and each item

has multiple attributes (often called fields), which each have attribute values

• Textual document library • Set of collections

• typically up to a few hundred collections per library)• descriptive attributes or metadata about the library

• E.g., name, location, owner

Page 6: Information Search (Shneiderman and Plaisant, Ch. 13)

Search Terminology, 2

• Task actions are decomposed into browsing or searching

• Examples of task actions:

- Specific fact finding (known-item search)• Find the e-mail address of the President of the United States

- Extended fact finding• What other books are by the author of “Jurassic Park”?

- Exploration of availability• Is there new work on voice recognition in the ACM digital library?

- Open-ended browsing and problem analysis• Is there new research on fibromyalgia that might help my patient?

Page 7: Information Search (Shneiderman and Plaisant, Ch. 13)

Search Terminology, 3

• Once users have clarified their information needs, the first step towards satisfying those needs is deciding where to search

• Supplemental finding aids can help users to clarify and pursue their information needs, e.g. table of contents or indexes

• Additional preview and overview surrogates for items and collections can be created to facilitate browsing

Page 8: Information Search (Shneiderman and Plaisant, Ch. 13)

Searching Textual Documents

• As noted, recent dramatic changes

• Historically, Boolean clause search and SQL

• Other methods include:- Natural language queries- Form fill-in- Query by example (QBE)

• Evidence shows that users perform better and have higher satisfaction when they can view and control the search

Page 9: Information Search (Shneiderman and Plaisant, Ch. 13)

Ex., Library of Congress

• Aids to find bills, etc

• “Multiple paths to information items”

• (had a look, just for fun)

– Not bad

Page 10: Information Search (Shneiderman and Plaisant, Ch. 13)

Ex., Library of Congress

• Aids to find bills, etc

Page 11: Information Search (Shneiderman and Plaisant, Ch. 13)

Ex., Library of Congress

• Aids to find bills, etc

Page 12: Information Search (Shneiderman and Plaisant, Ch. 13)

Ex., Library of Congress

• Aids to find bills, etc

Page 13: Information Search (Shneiderman and Plaisant, Ch. 13)

Searching in Textual Documentsand Database Querying

Page 14: Information Search (Shneiderman and Plaisant, Ch. 13)

Searching in Textual Documentsand Database Querying, 2

A search for “user interface” powered by Endeca (http://www.lib.ncsu.edu) returns144 results grouped into 10 pages. The menu at the upper right allows users to sortresults by relevance or by date, while on the left a summary of the results organizedby Subject, Genre, or Format provides an overview of the results and facilitatesfurther refinement of the search.

Page 15: Information Search (Shneiderman and Plaisant, Ch. 13)

Framework for Textual Search

• Recall, task delineation for interface design• Shneiderman suggests stages to consider in textual search• Overview below, detail, next slide:

• Formulation: expressing the search

• Initiation of action: launching the search

• Review of results: reading messages and outcomes

• Refinement: formulating the next step

• Use: compiling or disseminating insight

Page 16: Information Search (Shneiderman and Plaisant, Ch. 13)

5 Stages of Textual Search - Detail

Page 17: Information Search (Shneiderman and Plaisant, Ch. 13)

Address auto-fill-in with Visual Cues(Shneiderman)

Page 18: Information Search (Shneiderman and Plaisant, Ch. 13)

Multimedia Document Searches

• “Multimedia” (non-textual) search is hard• Quickly evolving area• Interface issues essentially undefined

• “Hum that tune”, “what did he/she/it look like”

• Types:• Image search• Map search• Design or diagram search• Sound search• Video search• Animation search

Page 19: Information Search (Shneiderman and Plaisant, Ch. 13)

Image Search

• Finding photos with images such as the Statue of Liberty is a challenge

• Query-by-Image-Content (QBIC) is difficult• Search by profile (shape of lady), distinctive features (torch), colors

(green copper)

• Simple drawing tools to build templates or profiles to search with

• More success is attainable by searching restricted collections • Search a vase collection • Find a vase with a long neck by drawing a profile of it

• Critical searches such as fingerprint matching requires a minimum of 20 distinct features

• For small collections effective browsing and lightweight annotation are important

Page 20: Information Search (Shneiderman and Plaisant, Ch. 13)

Map Search• On-line maps are plentiful

• Search by latitude/longitude is the structured-database solution

• Today's maps are allow utilizing structured aspects and multiple layers– City, state, and site searches – Flight information searches – Weather information searches – Mapquest, Google Maps, etc.

• Mobile devices can allow “here” as a point of reference

Page 21: Information Search (Shneiderman and Plaisant, Ch. 13)

Other Multimedia Searches• Design/Diagram Searches

– Some computer-assisted design packages support search of designs– Allows searches of diagrams, blueprints, newspapers, etc., e.g. search

for a red circle in a blue square or a piston in an engine – Document-structure recognition for searching newspapers

• Sound Search

• Video Search – Provide an overview– Segmentation into scenes and frames– Support multiple search methods

• Animation Search – Possible to search for specific animations like a spinning globe – Search for moving text on a black background

Page 22: Information Search (Shneiderman and Plaisant, Ch. 13)

Image Search

• Sketch or image to start

Page 23: Information Search (Shneiderman and Plaisant, Ch. 13)

Advanced Filtering & Search Interfaces

• Wide range of interface strategies and styles

• Filtering with complex Boolean queries• Automatic filtering• Dynamic queries• Faceted metadata search• Query by example• Implicit search• Collaborative filtering• Multilingual searches• Visual field specification

Page 24: Information Search (Shneiderman and Plaisant, Ch. 13)

Advanced Filtering and Search Interface Examples, 1

• Alternatives to form fill-in query interfaces:

• Filtering with complex Boolean queries• Problem with informal English, e.g. use of ‘and’ and ‘or’• Venn diagrams, decision tables, etc., not worked for complex queries

• Dynamic Queries• “Direct manipulation” queries • Use sliders and other related controls to adjust the query • Get immediate (less than 100 msec) feedback with data • Dynamic HomeFinder and Blue Nile and (sort of) Realtor.com• Hard to update fast with large databases

Page 25: Information Search (Shneiderman and Plaisant, Ch. 13)

Dynamic Queries

• Diamond price, rating indicated using sliders, etc.

Page 26: Information Search (Shneiderman and Plaisant, Ch. 13)

Advanced Filtering and Search Interface Examples, 2

• Alternatives to form fill-in query interfaces:

• Filtering with complex Boolean queries• Problem with informal English, e.g. use of ‘and’ and ‘or’• Venn diagrams, decision tables, etc., not worked for complex queries

• Dynamic Queries• “Direct manipulation” queries • Use sliders and other related controls to adjust the query • Get immediate (less than 100 msec) feedback with data • Dynamic HomeFinder and Blue Nile and (sort of) Realtor.com• Hard to update fast with large databases

• Query previews present an overview to give users information and distribution of data to eliminate undesired items

• Faceted metadata search• Integrates category browsing with keyword searching• Flameco

Page 27: Information Search (Shneiderman and Plaisant, Ch. 13)

Faceted Metadata

• Facets include media, location, date, themes

Page 28: Information Search (Shneiderman and Plaisant, Ch. 13)

Advanced Filtering and Search Interface Examples, 3

• Collaborative Filtering – Groups of users combine evaluations to help in finding items in a large

database – User "votes" and info used for rating the item of interest,

• e.g. Rating restaurants highly is given a list of restaurants also rated highly by those who agree the six are good

• Multilingual searches– Current systems provide rudimentary translation searches– Prototypes of systems with specific dictionaries and more sophisticated

translation

• Visual searches– Specialized visual representations of possible values, e.g. dates on a

calendar or seats on a plane– On a map the location may be more important than the name– Implicit initiation and immediate feedback

Page 29: Information Search (Shneiderman and Plaisant, Ch. 13)

Tree Map of Products

(Shneiderman)

Using The Hive Group’s treemap (http://www.hivegroup.com/), users can review all waterproof binoculars in the catalog of Amazon.com products and browse the items in the list, grouped by manufacturer. Each box corresponds to a pair of binoculars, and the size of the box is proportional to its price. Green boxes are best-sellers. Users can filter the results using the dynamic query sliders on the right. Here all the binoculars with less than three user reviews have been filtered out, leaving only 61 binoculars to consider.

Page 30: Information Search (Shneiderman and Plaisant, Ch. 13)
Page 31: Information Search (Shneiderman and Plaisant, Ch. 13)

Cost of Knowledge, Search,Cognition, and Computers

• Information systems (computers) and “cost” of acquiring knowledge– A first principle of information system design– “Cognitive information ergonomics”

• Efficiency/productivity gain/usability/…– “Economics of cognition and the cognitive cost of knowledge”

• There is (and has always been) a cost to acquire information / knowledge

– cost = user/worker time +, e.g., machine cost, db access charge, book

• Many studies fail to document increased profit directly from implementation of (single) information system

– However, no doubt that worker productivity in late 20th century dramatically increased

– Productivity greatly enhanced by pervasive use electronic information systems (computers)

Page 32: Information Search (Shneiderman and Plaisant, Ch. 13)

Informavores and Information Foraging

• That human quest for information is innate and adaptive is well known

• Humans are informavores– George Miller, 1983, “… magic number 7 + 2”– Organisms that hunger for information about the world and

themselves

• “A wealth of information creates a poverty of attention and a need to allocate it efficiently”

– Herb Simon, AI, Nobel prize, economics, cognition

• Consider analogy of acquiring knowledge with animals seek food

– Pirolli, P. and S. Card (1995). Information Foraging in Information Access Environments, in CHI '95, p. 518

– Pirolli, P. (2007) ….. Book …..

Page 33: Information Search (Shneiderman and Plaisant, Ch. 13)

Information Foraging Theory (IFT)

• Information Foraging Theory (IFT)– Pirolli and Card – Xerox PARC– “an approach to the analysis of human activities involving information access

technologies”– Derives from optimal foraging theory in biology and anthropology

• Analyzes adaptive value of food-foraging strategies

• Analyzes trade-offs in value of information gained against the costs of performing activity in human-computer interaction tasks

– And need models and analysis techniques to determine value added by information access, manipulation, and presentation techniques

• Real information system design problem is not how to collect more information, but how to optimize user’s time

– Increase relevant information gained per unit time expended

• IFT provides a relatively “formal” (quantitative) account

Page 34: Information Search (Shneiderman and Plaisant, Ch. 13)

IFT – Time Scales

• Considers “adaptiveness of human-system designs in the context of the information ecologies in which tasks are performed”

– Ecology, as system, here, information

• Time scales of information seeking and sense making activities:

– Cognitive band (~100 ms – 10 s)– Rational band (minutes to hours)– Social band (days to months)

• Have seen much of cognitive, now others

Page 35: Information Search (Shneiderman and Plaisant, Ch. 13)

Time scales of analysis

Time scale (s)Psychologicaldomain

10-1000 • Problem solving• Decision making

1-100• Visual search• Motor behavior

Pete Pirolli's Home Page

Peter Pirolli. ... Palo Alto, CA 94304 USA phone: +1-650-812-4483 fax: +1-650-812-4241

email: [email protected] This page updated December 18, 2000.

www.parc.xerox.com/istl/members/pirolli/pirolli.html - 9k - Cached - Similar pages

.100-1• Visual attention• Perceptual judgment

User Interface Domain

Page 36: Information Search (Shneiderman and Plaisant, Ch. 13)

IFT – An Ecological Perspective

• Time scales of information seeking and sense making activities– Cognitive band (~100 ms – 10 s)– Rational band (minutes to hours)– Social band (days to months)

• As time scale increases, less regard for how internal processing accomplishes linking of actions to goals

• Assumes behavior governed by “rational principles and shaped by constraints and affordances of the task environment”

• An ecological perspective, i.e., that behavior is “adaptive” in that it accomplishes some goal

Page 37: Information Search (Shneiderman and Plaisant, Ch. 13)

IFT – Metaphor and Quantitative

• Information Foraging Theory– name both a metaphor and straightforward use of biological “optimal foraging theory”

• Metaphor:– Animals adapt behavior and structure through evolution

• (humans don’t have to wait that long!)

– Animals adapt to increase their rate of energy intake, etc.• To do this they evolve different methods

• E.g., wolf hunts prey, spiders build webs and wait

• And there are analogies to this– E.g., hunting = active information seeking, waiting = information filtering– Humans (and others) hunt in groups - when variance of food is high

• Accept lower expected mean to minimize probability of days without food

– Also, on social time scale, sharing of information

Page 38: Information Search (Shneiderman and Plaisant, Ch. 13)

Optimal Foraging Theory - Biology

• Developed in biology for understanding opportunities and forces of adaptation

– P&C use elements of the theory to help in understanding existing human adaptations for gaining and making sense of information

– Also, aid in task analysis for creating new interactive information system designs

• Optimality models include:– Decision assumptions

• Which of the problems faced by an agent are to be analyzed

– E.g., whether to pursue a particular type of information (or prey) when encountered, how long to spend

– Currency assumptions• How choices are to be evaluated, e.g., information value (food value)

– Constraint assumptions• Limit and define relationships among decision and currency variables

– E.g., from task structure, interface technology, user knowledge

Page 39: Information Search (Shneiderman and Plaisant, Ch. 13)

Information Foraging Theory

• Information foraging usually a task embedded in context of some other task

– Value and cost structure defined in relation to the embedding task– Value of external information may be in improvements to outcomes of embedding

task

• Usually, embedding task is some ill-structured problem– Additional knowledge is needed to better define goals, available actions,

heuristics, etc.– E.g., choosing a graduate school, developing business strategy

• Though use optimality model, not imply human behavior is classically rational

– I.e., have perfect information and infinite computational resources– Rather, humans exhibit bounded rationality, or make choices based on satisficing

Page 40: Information Search (Shneiderman and Plaisant, Ch. 13)

IFT – Information Patch Model

• Information patch model – from optimal foraging theory

• Rate of currency intake, R = U / (Ts + Th)– U = net amount of currency gained

– Ts = time spent searching

– Th = time spent exploiting

• Net currency gain, U = Uf - Cf

– Uf = overall currency intake (gross amount foraged)

– Cf = currency expended in foraging

• Average rate of currency intake u = Uf / Ts

– If assume information workers/foragers/consumers encounter information as linear function of time

– Total n items encountered = Ts, where is rate of encounter with items

Page 41: Information Search (Shneiderman and Plaisant, Ch. 13)

IFT – Information Patch Model

• Average cost of handling items:

• Let s = search cost per unit time, then total cost of search = sTs

• Then, substituting in equation for R, rate of currency intake:

• So, can express R in terms of – Average rate of currency intake, u– Search cost per unit time, s– Cost of handling items, h

quickly …

Page 42: Information Search (Shneiderman and Plaisant, Ch. 13)

IFT – Information Patch Model

• And so forth …

Page 43: Information Search (Shneiderman and Plaisant, Ch. 13)

An Example: Scatter Gather

• Hierarchical clustering of document

• Users see “overview” of document clusters

• Allows user to navigate through clusters and overviews

Page 44: Information Search (Shneiderman and Plaisant, Ch. 13)

Scatter/Gather Task

Scatter/GatherWindow

Law

World News

AI

CS

Medicine

Nat. Lang.

Robots

Expert Sys

Planning

Bayes. Nets

Display TitlesWindow

Page 45: Information Search (Shneiderman and Plaisant, Ch. 13)

Optimal Foraging Time in a Patch

• gi(t), cumulative gain function– Amt of information gained in time t

– gA(t) = random order of encounter• Increase in information equal for all elements• Hence, constant slope

– gB(t) and gc(t) = ordered by relevancy• “Relevant” items, those with higher information

content, encountered earlier• Hence, highest rate of information increase earlier,

and rate decreases

• p, rate of encounter with relevant items

• x-axis, travel time between patches

• RB and RC = rate of return

• tc and tb optimal foraging time– Foraging longer in the “patch” not optimal

Information gained

time

Page 46: Information Search (Shneiderman and Plaisant, Ch. 13)

IFT - Cost of Knowledge

• Foraging Efficiency– Animals minimize energy expenditure to get required gain in sustenance– Humans minimize effort to get necessary gain in information

• Again, foraging for food has much in common with seeking information– Like edible plants in wild, useful information items often grouped together,

but separated by long distances in an “information wasteland”

• Also, information “scent”– Like scent of food, information in current environment that will assist in

finding more information clusters

• Activities analyzed according to value gained and the cost incurred– Resource costs

• Expenditures of time and cognitive effort incurred

– Opportunity costs• Benefits that could be gained in engaging in other activities• “Cost of lost opportunity”

– E.g., if not gaining information about algorithms (or messing with registration system), could be gaining information about software design

Page 47: Information Search (Shneiderman and Plaisant, Ch. 13)

IFT

• Information processing systems evolve so as to maximize the gain of valuable information per unit cost– Sensory systems (vision, hearing)

– Information access (card catalogs, offices)

information valuecost of interaction( )maximize

Page 48: Information Search (Shneiderman and Plaisant, Ch. 13)

End

• .