47
Boolean, Boolean, bibliometrics, and bibliometrics, and beyond beyond LIS 670 donna Bair-Mundy Part 1

Boolean, bibliometrics, and beyond LIS 670 donna Bair-Mundy Part 1

Embed Size (px)

Citation preview

Boolean, bibliometrics, Boolean, bibliometrics, and beyondand beyond

LIS 670donna Bair-Mundy

Part 1

Our roadmap

Boolean

Fuzzy sets

Boolean exercises

Bibliometrics

Boolean

Boolean algebra

• Developed by George Boole, an English mathematician, circa 1850

• Set theory• Boolean logic is binary• Widely used in electronic design• Widely used in information

retrieval systems

Two ways of defining a set

Enumeration (listing the elements)

A = {1, 2, 3, 4, 5}

Specification of a distinguishing property all elements of the set have in common

B = {x | x is a prime number}

Set operators (1)

Given sets A = {1, 2, 3, 4, 5}B = {1, 3, 5, 7}C = {6, 7, 8}

Union - produces a set containing all members of both operand sets

A C = {1, 2, 3, 4, 5, 6, 7, 8}

Set Set Resultant Set

Set operators (2)

Given sets A = {1, 2, 3, 4, 5}B = {1, 3, 5, 7}C = {6, 7, 8}

Intersection - produces a set containing members in the first set that also occur in the second set

A B = {1, 3, 5}

Set operators (3)

Given set A = {1, 2, 3, 4, 5}

Complement - produces a set containing all members of the universal set that are not a member of the operand set

If D is the universal set of all positive integers, then:

A = {6, 7, 8, …}

Boolean operators

Boolean

OR

AND

NOT

SetTheory

Union

Intersection

Complement

Symbol

-

Algebraicsymbol

+

*

-

Words and symbols to denote set operators.

Algebraic operations on sets

A * (B + C) = A * B + A * C

Given sets A = {1, 2, 3, 4, 5}B = {1, 3, 5, 7}C = {6, 7, 8}

A AND (B OR C)

{1,2,3,4,5} AND {1,3,5,6,7,8}

1, 3, 5

(A AND B) OR (A AND C){1,3,5} OR {null}

1, 3, 5

===

Venn diagrams

Set 1 Set 2

John Venn Charles Dodgson

Set 3

Venn diagram - OR

Poodles

Poodles OR Retrieversyields all documents about either

poodles or retrievers

Retrievers

Venn diagram - AND

Poodles Retrievers

Poodles AND Retrieversyields all documents that deal with both

poodles and retrievers

Venn diagram - NOT

Poodles Retrievers

Poodles NOT Retrieversyields documents about poodles but not

about retrievers

Venn diagram - Exclusive OR

Poodles Retrievers

Poodles XOR Retrieversyields all documents that deal with either

poodles or retrievers but not both

Rules of precedence

dogs OR cats AND fleaswill be read as

dogs OR (cats AND fleas)

Union(OR)

Intersection(AND)

Complementation(NOT)

Cats

Fleas

DogsDogs

Specifying order of performance

(dogs OR cats) AND fleas

Dogs Cats

Fleas

Boolean searching: advantages

Ideally suited for inverted file indexes - each index entry and set of pointers constitutes a set

Allows user to broaden (using OR) or narrow (using AND, NOT) searches

Cats 1,3,7,9,13Dogs 2,5,6,15Fleas 6,7,9,17Gnus 19,27Guppies 4,14,18Hamsters 22,25,31

Boolean Exercises

The Scenario – part I

You are the librarian at the Happy Tastebuds Vegetarian School of Culinary Arts.

Chef Kweezee is planning the menus for this week’s demonstrations. He comes to the reference desk and asks you to search the recipe database for him.

The Scenario – part II

The search command for this database is FIND followed by key words. The system accommodates Boolean operators and allows parentheses. It does not accommodate phrase searching.

The Scenario – part III

To impress the chef, who stays to watch you search, you formulate a single search statement for each menu.

Sample Boolean exercise

Cuisine:MexicanTitle:EnchiladaIngredients:Corn tortillas, tomato sauce, chili peppers, beans, onions, garlic, cilantro…

Mexican CuisineEnchiladaRefried beans

FIND mexican AND (enchilada OR (refried and beans))

Sample record

Menu

Search statement

FIND mexican AND (enchilada OR (refried AND beans))

refried

beansenchiladaenchiladamexican

FIND Casserole

Exercise 1

Mexican Cuisine

Mexican casserole Tostada

Menu

OR Tostada ( ) AND Mexican

FIND Casserole

Exercise 1 Venn Diagram

Mexican Cuisine

Mexican casserole Tostada

Menu

Tostada ( ) AND Mexican

Casserole

Tostada

Casserole

Tostada

Mexican

OR

Exercise 2

Italian Cuisine

Pasta with grilled artichoke heartsBaked garlic

Menu

FIND

Exercise 3

Greek Cuisine

Vegetarian moussakaGreek salad featuring kalamata olives

Menu

FIND

Exercise 4

Chinese Cuisine

Hot and sour soupFried eggplantTofu and broccoli dish

Menu

FIND

Exercise 5

Indian Cuisine

Eggplant currySamosaRaitaTamarind sauce

Menu

FIND

Boolean searching: disadvantages (1)

• Counterintuitive– AND retrieves fewer items

• Two-valued logic - items meet criteria or they do not–Good for computers

–Does not reflect user relevancy determinations

Boolean searching: disadvantages (2)

Research topic: Digital music libraries

Documents

Current research on digital music libraries

Introduction to digital libraries

Information architecture in the digital environment

Libraries of ancient Babylonia

Fuzzy sets

Binary versus fuzzy sets

Fuzzy set S(Ri x Q) [0,1]

S expresses not whether or not R is in the set but the degree of strength of the association of R with the set.

Ri = any recordQ = user query

Binary set S(Ri x Q) 0,1

Retrieval set for query Q is all records Ri such that S(Ri x Q) = 1

Test each record against query

Brackets indicate range

Yes or no: retrieved

or not

Fuzzy set

1 0

highly relevant

non-relevant

FIND Agni Vedic fire ritual

1 0

highly relevant

non-relevant

Analysis of the Agni Vedic fire

ritual(1)

Structural analysis of a

Vedic fire ritual(0.75)

Analysis of a fire ritual of India

(0.5)

Characteristics of Agni

(0.25

Implementing fuzzy sets (1)

User enters list of words

FIRE, RITUAL, SACRIFICE

Retrieval system examines each record or document in the database

Computes score by number of query words that appear in the document

System presents ordered list of documents, along with their scores

Implementing fuzzy sets (2)

Rank Title

100% The fire sacrifice ritual of early Vedic period India

66% Fire and sacrifice in proto-Indo-European society

33% How to build a fire the Girl Scout way

FIRE, RITUAL, SACRIFICE

Implementing fuzzy sets (3)

User enters list of wordsFIRE, RITUAL, SACRIFICE

Retrieval system examines each document or record in the database, computing score for that item by adding 1 for each time any of the words on the user's list appears in the document or record

System presents ordered list of documents, along with their scores

Implementing fuzzy sets (4)

Rank Document The fire sacrifice ritual of early

Vedic period India. The fire sacrifice ritual is one of many sacrifice rituals observed as being performed…

Fire and sacrifice in proto-Indo-European society. Discussion of the role of fire…

How to build a fire the Girl Scout way. Demonstrates fire building…

FIRE, RITUAL, SACRIFICE

Fuzzy sets in Voyager (1)Agni, vedic

Fuzzy sets in Voyager (2)Agni, vedic

Fuzzy sets in Voyager (3)Agni, vedic

Fuzzy sets in Voyager (4)Agni, vedic

Fuzzy sets in Voyager (5)Agni, vedic

Field-weighting terms

Title:Winning-induced euphoria in tiddlywinks players

Descriptors:Euphoria; Tiddlywinks

Abstract:The authors studied the brain waves of 175 tiddlywinks winners and found euphoria induced by winning lasted an average of 3 hours.

Text:Researchers have long held that tiddlywinks, unlike other sports, do not induce a significant affective…

55

2

1

Terms weighted by fields in which they occur

User-weighting termsTerms weighted at time of search by user

*Weighted

term