56
Chapter 5 Chapter 5 Accessing Information Accessing Information Resources Resources

5 Accessing Information Resources

Embed Size (px)

Citation preview

Page 1: 5 Accessing Information Resources

Chapter 5Chapter 5Accessing Information Accessing Information

ResourcesResources

Page 2: 5 Accessing Information Resources

Learning ObjectivesLearning Objectives

Compare and contrast the surface Web and the Compare and contrast the surface Web and the deep Web.deep Web.

Differentiate the different types of search tools and Differentiate the different types of search tools and how they work.how they work.

Summarize the different methods used to access Summarize the different methods used to access information contained in the deep Web.information contained in the deep Web.

Explain how to develop a search question and Explain how to develop a search question and strategy.strategy.

Page 3: 5 Accessing Information Resources

Learning ObjectivesLearning Objectives

Utilize search queries, keywords, and Boolean Utilize search queries, keywords, and Boolean search operators.search operators.

Utilize different search techniques.Utilize different search techniques.

Describe how to use advanced search engine Describe how to use advanced search engine pages.pages.

Describe how to evaluate information for accuracy Describe how to evaluate information for accuracy and validity.and validity.

Explain plagiarism, intellectual property, fair use, Explain plagiarism, intellectual property, fair use, and proper citation techniques.and proper citation techniques.

Page 4: 5 Accessing Information Resources

Chapter FocusChapter Focus

Information Resources on the WebInformation Resources on the Web

Search EnginesSearch Engines

Subject DirectoriesSubject Directories

The Deep WebThe Deep Web

Defining a Search QuestionDefining a Search Question

Formulating Search QueriesFormulating Search Queries

Search Logic and Syntax ConventionsSearch Logic and Syntax Conventions

Evaluating and Using Internet ResourcesEvaluating and Using Internet Resources

Page 5: 5 Accessing Information Resources

Information Resources on the WebInformation Resources on the Web

Size of the InternetSize of the Internet

According to 2003 study by UC BerkeleyAccording to 2003 study by UC Berkeley

92,017 terabytes excluding e-mail and instant 92,017 terabytes excluding e-mail and instant messagingmessaging

Average Web pageAverage Web page

50 kilobytes50 kilobytes

Page 6: 5 Accessing Information Resources

Information Resources on the WebInformation Resources on the Web

Surface WebSurface Web

The portion of the Web that search engines The portion of the Web that search engines and subject directories can indexand subject directories can index

Deep WebDeep Web

Searchable databases that generate dynamic Searchable databases that generate dynamic Web pages, non-HTML files, sites that require Web pages, non-HTML files, sites that require passwords or registration, archives, library passwords or registration, archives, library catalogs, and information located behind catalogs, and information located behind firewallsfirewalls

Page 7: 5 Accessing Information Resources

Information Resources on the Web Information Resources on the Web ReviewReview

What is the difference between the deep Web and What is the difference between the deep Web and the surface Web?the surface Web?

Which contains more information, the deep Web or Which contains more information, the deep Web or the surface Web?the surface Web?

Why do you think this is true?Why do you think this is true?

Page 8: 5 Accessing Information Resources

Search EnginesSearch Engines

Search enginesSearch engines

Web sites that use software tools to index the Web sites that use software tools to index the contents of the Web so that information can be contents of the Web so that information can be located and retrievedlocated and retrieved

QueryQuery

Consists of one or more keywords, important or Consists of one or more keywords, important or significant words likely to be found in the significant words likely to be found in the information soughtinformation sought

Search hitsSearch hits

Information relevant to the query is presented by Information relevant to the query is presented by the search engine as a series of listingsthe search engine as a series of listings

Page 9: 5 Accessing Information Resources

Search EnginesSearch Engines

Search Engine Search Engine Search Text Search Text Boxes and Search Boxes and Search Command Command ButtonsButtons

searchtext box

searchcommandbutton

Lycos Basic Search Page

AltaVista Basic Search Page

Yahoo! Basic Search Page

Google Basic Search Page

searchtext box

searchcommandbutton

searchtext box

searchtext box

searchcommandbutton

searchcommandbutton

Page 10: 5 Accessing Information Resources

Search EnginesSearch Engines

Google Search Results PageGoogle Search Results Page

sponsoredlinks

Page 11: 5 Accessing Information Resources

Search EnginesSearch Engines

Google Google Cache Cache

FeatureFeature

Click a Cached linkto view the mostrecently indexedversion of a page

Some search engines such as Google and Yahoo! Some search engines such as Google and Yahoo! Search have a cache feature that enables users to Search have a cache feature that enables users to view cached or saved Web pages in its database asview cached or saved Web pages in its database as

Clicking Cached hyperlink displays most recently Clicking Cached hyperlink displays most recently indexed version of a page if search engine indexed version of a page if search engine cannot display the current version for any reasoncannot display the current version for any reason

Page 12: 5 Accessing Information Resources

Search EnginesSearch Engines

Search Engine Information Gathering and StorageSearch Engine Information Gathering and Storage

A search engine typically searches its own A search engine typically searches its own database containing information that was database containing information that was gathered using robotic programs known as gathered using robotic programs known as spidersspiders

An indexer program sorts the words contained in An indexer program sorts the words contained in or related to the Web page and organizes them in or related to the Web page and organizes them in a databasea database

Page 13: 5 Accessing Information Resources

Search EnginesSearch Engines

Search ResultsSearch Results

Most search engines rank the hits on a results Most search engines rank the hits on a results page listing the most relevant results firstpage listing the most relevant results first

Almost all search engines ignore meta tag Almost all search engines ignore meta tag keywords when ranking pages to prevent Web keywords when ranking pages to prevent Web page creators from manipulating page rankingspage creators from manipulating page rankings

A particular query entered in different search A particular query entered in different search engines will almost always produce different engines will almost always produce different resultsresults

Page 14: 5 Accessing Information Resources

Search EnginesSearch Engines

Google SearchResults

AltaVista SearchResults

Page 15: 5 Accessing Information Resources

Keywords can be difficult to find in a Web page as Keywords can be difficult to find in a Web page as they may be located deep within the page or they may be located deep within the page or displayed throughout a documentdisplayed throughout a document

Several methods can help the user find relevant text Several methods can help the user find relevant text on a Web pageon a Web page

Search within a Web page by pressing Ctrl+F to Search within a Web page by pressing Ctrl+F to open the Find dialog boxopen the Find dialog box

Type the keyword in the Type the keyword in the Find what Find what text box and text box and then click the Find Next button to highlight the then click the Find Next button to highlight the first instance of the keyword on the page and first instance of the keyword on the page and again to find any subsequent occurrences of the again to find any subsequent occurrences of the keywordkeyword

Search EnginesSearch Engines

Page 16: 5 Accessing Information Resources

Search EnginesSearch Engines

Viewing a cached version of a Web document Viewing a cached version of a Web document (from a search engine that offers this feature) can (from a search engine that offers this feature) can help the user find keywords in the documenthelp the user find keywords in the document

Page 17: 5 Accessing Information Resources

Search EnginesSearch Engines

Specialized Search EnginesSpecialized Search Engines

Subject-Specific search enginesSubject-Specific search engines

Narrows focus to a single subject or fieldNarrows focus to a single subject or field

Meta Search EnginesMeta Search Engines

Submits a query to more than one search Submits a query to more than one search engineengine

Some cluster or group results by topicSome cluster or group results by topic

Internal Search EnginesInternal Search Engines

Restricts searches to the contents of the siteRestricts searches to the contents of the site

Page 18: 5 Accessing Information Resources

Search EnginesSearch Engines

Health On the Net Foundation Web PageHealth On the Net Foundation Web Page

Page 19: 5 Accessing Information Resources

Search EnginesSearch EnginesIndicates on which searchengine this result was found.

Dogpile Meta Search Engine ResultsDogpile Meta Search Engine Results

Page 20: 5 Accessing Information Resources

Search EnginesSearch Engines

searchresultclusters

Clustered Search ResultsClustered Search Results

Page 21: 5 Accessing Information Resources

Search EnginesSearch Engines

Page 22: 5 Accessing Information Resources

Search EnginesSearch EnginesReviewReview

How do search engines index information?How do search engines index information?

How are page results ranked?How are page results ranked?

What are clustered search results?What are clustered search results?

What are meta search engines?What are meta search engines?

Page 23: 5 Accessing Information Resources

Subject DirectoriesSubject Directories

Contains links to Web sites and pages organized in Contains links to Web sites and pages organized in hierarchically arranged subject categorieshierarchically arranged subject categories

Each subject directory uses its own system for subject Each subject directory uses its own system for subject categorizationcategorization

Can be difficult to know how a topic might be Can be difficult to know how a topic might be classifiedclassified

Often contain annotations written by subject experts Often contain annotations written by subject experts that provide a capsule description of the informationthat provide a capsule description of the information

Typically index only a Web site’s home page rather Typically index only a Web site’s home page rather than all the pages contained in the sitethan all the pages contained in the site

A search tool combining a search engine and a A search tool combining a search engine and a subject diretory is known as a hybrid search enginesubject diretory is known as a hybrid search engine

Page 24: 5 Accessing Information Resources

Subject DirectoriesSubject Directories

Google Directory Home PageGoogle Directory Home Page

Page 25: 5 Accessing Information Resources

Subject DirectoriesSubject Directories

Yahoo! Directory Movie Review SubcategoryYahoo! Directory Movie Review Subcategory

Page 26: 5 Accessing Information Resources

Subject DirectoriesSubject Directories

About.com Subject ExpertAbout.com Subject Expert

Page 27: 5 Accessing Information Resources

Subject DirectoriesSubject DirectoriesReviewReview

How do subject directories differ from search How do subject directories differ from search engines?engines?

What are subject experts and what do they do?What are subject experts and what do they do?

What is a hybrid search engine?What is a hybrid search engine?

Page 28: 5 Accessing Information Resources

The Deep WebThe Deep Web

Contains Web resources that lie below the surface Contains Web resources that lie below the surface Web Web

Remains hidden because it resides in searchable Remains hidden because it resides in searchable databases that present several obstacles to search databases that present several obstacles to search engine spidersengine spiders

Often requires registration and logonOften requires registration and logon

Accessing the deep Web is known as drilling downAccessing the deep Web is known as drilling down

Not all deep Web material is accessibleNot all deep Web material is accessible

Page 29: 5 Accessing Information Resources

The Deep WebThe Deep Web

CompletePlanet Deep Web DirectoryCompletePlanet Deep Web Directory

Page 30: 5 Accessing Information Resources

The Deep WebThe Deep WebReviewReview

What prevents search engines and search What prevents search engines and search directories from indexing the deep Web?directories from indexing the deep Web?

What is drilling down?What is drilling down?

What kinds of deep Web information may be What kinds of deep Web information may be inaccessible?inaccessible?

Page 31: 5 Accessing Information Resources

Defining a Search QuestionDefining a Search Question

First stepFirst step

You should make sure to clearly define what the You should make sure to clearly define what the search question is in order to focus your searchsearch question is in order to focus your search

Start with a general search and then go more Start with a general search and then go more specificspecific

A subject directory is often the best starting pointA subject directory is often the best starting point

Then you can gradually narrow the focusThen you can gradually narrow the focus

Most subject directories include search engines Most subject directories include search engines that can search within the directory or the entire that can search within the directory or the entire WebWeb

Page 32: 5 Accessing Information Resources

Defining a Search QuestionDefining a Search Question

Search Question Search Question Flow ChartFlow Chart

Page 33: 5 Accessing Information Resources

Defining a Search QuestionDefining a Search QuestionReviewReview

What is the difference between a specific search What is the difference between a specific search question and a general search question?question and a general search question?

How can formulating a search question help How can formulating a search question help determine the type of search tool that should be determine the type of search tool that should be used?used?

What is the difference between a search question What is the difference between a search question and a search query?and a search query?

Page 34: 5 Accessing Information Resources

Formulating Search QueriesFormulating Search Queries

Syntax ConventionsSyntax ConventionsGeneral rules that determine how a search engine General rules that determine how a search engine processes keywordsprocesses keywords

Keyword QueriesKeyword QueriesEnable the search engine to find information relevant Enable the search engine to find information relevant to the search questionto the search questionMost search engines ignore certain words known as Most search engines ignore certain words known as stop or filter wordsstop or filter words

the, in, for, to, #, &, and so onthe, in, for, to, #, &, and so onTo determine what keywords to use in a search To determine what keywords to use in a search query, users should try to imagine the keywords likely query, users should try to imagine the keywords likely to appear in an answer to the question being posedto appear in an answer to the question being posed

Page 35: 5 Accessing Information Resources

Formulating Search QueriesFormulating Search Queries

Phrase QueriesPhrase Queries

Involves visualizing phrases likely to appear on a Involves visualizing phrases likely to appear on a Web page containing the desired informationWeb page containing the desired information

Almost all search engines will search for the Almost all search engines will search for the exact combination and order of words enclosed in exact combination and order of words enclosed in paired quotation marks, called a phrase searchpaired quotation marks, called a phrase search

A phrase query can be combined with a keyword A phrase query can be combined with a keyword search by including keywords outside of the search by including keywords outside of the phrase quotation marksphrase quotation marks

Page 36: 5 Accessing Information Resources

Formulating Search QueriesFormulating Search Queries

Refining Keyword QueriesRefining Keyword Queries

Using a single keyword for a keyword query will Using a single keyword for a keyword query will result in too many search result hits, many of result in too many search result hits, many of which will have nothing to do with the user’s which will have nothing to do with the user’s original questionoriginal question

Using too many keywords will reduce the number Using too many keywords will reduce the number of hits, which may cause the search engine to of hits, which may cause the search engine to ignore valuable informationignore valuable information

Page 37: 5 Accessing Information Resources

Formulating Search QueriesFormulating Search Queries

Using Multiple Keywords to Refine a SearchUsing Multiple Keywords to Refine a Search

Page 38: 5 Accessing Information Resources

Formulating Search QueriesFormulating Search QueriesReviewReview

What is a keyword?What is a keyword?

What are stop words?What are stop words?

What are two different types of search queries?What are two different types of search queries?

Page 39: 5 Accessing Information Resources

Search Logic and Syntax ConventionsSearch Logic and Syntax Conventions

Boolean LogicBoolean Logic

A type of algebraic logic that employs expressions using A type of algebraic logic that employs expressions using operatorsoperators

Boolean expressions produce a true/false resultBoolean expressions produce a true/false result

Boolean operatorsBoolean operators

AND tells search engine to return hits containing both AND tells search engine to return hits containing both wordswords

OR returns hits for pages containing at least one of OR returns hits for pages containing at least one of the two wordsthe two words

NOT excludes words from search query resultsNOT excludes words from search query results

Nesting can be combined to build more complex queries Nesting can be combined to build more complex queries using more than one operatorusing more than one operator

Page 40: 5 Accessing Information Resources

Search Logic and Syntax ConventionsSearch Logic and Syntax Conventions

Search Engine Comparison Table Web Page

Page 41: 5 Accessing Information Resources

Search Logic and Syntax ConventionsSearch Logic and Syntax Conventions

Results from Results from Boolean Boolean OperatorsOperators

Page 42: 5 Accessing Information Resources

Search Logic and Syntax ConventionsSearch Logic and Syntax Conventions

Syntax ConventionsSyntax Conventions

Paired parenthesesPaired parentheses

Indicates a phraseIndicates a phrase

Plus sign (+)Plus sign (+)

Used before a stop word to ensure it is not Used before a stop word to ensure it is not ignoredignored

Minus sign (–)Minus sign (–)

Used before a word to excludeUsed before a word to exclude

Boolean operators Boolean operators

Should be capitalized so they will not be Should be capitalized so they will not be mistaken for stop wordsmistaken for stop words

Page 43: 5 Accessing Information Resources

Search Logic and Syntax ConventionsSearch Logic and Syntax Conventions

Case SensitivityCase Sensitivity

Major search engines ignore capitalizationMajor search engines ignore capitalization

StemmingStemming

Refers to the ability of some search engines to Refers to the ability of some search engines to search for root words or partial form of keywords search for root words or partial form of keywords as well as the keywords themselvesas well as the keywords themselves

Page 44: 5 Accessing Information Resources

Search Logic and Syntax ConventionsSearch Logic and Syntax Conventions

Advanced Search OptionsAdvanced Search Options

Word Filter Search OptionsWord Filter Search Options

Enables the user to include or exclude words Enables the user to include or exclude words to create complex searchesto create complex searches

Field Search OptionsField Search Options

Allows user to specify the fields that will be Allows user to specify the fields that will be searched in a querysearched in a query

Page 45: 5 Accessing Information Resources

Search Logic and Syntax ConventionsSearch Logic and Syntax Conventions

Advanced Search OptionsAdvanced Search Options

Media and File Format Search OptionsMedia and File Format Search Options

Allows users to specify media type or file Allows users to specify media type or file formatformat

Domain/Site Restriction OptionsDomain/Site Restriction Options

Enables users to restrict a search to a top-Enables users to restrict a search to a top-level domain or exclude a domain or site from level domain or exclude a domain or site from a searcha search

Page 46: 5 Accessing Information Resources

Search Logic and Syntax ConventionsSearch Logic and Syntax Conventions

mediachoices

Google Basic Search Page

Yahoo! Basic Search Page

AltaVista Basic Search Page

mediachoices

mediachoices

AdvancedSearchhyperlink

AdvancedSearchhyperlink

AdvancedSearchhyperlink

Page 47: 5 Accessing Information Resources

Search Logic and Syntax ConventionsSearch Logic and Syntax Conventions

Google Advanced Search Field OptionsGoogle Advanced Search Field Options

Page 48: 5 Accessing Information Resources

Search Logic and Syntax ConventionsSearch Logic and Syntax Conventions

Advanced Search OptionsAdvanced Search Options

Date Search OptionsDate Search Options

Enables user to specify when document was last Enables user to specify when document was last updatedupdated

Language OptionsLanguage Options

Enables user to tap foreign language resourcesEnables user to tap foreign language resources

Numeric RangeNumeric Range

Allows user to specify a number rangeAllows user to specify a number range

Offensive Content BlockingOffensive Content Blocking

Allows user to specify a level of protection Allows user to specify a level of protection against offensive contentagainst offensive content

Page 49: 5 Accessing Information Resources

Search Logic and Syntax ConventionsSearch Logic and Syntax Conventions

Last Update Search OptionsLast Update Search Options

Page 50: 5 Accessing Information Resources

Search Logic and Syntax Conventions Search Logic and Syntax Conventions ReviewReview

What are the common Boolean search operators What are the common Boolean search operators and how do they work?and how do they work?

What does nesting Boolean operators do?What does nesting Boolean operators do?

What are some common advanced search engine What are some common advanced search engine features?features?

Page 51: 5 Accessing Information Resources

Evaluating and Using Internet ResourcesEvaluating and Using Internet Resources

Evaluating Internet ResourcesEvaluating Internet Resources

The unique nature of the Web requires the use of The unique nature of the Web requires the use of additional evaluation techniques specific to this additional evaluation techniques specific to this new form of communicationnew form of communication

PlagiarismPlagiarism

Involves representing someone else’s words, Involves representing someone else’s words, writing, or findings as your own, and is a form of writing, or findings as your own, and is a form of thefttheft

Page 52: 5 Accessing Information Resources

Evaluating and Using Internet ResourcesEvaluating and Using Internet Resources

Intellectual PropertyIntellectual Property

Refers to creative ideas and expressions afforded Refers to creative ideas and expressions afforded specific legal protectionspecific legal protection

Includes copyrights and trademarksIncludes copyrights and trademarks

Proper CitationProper Citation

The citation methods used for material found on The citation methods used for material found on the Internet differ from those used for traditional the Internet differ from those used for traditional print materialprint material

Page 53: 5 Accessing Information Resources

Evaluating and Using Internet ResourcesEvaluating and Using Internet Resources

Web Bibliographic CitationsWeb Bibliographic Citations

Page 54: 5 Accessing Information Resources

Evaluating and Using Internet Resources Evaluating and Using Internet Resources ReviewReview

The Wayback MachineThe Wayback Machine

Wayback Machine Web site contains database of Wayback Machine Web site contains database of Web pages going back to 1996Web pages going back to 1996

Can be used to help find information when a Can be used to help find information when a dead link is encountereddead link is encountered

In addition to locating missing pages, it can track In addition to locating missing pages, it can track the evolution of a Web page or sitethe evolution of a Web page or site

Web site owners can request sites or pages not Web site owners can request sites or pages not be made availablebe made available

Page 55: 5 Accessing Information Resources

Evaluating and Using Internet Resources Evaluating and Using Internet Resources ReviewReview

Page 56: 5 Accessing Information Resources

Evaluating and Using Internet Resources Evaluating and Using Internet Resources ReviewReview

What are some of the methods that can be used to What are some of the methods that can be used to evaluate information found on the Internet?evaluate information found on the Internet?

What is plagiarism?What is plagiarism?

How does fair use relate to copyrighted material?How does fair use relate to copyrighted material?