22
Searching the Deep Web LEMA, February 2011 Deep Web Video

LEMA, February 2011 Deep Web Video. Image from express.howstuffworks.com, 14 Feb 11 Surface Web: accessible via general-purpose search engines such as

Embed Size (px)

Citation preview

Searching the Deep WebLEMA, February 2011

Deep Web Video

Image from express.howstuffworks.com, 14 Feb 11

Surface Web: accessible via general-purpose search engines such as Google and Yahoo!

Deep Web: Not accessible via typical search engines; primarily databases

25%

75%

AKA visible vs. invisible web

1 trillion + Pages

500 trillion +

Pages!!

The “deep web” contains …Databases which use dynamic or temporary links

Often ?, &, CGI, other elements in the URLWebsites which aren’t indexed, by design or because

there are no links to itDeep web sites

Google limits the amount of a web site it indexes, an unpublished factor in its secret algorithm

At one point, only 110KFormats that aren’t currently supported

Google now shows results for .pdf, .doc, .ppt

Boundary between surface and deep web always in flux as search engines incorporate more of the deep web at the same time more is being added to the deep web

Deep Web: Why important?Studies show that students’ searching habits

are fairly ingrained by college Use Google for everything Only look at the 1st page of results Assume trustworthiness of web sites

Rich source of in-depth material not accessible through a typical Google search

Expose students now to richer and more authoritative resources.

Students need to understand ….The best results are NOT in the top 10Everything’s NOT on the webGoogle does NOT search the whole webEverything’s NOT freeEverything’s NOT trustworthySearching/Research is NOT always easy

How can we help our students be better searchers?Introduce them to the idea that Google isn’t everything &

whyReinforce the idea of evaluating resourcesMake them better “surface” searchers

Many information needs can be met with the surface webEasy yet “advanced” Google searching techniques

Better alternatives to the “surface web” & how to effectively search these alternativesDatabases!Familiarity with “deep” sites on a particular topic

Example: Primary materials available at Library of Congress Example: Legislative info at thomas.loc.gov

Familiarity with portals and directories

Three simple techniques to being a better Google searcher ….Phrase searching

“xxx xxxx”Searching the title of web pages

intitle: xxx or intitle:”xxx xxxx”Example: intitle:”climate change”Example: intitle:unicorn

Specifying a sitesite:.xxx or site:xxx.comegypt site:washingtonpost.com “climate change” site:.gov

NOTE:1. No space after

colon2. Lowercase

commands

Let’s try a site: search ….Look for a Washington Post article on the B-

52s

Now let’s try a phrase search…First, try Howard Morris as a simple keyword

search -- How many hits?

Now try it as a phrase “Howard Morris”How many hits?

Now let’s try an intitle: searchFirst, just search for “climate change” – how

many hits?

An intitle: searchNow try searching for “climate change” in

the title of the web page – how many hits?

Searching the Deep WebLVHS Library Web Page – Deep Web link on

the left Google search for your topic and add

keyword database Ex: Plane crashes database

The Deep Web: A ComparisonUsing Google, search on the term metabolismOpen a separate tab, go to www.science.gov

and search metabolism againLooking at the top ten results of each, which

provided generally “better” information? How difficult/easy is it to pursue your search

in related fields?

Directories/Portals of InterestIpl2

January 2010Merge of Internet Public Library and Librarians’

Internet IndexLibrarians and Information Science ProfessionalsHosted by Drexel University’s College of Information

Science & TechnologyInfomine

University-level scholarly resourcesLibrarian built and maintainedUniversity of California

Virtual Private Library

Other ResourcesLVHS Library Web Page – Deep Web link on

the leftGoing Beyond Google: The Invisible Web

in Learning and Teaching by Jane Devine and Francine Egger-Sider, 2009Not as up-to-date as web resources, butVery focused on teaching

Any questions?