22

Using internet search engines and library catalogs to locate

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

UCI

Sustento del uso justo de materiales protegidos por

derechosde autor para fines educativos

El siguiente material ha sido reproducido, con fines estríctamente didácticos e ilustrativos de los

temas en cuestion, se utilizan en el campus virtual de la Universidad para la Cooperación

Internacional – UCI - para ser usados exclusivamente para la función docente y el estudio

privado de los estudiantes en el curso Tecnología y Manejo de información perteneciente al

programa académico Maestría en Inocuidad de Alimentos.

La UCI desea dejar constancia de su estricto respeto a las legislaciones relacionadas con la

propiedad intelectual. Todo material digital disponible para un curso y sus estudiantes tiene fines

educativos y de investigación. No media en el uso de estos materiales fines de lucro, se entiende

como casos especiales para fines educativos a distancia y en lugares donde no atenta contra la

normal explotación de la obra y no afecta los intereses legítimos de ningún actor .

La UCI hace un USO JUSTO del material, sustentado en las excepciones a las leyes de

derechos de autor establecidas en las siguientes normativas:

a- Legislación costarricense: Ley sobre Derechos de Autor y Derechos Conexos,

No.6683 de 14 de octubre de 1982 - artículo 73, la Ley sobre Procedimientos de

Observancia de los Derechos de Propiedad Intelectual, No. 8039 – artículo 58,

permiten el copiado parcial de obras para la ilustración educativa.

b- Legislación Mexicana; Ley Federal de Derechos de Autor; artículo 147.

c- Legislación de Estados Unidos de América: En referencia al uso justo, menciona:

"está consagrado en el artículo 106 de la ley de derecho de autor de los Estados

Unidos (U.S,Copyright - Act) y establece un uso libre y gratuito de las obras para fines

de crítica, comentarios y noticias, reportajes y docencia (lo que incluye la realización

de copias para su uso en clase)."

d- Legislación Canadiense: Ley de derechos de autor C-11– Referidos a Excepciones

para Educación a Distancia.

e- OMPI: En el marco de la legislación internacional, según la Organización Mundial de

Propiedad Intelectual lo previsto por los tratados internacionales sobre esta materia.

El artículo 10(2) del Convenio de Berna, permite a los países miembros establecer

limitaciones o excepciones respecto a la posibilidad de utilizar lícitamente las obras

literarias o artísticas a título de ilustración de la enseñanza, por medio de

publicaciones, emisiones de radio o grabaciones sonoras o visuales.

Además y por indicación de la UCI, los estudiantes del campus virtual tienen el deber de cumplir

con lo que establezca la legislación correspondiente en materia de derechos de autor, en su país

de residencia.

Finalmente, reiteramos que en UCI no lucramos con las obras de terceros, somos estrictos con

respecto al plagio, y no restringimos de ninguna manera el que nuestros estudiantes, académicos

e investigadores accedan comercialmente o adquieran los documentos disponibles en el mercado

editorial. sea directamente los documentos, o por medio de bases de datos científicas, pagando

ellos mismos los costos asociados a dichos accesos.

Toxicology 157 (2001) 121–139

Using internet search engines and library catalogs to locatetoxicology information

Laura Dassler Wukovitz *Pollution Pre6ention Information Clearinghouse, 7407, United States En6ironmental Protection Agency, Ariel Rios Building,

1200 Pennsyl6ania A6enue, Washington, DC 20460, USA

Abstract

The increasing importance of the Internet demands that toxicologists become aquainted with its resources. To findinformation, researchers must be able to effectively use Internet search engines, directories, subject-oriented websites,and library catalogs. The article will explain these resources, explore their benefits and weaknesses, and identify skillsthat help the researcher to improve search results and critically evaluate sources for their relevancy, validity, accuracy,and timeliness. © 2001 Elsevier Science Ireland Ltd. All rights reserved.

Keywords: Toxicology; Information resources; Internet; Databases; Online systems

www.elsevier.com/locate/toxicol

1. Introduction

The Internet is a powerful workhorse that canbe harnessed to assist in the harvesting of awealth of information on virtually any topic, in-cluding toxicology. However, like the farmerlocked in a perennial battle with parasites andweeds, the researcher using the Internet requirespersistence, creativity, judgement, and the abilityto spot the symptoms of a diseased crop. Whilethere is often no substitute for a thorough searchusing traditional bibliographic databases, the in-creasing importance of the Internet to scientificcommunications demands that toxicologists be-

come acquainted with its resources. There areseveral classes of tools that can be used to helpidentify relevant information on the Internet, suchas search engines, directories, and subject-orientedwebsites. The Internet can also be used as a wayto access books and library resources on toxicol-ogy, since most libraries now have online cata-logs. Studies have shown that the use of severalbibliographic databases, instead of only one ortwo, provides the most comprehensive set of re-search results (Gehanno et al., 1998; Ludl et al.,1996). This also holds true for the Internet. Thisarticle will explain the essential structure of thedifferent resources, explore the benefits and weak-nesses of each, and identify skills that help theresearcher to improve search results and criticallyevaluate sources for their relevancy, validity, ac-curacy, and timeliness.

* Tel.: +1-202-2601758.E-mail address: [email protected] (L. Dassler

Wukovitz).

0300-483X/01/$ - see front matter © 2001 Elsevier Science Ireland Ltd. All rights reserved.

PII: S0300-483X(00)00343-7

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139122

2. Pitfalls and promise of the internet

The Internet is a vast and ever-increasing com-puter network that is dramatically changing theway information is distributed and shared, and ithas become an important vehicle for scholarlycommunications (Lawrence et al., 1999b). Unfor-tunately, due to the structure and nature of theInternet itself, it can be difficult to locate informa-tion relevant to a given topic. Unlike biblio-graphic databases and library catalogs, whichindex a specific set of publications using a definedstructure and a controlled list of subject headingsor keywords, the Internet has no definableboundaries and no standard method of organiza-tion. If you need to know whether a particularjournal is indexed in Medline, you simply look atthe master list of journal titles, and you have youranswer, usually very quickly. There is no corre-sponding master list of titles for the Internet, andfinding the answer to a basic question can some-times take an unexpected quantity of time andenergy.

Locating information on the Internet is compli-cated by the fact that it is constantly in a state offlux. This continuous change can help or hinderthe researcher. Digital information has severaladvantages over traditional printed information— electronic data can be easily and quickly up-dated, and hypertext allows for interactive fea-tures, such as linking among related sources.News about the latest breakthroughs can befound on the Internet before they appear in printsources (DeWoskin, 1998). However, the dynamicnature of digital information also poses chal-lenges. Electronic data is relatively easy to updateas new developments occur, but frequent changesto sites often result in information being moved,deleted, or reorganized. As a result, search utilitieswill sometimes return ‘dead links’, or pages thatno longer exist, in response to a query. Existingsites will also sometimes contain links to extinctpages or sites. In addition to keeping track ofexisting information, new sites are constantly be-ing added to the Internet. In February 2000, thenumber of indexable webpages on the WorldWide Web (WWW) was estimated to be over onebillion. This number counts only pages that are

accessible to search tools, and does not include allthe information contained within online databases(Inktomi, 2000). Due to its size and variability, noexisting search utility is capable of searching theentire WWW, and even the largest of the searchtools index only about 35% of the WWW (Sulli-van, 2000).

These barriers to finding information may seemformidable, but they are not insurmountable. Thetechnology used to search and index the Internetis constantly being improved, and the number ofgeneral and subject-specialized search tools isgrowing. In addition, librarians and informationprofessionals are attempting to organize and clas-sify sites based on their content.

There are three basic mechanisms for findingmaterials on the Internet. They are search engines,directories, and subject-oriented websites. Thereare fundamental differences between these threegeneral categories, and each will yield a differentset of results. When discussing these resourcescollectively, this paper will refer to them as searchtools or search utilities. It is important to remem-ber that all of these tools are surrogates — youare not literally searching the entire Internet eachtime you enter a query. Even if this task werepossible, it would still likely be an inefficient,time-consuming method of finding information.We would not read every book in the library tofind out information on the occupational healthrisks for chemical factory workers. It would bebetter to search the library catalog to identifypotentially useful materials, and then read onlythose relevant items. The librarians have built thecatalog by creating an electronic record (or even aprinted card) that describes the content of everybook in the collection. The record gives key infor-mation about each book in a short, easy-to-useformat. Searching this catalog allows you toquickly identify all the books in the library onoccupational exposure without having to siftthrough thousands of volumes of books. Internetsearch utilities have been built to provide a similarfunction.

Because the Internet is decentralized, there iscurrently no method for keeping track of all theinformation on every computer connected to theInternet. Instead, we must rely on intermediary

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139 123

indexes that have been compiled by humans orcomputers. Unfortunately, the Internet is not alibrary, and there is not one catalog, but dozens,and none contain exactly the same information.In a library, before a book is added to the shelves,it is given a call number to identify its location,and a record in the catalog so that it can be foundagain. When information is added to the Internet,it has a unique location, but no record in thesearch tools. The owner of a website must eithersubmit the new pages for addition to search utili-ties, or wait for them to find the data on theirown. Search utilities are currently only able tocover roughly a third of the material on theInternet because they must find and index infor-mation after it has been made public — a hugetask, given the number of pages on the Internet.A frustrating consequence of this situation is that,with every search, there is always a possibility thatsignificant information could be missing fromyour results because it has never been indexed byany search tools (Lawrence et al., 1999a). Anotherreality the Internet researcher must contend withis that most search tools, on average, containabout 40% original content. (Notess, 2000) As aresult, there will usually be a significant amountof duplication between the results from differentsearch tools. Using several search tools will maxi-mize the number of unique pages found; it willalso increase the number of duplicate pages.

3. Improving the odds

3.1. E6aluate the question

In order to retrieve the best information, theresearcher will need to learn to use an assortmentof search tools to the best advantage. This processbegins before any searching starts. The prospec-tive Internet searcher should first evaluate theinformation that is needed. What, exactly, is thequestion? What are the desired results? A searchfor chemical data will require different techniquesand resources than one for upcoming professionalconferences, or a list of recently published books.Consult the help files of each search tool beforebeginning a search — these files are often over-

looked, and they contain vital information aboutthe mechanics of how your search will be treated.Spending some preparation time beforehand fre-quently shortens the amount of time a searchtakes and improves the quality of the results.

3.2. Define the scope

Once the question has been clearly identified,the scope of the query should be defined. Howdoes it fit into the larger framework of informa-tion? Is it a narrow specialty within a widersubject, or an overarching survey of availabledata? Answering these questions will help clarifyhow broad the search should be. A search for thediagram of a molecular structure can often besatisfied by a simple query in a website, likeChemFinder, while a comprehensive search for allsuspected toxicological effects will require search-ing for several terms in multiple databases. Acomprehensive search should leave out very little,but it is also more likely to include irrelevant andduplicate items. For example, a chemical may bementioned, but not in the desired context.

3.3. Create a list of rele6ant terms

In addition, a list of possible synonyms, abbre-viations, and keywords should be made. Are dis-tinctive names, words, or phrases commonly usedfor the topic? The list should be as inclusive aspossible, because excluding an important syn-onym from a search can mean valuable informa-tion is missed. Variant word endings, such assingular and plural forms, must also be consid-ered. Not all of the terms from the list willnecessarily be used in every search utility, but it isgood planning and will help ensure that relevantterms are not inadvertently overlooked. For ex-ample, a search for information about Methylethyl ketone that includes only one of its chemicalnames could miss items that refer to it by its othercommon synonyms, butanone or MEK. Includingall the possible terms for a substance or conceptwill yield the greatest number of results. Thedesired thoroughness of the results will dictatehow extensive the list of synonyms needs to beand how many resources will need to be con-

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139124

sulted. To retrieve a simple fact, like molecularweight or structure, one form of the chemicalname may be sufficient. Other searches will re-quire more complex combinations of terms.

4. Constructing a search statement

4.1. Boolean logic

Although each search utility has its own re-quirements, most allow words to be combinedinto a search statement using Boolean logic.Boolean logic uses the mathematical operators,‘and’, ‘or’ and ‘not’ to create very specific or verybroad searches. Using ‘and’ between your termswill yield a narrower set of results, because itrequires that more than one term be present in theitem. Using ‘or’ will broaden the results, becauseonly one of several possible terms needs to bepresent. ‘Not’, as the word implies, commands thecomputer to exclude any items containing thatword. Care should be used when using ‘not’,because there is always a danger that relevantitems may be excluded because they mention theundesired word in an unrelated context. In data-bases like Medline, that include a field for lan-guage, ‘not’ can often be used to eliminate itemsin other languages. On the Internet, it can alsosometimes be useful for eliminating items that usethe same term for an entirely different topic. Insome search tools, ‘+ ’ and ‘− ’ can be usedinstead of ‘and’ and ‘not.’

A good demonstration of the ways Booleanlogic can be used to formulate a search is whenseeking for information on heavy metals, a termthat has two entirely different meanings. Bysearching for ‘heavy metals OR heavy metal NOTmusic’, the search results should logically includeitems that mention either the singular or pluralforms of the phrase and exclude items on heavymetal rock bands. However, in practice, this isless than perfect, as not all heavy metal rockbands actually use the word ‘music’ in their web-pages. A search using AltaVista found 303 493websites on heavy metal, 129 856 on heavy metals,404 879 that mention either of the two phrases,and 262 177 that mention both phrases but do not

use the word music. Unless a comprehensivesearch for all items on all types of heavy metals isrequired, and an unlimited amount of time to siftthrough irrelevant items is available, this queryneeds refinement. ‘Heavy metals’ as a topic is toobroad, and the set of results is simply too large. Abetter strategy would be to reevaluate the ques-tion, adding more terms in order to make thesearch statement more specific. Using the Al-taVista advanced search, the query ‘heavy metalsOR heavy metal AND remediation AND NOTmusic’ retrieves 13 847 items. This is a significantimprovement, but is still too many pages for eventhe most thorough researcher to read. By addingthe term ‘pit lakes’ to the above search query, theresults are reduced to 20 websites, some of whichreally do discuss the remediation of heavy metalsin the pit lakes of mining operations. By thispoint, excluding the term ‘music’ from the resultshas become unnecessary, and it would be better toremove it from the search statement altogether.Although there are exceptions, using ‘not’ as away to exclude irrelevant items is usually a signthat the search statement itself is too general andneeds to be reexamined.

4.2. Stemming and truncation

Stemming refers to a process that some searchtools use when searching for terms. It automati-cally reduces the term to its root word, or stem,and then searches for common word endings. Forexample, if one of your search terms is ‘react’,Infoseek will search for ‘reacts’ ‘reacted’,‘reacting’ and other common endings. This sim-plifies searching, because including a list of allvariant forms in the search statement is not neces-sary.

Truncation is similar to stemming, and in thatit will search for word variants. The major differ-ence between the two is that truncation is con-trolled by the researcher, not by the computer. Itis usually indicated using ‘*’ at the point in theword where truncation is required. The lettersbefore the asterix must be present in the word, butany ending is permissible. For example, searchingfor ‘toxic*’ in AltaVista would return websitescontaining any form of the word toxic, including

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139 125

toxics, toxicology, toxicologist, toxicological, andso on. Unfortunately, not all tools support stem-ming and truncation, and, sometimes, it will benecessary to use ‘or’ to format a query that in-cludes all variant word endings. Read each tool’shelp files to be sure.

4.3. Phrase searching

Many search utilities also allow terms to bedesignated as phrases, meaning that several wordsmust appear together in the page. This is usuallyaccomplished by enclosing the words in quotationmarks. For example, searching for ‘methyl chlo-ride’ as a phrase should only find items mention-ing that substance, and not items containingmethyl and chloride as unrelated terms, such asmethyl bromide and potassium chloride. This fea-ture can also be used when searching for propernames.

4.4. Proximity searching

An additional feature offered by a few searchtools is proximity searching, or looking for termsthat are close together in the page. For example,searching for ‘methyl chloride NEAR inhalation’in AltaVista will return items in which the twoterms are separated by no more than ten words.The assumption is that terms which are closertogether are more likely to be related, making thesearch more accurate.

4.5. Field limiting

Some search tools allow searching for terms ina specific area of the item, such as in the title orthe URL. Using this feature can be valuable whenlooking for very specific documents, the home-pages of organizations, or other items where thetitle or name is known. It can also be an indicatorof relevancy; a webpage with ‘mercury poisoning’in the title is likely to contain a gooddeal of information on the topic. This feature canbe useful when seeking a general overview of atopic, or when trying to limit the number ofresults.

4.6. Site characteristics

A few search tools offer options that allowsearching for specific webpage characteristics, likethe date, language, or domain name. This can beuseful for eliminating outdated items or materialsin languages other than your own. Searchingbased on the domain name looks at the URL ofthe item, which indicates the type of organizationpublishing the site. For example, URLs ending in‘.edu’ belongs to educational institutions, ‘.gov’indicates government websites, ‘.org’ is for organi-zations or associations, and ‘.com’ is used forcommercial companies.

4.7. Nesting

Nesting is another powerful feature in somesearch tools. Nesting allows you to build morecomplex search statements by enclosing terms inparenthesis. This helps to prevent ambiguity inthe search statement, and instructs the computeron how the terms should be grouped together. Intools where nesting is supported, the items inparenthesis are searched first, and then the otheroperations in the search statement are carried out.The more complex your search, the more valuablenesting becomes, and it can be a good way tosearch for all of the synonyms or forms of a word.For example, searching for ‘(inhalation OR in-haled OR inhaling) AND human AND (methylchloride OR chloromethane ORmonochloromethane)’ should return items dealingwith the human inhalation of methyl chloride,regardless of which variation of the name is used.If there were no parenthesis in the above state-ment, there is a danger that the computer couldmisinterpret which terms must appear together.Inhaling, human, and methyl chloride might all betreated as requirements — defeating the purposeof including the synonyms, and increasing thenumber of irrelevant items in the results.

Again, the individual help files for each searchutility will list the searching options that are andare not available, and the proper syntax thatshould be used to get the best results. These helpfiles provide vital clues about how search querieswill be treated, and using this information will

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139126

improve search efficiency. The goal is to maximizerelevant items while minimizing the number ofuseless websites in the list of results. This can bedone by combining keywords using Boolean logic,and other advanced options, into search state-ments that maximize the potential of each data-base and retrieve the most relevant items.

5. Search engines

Once the question has been evaluated, a list ofkeywords created, and preliminary search strategydeveloped, the actual searching can begin. Onecommon starting place is with search engines.Search engines are large databases of websitesthat have been compiled automatically by com-puters, with no attempt at organization or classifi-cation by topic.

Because the Internet is a telecommunicationsnetwork that allows many individual computersystems to link to each other, it can appear to bea seamless set of resources. This is an illusion.Each separate site is independently operated andmaintained, and hyperlinks are used to allowvisitors to jump from the originating site to otherrelated sites. Search engines work by exploitingthese links. Search engines compile their data-bases of webpages by using automated programs,frequently referred to as ‘robots’ or ‘spiders’, tobrowse the Internet. Each robot begins with onewebsite as its starting point, and follows everylink that leads from that site to other sites, addingpages to the database as it travels through sites.The process is then repeated at every new siteencountered. Because these robots are capable oftraversing through websites at a very fast rate,and many of them are sent out to different areasof the Internet, they can create a large database ofwebsites in a relatively short period of time. Thelarge size of search engines often makes themgood for locating very specific items, since thelarger the index the better the odds that thedesired item will be present.

There are a few limitations to this approach,however. By relying on links in order to reachnew sites, bias in the starting sites can affect thecomposition of the database. Also, new sites may

not be found by the robots because they often donot have as many links leading to them. Anotherdrawback is that the content of some sites, espe-cially databases, cannot be searched by robots.This is of particular concern for scientific re-searchers, because scientific data is frequently pre-sented in database format. It is easy to becomeoverwhelmed by the sheer amount of data con-tained in a large search engine. The savvy Internetresearcher uses combinations of words andphrases to help keep the results list manageableand relevant. Unfortunately, even this will notalways prevent the occasional useless websitefrom slipping into a results list. This is partly dueto the technologies used by search engines, andpartly due to website designers who have learnedto exploit the traits of search engines in order toincrease the visibility of their sites. When a queryis entered into a search engine, it searches itsdatabase of webpages for matches, and presentsthe results in a list ranked by relevance. This is anambiguous process, because a search engine’s ideaof relevance may be quite different from that ofthe searcher. It is further complicated by the factthat each search engine uses different algorithmsfor finding and ranking sites (Chakrabarti et al.,1999). In Google, for example, the results areranked by the number of sites that link to them.This is based on the theory that more authorita-tive sites will be linked to more often, in the sameway that important researchers are cited morefrequently by other authors. By contrast, the Al-taVista advanced search screen allows the user tospecify which terms the results should be sortedby. Others use a combination of techniques, in-cluding the number of times the terms are used inthe item, if all the terms are present, and theirposition within the text Table 1.

Some of the major search engines include:AltaVista, http://www.altavista.com.Excite, http://www.excite.comFast, http://www.alltheweb.com.Go (Infoseek), http://www.go.com/WebDir/.Google, http://www.google.com/.Hotbot, http://www.hotbot.com.Lycos, http://www.lycos.com.Northern Light, http://www.northernlight.com.

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139 127

Tab

le1

Com

pari

son

ofse

arch

engi

netr

aits

Boo

lean

logi

c(o

rT

runc

atio

n/st

emm

Size

(in

mill

ions

)D

ocum

ent

trai

tsSe

arch

engi

neF

ield

limit

ing

Ran

king

base

don

Phr

ase

sear

chin

gan

dne

stin

git

seq

uiva

lent

)in

g(d

omai

n,da

te,

lang

uage

,et

c.)

Yes

Yes

Use

rsp

ecifi

edY

esA

nd,

or,

and

not

Alt

aVis

taY

es35

0P

age

cont

ent,

link

And

,or

,an

dno

tY

esY

esE

xcit

e25

0N

oN

opo

pula

rity

y34

0Y

esP

age

cont

ent

Yes

And

,no

tY

esF

ast

No

Go

(Inf

osee

k)P

age

cont

ent

50P

hras

eon

lyA

nd,

not,

and

not

Yes

Stem

min

gon

lyY

esP

age

cont

ent,

link

512

Phr

ase

only

No

No

And

No

Goo

gle

popu

lari

ty11

0Y

esP

age

cont

ent

Phr

ase

only

And

,or

,no

tY

esH

otB

otSt

emm

ing

only

Inkt

omia

Ove

r1

billi

on50

Yes

Pag

eco

nten

t;re

sult

sA

nd,

orY

esL

ycos

Phr

ase

only

No

orga

nize

dby

type

,e.

g.w

ebsi

tes

and

new

s

Pag

eco

nten

t,lin

kY

esA

nd,

or,

not

Yes

260

Nor

ther

nlig

htY

esY

espo

pula

rity

aIn

ktom

ipr

ovid

esth

eir

sear

chen

gine

and

dire

ctor

yto

othe

rco

mpa

nies

and

sear

chse

rvic

es,

such

asY

ahoo

!.B

ecau

seth

eda

taba

ses

are

sear

chab

leus

ing

diff

eren

tto

ols,

the

sear

chte

chni

ques

will

vary

(But

ler,

2000

;B

arke

r,20

00In

ktom

i,20

00;

Not

ess,

2000

;Su

lliva

n,20

00).

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139128

6. Meta-search engines

As was previously noted, most search enginescover only about 35% of the materials availableon the Internet, and even then there is overlapbetween the content of the individual tools. Usingmore than one tool will maximize the chances offinding the information that is sought. A special-ized type of search engine has been developed toaddress this need.

Meta-search engines are tools which allow theuser to simultaneously query several search toolsat once. However, they only spend a short time ateach database, and may only retrieve some of therelevant items in each one. In addition, becauseeach search tool has its own individual featuresand syntax requirements, searches submittedthrough a meta-search engine may not produceconsistent or relevant results. This makes meta-search engines more appropriate for testing key-words to see if they are retrieving the type ofinformation required, or for very simple searcheswhen time is crucial. They can provide a veryquick survey of what various search engines anddirectories have to offer, and give an overall ideaof what search tools may be most useful for thegiven topic. In some cases, especially for complexsearches, it may sometimes be better to querysearch engines individually to make the best useof search techniques to retrieve relevant items.

Examples of meta-search engines.Chubbahttp://www.chubba.comsearches AltaVista, Kanoodle, Infoseek,GoTo.com and Lycos.Copernichttp://www.copernic.com/a downloadable metasearch tool that queries 80search engines and eliminates duplicates.Dogpilehttp://www.dogpile.com/simultaneously queries three search engines at atime, and continues to search in sets of threeuntil at least ten resources are found. Usershave the option of searching the next sets ofthree even if more than ten items are found.Ixquickhttp://www.ixquick.com/

searches 14 other search tools, and the user hasthe option of selecting which will be queried.Advanced search techniques are translated tothe proper syntax for the search engines withcomparable search features. Duplicates are re-moved from results list.Metacrawlerhttp://www.metacrawler.com/index.htmlsimultaneously queries 15 search engines, andsupports some advanced search techniques. Al-lows results to be sorted by relevance, type ofsite, or by source of the citation.Search.com (SavvySearch)http://savvy.search.com/simultaneously searches several search enginesand returns results organized by type, e.g. web-pages, directories, and headlines. Offers asearch-within-search-results option to narrowresults list.MonsterCrawlerhttp://www.monstercrawler.com/searches seven search engines at once. Ad-vanced search at http://datamonster.com/ al-lows the user to select which of the 12 searchengines to query.

7. Directories

Directories are Internet-search tools that aremade up of resources that have been compiled,organized, and in some cases even annotated byhuman editors. They are frequently arranged intohierarchical categories, with websites on similartopics grouped together for easy browsing. Thiselement of oversight and classification helps avoidthe problem of site designers who overload pageswith keywords or use other devices in order toincrease the site’s visibility. Directories are usuallymore selective about what sites are included, andcan, therefore, be good resources for locatinggeneral overviews and authoritative sites. Brows-ing through the categories can also be a good wayto quickly review popular resources, or view a listof related organizations.

As Internet-searching technologies improve, itis likely that the distinctions between search en-gines and directories will begin to blur. In some

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139 129

cases, this process has already begun. For exam-ple, the search engine, Google has added subject-browsing capabilities, and directories often alsosearch through robot-compiled databases to sup-plement their resources.

The following are some useful general directo-ries of Internet resources. They are not designedwith any one subject in mind, and include re-sources on virtually any topic as well as scientificand toxicological information.

Argus Clearinghouse, http://www.clearinghouse.net/. It indexes and ratessubject guides that identify, describe, and evalu-ate Internet-based information. It does not leaddirectly to sources and search full-text. It isgood for basic information, as sources are gen-erally high-quality, but searching on broadterms, like ‘chemicals’ produces better resultsthan using narrower concepts.Britannica’s Internet guide, http://www.britannica.com/, searches websites,magazines, books, and the Encyclopædia Bri-tannica completely. Websites are ranked by thesite editors and higher-ranked results are pre-sented first.Galaxy, http://www.galaxy.com/, is an ad-vanced search option which allows more so-phisticated queries. Toxicology information islocated at http://www.galaxy.com/galaxy/Medicine/Health-occupations/ Pharmacology/Toxicology.html.Infomine, http://infomine.ucr.edu/, specializesin scholarly resources, compiled by academiclibrarians. Resources are assigned Library ofCongress Subject Headings. It offers browsingby subject, keyword and title. Search featuresinclude truncation, phrases, field limits, andBoolean operators.Librarian’s index to the Internet, http://lii.org/,a searchable or browsable subject directory ofInternet resources evaluated and annotated bylibrarians. It allows Boolean logic, stemming,field limits and restrictions by category. Resultsare sorted by these categories — ‘Best of’,directories, databases and subject-specific re-sources. Science information is located at http://lii.org/search/file/science and health/medicineinformation (including toxicology) is at http://lii.org/search/file/health.

Open Directory Project Clearinghouse, http://dmoz.org//, a no-frills site containing a wealthof annotated sites organized by volunteer edi-tors. It aims to create the largest directory ofInternet sites on virtually all subjects, and isused as a source of data for other directoriesand search engines, including Yahoo! andGoogle. Search for specific terms or browse bysubject. Subjects are arranged hierarchically. Itclaims to index 1,908,874 sites and have 27 020editors. Science topics are listed at http://dmoz.org/Science/.World Wide Web virtual library, http://www.vlib.org/, compiled by volunteers, each re-sponsible for a portion of the directory. Due toits decentralized nature, currency and appear-ance are inconsistent. Science information islocated at http://www.vlib.org/Science.html.Yahoo!, http://www.yahoo.com/, allows phrasesearching, Boolean logic, and some other ad-vanced search syntax. Science topics are listedat http://dir.yahoo.com/Science/ and toxicologyinformation is located at http://dir.yahoo.com/Health/Medicine/Toxicology/.

8. Subject-specialized search engines, directories,databases, and websites

Scientific information is not always well-servedby the large, general search engines and directo-ries. As a reaction to this problem, subject-specificsearch tools have been developed. They usuallyoffer a combination of search engine and direc-tory features, and index only websites within theirarea of expertise. For example, Medical WorldSearch uses a database that includes only pagesfrom selected websites. This makes it a muchsmaller search engine, but it also means thatsearches retrieve very specific information, with-out including as many irrelevant links. This spe-cificity can sometimes also be a disadvantagebecause their scope of resources is not as broad,but for many searches they will provide the de-sired information with less frustration or informa-tion overload.

The universe of subject-oriented websites ismuch broader and less clear-cut than other search

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139130

tools. These sites are created specifically for Inter-net users looking for a particular type of informa-tion. They are more likely to contain their owndatabases, such as ChemFinder’s chemical prop-erties database, or ExToxNet’s Pesticide Informa-tion Profiles. In many cases, going directly to thesite is the only way to search through the infor-mation contained in these databases.

Like all other Internet-search tools, subject-based sites each offers its own individual searchtechniques which will vary from site to site. Some,like Toxilinks, have a very narrow scope (in thiscase, forensic toxicology) and offer only browsingas a way to access links. Others, like MedWeb-Plus, offer sophisticated subject directories linkingnarrower, broader, and related topics, or ad-vanced search features like those available on theOpen Directory Project. As with any other web-site, it is important to consider the source andorientation of these sites. Some, like MedWeb-Plus, ChemFinder, and SciCentral, are commer-cial ventures that make their money by providingadvertising space or selling products and services.This does not mean the information is less valu-able, but the researcher should be alert for poten-tial bias in any site.

The advantage of these sites is that they areable to tailor their information to a particularsubset of users who are interested in a specifictopic. Subject-oriented websites are more likely tocontain actual data, rather than simply lists oflinks (that sometimes lead to yet more links).

The following is a partial list of subject-special-ized sites providing access to toxicology informa-tion. Some are search engines or directories thathave index-only scientific sites, and others arecollections of data presented by an individual ororganization.

Agency for Toxic Substances and Disease Reg-istry, http://www.atsdr.cdc.gov/, includes theToxFaqs Fact Sheets, the HazDat Database,and other information related to hazardouschemicals and diseases. There is a browseablesite index and also a search engine for the site.BioBot, http://www.nbii.gov/search/biobot/search.html, designed to improve access to bio-logical information, allows users to search thenational Biological Information Infrastructure

website as well as retrieve biology informationon the Internet through other search tools, suchas AltaVista, Yahoo!, and BioLinks. It permitsBoolean and phrase searching, and users canopt to see all results, on only the best or fastestthree resources.Biocrawler, http://www.biocrawler.com/, is acombination search engine and directory of lifesciences information. It includes informationsheets on the resources listed in the directorythat lists keywords, the subject category the sitehas been classified in, and a list of pages thatcite the resource.Biocrawler List of Life Science Databases,http://www.biocrawler.com/cgi-bin/direc-tory?vkid=1, is a subcategory of theBiocrawler website that provides access to anarray of commercial and non-commercialdatabases.Bioethics Resources on the Web, http://www.georgetown.edu/research/nrcbl/scopenotes/sn38.htm, gives access to bioethicsand genetics information from the NationalReference Center for Bioethics Literature atGeorgetown University. It also links to directo-ries, journals, and other digital publications.Biolinks, http://www.biolinks.com/, is a combi-nation search engine and directory of scienceresources that allows the user to search all sites,or only those which have been indexed. It alsoincludes a browseable subject index, and exten-sive links to journal websites.BioMedNet Web Links, browses biomedicalsites by topic or use the search feature. Use ofthe site requires registration, and some areasrequire a fee or subscription. Sites are reviewed,annotated, and given a rating from one to fourstars, one being good, and four being indispens-able. It is one of the few sites that specificallylist resources by organism, such as mouse, rat,and human. The site also includes news aboutcurrent events, job postings, a bookstore, andlaboratory supplies, and a customizableinterface.ChemFinder, http://www.chemfinder.com/,provides a searchable database of chemicalstructures and properties, with links for addi-tional information on health effects, regulatory

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139 131

oversight, or other topics when available. Thelist of indexed sites at http://www.chemfinder.com/siteslist.asp also containsa wealth of links to sites providing chemicalinformation. Although the chemical databaseand list of indexed sites are available to anyone,other areas of the ChemFinder site require asubscription or fee.Contaminant Hazard Reviews, http://www.pwrc.usgs.gov/new/chrback.htm, gives ac-cess to 34 detailed contaminant hazard reviewsmade available by the USGS Patuxent WildlifeResearch Center on common chemicals likemercury, lead, dioxin, and PCBs.Environmental Contaminants Encyclopedia,http://www1.nature.nps.gov/toxic/, an environ-mental toxicology encyclopedia from the Na-tional Park Service, includes entries for 118chemicals, and describes their effects on wildlifeand the environment.Extension Toxicology Network (ExToxNet),http://ace.ace.orst.edu:80/info/extoxnet/, fea-tures toxicology and pesticide information. Itincludes the University of California, Davis En-vironmental Toxicology Newsletter, pesticideinformation profiles, toxicology informationbriefs, and other information.The Genome Database, http://gdb-www.gdb.org/, provides access to the databaseof genetic information from the HumanGenome Project.Global Information Network on ChemicalsWeb Sites on Chemical and Safety Information,http://www.nihs.go.jp/GINC/webguide/in-dex.html, provides a browseable list of re-sources from the National Institute of HealthSciences, Japan. It features an extensive list ofinternational, government, and non-govern-ment organizations. There are links to chemicalsafety information; journals and publications;and a short list of other search tools and data-bases. Site is accessible in English or Japanese.Healthfinder, from the US department ofHealth and Human Services. Homepage, http://www.healthfinder.gov/. It includes informationon a wide range of topics of interest to con-sumers and health professionals. A short list ofthe available subjects is given on the home

page, but viewing or searching the full index isa better indication of the materials on the site.Links to other government sites are indicatedwith a flag, and annotations are provided in the‘Details’ links. The destination of links is notalways apparent from the title, forcing the visi-tor to view the details first or to simply followlinks wherever they lead. Two subject categoriesof interest to toxicologists are ‘Toxic Sub-stances’ and ‘Poisons’. The direct URLs tothese pages are given below.

Toxic substances, http://www.healthfinder.gov/Htmlgen/HFSrchFT.cfm?Keyword=866&ShowPg=ALL&Population=all&Resource=All.Poisons, http://www.healthfinder.gov/Html-gen/HFSrchFT.cfm?Keyword=669&ShowPg=ALL&Population=all&Resource=All.Healthfinder Links to Databases, http://www.healthfinder.gov/HTMLGen/HFKey-word.cfm?Keyword=DATABASE&ShowPg=ALL. An extensive list of databases on allaspects of health, from a variety of govern-ment and non-government sources. Links arearranged alphabetically.

Health Sciences Library System Internet Re-sources, University of Pittsburgh, http://www.hsls.pitt.edu/intres/, includes Internetguides and links of health resources organizedby topic. It is also available as an alphabeticallist.HealthWeb, http://healthweb.org/, a collabora-tive project by more than 20 health librariesprovides links to evaluated non-commercial,health-related Internet resources. It searches orbrowses the list of subjects. Due to thenature of the project, the actual pages ofsubject resources are hosted by different organi-zations, and the appearance differs in each.Toxicology links are located at http://www.medlib.iupui.edu/hw/tox/home.html.Insect Databases on the InsectWeb Server,http://insectweb.inhs.uiuc.edu/index.html, pro-vides access to several entomology databases,including insect pathogens.Karolinska Institute Medical Library, Sweden,http://www.mic.ki.se/Diseases/index.html, pro-

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139132

vides extensive list of medical, health, and sci-ence resources on the Internet. It includes awide array of topics, and resources organizedby MESH headings. It browses through nar-rower, broader, and related topics, or search forspecific terms.Librarian’s Index to the Internet, ToxicologyInformation, http://lii.org/search?query=Toxi-cology&subsearch = Toxicology&searchtype=subject, a searchable or browseable subject di-rectory of Internet resources evaluated and an-notated by librarians. Includes links todirectories, databases, and subject-specific re-sources. Not as extensive as other directories,but with an emphasis on food safety.Medical World Search, http://www.mwsearch.com/, offers full-text searchingof selected medical websites. Supports Booleanlogic, and uses a controlled vocabulary thesau-rus. It will also perform search in other sites,including Medline and AltaVista.MedlinePlus Poisoning, Toxicology, Environ-mental Health Topics, http://medline-plus.nlm.nih.gov/medlineplus/poisoningtoxicologyenvironmentalhealth.html, is a browseable listof topics related to poisoning, toxicology, andenvironmental health. Subjects include air pol-lution, lead poisoning, pesticides, poisoning,and others. Resources within each subject aregrouped by category, such as ‘generaloverviews’ and ‘children’.Medicinal and Poisonous Plant Databases,http://www.wam.umd.edu/�mct/Plants/, linksto databases of information on both medicinaland toxic plants.MedWeb, from the Emory Health SciencesCenter Library, http://www.medweb.emory.edu/MedWeb/default.htm, offers advancedsearching or browsing by subject. Toxicology islisted as a heading in the ‘Browse By Subject’,and the ‘Focus Farther’ option allows the visi-tor to view subtopics within toxicology, such asbiotechnology and endocrinology. Somesubtopics only have one or two resources, butothers contain many useful links.MedWebPlus, http://www.medwebplus.com/subject/Toxicology.html, searches for specificterms or browses by subject. It includes online

and print resources, organized using MESHSubject Headings. Sites are graded by availabil-ity, the highest grade being an ‘A’, with 90% ofrandom attempts to access the site successfully.Sites with less than 50% of successful connectsare usually not listed. Journals and organiza-tions/associations are well represented.Molecular Modeling Databases, http://cmm.info.nih.gov/modeling/databases.html,links to molecular structure databases, orga-nized by type of molecule, such as proteins,nucleic acids, and small molecules.National Biological Information InfrastructureMetadata Clearinghouse, http://www.nbii.gov/search/clearinghouse/, provides access to bio-logical data sets and information products frommany different sources inside and outside of thegovernment. The NBII website offers foursearch tools, they are the NBII website, theBioBot Search Engine, the NBII MetadataClearinghouse, and BioNews.National Center for Biotechnology Informa-tion, http://www.ncbi.nlm.nih.gov/, providesaccess to biomedical and genetic databases, aswell as PubMed.National Library for the Environment, http://www.cnie.org/. Presented by the NationalCouncil for Science and the Environment, thissite contains links to information on an arrayof environmental topics, including Congres-sional Research Service reports, national andinternational State of the Environment Reports,and an extensive list of researcher’s bookmarksorganized by type of source and topic.Open Directory Project, Toxicology Informa-tion, http://dmoz.org/Science/Biology/Toxicol-ogy/. Volunteer editors have created this list oftoxicology sites. The ‘see also’ links to otherareas of the directory provide useful additionalinformation.Pesticide Publications, Databases, Links, andOther Resources, from the Pesticide ActionNetwork of North America, http://www.panna.org/resources/resources.html, is anextensive list of reports, articles, guides, videos,databases and links to other resources related

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139 133

to pesticides. Some anti-pesticide bias is evidentin links, but nevertheless does include manyvaluable resources. PANNA resources are listedfirst, followed by links to information producedby outside organizations.SciCentral, http://www.sciquest.com/cgi-bin/ncommerce3/ExecMacro/sci–index.d2w/report,features browseable subject index, links to con-ferences and associations, and current newsitems.Scorecard, http://www.scorecard.org/, containsdata on a variety of environmental issues, suchas air pollution, land contamination, and chem-ical releases. Although the site is provided bythe Environmental Defense group, the data isgenerally derived from government sources,such as the Environmental Protection Agency’sToxic Release Inventory.Society of Toxicology, http://www.toxicology.org/, provides news, events, and other profes-sional information for toxicologists.Toxicology Links from the UniversityMaastricht, Netherlands, http://www2.unimaas.nl/� farmaco/links/Links–Toxic/index.htm.Toxilinks, from the Society of Forensic Toxi-cologists, http://www.soft-tox.org/toxilinks/, fo-cuses on forensic toxicology. Links areorganized using pull-down menus based ontopic or organization. It includes links to re-lated government and academic sites, forensictoxicology journals, associations, and otherinformation.University of Iowa Hardin Library’s Directoryof Internet Health Sources, http://www.lib.uiowa.edu/hardin/md/. ‘We List theBest Sites that List the Sites’ is the motto onthis website. Don’t expect to find hard data onthis site Its purpose is to identify other largeand reliable sites than link to health-relatedinformation. Only sites with connection rates ofover 80% are listed.US Environmental Protection Agency, http://www.epa.gov/epahome/topics.html Browse forinformation by subject at this site, or use thesearch engine to find specific terms. The data-bases listed at http://www.epa.gov/epahome/comm.htm also allow searching for data bycommunity.

Virtual Library of Energy, Science and Tech-nology, http://www.osti.gov/. This site from theUS Department of Energy offers several collec-tions of resources, including the PubSciencedatabase of peer-reviewed literature, the DOEInformation Bridge of research and develop-ment reports, and the PrePrint Network ofprepublished information.Yahoo! Toxicology Resource Listing, http://dir.yahoo.com/Health/Medicine/Toxicology/,offers a short annotated list of toxicology web-sites, along with links to the following cate-gories within toxicology. Environmentaltoxicology, forensics, institutes, journals, orga-nizations, and schools, departments andprograms.

9. Evaluating and using sites

Once a search tool has presented a list ofpotentially relevant resources, the next step isdetermining which are useful, accurate, andtrustworthy. Sometimes this will be apparentfrom the title, but it may not always be obvious.This process of evaluating information on theInternet is not significantly different than thatused for gauging traditional print materials.The author’s credentials, the publisher’s author-ity, the age of the item, and the objectivityor bias evident in the writing all remain sig-nificant factors in judging information, regardlessof its format. Because anyone with acomputer and an Internet connection can postjust about anything on the Internet, authority is amuch larger issue than it is for journals andbooks, which commonly undergo a peer reviewprocess. Some Internet sites, especiallyonline journals, have instituted peer review pro-cesses, but unless a site clearly indicates that it hasbeen peer reviewed, it is safer to assumeotherwise.

This makes the question of who is presenting asite, their credibility, and the sources of theirinformation an extremely important issue. Whencritically evaluating a site, there are several keyitems to look for.

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139134

9.1. What is the origin of the information

It may not be the same as the website publisheror the page author. Is it cited? Are there anyexplanations for how the data was derived? Areany credentials given for the author or publisher?The site should leave no doubts as to the sourceof the data.

9.2. Who published the information

Look for an author or organization’s name onthe page; the authorship should be made clear andcontact information provided. Examining theURL can reveal a great deal of clues about a site.A page with a ‘.edu’ has been published on aneducational institution’s website. This does notnecessarily mean that the data is reliable, anymore than it means that information from a com-mercial website (.com) is unreliable. A ‘.gov’ sitehas been published by a government entity. If it isa long URL, try going back a directory level(indicated by the ‘/’ in a URL) for a perspectiveon the overall site.

9.3. Is the information current andwell-maintained

Look for a ‘last updated’ or ‘copyright’ date onthe page. Not all pages within the same site willhave the same date; look at each page’s date to besure it is still current. Are there any dead links onthe page? This may be a sign that the informationis outdated, forgotten, abandoned, or an indica-tion that the author is not perhaps as serious asone would wish.

9.4. What is the site’s purpose? Is there a bias orslant

Every organization or individual has goals anda purpose, and some are more straightforwardregarding their intentions than others are. Exam-ine the presentation and focus, read the missionstatement, and look for information that describesthe author’s aims and purposes. This will provideinsight regarding any bias that may or may not bepresent.

The page located at http://www.epa.gov/opptintr/exposure/docs/efast.htm can be used asan example to demonstrate these evaluation tech-niques. The page describes a downloadable soft-ware package, called the Exposure, FateAssessment Screening Tool (E-FAST), which pro-vides screening-level estimates of the concentra-tions of chemicals released to air, surface water,landfills, and from consumer products. At the topof the page the visitor immediately encounterstwo logos, one for the Environmental ProtectionAgency and another for the Office of PollutionPrevention and Toxics. The URL supports thefact that the page is indeed on the EPA website,since its domain is ‘epa.gov.’ The text of the pageindicates that some portions of the software haveundergone peer reviewing, and provides a name,telephone number, and address for more informa-tion. At the bottom of the page are links that tiethe page in with the rest of the website, and a lastrevision date. By going back two directory levels(to http://www.epa.gov/opptintr/exposure/) it isclear that E-FAST is part of a larger project todevelop exposure assessment tools and models.Going back yet another directory level at http://www.epa.gov/opptintr/, the office home page isencountered, where the whole organizationalstructure of the office can be explored, if desired.If a site leaves any of these questions unanswered,it may warrant further fact-checking, or searchingfor a more reliable source of information.

10. Citing internet resources

Once an Internet site has been located andevaluated, it may be used as part of a researchproject, an article, or other publication. As withprinted sources, different organizations have de-termined their own formats for Internet citations,and this process is still being sorted out. TheAmerican Psychological Association has pub-lished a guide to their recommended format athttp://www.apa.org/journals/webref.html, but notall organizations will require identical references.

Digital information is not fundamentally differ-ent from print resources: there is an author orpublisher, a title, a date, and a source. These

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139 135

should all be essential parts of any reference. Thepurpose of a citation is to allow others to locatethe information again, and a scientist can reason-ably expect that a 1996 issue of Nature will al-ways have exactly the same words printed on page25. This is not necessarily true of information onthe Internet. Because data on the Internet fre-quently changes, it is important that the date thepage was viewed be included in the citation.

The general recommended APA format for anInternet document is — electronic reference for-mats recommended by the American Psychologi-cal Association (1999), Washington, DC,American Psychological Association. RetrievedNovember 20, 1999 from the World Wide Web,http://www.apa.org./journals/ webref.html.

For an online journal, the format would be —Jacobson, J.W., Mulick, J.A., & Schwartz, A.A.(1995). A history of facilitated communication:Science, pseudoscience, and antiscience: Scienceworking group on facilitated communication.American Psychologist, 50, 750–765. RetrievedJanuary 25, 1996 from the World Wide Web:http://www.apa.org/journals/jacobson.html.

11. Library catalogs

Despite the enormous growth of the Internet,libraries continue to play an integral role in thestorage and dissemination of information. Li-braries are valuable repositories of scientific data,both recent and old. This is especially importantbecause, while older data is frequently still rele-vant, it is not likely to be found anywhere on theInternet. Most libraries have online catalogs thatcan be searched directly through the Internet.This is useful for busy professionals who do nothave the time for wild-goose chases, but it can bea drawback. Librarians usually know their cata-logs and their collections very well. Do not hesi-tate to consult their expertise, even if you are notphysically in the library. Most will be willing tooffer advice for using the catalog, developingsearch strategies, and suggesting useful resourcesfor the topic.

As previously mentioned, library catalogs con-tain a record for every book in the collection. The

same basic search techniques that apply to theInternet are also relevant to library catalogs. Mostcatalogs allow searching by author, title, subject,or keyword, and limiting by various aspects, suchas date. Although the syntax may differ from onecatalog to another, the essential search mecha-nisms are very similar. A search for library mate-rials should be approached in fundamentally thesame way as an Internet search, by consideringthe question, scope of the desired results, and keyterms related to the topic.

One difference between library catalogs and theInternet is that library materials, unlike most web-pages, have been assigned descriptive terms by alibrarian. These descriptive terms are derivedfrom controlled vocabulary lists, or thesauri, thatare hierarchical sets of terms showing the relation-ships between broader, narrower, and related con-cepts. This allows searching a spectrum ofbroader and narrower levels of specificity. Thetwo most commonly used sets of controlled termsare the Library of Congress Subject Headings(LCSH) and the Medical Subject Headings(MESH) used by the National Library ofMedicine. Depending on the library in question,either set of headings may be encountered. How-ever, MESH headings are more likely to be usedin medical or health-related setting because theyhave been designed specifically for accurate anddetailed subject indexing of medical topics. Bothlists of terms are continually updated to incorpo-rate new topics and changing ideas.

Subject headings can be put to good use by thesearcher, because all materials on the same topicshould have similar subject headings. However,human inconsistencies and changes in the list ofheadings over time can result in similar itemsreceiving different headings. When using a librarycatalog, it is frequently helpful to look at thesubject heading that have been assigned to rele-vant items and use these to refine the search. Alsoremember that a library catalog is not a full-textsearch; only the item records are searched, andnot the full items themselves. When a librariancreates a record and adds it to the catalog, only afew subject headings are assigned. The generalrule of thumb is that about 20% of an item mustbe on a given topic in order for a subject heading

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139136

describing that topic will be assigned. An obviousdisadvantage to this approach is that a very rele-vant item will sometimes be missed if only a smallportion of the item is about the topic. To mini-mize this problem, it is better to start out with abroad search than a very narrow one. It is moreefficient to refine a comprehensive search than it isto expand a search that was very limited to startwith.

11.1. Library of congress subject headings

Library of Congress Subject Headings are con-structed of one, two, or three parts. These consistof a major heading or term, and one or twoadditional subdivisions that further categorize thetopic. For example, the subject heading ‘Air —Pollution — Health Aspects’ contains three sepa-rate terms combined to describe one concept, andeach one further narrows the definition. ‘Air’ isthe broadest term, and used alone is not veryhelpful, because it could include anything havingto do with air. Adding the subdivision ‘Pollution’makes it clear that items with this subject headingare about air pollution, not air flow, weather, orthe atmosphere. ‘Health Aspect’ indicates that theitem discusses the effects of air pollution onhealth.

This is just a sampling of some of the Libraryof Congress Subject Headings and Medical Sub-ject Headings that may be useful for toxicologyresearch. Many others exist, and they will varyfrom search to search. Use the keyword list devel-oped before searching as a starting point, andexamine the subject headings that have been as-signed to especially useful items in order to re-trieve other potentially relevant materials.

11.2. Library of congress subject headings relatedto toxicology

Behavioral Toxicology.Biological Monitoring.Chemical — Safety Measures (or other relevantsubdivision).Environmental Exposure.Environmental Health.

Environmental Monitoring.Environmentally Induced Diseases.Fetus — Effects of Chemical On (subdivisioncan be used with names of other organs, sys-tems, or organisms).Hazardous Substances.Health Risk Assessment.Industrial Hygiene.Industrial Toxicology.Lead — Physiological Effect (subdivision canbe used with other names of substances orchemicals).Medicine, Industrial.Mutagens.Neurotoxic Agents.Occupational Diseases (also search under thesubdivision ‘Diseases’ by occupational group).Occupational Mortality.Pesticide Residues.Poisons.Pollutants.Pollutants — Toxicology (subdivision can beused with other substances or chemicals).Pollution — Health Aspects.Printers — Diseases (subdivision can be usedwith other occupations or professions).Reproductive Toxicology.Risk Assessment.Solvents — Toxicology.Teratogenic Agents.Threshold Limit Values.Toxicity Tests.Toxicological Interactions (also search undersubdivision ‘Health Aspects’ or ‘Adverse Ef-fects’ by industry, process, or substance, e.g.‘Pesticides — Adverse Effects’).Toxicology.

11.3. Medical subject headings

A searchable and browseable list of the MESHterms is available at http://www.nlm.nih.gov/mesh/MBrowser.html and at http://www.ncbi.nlm.nih.gov:80/entrez/meshbrowser.cgi.Both of these two tools allow the user to searchand browse through the hierarchical structure, ortree, of the thesaurus in order to find broader,narrower, and related terms. The structure of

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139 137

MESH headings is different from that of LCSHbecause the subject headings are not organized,grouped, or constructed in the same way.Whereas in LCSH an item will generally be as-signed about three subject headings, MESH al-lows as many headings as are necessary toadequately represent the item’s content. As canalso happen in Internet directories, concepts arenot always grouped together in the same way bytwo different resources. The groupings of subjectterms are often entirely different in MESH andLSCH, which will affect the way a search isrefined or broadened, depending on which tool isbeing used. The following is a partial list ofMESH headings of interest to toxicologists andother researchers (National Library of Medicine,2000).

11.4. Medical subject headings rele6ant totoxicology

Age Groups (subtopics include, Adult, Aged,Middle Age, Child, Adolescence, Child,Preschool, Infant, etc.).Air Pollutants (subtopics include types of pollu-tants, e.g. Air Pollutants, Environmental andAir Pollutants, Occupational).Body Burden.Carcinogenicity Tests.Conservation of Natural Resources.Cytotoxicity Tests, Immunologic.Disorders of Environmental Origin.Environment and Public Health.Environmental Exposure.Environmental Health.Environmental Illness.Environmental Medicine.Environmental Pollutants, Noxae, andPesticides.Environmental Pollution.Hazardous Substances.Hazardous Waste.Heavy Metal Poisoning, Nervous System.Industrial Waste.Inhibitory Concentration 50.Investigative Techniques.Irritants.

Lethal Dose 50.Maximum Tolerated Dose.Medical Waste.Mutagenicity Tests.Neurotoxicity Syndromes.No-Observed-Adverse-Effect Level.Occupational Diseases.Occupational Exposure.Occupational Health.Pesticides.Pharmacokinetics.Plants, Toxic (subtopics include individual spe-cies, e.g. Belladonna, Digitalis, and Hemlock).Poisoning (Subtopics include types of poison-ing, e.g. Lead Poisoning, Arsenic Poisoning orDrug Toxicity).Psychoses, Substance-Induced.Public Health.Radioactive Pollutants.Radioactive Waste.Risk Assessment.Risk Factors.Sewage.Soil Pollutants.Solvents.Substance-Related Disorders.Toxicity Tests.Toxicology.Toxins (subtopics include toxin types, e.g. Bac-terial Toxins, Cytotoxins or Neurotoxins).Waste Products.Water Pollutants.There are also common terms or subtopics

which may be combined with main terms using aslash, e.g. Waste Products/prevention and control.Not all of these subtopics are applicable to everyterm; so, using the MESH browser and examiningthe subject headings assigned to other relevantitems will provide useful information about addi-tional terms that could be used to refine a search.

administration and dosage.adverse effects.analysis.blood.chemical synthesis.chemistry.classification.economics.

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139138

history.immunology.isolation and purification.metabolism.pharmacokinetics.pharmacology.poisoning.prevention and control.radiation effects.standards.statistics and numerical data.toxicity.As a supplement to their printed collections of

journals and books, many libraries have alsoadded electronic resources. These can be subscrip-tion-based commercial database services, or mate-rials freely available on the Internet. The libraryshould not be overlooked as a resource for findingmaterials on the Internet. Some libraries havebegun adding records for documents available onthe Internet to their regular catalogs, and othersoffer access to databases, like Toxicology Ab-stracts, which indexes Internet materials. Manylibraries have also compiled ‘virtual libraries’ andsubject guides of Internet resources that the li-brarians have found to be useful and reliable. Forexample, the National Reference Center forBioethics Literature at Georgetown University of-fers the online guide ‘Bioethics Resources on theWeb’.

List of Useful Library websites.Environmental Protection Agency Online Li-brary System, http://cave.epa.gov.The catalog includes items in all of the EPAregional and laboratory libraries. It also pro-vides access to the catalog of EPA documentsavailable from the National Service Center forEnvironmental Publications, the EnvironmentalFinancing Information Network database, theNational Enforcement Training Institute data-base, and resources from the Subsurface Reme-diation Information Center. Search by title,keyword, author, report number, or call num-ber. Searches can be limited by library, andsophisticated searches can be formulated usingthe advanced search option (EnvironmentalProtection Agency, 2000).Library of Congress,

http://www.loc.govprovides several way to search the Library ofCongress catalogs. Visitors can search orbrowse by keyword, title, author or subjectusing several different interfaces. The commandsearching allows the most sophisticatedsearches. Extensive collection of current andhistorical resources. The website for the ScienceReading Room, which features science andtechnology materials, is located at http://lcweb.loc.gov/scitech/.National Agricultural Libraryhttp://www.nal.usda.gov/The Agricola system provides access to libraryholdings and the Journal Article Citation In-dex. Search or browse by keyword, author,title, subject, call number, or use the advancedsearch. Features a large collection of resourceson agriculture and allied disciplines, includinganimal, plant, and environmental sciences. TheNAL site also links to the Agriculture NetworkInformation Center (http://www.agnic.org/)that provides access to additional agriculture-related information.National Library of Medicinehttp://www.nlm.nih.gov/searches the library holdings or the biblio-graphic databases. LocatorPlus, at http://www.nlm.nih.gov/locatorplus/locatorplus.htmlis the main catalog of library materials. Thecatalog can be searched by keyword, title, au-thor, subject, or call number. Additional fea-tures include and advanced menu and akeyword combination feature that permitsBoolean operators, truncation, and nesting. Ex-tensive medical and health-related materials.For additional websites from medical and

health sciences libraries, the University of Iowa’sHardin Library offers links to library websites athttp://www.lib.uiowa.edu/hardin-www/hslibs.html.

12. Conclusion

There are any number of predictions about howthe Internet will revolutionize publishing, infor-mation sharing, and research, but it is safe to

L. Dassler Wuko6itz / Toxicology 157 (2001) 121–139 139

assume that, regardless of whether or not thesepredictions are accurate, the Internet will continueto be integral vehicle for scholarly communica-tions. The technology will no doubt evolve, but itmay be some time before search if tools are ableto catch up with the phenomenal growth of theInternet. In the meantime, researchers will do wellto familiarize themselves with the resources thatare available and the techniques to use themeffectively.

By preparing for a search, choosing the appro-priate resources, and capitalizing on their fea-tures, the scientist can improve the quality andquantity of information retrieved. As this paperhas explained, there are many techniques that canbe used to find the most relevant resources. Al-though the syntax and exact features may vary, aresearcher who masters these techniques canconfidently search almost any database, catalog,or other bibliographic tool.

References

American Psychological Association, 1999. Electronic refer-ence formats recommended by the American PsychologicalAssociation. Retrieved July 10, 2000 from the World WideWeb: http://www.apa.org./journals/webref.html.

Barker, J., 2000. Finding Information on the Internet: ATutorial. Retrieved June 16, 2000 from the World WideWeb: http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/FindInfo.html.

Butler, D., 2000. Souped-up search engines. Nature 405, 112–115.

Chakrabarti, S., et al., 1999. Hypersearching the Web. Scien-tific American, June 1999.

DeWoskin, R.S., 1998. Toxicity summaries and data on theinternet: a baseline assessment on the transition from paperto electronic access. Toxicol. Lett. 95, 87.

Environmental Protection Agency, 2000. About the EPA On-line Library System (OLS). Retrieved June 16, 2000 fromthe World Wide Web: http://www.epa.gov/natlibra/aboutols.htm.

Gehanno, J.F., et al., 1998. Assessment of bibliographic data-bases performance in information retrieval for occupa-tional and environmental toxicology. Occup. Environ.Med. 55, 562–566.

Inktomi, 2000. Web Surpasses One Billion Documents. Re-trieved June 14, 2000 from the World Wide Web: http://www.inktomi.com/new/press/billion.html.

Lawrence, S., et al., 1999a. Accessibility of information on theWeb. Nature 400, 107–109.

Lawrence, S., et al., 1999b. Searching the web: general andscientific information access. IEEE Commun. 37 (1), 116.

Ludl, H., et al., 1996. Searching for information on toxicolog-ical data of chemical substances in selected bibliographicdatabases — selection of essential databases for toxicolog-ical researches. Chemosphere 32, 867–880.

National Library of Medicine, 2000. Fact Sheet: MedicalSubject Headings. Retrieved June 16, 2000 from the WorldWide Web: http://www.nlm.nih.gov/pubs/factsheets/mesh.html.

Notess, G.R, 2000. Database Overlap: Overlap of FiveSearches. Retrieved June 28, from the World Wide Web:http://www.searchengineshowdown.com/stats/over-lap.shtml.

Sullivan, D, 2000. Search Engine Sizes. Retrieved June 28,2000 from the World Wide Web: http://searchenginewatch.internet.com/reports/sizes.html.

.