Upload
percival-garrett
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
Definition A database is a collection of
information organized in a way that a computer program can quickly retrieve desired pieces of data.
Database Fields Pieces of information a user can
access Author Title Journal name Abstract Descriptors Other
Database Fields Fields may have attributes
associated with them: Numeric
(e.g., accession number) Textual
(e.g., author name)
Database Records and Files Record
A collection of fields which constitutes a complete set of information
File A collection of records
Hypertext Database Hypertext was invented by Ted
Nelson in the 1960s. In a hypertext database, objects
(text, pictures, music, and other media) are linked to each other.
Data Structure A scheme for organizing related
pieces of information. Basic types of data structures
Files, records, trees, tables
Trees Data is organized in a hierarchical
structure Each element is attached to one or
more elements that is directly beneath it.
Connections between elements ->branches
Elements at bottom of a tree with no elements below them -> leaves
Tables Data is organized in rows and
columns Example: Excel spreadsheet
Relational database management systems store data in the form of related tables
Aleph system is based on a relational database management system (Oracle)
Dialog Database Documents or surrogates are
stored in a linear file Linear file is transformed into an
inverted file
Dialog Database Structure Linear file
Composed of document surrogates stored in the IR system in their full, native form.
Inverted file Composed of all words included in
document surrogates excluding stop words.
Linear File Documents have to be searched in
their entirety to locate specific information needed.
Slow and inefficient
Inverted File Words in all documents can be
searched instead of the whole text of the documents themselves
Faster and more efficient
Creation of Inverted File A list of words in document
surrogates is made. Each word is numbered, including
phrases and excluding stop words. Words that are numbered are
alphabetized (numbers precede letters)
Creation of Inverted File Alphabetized entries are followed
by the document number, field (e.g., AB, DE), and the number of the entry in that field (e.g., entry in abstract as first word)
Linear File: Example
101
The origins of Don Giovanni.
Discusses the history and sources Mozart used in his opera Don Giovanni.
DE: Mozart, Opera, Historical Analysis.
Inverted FileOrigins 101 Ti 2
Don 101 Ti 4
Giovanni 101 Ti 5
Discusses 101 Ab 1
History 101 Ab 3
Sources 101 Ab 5
Mozart 101 Ab 6
Used 101 Ab 7
…
Inverted File
Mozart 101 DE 1
Opera 101 DE 2
Historical 101 DE 3
Analysis 101 DE 4
Historical Analysis 101 DE 3,4
Indexing Words (keywords)
Every important word in document is indexed
Information systems, for example, is indexed as 2 separate words and as a phrase
Information Systems Information systems
Record Structure Dialog record structure shows
every field followed by the information for that field.
Fields and structure varies among databases
Demo of a Dialog Record structure
Internet Protocols Http Telnet File Transfer Protocol (FTP)
Secure FTP (SSH) Web-based FTP (Volspace)
E-mail Protocols SMTP (Simple Mail Transfer Protocol)
Used to send e-mail between servers and between a server and client’s machine over the Internet
Email is retrieved by using a POP, IMAP, or text-based email client
POP (Post Office Protocol) IMAP (Internet Message Access Protocol)