View
216
Download
1
Tags:
Embed Size (px)
Citation preview
Presentations
Document Management Systems & OCR– Market Overview– Algorithm Introduction
Video on Demand– Real Media– Technology
Authoring Systems– Macromedia Products
Presentations
Content Management– Functionality– Market Overview– Opencms
Application Server– Functionality– Market Overview
Presentations
VRML– Syntax Introduction– Exercise
SMIL : Multimedia Synchronisation– Syntax Introduction– Exercise
Information System Classification
Expert Systems Transaction Processing Systems Office Automation Systems Management/Executive Information Systems Geographic Information Systems Information Retrieval Systems
Expert Systems
Problem Solving Artificial Intelligence Replace an Expert Multiple operational Implementations Often Implemented using Prolog
Transaction Processing Systems
Records Events of interest to an organization Supports the operational level of the
business High data volume
TPS applications
Manufactoring and Production Sales and Marketing Finance and Accounting Human Resources
Management/Executive Information Systems
Analysis of TPS data Higher Level Reports Drill Down to detailed Information possible
Information Retrieval System
Manages Documents = Records of Information
Presents relevant Documents on a Query
Information Retrieval System Examples
POTS directory assistance Library Catalog World Wide Web Search Engine
Information Retrieval
Deals with the– Representation of– Storage of– Organization of– Access to
Information items
History
Early Example: Book‘s Table of Contents Indices in libraries Only recently automatic indexing The Web
– Easy & cheap access– Variety of sources– Freedom of Publication, Interactivity
Data Retrieval vs Information Retrieval
Exact match Looks for matching
items Complete Query Data with well defined
structure and semantics
Best match Looks for Relevant
Items Incomplete Query Natural Language
Documents
Information Retrieval and the Web
IR originally Text Indexing and Searching Web is highly heterogenous System, no
common data model Navigation is ineffiecient Information Retrieval promises to structure
information and ease fulfilling information needs
Usage: Information Retrieval
User has Information need User translates this need into a machine-
understandable Query System retrieves relevant Information
Logical Views of a Document
Full text Set of Index Terms
– Specified by human expert– Text Operations
Elimination of Stopwords Stemming Compression
Intermediate Logical Views Structure Recognition
Retrieval Process
User Interface
Text Operations
Text Database
DB Manager Module
Index
Searching
Ranking
Operational Modes
Ad Hoc– Fixed Database, changing Queries
Filtering– Fixed Queries, changing Database– User Profiles
Linear List
Unsorted list of documents Easy addition of files Traversal required for a search
AuthorD
AuthorE
AuthorA
AuthorF
1 2 3 4 5 6 7
AuthorB
AuthorG
AuthorC
Sequentially Ordered File
Sorted by the values of a Key Addition of documents more involved Binary search possible
AuthorA
AuthorB
AuthorC
AuthorD
1 2 3 4 5 6 7
AuthorE
AuthorF
AuthorG
Inverted Indices
An index of all the words in the texts Vocabulary
– Different Words in the text– Little Space required after Text Operations
Occurences– Positions– More Space required, ~30-40% of text size
Inverted Indices
Block Addressing– Smaller Pointers– References in one block are collapsed– Online Search required for exact positions– Fixed Size Blocks or Natural Cuts
Fully Inverted Indices– For less readily accessable collections if exact
position is required
Boolean Model
Pro Easy to understand Precise Semantics of a query
Contra Binary Decision Difficult for users