90
The National University of Lesotho Department of Mathematics and Computer Science COMPUTER SCIENCE PROJECT CS4403 2011/12, ACADEMIC YEAR. Mathematics and Computer Science Digital Library System MACSDL June 25 th 2012. Compiled & submitted by: 1. Mosola, N.N 200800142 2. Koali, M.S 200800572 3. Senatsi, K.V 200800535 Supervisor: Mr. L.Poulo

Digital Library System

Embed Size (px)

Citation preview

Page 1: Digital Library System

The National University of Lesotho

Department of Mathematics and Computer Science

COMPUTER SCIENCE PROJECT

C S 4 4 0 3

2011/12, ACADEMIC YEAR.

Mathematics and Computer Science Digital Library System

MACSDL

June 25th 2012.

Compiled & submitted by:

1. Mosola, N.N – 200800142

2. Koali, M.S – 200800572

3. Senatsi, K.V – 200800535

Supervisor: Mr. L.Poulo

Page 2: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

2

ABSTRACT:

The world is evolving rapidly, while the pace of technology rises exponentially. Living in an

information age, free access to information is a high demand, making people share what they

have and obtain what they do not possess. Information sharing is common these days, with

knowledge being shared amongst individuals, a need to manage such information is necessary.

The National University of Lesotho, amongst its faculties is the faculty of Science and

Technology which has a few departments.

The department of Mathematics and Computer Science (MACS) seeks to have an information

management system, where large pools of information can be stored, accessed freely and

managed adequately. A web based MACS digital library (DL) system is the answer. MACS DL

will manage shared digital information objects for students and lecturers to enhance learning at

NUL. The system will bring an evolution to the Information, communication and Technology

(ICT) usage within the NUL barracks where both students and their lecturers are in need of

information on daily basis. Information retrieval will be at the heart of the system, while a

repository of digital objects is kept.

Following the prototyping process model and software life cycle, the system will be developed to

address this issue. This document discusses all the relevant steps that take place as the system is

under development.

ACKNOWLEDGEMENTS Thanks to the Mathematics and Computer Science department at NUL, the project was indeed an

eye opener; lots of great lessons have been picked up from this one and surely are the ones to

build for the future. Working together on this project has made us a unit and we hope to work

together again, we were a great team, an incredible team indeed and with you, a new

„Computing‟ era is born.

Big and ongoing thanks go to our supervisor and thesis advisor, Mr.Lebeko Poulo, for

introducing us to this fascinating subject of „Digital Libraries (DLs) and Information Retrieval

(IR)‟. Even in this four credit hour course, we learned more about DLs and IR systems than we

could have learned in a lifetime in any other field of study. Thank you for giving us a chance to

work on such a fascinating and challenging project! Our warm and kind regards go to the student

union in the MACS department, this project would not have been possible without them willing

to spare minutes with us during the requirements elicitation phase, testing and the evaluation

phases of this endeavor.

Page 3: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

3

CHAPTER ONE Introduction

Building a digital library (DL) is inevitably an expensive and resource-intensive. Before

embarking on such a project, it is important to consider some basic principles underlying the

design, implementation and maintenance of a DL. The principles applied in building this project,

hereinafter called MACS DL, do not only apply on this endeavor but are essential to building

large digital libraries that we know today, good examples are: The ACM digital library

(http://portal.acm.org/dl) , New Zealand Digital Library (http://www.nzdl.org/cgi-bin/library),

National Science Digital Library (http://nsdl.org/), to mention but a few.

A digital library is “a focused collection of digital objects, including text, video, and audio, along

with methods for access and retrieval, and for selection, organization, and maintenance of the

collection.” C.f. Witten, Ian and David Bainbridge (2002), How to Build a Digital Library,

Morgan Kaufman, p. 6.

Brief background of DLs

As the need to avail information and resources for access globally arose, digital library systems

(DLS) were born and their importance grew to greater heights over traditional libraries to

digitally preserve collections of valuable resources and information on the Web for educational

and research purposes. As a result, the basic idea was to create web-based, easily-accessible

collection of digital information whose organization and management would be automated to

address the inefficiency of traditional libraries. MACS DL is no exception as all the principles

used in its development follow the same route.

What is MACS DL?

MACS DL is an educational portal built for use at the National University of Lesotho (NUL),

under the department of Mathematics and Computer Science (MACS) in the faculty of Science

and Technology (FOST), to enhance the mode of course delivery and provide facilities to

academics in this faculty. MACS DL provides services to the mentioned NUL community such

as file sharing, browsing documents, searching textual materials, storing unlimited amount of

digital objects on the server for current and future purposes, and information retrieval (IR).

Motivation

The higher education industry in Lesotho is experiencing an unprecedented growth rate. This

trend is largely a result of new enabling technologies that have facilitated the virtual delivery of

academic programs. This has in turn led to libraries becoming key success factors in the virtual

academic environment.

As students at NUL, it has come to our attention that the famous Thomas Mofolo library within

the premises of NUL is not adequate and well-equipped enough to provide services to the

students, researchers, and N.U.L staff in general. With that in mind, we aim to promote, support,

Page 4: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

4

manage and disseminate high quality research, development and innovation in information,

library and related fields.

Project aims

We aim to encourage and facilitate the development of information strategies in higher education

communities such as NUL. The main reason of building a digital library system is to provide

unlimited, free and remote access, to information from multiple users around the NUL campus.

Problem Statement

The National University of Lesotho (NUL) has a vision to be a leading African university. In the

faculty of Science and Technology (FOST), the department of Mathematics and Computer

Science (MACS) has a vision to facilitate learning and enable both students and lecturers have a

better way of managing and conducting their academic work. Currently, MACS does not have an

academic portal that manages textual digital objects to enhance learning. Students and lecturers

rely on the internet search engines such as Google, Yahoo, etc, for any academic material they

need. MACS department requires a more direct digital library that encourages file sharing for

easy access of materials used in the MACS department.

Proposed solution

A well managed digital library system that will serve as a repository of rich information of

greatest demand contributed by students and lecturers in the MACS department. Our hope is that

this will increase the availability of student research for scholars, empower lecturers and students

to conduct researches and advance digital library technology worldwide. MACS DL shall be a

repository that archives any textual objects for current and future reference to enhance learning at

NUL and provide free access to information. The fundamental reason for building a digital

library for MACS department at NUL is belief that it will provide better delivery of information

than was not possible in the past.

Why a [MACS] DL?

Some of the advantages of DLs, though not limited to, are the following:

DLs bring the libraries closer to users: Information is more and easily accessible, and

increases information usage. This is very much different to what happens when a

traditional library, like Thomas Mofolo, is used since users need to physically go to the

library.

Searching and browsing capabilities: Computer systems are better than manual methods

for finding information. DLs offer efficient and advanced search, information retrieval

and browsing techniques that enable users to better search for their information need,

browse material searched with relative ease.

Page 5: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

5

Information sharing: Placing digital information on a network makes it available to

everyone. With a MACS DL maintained on the NUL site, it will vastly be an

improvement over expensive, physical duplication of little used material, which is

sometimes inaccessible without having to travel to the location where it is stored, like

Thomas Mofolo library.

Availability of information: MACS DL‟s doors will never close; usage of MACS DL‟s

collections can be done when library (i.e. Thomas Mofolo library) buildings are closed.

Materials are never checked-out, misplaced, or stolen!

Project Plan

MACS DL system is sub-divided into two major parts, namely:

The DL: This is by large, the most important of the two. The DL is a focused collection

of digital objects organized and maintained in a proper manner. The DL will contain a

pool of electronic versions of books and journals.

Search engine – This will assist in the information retrieval (IR) and file indexing (FI).

The plan is to have a successful implemented DL with an incorporated search engine that

enables users of the DL to retrieve information they require. Upon successful completion of

these two, the system will be deemed to have met the user‟s requirements, later discussed in

this document.

Page 6: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

6

CHAPTER TWO

System Requirements Specification (SRS)

System functions and purpose:

The system is a managed digital library portal for higher education and research purposes

to be used at NUL under FOST, in the MACS department by both students and lecturers.

MACS DL manages textual collections. The system allows intended users to upload

materials to the server, browse through the collection, sort the collection, search for any

material on the server, and download material from the collection.

Hardware and Software requirements:

1. Hardware:

Computers with a minimum secondary disk space of 20GB and primary memory

of 128MB.

2. Software:

Apache tomcat web server, MySQL database server and Java Integrated

Development Environment (IDE).

Performance specification:

Using data structures and algorithms designs, each module/function of the system has

been optimized as to never burden the processor with prolonged processes.

User interface (UI):

The system will provide an interactive, easy to learn and use interfaces to interact with its

users enabling it to be used effectively and efficiently. The system UI obeys the basics of

human computer interaction principles and designs.

System data:

Any data captured into the system, e.g. user‟s information is stored in relational database

schemas with normalized objects to conform to data integrity rules and consistency. The

system provides tight information security measures to allow access only to users with

credentials to access the system data.

System design constraints:

Imposed on the system design, MACS DL only manages textual digital objects. The

system uses English language only, bearing in mind that the intended users are academics

and can actually understand the language.

Page 7: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

7

Requirements Engineering (RE)

Requirements engineering establishes a solid base for design and construction of any system.

Without it, the resulting software would have a higher probability of not meeting user‟s needs.

To build elegant software that actually solves user‟s problems and meets the SRS mentioned

earlier, the developers conducted an extensive study around the NUL campus to gather views

from the targeted/potential users. The following RE steps were followed:

Inception: this is where the scope and nature of the system was defined.

1. Scope: A web based digital library system that manages textual objects.

2. Nature: The system is an academic portal helping students and researchers to

share materials and search for any texts on the server.

Elicitation: This step helps to define what is actually required. Interviews with NUL students

in the MACS department and lecturers were conducted to help developers elicit the user‟s

requirements to identify the problem properly and propose elements of the solution. The

following diagrams were used to elicit user‟s requirements.

Figure 2.1 Use Case scenarios

Use Case Number Use Case Name Use Case description

1 Browsing Accessing subsets of data by

categorical classification.

E.g. browse by author name,

alphabetical order, title , by

date etc

2 Searching Indexing, Information

retrieval and querying

3 Annotate Adding commentary,

generalization and reviews

4 Upload/Submission Adding new digital objects

to the DLS

5 Download Saving a digital object to a

local storage media

Page 8: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

8

Fig 2.2 Use case diagram

SEARCH

BROWSE

SUBMIT

ANNOTATE

DOWNLOADCLIENT

Elaboration: The basic requirements gathered were refined and modified to suite the design

of the system under development; as a result an analysis model was produced.

Negotiation: The priorities of the system were clarified. E.g. as one of the priorities, the

system must be able to upload documents to the server, enable information retrieval and

download material from the server. During this step, different approaches to solving the

problem identified were coined a preliminary set of set of solution requirements was

negotiated amongst the developers.

Specification: From the elaboration and negotiation steps, a detailed specification of the

system was developed as enough resources had been gathered.

Validation: In an iterative manner, prototyping as a standalone process model, users of the

system were frequently visited to make sure what is being developed conforms to what the

users required.

Management: Throughout the project‟s life cycle, changes to the initially gathered

requirements were brilliantly managed as the prototyping model allows iteration of the steps

performed.

Page 9: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

9

As a result of the above RE steps, prototypes were built as an end product.

To translate user‟s needs into technical requirements, the development team used Quality

Function Deployment (QFD), emphasizing what is valuable to the users, identifying the

following three types of requirements:

Types of Requirements identified

1. Normal requirements: These were stated by the users during interviews

conducted, and provided developers with an understanding of what should be

developed.

2. Expected requirements: These were not explicitly mentioned by the users but

were identified by the developers. E.g. ease of searching.

3. Exciting requirements: These were identified by developers, as they are

beyond the user‟s expectations.

Requirements gathering Techniques used

A number of techniques were used to gather requirements from the target population. The

following were of great importance in the requirements gathering phase:

Stratified sampling: A small group of students in the MACS department were chosen to

represent the entire MACS student union. In the communication and planning phase of

the prototyping model followed by the developers, ten (10) students were sampled.

Observing users: Sampled students were observed as they carried out their daily

activities, using Google and Yahoo as internet search engines to search for material they

require on the internet. On the other end of things, sampled students were observed as

they used the MACS DL search engine.

Interviews: face-to-face interviews with the sampled population of students were

conducted. Initially, a pilot study was employed, to make certain that the methods

proposed by the developers were viable and that in the long run, the solution would be

appreciated. As the developers needed concrete answers and proof for future references

that interviews were conducted, a live video recording session of students using the

MACS DL system as a prototype was recorded. This video is in the possession of the

developers and shall be made available to the supervisor.

Page 10: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

10

CHAPTER THREE Project Risk Analysis and Management

SWOT analysis was used to determine the strengths, weaknesses, opportunities and threats of

this project. The following table depicts the outcomes of this extensive risk assessment.

Table 3.1 SWOT analysis

Strengths Weaknesses Opportunities Threats

1. Skilled project team

members in

programming web based

IR systems

Apache Tomcat web

server storage

New technology Existing DLs and

search engines, such

as Google scholar,

4Shared, etc

2. Availability of required

resources

Not enough metadata

can be found in digital

objects

Integration with external

search engines, such as

Google

Web resources

3. New technology

facilitating information

sharing

Computer illiterate

end users

Search engine

development

Information overload

A thorough study in assessing the risks related to embarking on a project of this nature was

conducted by the developers and the above table shows the results of risk analysis using SWOT

analysis.

Other software engineering methods of risk assessment, management and mitigation were

employed to try and analyze the uncertainties that could put the project under risk. These

involve:

Identifying technical risks for MACS DL project

Identifying technology risks for MACS DL project

Identifying staff risks for MACS DL project

From the above, the developing team came up with the following risk analysis table:

Page 11: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

11

A scale of 1 to 5 was used to estimate the impact of the risk on the project, and the following

mappings were concluded:

1 = catastrophic, 2 = critical, 3 = marginal, 4 = negligible, 5 = low

Table 3.2 Risk Table

Risk Management Mitigation Monitoring Impact

Supervisor not readily

available on campus

Use groupware systems to

contact supervisor

This risk was inevitable

(i.e. unavoidable)

Fortnightly

progress reports

must be send to

the supervisor

2

Ambiguous project scope Refine project scope Request clear definition

of scope

Brain-storming

sessions

1

Developers not familiar

with technology used

Seek related sources from

supervisor and specialists

Quickly learn how to use

the technologies required

Project progress

report

3

Requirements change Elicit requirements, build

prototypes

Iteratively refine the

requirements to track

changes

Perform

requirements

engineering

3

Project team member drops

out of the project

Ensure timely and

consistent check-ups on

team members

Meet with team members

regularly discussing the

project

Measure

effectiveness of

mitigation. E.g.

ensure that

every member

is doing some

work on the

project

5

The risks identified were then assessed using the methods described above. In a round-robin

fashion, the developers had to assign each impact of the risk a value until an agreement was

reached, which is depicted in the tables above.

Projects are always under some risk if any event is identified that could dent the project‟s

schedule. The schedule of this project was affected by some of the risks identified above, for

example; in the requirements elicitation phase, numerous iterations regarding the requirements

identified were a must [RE] do.

Page 12: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

12

CHAPTER FOUR Object Oriented Analysis Design (OOA)

1. Class Responsibility Collaborator (CRC) Modeling

CRC modeling provides a simple means of identifying and organizing classes. NB: CRC

modeling is not an official part of Unified Modeling Language, but a collection of index cards

that represent classes. Using this modeling, developers were able to identify potential classes that

they could use as the building blocks of the MACS DL system.

Benefits identified

Portability: No computers are needed as CRC can be used anywhere, during the

brainstorming sessions.

Tangible: They allow participants to experience at firsthand how the system will work.

Limited size: Index cards can only hold a limited amount of information compared to

class diagrams. This enforces a high-level analysis.

Fig. 4.1Class Responsibility Collaborator (CRC) Cards

Class Name: Searcher

Class Type : Internal entity

Responsibilities Collaborators

Generates Query

Filters Query

Locates query

Retrieves Results

Inverted File

Retriever

Ranker

Class Name :Inverted File

Class Type : External Entity

Responsibilities Collaborators

Insert Data into an Index

Maintains Index

Searcher

Page 13: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

13

Class Name : Retriever

Class Type : External Entity

Responsibilities : Collaborators :

Displays ranked results

Ranker

Searcher

Class Name : Ranker

Class Type :External Entity

Responsibilities Collaborators

Ranks data Retriever

Searcher

Class Name : Browser

Class Type :External Entity

Responsibilities Collaborators

Retrieving data from links Searcher

Retriever

Class Name : Downloader

Class Type : External Entity

Responsibilities Collaborators

Copies data to local storage Searcher

Retriever

Page 14: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

14

Class Name : Digital Library

Class Type : External Entity

Responsibilities Collaborators

Processes a given request Searcher

Browser

Downloader

Up-loader

Class Name : Up-loader

Class Type : External Entity

Responsibilities Collaborators

Copies digital objects from local storage

into the collection

2. Class Diagrams

Page 15: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

15

The system consists of the following classes, depicted in class diagrams.Fig 4.2 Class diagrams

OOA continued…

3. Data flow diagram (DFD)

DFDs show how data is captured as input, transformed in the processes and output as results to

the users.

+GenarateQuery() : string

+FilterQuery() : string

+LocateQuery() : void

+RetrieveResults() : string

-SearchQuery : string

-Results : string

SEARCHER

+InsertData() : void

+Maintains() : void

-Query : string

-Size : int

Inverted_File

+RankResults() : void

+DisplayResults() : void

-Data : string

RETRIEVER

+RankData() : void

-Data : string

Ranker

+BrowseLink() : string

-Link : string

BROWSER

+DownloadFile() : void

-File : string

DOWNLOADER

+CopyFile() : void

-Data : string

UPLOADER

+ProcessRequest() : void

-Request : string

DIGITAL_LIBRAY

Page 16: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

16

Fig 4.3 DFD

BROWSE SEARCH

SUBMIT

ANNOTATE

CLIENT

DATABASE

SERVERINDEX RETRIEVE

A d

igita

l ob

ject

Fe

ed

ba

ck

A request for a digital object

Query/Results

A request for a digital object

Com

ment(s)

Indexed object

CLIENT

Ne

w d

igia

tl

ob

ject

Dig

ital o

bje

ct

A digital object

Commect(s) o

n digital o

bject

DOWNLOADDownload Request

Dig

ital obje

ct

CHAPTER FIVE Data Structures and Algorithms

Page 17: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

17

The following data structures were used in the development of the DL. The developers

extensively studied the various data structures to use, and from a long list of candidates, it was

thought that the best ones to use were the following:

Hash Table: This data structure store all index terms. A hash table location references a

posting list node for a specific index term.

Why Hash Table?

A Hash table data structure provides efficient searching which has been optimized to a

time complexity of O (1) to find a posting list for an index term.

Posting list (implemented using linked list)

This is a linked list data structure in which a node in the list encapsulates term frequency

(the number of times a term appears in the document) and document id (document

filename). A new node is added in the list every time a document is indexed which

contains the term. This operation runs at O (1).

Algorithms:

Indexing

0. Read an index object from disk.

0.1.Extract entire text from a given document.

0.2. Break the text into tokens /terms.

0.3.Filter the stop-words from the terms.

0.4.Stem each term, applying the stemming process

0.5.For each stemmed version of the term:

Begin

0.5.1. if a term does not exist

0.5.1.1 Store the term into the hash table

0.5.1.1.2 Create a corresponding posting list for the term.

0.5.2. Else

0.5.2.1.Add a node in the posting list.

End

0.6.Save an index object to disk.

Searching and Ranking

1. Read an index object from disk.

1.1.Break the user query into tokens/terms.

1.2. Stem each term, applying the stemming process

1.3.For each stemmed-term:

Begin

1.2.1. If a term exists

1.2.2. Retrieve its posting list and compute its weight in relation to query vector

Q, and all document vectors (Di…Dn ) where n is the number of documents in the

collection.

1.2.3. Else

Page 18: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

18

1.2.3.1 The term weight is zero.

End

1.4.For each document

Begin

1.4.1. Compute the score (using the ranking formula).

End

1.4.2. Sort documents according to their score.

1.4.3. Return sorted documents (The document with highest score is the most relevant

document to the given query).

System Design and Engineering

This section discusses how the system was engineered. In this context, there are numerous steps

that were followed, now that the developers were equipped with the requirement from the RE

phase, classes to implement from the OOA phase, data structures and algorithms to implement,

the developers now had to design and engineer the system to meet the requirements.

Indexing Documents

Overview:

Searching, indexing and ranking techniques are at the core of the implementation of this piece of

work. This chapter discusses the searching algorithm‟s efficiency for indexing and ranking

documents. Indexing extracts terms from a given document when uploaded to the server, to

indicate what the document is all about or summarize its content. This process takes extracted

terms and places them in an inverted index/file data structure. Searching pertains to posing a

query and awaiting results from the digital library (DL) system. Information retrieval is the

process of identifying the most relevant information that satisfies the given search query.

The point of using an index is to increase the speed and efficiency of searches of the document

collection. Without indexing, searching would have to be sequential, thus increasing the

complexity of the algorithms. An inverted index contains two parts: an index of terms generally

called the term index, which stores a distinct list of terms found in the document collection and,

for each term, a posting list, which is simply a list of documents that contain the term. When

submitting documents to the DL system, punctuations are removed, all terms converted to lower

case, and stop words are removed. Stop words are those terms with little information content,

e.g. conjunctions. This strategy will be discussed in depth later in this document.

Suppose there are two documents; D1 and D2 and D1 has the following contents: Mathematics

and Computer Science department whilst D2 contains: Department of Social Science.

Key terms: Information retrieval (IR), Inverted Index (II), ranking, stop words, stemming, term

weight, posting list.

Page 19: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

19

Table 5.1 Inverted file structure analogy

Term Document ;Term frequency

Mathematics 1;1

Computer 1;1

Science 1;1, 2;1

Department 1;1, 2;1

Social 2;1

Inverted Index architecture

Fig 5.2 Inverted Index architecture.

Indexing documents

A document is uploaded through an interface to add it to the collection. The index Builder class

is instantiated and constructed with the document name. The document is then indexed using the

indexDocument method which simply allows the Text Extractor instance to extract text from the

document and breaks the text into tokens and also filter the stop words. The stemmer instance

stems the tokens. The inverted Index class will then be instantiated to store the stemmed terms

into a hash table and a posting list is created for each term. The entire process forms the inverted

index.

Index

Builder

Do

cu

me

nt

Text

Extractor

Stemmer

Inverted Index

Posting List Hash Table

Stop words list

Ste

mm

ed

toke

ns

toke

ns

Page 20: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

20

Posting List(s) [implemented as linked lists]

A posting list indicates, for a given term, which documents contain the term. Typically, a

Linked list data structure is used to store the entries in a posting list. This is because in most

retrieval operations, a user enters a query and all documents that contain the query are obtained.

This is done by hashing on the term in the index and finding the associated posting list. Once the

posting list is obtained, a simple scan of the linked list yields all the documents that satisfy the

query.

Index Builder

The index builder drives the indexing process. The index builder loops through all the document

objects and calls the indexDocument method to add each document to the inverted index. Once

all the documents have been processed, the writeIndextoDisk method is used to store the

invertedIndex object to disk, which is read every time a new document is uploaded to check for

duplicates, a programming technique called SERIALIZABLE functions was used to make these

functions SERIALIZABLE so that each time the program runs, the inverted index is read from

disk hence all data in it will not only be available at runtime but saved to this inverted index file.

Applying stemming process c.f. Porter’s stemming algorithm

Stemming simply refers to changing all term forms to canonical versions. For example studying,

studies, and studied all map to study. Stemming reduces words by stripping off suffixes,

converting them to neutral stems that are devoid of tense, number, and in some languages case

and gender information. This relaxes the match between query terms and words in the documents

so that, for example, libraries is deemed equivalent to library. Stemming is not appropriate for

all queries, particularly those involving names and other very specific words.

This process avoids mapping words with different roots to the same term. Porter‟s Stemming

algorithm has been used to provide this service to the MACS DL system.

Below is a description of Porter‟s stemming algorithm, which can be found on the following

URL:http://snowball.tartus.org/text/introduction.html,

http://snowball.tartus.org/algorithms/lovins/stemmer.html.

Porter‟s stemming algorithm defines five successively applied steps of word transformation.

Each step consists of a set of rules in the form <condition> <suffix> → <new suffix>. For

example, a rule (m > 0) EED → EE means “if the word has at least one vowel and consonant

plus EED ending, change the ending to EE”. This would mean words such as “agreed” become

“agree”, while “feed” remains unchanged since the condition would not be satisfied hence

another production rule would be used.

Page 21: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

21

The algorithm is very concise, having just about sixty (60) rules, and very readable for a

programmer. It is also very efficient in terms of computation complexity as compared to other

affix and/ or statistical, stemming algorithms such as N-gram stemming, Hidden-Markov Model

(HMM) algorithm, to mention but a few, although HMM algorithms are beneficial in fields such

as machine translation and natural language processing, where numerous languages form the

data set.

The flaws identified with using classical stemmers like Porter‟s stemming algorithm is that they

often conflate words with similar syntax but completely different semantics. For example,

“news” and “new” are both stemmed to “new” while they belong to two quite different

categories.

Dr. Porter, did not only publish the standard implementation of his work written in C and Java

programming languages, but also developed a whole stemmers framework called Snowball. This

framework provides a stemmer definition script language and a translator to ANSI C and Java.

The main purpose was to enable programmers to develop their own stemmers for other character

sets or languages. Currently there are implementations for Romance, Germanic, Uralic and

Scandinavian languages as well as English, Russian, and Turkish on the websites given.

We chose Porter‟s stemming algorithm because of its efficiency in dealing with English related

corpus, and it really helped in paving the way for developing MACS DL.

Applying Stop words removal

Stop words make up a large fraction of the text in most documents. Eliminating such words from

consideration speeds processing, saves huge amount of disk space in indexes, and does not

damage retrieval effectiveness. A list of words filtered out during automatic indexing because

they make poor index terms is called a stop word list or a negative dictionary. These are words

such as: a, and, on, in, the, about etc. Here we remove the words such as articles, Prepositions,

conjunctions etc. from the documents. The following screen shot depicts an inverted index object

after indexing two documents; the output of the indexing module was as follows:

Page 22: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

22

Page 23: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

23

CHAPTER SIX Searching, Browsing, Ranking and Information Retrieval (IR)

IR aims to retrieve large amounts of data, as fast as possible from different kinds of information

stored in more than one form, be it visual, audio or textual. The user can retrieve information

through posing a query, where the information retrieval module/function will retrieve all the

information that satisfies the query. This is in contrast to what a database system does, where an

exact answer is retrieved from a database object that matches a query using a select statement. IR

systems do not retrieve a definite answer, but produce ranking of documents that seem to contain

information relevant to the query given to the system. This is a process called indexing, which

was covered earlier in this document. MACS DL information retrieval mechanism has been

engineered to produce only the results that best match the provided query, filtering unwanted

results.

Methodology

Several different types of IR mechanisms exist, but MACS DL system employs a method called

Inverted File indexing. This is the most well organized index structure for text query evaluation

as the system was developed to be used on textual digital objects.

IR systems high level architecture

A general scheme in figure 6.1 explains the essential structure of classical IR system. Through

the first phase is the preprocessing mechanism, the raw documents of the corpus are processed to

tokenized documents and then indexed as a list of postings per terms. At the second phase the

user gives a query to represent his "information need". The query is then transformed to a system

query and its relevant documents are retrieved from the index. The retrieved documents are

ranked according to their relevance to the query and returned to the user through a user interface,

later discussed in this document.

Figure 6.1: Classical IR system architecture

Page 24: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

24

Term Weighting

This text retrieval module, like the rest, has been designed based on a comparison of content

identifiers attached both to stored texts and to the user‟s information queries. A formal

representation of the term vectors is obtained by including in each vector all possible content

terms allowed in the system and adding term weight assignments to provide distinction amongst

terms. If Wk represents the weight of term k in document D or query Q, and t terms in all are

available for content representation, the term vectors for document D and/or query Q can be

written as:

D = (t0, w0, t1, w1,...., tn, wn) and Q = (q0, w0; q1, w1;. . .; qr,wr).

Searching process

Searching is the most important part of the DL system. Information is retrieved based on the

search process. This technique gives results based on the relevancy of the query provided.

Finally, the related documents are then displayed on an output interface as links on a web page.

The following screenshot shows the result of searching, after three documents were indexed

correctly.

Document 1 – a document on digital libraries

Document 2 - a document on digital libraries and Information retrieval

Document 3 – a document on distributed databases.

Query = Introduction to digital libraries

The computations of the term weights, term frequency, in relation to an uploaded document gave

the following output:

Page 25: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

25

Ranking retrieved documents

Ranking uses similarity to select items that can be used in ranking the output triggered by a

query. This involves ordering from the most likely items that satisfy the query. It also displays

the most likely relevant terms first. To rank a document retrieved by a query similarity between

them has to be calculated. The below formula is used to measure similarity between query and

item.

Ranking is done in two phases, these are:

Coarse grain ranking – Documents are sorted depending on the frequency of the

query tokens. The document that contains all query terms will be ranked first.

Fine grain ranking – Depends upon weights of terms. In this phase, the similarity

function is calculated between document and query.

This module sorts the retrieved documents based on their relevance to the query posed, using the

following formula:

The following screenshot depicts the result of a query, with the results ranked according to their

relevance to the query posted.

Page 26: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

26

In ranking, an artificial measure is used to gauge the similarity of each document to the query,

and a fixed number of the closest matching documents are returned as answers.

Metadata browsing

Browsing is often described as the other side of the coin from searching, but really the two are at

opposite ends of a spectrum. Searching is purposeful, whereas browsing tends to be casual.

Terms such as random, informal, unsystematic, and without design are used to capture the

unplanned nature of browsing and, often, the lack of a specific goal. Searching implies that you

know what you‟re looking for, whereas browsing implies that you‟ll know it when you see it.

The metadata provided with the documents in a collection can support different browsing

activities. Information collections that are entirely devoid of metadata can be searched. This is

one of the real strength of full-text searching, but they cannot be browsed in any meaningful way

unless some additional data is present. The structure that is implicit in metadata is the key to

providing browsing facilities. Here are some examples of browsing:

Lists: This is the simplest structure that is simply an ordered list. It can either be

alphabetical, in an ascending or descending order.

Dates: An automatically generated selector gives a choice of years, months and dates that

can be used to browse metadata.

Name: Offers users the flexibility of browsing collections using author‟s names. For

example Deitel.

Title: Users can browse collections using the titles of the documents in a pool of

collections. For example, Advanced Java Programming.

Page 27: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

27

CHAPTER SEVEN User Interface (UI) Design

A user interface describes how users of the system interact with it. Human Computer Interaction

(HCI) basics and principles have been employed in developing the MACS DL user interfaces, to

enable users to have a seamless interaction with the system. Common interface styles that were

used are:

Menus

Forms

Principles of UI design

Consistency: The system is expected to be consistent. MACS DL achieved consistency in

the choice of colors used. The system has consistent interfaces and styles.

Learn-ability: The system should be easy to learn how to use. MACS DL is very easy to

use, providing labels and necessary information to guide users on how to best utilize it.

Informative feedback: The system should provide informative feedback to users after an

operation was performed. MACS DL adheres to this principle as at each instance, the

system provides users with feedback after a query was posed and results displayed.

Provide error prevention and handling: The system must have mechanisms to prevent

users from committing errors and if any, be able to handle them. MACS DL is no

exception as it prevents errors and system crashes.

Off-load the short term memory: Reduces the number of steps users have to perform

when carrying out an operation. MACS DL was designed to have interfaces with links

and proper labels that make users to remember easily.

Provide short-cuts for users: The system provides hyperlinks as a form of shortcuts to

navigate web pages.

System dialogue yielding closure: The system informs its users about its current state at

each instance. For example, after posing a query, the system retrieves the results with a

message that reads “RESULTS MATCHING THE QUERY” to yield closure of the IR

operation.

Page 28: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

28

Provide internal locus of control: The system allows users to be in control of it. Every

operation the system performs is triggered by users. For example, documents retrieved

are only downloaded when clicked.

Technologies used in the UI development

Java server pages and servlets to make the system web based.

Java scripts

eXtensible Mark-up Language (XML) to allow file formats

Hypertext Mark-up Language (HTML)

Cascading style sheets to provide presentable documents with minimal effort

eXtensible Style sheet Language (XSL) for supporting XML and HTML that are XML

compliant.

Page 29: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

29

CHAPTER EIGHT System Testing and Evaluation

The system was frequently tested for errors after completing each module. Testing is the process

of exercising a program with the specific intent of finding errors prior to delivery to the end user.

The system was thoroughly tested mainly to show the following:

Errors

Requirements conformance

Performance

Quality

Who did the testing?

The developing team did most of the testing while independent testers were also invited to test

the system.

Testing strategies

Unit test

Integration test

White box test

Validation test

System test

Regression test

The following table depicts some of the modules and criteria used in the testing phase.

Table 8.1 Test results

Test case Test strategy Description Results

GUI functionality Unit test Testing action performed when

buttons and controls are clicked

PASS

Code snippets Integration test Integrating modules to form a

complete system

PASS

System performance White box test Accessing the system on concurrently PASS

Page 30: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

30

System functionality Integration and

Unit test

Integrating system modules and

testing each of them for functionality

PASS

Databases connectivity Integration test Integrating a third party software,

Oracle 10g database server

PASS

Human Computer

Interaction

Unit test Each module was tested for interaction

with users

PASS

Textual objects Validation test Testing if the material uploaded is text

or not

PASS

Error handling Regression test Testing system errors PASS

System Evaluation

Evaluating the system for users to accept it as a usable tool. Direct observation and Pilot study

evaluation techniques were used to find out the user‟s views during this phase.

Direct observation: The developers observed directly when some sampled users were

evaluating the system. Users had to perform all the operations that are implemented in the

MACS DL system and evaluate results.

Pilot study: A small group of users was asked a set of questions regarding the system.

Using a questionnaire, the pilot study was conducted and users provided their evaluation

heuristics. Some of the questions asked were: Is the system usable? Is the system useful?

Evaluation results

The results obtained from the system evaluation phase were used to enhance the system‟s

functionality to make it more effective and efficient. The results were collected to guide the

developers and also users on how to improve the system and how to best use it, respectively.

The following is an in depth analysis of results obtained from the evaluation phase:

Page 31: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

31

Direct observation of users:

We tried to investigate the factors that influence the perceived ease of use and usefulness of

digital libraries among NUL students. Data were collected from under-graduate students at NUL.

Individual undergraduate students were the population sample identified, and using stratified

sampling method, each student around the NUL campus such as the Thomas Mofolo library and

classrooms was handed a questionnaire.

Evaluation results and analysis

Out of One hundred and fifty questionnaires that were distributed, only sixty nine were returned,

giving a response rate of 46%. Based on the study, 60% of the respondents were Computer

Science and Engineering students, 20% were from social sciences and 10% from humanities.

Table 8.2

A scale of 1 to 4, ranking as follows was used to grade the scores:

1 = Best, 2 = Good, 3 = Not sure, 4 = Bad

Item evaluated Score Answer

HCI (Usability) 1 Best

HCI (Functionality) 1 Best

Project functionality 1 Best

System training

There will be no need of training the users as the system is usable and easy to learn.

Furthermore, MACS DL system is no exception to the already web based existing digital

library systems that are in use today, which NUL MACS department students are already

accustomed to using.

Conclusions and future prospects

MACS DL system was a success, making it an exciting endeavor that served as an eye opener to

the developers in their academic career as plenty of new computing concepts were learned during

the execution of this project. The system is ready for deployment and use in an organization as

huge as NUL.

This system covers major parts of search engine implementation like stop-word removal,

stemming, automatic Indexing, searching. To make this system a complete search engine we

could add other parts of it like clustering and thesaurus expansion. We could implement this

Page 32: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

32

system for any digital objects collection such as videos, images etc. This system takes a lot of

time to upload large documents, perhaps in the future new implementation strategies could be

employed to make this faster.

References

1. Arms, W. Digital Libraries. MIT Press, Cambridge, MA, 2000.

2. Alexa T.M and Marie E.G, Principles For Digital Library Development, accessed on

September 10th

, 2011, from

http://www.lhncbc.nlm.nih.gov/dlb/pubs/200105_cacm_mccray.pdf

3. Bin Li at, The History of Digital Libraries. Accessed on September 12th

, 2011, from

http://www.ils.unc.edu/~lib/digital-library.html

4. Gerald Salton and Christopher Buckley Term-Weighting approaches in automatic text

retrieval, Cambridge, 2000.

5. Williams B. Frakes and Ricardo Baeza- Yates, Information Retrieval: Data Structures &

Algorithms, 88-94

6. Witten et al, How to Build a Digital Library, Morgan Kaufman Publishers

Page 33: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

33

APPENDIX A Acronyms

A1. MACS – Mathematic and Computer Science

A2. NUL – National University of Lesotho

A3. FOST- Faculty of Science and Technology

A4. IR – Information Retrieval

A5. FI – File indexing

A6. II – Inverted Index

A7. UI – User Interface

A8. CRC – Class Responsibility Collaborator

A9. DFD – Data Flow Diagram

A11. RE – Requirements Engineering

A12. SRS – System Requirements Specification

Page 34: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

34

APPENDIX B Programs

//Cascading Style sheet

#header1

{

height:200px;

padding-top: 20px;

padding: 0px 0px 0px 0px;

width: 900px;

background-repeat:no-repeat;

background-position:top;

padding-bottom: 3px;

}

#logos

{

font-family: Arial,sans-serif;

color:#FFFFFF;

font-size:18px;

font-style:italic;

padding: 15px 0px 0px 135px;

background:url(images/buka.jpg) left top no-repeat;

height: 200px;

}

*

{

border: 0;

margin: 0;

Page 35: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

35

}

#uploader-button

{

font-family: Arial, Helvetica, sans-serif;

font-size: 12px;

font-weight:normal;

color: #ffffff;

width: 60px;

height: 21px;

background: url(images/read.gif);

background-repeat:no-repeat;

background-position:left top;

border: none;

float:right;

}

img

{

border: 0px;

}

body{

font: 12px Arial, Helvetica, sans-serif;

color: #000000;

background: url(images/body_bg.jpg) top repeat-x #FFFFFF;

line-height: 20px;

}

#bg{

background: url(images/bg.jpg) center top no-repeat;

}

Page 36: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

36

/* search */

#search

{

float:right;

padding-right:45px;

padding-top:1px

}

#search form

{

margin: 0;

}

#search fieldset

{

margin: 0;

padding: 0;

border: none;

}

#search input

{

float: left;

font: 11px Georgia, "Times New Roman", Times, serif;

}

#search-text

{

Page 37: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

37

width: 230px;

height: 19px;

padding-top: 4px;

padding-left: 10px;

padding-right: 12px;

border: none;

background: url(images/search.png);

background-repeat:no-repeat;

background-position:left top;

color: #000000;

}

#search-submit

{

width: 40px;

height: 23px;

background: url(images/search2.png);

background-repeat:no-repeat;

background-position:left top;

border: none;

}

/*MENU*/

/*MENU*/

#menu

{

width:650px;

height:55px;

}

Page 38: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

38

#menu ul

{

list-style:none;

padding-left:0px;

}

#menu li

{

display:inline;

}

#menu ul li a

{

font-family: Arial,sans-serif;

font-size: 18px;

font-weight:normal;

color: #008ae8;

float: left;

width: 85px;

height: 30px;

display: block;

text-align: left;

text-decoration: none;

padding-top: 5px;

padding-left:40px;

background: url(images/menu_bg.png);

background-repeat:no-repeat;

background-position:10px 5px;

}

#menu a:hover

{

width: 85px;

height: 35px;

color: #093285;

Page 39: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

39

text-decoration: none;

background: url(images/menu_hov.png);

background-repeat:no-repeat;

background-position:10px 5px;

}

#left_part

{

width: 100px;

float:left;

padding: 0px 0px 0px 0px;

}

.main_top

{

background: url(images/main_top.png) no-repeat top;

height: 15px;

}

.main_bot

{

background: url(images/main_bot.png) no-repeat top;

height: 15px;

width:750px;

padding-bottom: 10px;

}

.main_bg1

{

background: url(images/main_bg.png);

padding-left: 8px;

color: black;

Page 40: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

40

padding-right: 7px;

font-family: Tahoma;

}

/*main page*/

#main

{

width: 900px;

margin: 0px auto;

background:url(images/main.jpg) right top no-repeat;

}

#main2

{

width: 750px;

height: 400px;

margin-left: 8px;

clear:both;

/*background: url(images/left_bg.jpg);*/

background-repeat:repeat-y;

background-position:left;

}

#header {

width:900px;

height: 100px;

}

#logo {

padding: 0px 0px 0px 0px;

height: 113px;

Page 41: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

41

}

#logo H2 {

font-family: Arial, Helvetica, sans-serif;

color:#000000;

font-size:18px;

font-style:italic;

}

#logo a {

text-decoration: none;

text-transform: lowercase;

font-style: italic;

font-size: 16px;

color: #000000;

}

#logo H2 a{

font-size: 12px;

font-family: Arial, Helvetica, sans-serif;

font-weight:100;

}

/* buttons */

#buttons

{

text-align:center;

height: 30px;

margin: 0px auto;

padding: 0px 0px 0px 0px;

Page 42: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

42

background: url(images/buttons.png);

width: 600px;

}

#buttons a

{

font-family: Georgia, "Times New Roman", Times, serif;

font-size: 18px;

display: block;

float: left;

text-decoration: none;

color: #0059FF;

text-align: center;

padding-top: 0px;

font-weight:100;

width: 170px;

}

#buttons .but:hover {

text-decoration:underline;

}

.top { height:334px;

padding-top: 10px;

padding-left: 10px;

background:url(images/top.jpg) left top no-repeat;

}

.top_bot {

background: url(images/top_bot.jpg) left top no-repeat;

height: 28px}

#content

Page 43: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

43

{

width: 876px;

margin: 0px auto;

background: #E6F6FF;

padding: 0px 12px 5px 12px;

line-height: 22px;

background-repeat:repeat-y;

text-align: left;

background-position:left;

}

#content_razd {

background: url(images/content_razd.gif) 586px repeat-y ;

}

#content_top {

width: 900px;

background: url(images/content_top.png) 0px top no-repeat ;

height: 10px;

}

#content_bot {

width: 900px;

background: url(images/content_bot.png) 0px bottom no-repeat ;

height: 9px;

}

.float_l {

float:left;}

.col {

width: 265px;

Page 44: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

44

float:left;

padding: 0px 0px 0px 0px;}

.col_razd {

background:url(images/col_text.gif) center repeat-y;

height: 124px;

width: 40px;

float:left;

margin-top: 35px;

}

h1 {

padding: 0px 0px 5px 0px;

font-family: Georgia, "Times New Roman", Times, serif;

font-size: 16px;

font-weight: bold;

color:#051B93;}

#left{

width: 558px;

float: left;

color:#000000;

margin-left: 0px;

}

.text{

padding: 0px 0px 15px 0px;

}

.img_l { float:left;

Page 45: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

45

margin: 6px 15px 40px 0px;

}

.img_r { float: right;

margin: 9px 10px 3px 10px;

}

.span_cont { color: #07249F;

font-size:12px;

font-weight:bold;

}

#content H2{

font-family: Georgia, "Times New Roman", Times, serif;

font-size:16px;

font-weight: bold;

color: #07249F;

text-align: left;

padding: 5px 0px 5px 0px;

}

.read_r{

text-align: right;

padding: 0px 8px 0px 0px;

background: url(images/read.gif) right 3px no-repeat;

}

.razd_g {

background: url(images/razd_g.gif) 0px 2px repeat-x;

height: 5px;

}

Page 46: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

46

.read_r a {

font-size:12px;

color: #ffffff;

text-decoration: none;

padding-right: 9px;

}

.next {

width: 100%;

text-align: right;

padding: 0px 0px 0px 0px;}

.next a{

color:#FFFFFF;

text-decoration: none; }

.next a:hover {

text-decoration: underline; }

.more {

text-align:right;}

.more a {

color: #009FFF;

text-decoration:none;

}

#right{

float: right;

Page 47: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

47

width: 270px;

}

.span_dat {

color: #002380;

text-decoration: underline;}

#bottom {

background: #E6F6FF;

margin: 0px auto;

color:#000000;

padding: 0px 0px 0px 15px;

}

#b_col1 {

width: 220px;

float: left;

margin-left: 0px;

}

#b_col2 {

width: 180px;

float: left;

margin-left: 57px;

}

#b_col3 {

width: 160px;

float: left;

margin-left: 20px;

text-align: left;

}

#b_col4 {

Page 48: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

48

width: 184px;

float: left;

margin-left: 35px;

text-align: left;

}

.a_icons {

color:#FF0000;

text-decoration:none;}

.a_icons:hover {

text-decoration: underline;}

#bottom ul {

list-style:none;

padding: 0px 0px 0px 0px;}

#bottom li {

padding: 8px 0px 0px 0px;

}

#bottom ul a:hover {

text-decoration:underline;

}

#bottom ul a {

color:#000000;

text-decoration:none;

font-weight: 100;}

.fu_i {

padding: 0px 14px 0px 0px;

vertical-align: middle ;

}

Page 49: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

49

#b_col2 ul {

list-style:none;

padding: 0px 0px 0px 0px;}

#b_col2 li {

padding: 4px 0px 0px 18px;

background: url(images/fish2.gif) 0px 11px no-repeat;}

#b_col2 a {

color:#FFFFFF;

}

#footer{

font-size: 11px;

color: #000000;

text-align: center;

padding: 20px 0px 0px 0px;

height: 60px;

text-align: center;

margin: 0px auto;

}

#footer a{

color: #000000;

font-size: 11px;

text-decoration: none;

}

#footer a:hover{

color: #000000;

font-size: 11px;

Page 50: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

50

text-decoration: underline;

}

/* ------------------------------------------------------------------------

DO NOT CHANGE THE FOLLOWING

------------------------------------------------------------------------- */

div.pp_overlay {background: #000;display: none;left: 0;position: absolute;top: 0;width: 100%;z-index: 9500;}

div.pp_pic_holder {display: none;position: absolute;width: 100px;z-index: 10000;}

//Java source code for Index class, index.java

package InvertedIndex;

import InvertedIndex.Index.PostingListNode;

import java.io.Serializable;

import java.util.ArrayList;

import java.util.Hashtable;

public class Index implements Serializable {

public class documentVector implements Serializable {

public String docId;

public double score;

public ArrayList docVector;

public documentVector() {

//compiled code

throw new RuntimeException("Compiled Code");

}

public documentVector(String documentId) {

//compiled code

Page 51: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

51

throw new RuntimeException("Compiled Code");

}

}

public class PostingList implements Serializable {

public PostingListNode first;

public int documentFrequency;

public PostingList() {

//compiled code

throw new RuntimeException("Compiled Code");

}

public void Add(PostingListNode Node) {

//compiled code

throw new RuntimeException("Compiled Code");

}

}

public class PostingListNode implements Serializable {

public String documentId;

public int docReference;

public int termFrequency;

public PostingListNode next;

public PostingListNode() {

//compiled code

throw new RuntimeException("Compiled Code");

}

Page 52: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

52

public PostingListNode(String docId, int tf, int docRef) {

//compiled code

throw new RuntimeException("Compiled Code");

}

}

private ArrayList PostingLists;

private int count;

private Hashtable<String, Integer> IndexTerms;

public int numOfdocuments;

public ArrayList docVectors;

public ArrayList queryVector;

private ArrayList queryterms;

public Hashtable<String, Integer> documents;

private String stopwordsPath;

public Index(String stopwords_Path) {

//compiled code

throw new RuntimeException("Compiled Code");

}

public void addIndexTerm(String termId, String docId, int tf) {

throw new RuntimeException("Compiled Code");

}

public void Search(String query) throws Exception {

//compiled code

throw new RuntimeException("Compiled Code");

}

public void getVectors() {

//compiled code

throw new RuntimeException("Compiled Code");

}

public void computeScores() {

//compiled code

Page 53: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

53

throw new RuntimeException("Compiled Code");

}

public void sortDocuments() {

//compiled code

throw new RuntimeException("Compiled Code");

}

public ArrayList RetrieveAnswer(String query) throws Exception {

//compiled code

throw new RuntimeException("Compiled Code");

}

}

//Source code for class IndexBuider

package InvertedIndex;

import java.util.ArrayList;

import java.util.Hashtable;

public class IndexBuilder {

public Index invertedIndex;

private String document;

private String response;

private Hashtable<String, Integer> termsfrequency;

public ArrayList QueryResults;

private TextExtractor Extractor;

public IndexBuilder(String stopwords_path) throws Exception {

//compiled code

throw new RuntimeException("Compiled Code");

}

public IndexBuilder(String docId, String stopwords_path) {

Page 54: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

54

//compiled code

throw new RuntimeException("Compiled Code");

}

private int frequency(ArrayList tokens, String term) {

//compiled code

throw new RuntimeException("Compiled Code");

}

public void indexDocument() throws Exception {

//compiled code

throw new RuntimeException("Compiled Code");

}

public void SaveIndexToDisk(String path) throws Exception {

//compiled code

throw new RuntimeException("Compiled Code");

}

public void ReadIndexFromDisk(String path) throws Exception {

//compiled code

throw new RuntimeException("Compiled Code");

}

public void AnswerQuery(String query) throws Exception {

//compiled code

throw new RuntimeException("Compiled Code");

}

public static void main(String[] args) throws Exception {

//compiled code

throw new RuntimeException("Compiled Code");

Page 55: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

55

}

}

//Java source code for class StemText

package InvertedIndex;

import java.util.ArrayList;

class StemText {

private char[] b;

private int i;

private int i_end;

private int j;

private int k;

private static final int INC = 50;

public StemText() {

//compiled code

throw new RuntimeException("Compiled Code");

}

public void add(char ch) {

//compiled code

throw new RuntimeException("Compiled Code");

}

public void add(char[] w, int wLen) {

//compiled code

throw new RuntimeException("Compiled Code");

}

Page 56: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

56

public String toString() {

//compiled code

throw new RuntimeException("Compiled Code");

}

public int getResultLength() {

//compiled code

throw new RuntimeException("Compiled Code");

}

public char[] getResultBuffer() {

//compiled code

throw new RuntimeException("Compiled Code");

}

private final boolean cons(int i) {

//compiled code

throw new RuntimeException("Compiled Code");

}

private final int m() {

//compiled code

throw new RuntimeException("Compiled Code");

}

private final boolean vowelinstem() {

//compiled code

throw new RuntimeException("Compiled Code");

}

private final boolean doublec(int j) {

//compiled code

Page 57: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

57

throw new RuntimeException("Compiled Code");

}

private final boolean cvc(int i) {

//compiled code

throw new RuntimeException("Compiled Code");

}

private final boolean ends(String s) {

//compiled code

throw new RuntimeException("Compiled Code");

}

private final void setto(String s) {

//compiled code

throw new RuntimeException("Compiled Code");

}

private final void r(String s) {

//compiled code

throw new RuntimeException("Compiled Code");

}

private final void step1() {

//compiled code

throw new RuntimeException("Compiled Code");

}

private final void step2() {

//compiled code

throw new RuntimeException("Compiled Code");

}

Page 58: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

58

private final void step3() {

//compiled code

throw new RuntimeException("Compiled Code");

}

private final void step4() {

//compiled code

throw new RuntimeException("Compiled Code");

}

private final void step5() {

//compiled code

throw new RuntimeException("Compiled Code");

}

private final void step6() {

//compiled code

throw new RuntimeException("Compiled Code");

}

public void stem() {

//compiled code

throw new RuntimeException("Compiled Code");

}

public ArrayList stemIndexTerms(ArrayList textTokens) {

//compiled code

throw new RuntimeException("Compiled Code");

}

}

//Source code for class stopwords

package InvertedIndex;

Page 59: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

59

import java.io.BufferedReader;

import java.io.IOException;

import java.util.Hashtable;

public class StopWords {

public Hashtable<String, Integer> stopWords;

private BufferedReader stopWordsFile;

private int count;

public StopWords(String path) throws IOException {

//compiled code

throw new RuntimeException("Compiled Code");

}

}

//Source code for class TextExtractor

package InvertedIndex;

import java.io.File;

import java.io.IOException;

import java.util.ArrayList;

import javax.xml.parsers.ParserConfigurationException;

import org.xml.sax.SAXException;

public class TextExtractor {

private File file;

private String filename;

public String textFromFile;

public ArrayList Tokens;

private String stopwordsPath;

public TextExtractor(String stopwords_Path) {

Page 60: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

60

//compiled code

throw new RuntimeException("Compiled Code");

}

public TextExtractor(String Filename, String stopwords_Path) {

//compiled code

throw new RuntimeException("Compiled Code");

}

public void ExtractText() throws Exception {

//compiled code

throw new RuntimeException("Compiled Code");

}

public void indexTerms() throws Exception {

//compiled code

throw new RuntimeException("Compiled Code");

}

private void pdfFile() throws Exception {

//compiled code

throw new RuntimeException("Compiled Code");

}

private void docxFile() throws IOException, ParserConfigurationException, SAXException {

//compiled code

throw new RuntimeException("Compiled Code");

}

private void pptFile() throws IOException {

//compiled code

throw new RuntimeException("Compiled Code");

}

private void txtFile() throws IOException {

//compiled code

throw new RuntimeException("Compiled Code");

}

}

Page 61: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

61

//Source code for

Page 62: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

62

//Java source code creating a home page interface

<%@page contentType="text/html" pageEncoding="UTF-8"%>

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"

"http://www.w3.org/TR/html4/loose.dtd">

<head>

<meta http-equiv="content-type" content="text/html; charset=utf-8" />

<title>Mathematics & Computer Science Digital Library System</title>

<meta name="keywords" content="" />

<meta name="description" content="" />

<script type="text/javascript" src="lib/jquery-1.3.2.min.js"></script>

<script type="text/javascript" src="lib/jquery.tools.js"></script>

<script type="text/javascript" src="lib/jquery.custom.js"></script>

<link href="styles.css" rel="stylesheet" type="text/css" />

<link href="style.css" rel="stylesheet" type="text/css" />

</head>

<script language="JAVASCRIPT" type="TEXT/JAVASCRIPT">

function confirmMessage()

{

//display a confirmation box yielding closure of a system operation

{

alert("File successfully uploaded to server");

}

}

$(document).ready(function()

{

var passfield = document.getElementById('password_field_id');

passfield.type = 'text';

});

function focusCheckDefaultValue(field, type, defaultValue)

{

Page 63: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

63

if (field.value == defaultValue)

{

field.value = '';

}

if (type == 'pass')

{

field.type = 'password';

}

}

function blurCheckDefaultValue(field, type, defaultValue)

{

if (field.value == '')

{

field.value = defaultValue;

}

if (type == 'pass' && field.value == defaultValue)

{

field.type = 'text';

}

else if (type == 'pass' && field.value != defaultValue)

{

field.type = 'password';

}

}

</script>

<body>

<div id="bg">

Page 64: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

64

<div id="main">

<div id="content">

<div class="navi"></div>

<div id ="header1">

<div id="menu">

<ul>

<!--create button links-->

<li id="button1"><a href="macsdl.jsp" title="">Home</a></li>

<li id="button2"><a href="ByAuthor.jsp" title="">Browse</a></li>

<li id="button2"><a href="#" title="">Contacts</a></li>

</ul>

</div>

<div id ="logos"></div>

<div id="search">

<form method="get" action="searchResults.jsp">

<fieldset>

<input type="text" name="search" id="search-text" size="25" value ="Search"

onFocus="javascript:focusCheckDefaultValue(this, '', 'Search');"

onBlur="javascript:blurCheckDefaultValue(this, '', 'Search');"

>

<input type="submit" id="search-submit" value="" />

</fieldset>

</form>

</div>

</div>

<br/><br/>

Page 65: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

65

<div align="left">

<img src="images/img11.jpg" class="img_l" align="left"alt="" /><br/><br/>

<span class="span_cont">About MACS DL </span><br />

MACS DL is an educational portal for higher learning, with unlimited amounts of large pools of books,journals etc, everything you

ever needed.

</div>

<form enctype="Multipart/form-data" action="uploadfile.jsp" method="post" >

<br/><br/><br/>

<center>

<table border="2">

<tr>

<center>

<td colspan="2">

<p align ="center"><b>Upload and share your files with the NUL community</b>

</td>

</center>

</tr>

<tr>

<td>

<b>Choose a file to upload:</b>

</td>

<td>

<input name="inputfile" type="file">

</td>

</tr>

<tr>

<td colspan="2">

<p align="right"><input type="submit" id ="uploader-button" value="UPLOAD"

onclick="confirmMessage()"></p>

</td>

Page 66: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

66

</tr>

</table>

</center>

</form>

<div class="razd_g"></div><br />

<div class="col">

<h1>Add to the MACS DL</h1>

<img src="images/col_img1.jpg" class="img_l" alt="" />Add you objects and share with the NUL community by uploading your

files<br/>to the server, download and get stuff you need most!

</div>

<div class="col_razd"></div>

<div class="col">

<h1 class="tit">Browse by date</h1>

<img src="images/col_img2.jpg" class="img_l" alt="" />Browse the collection by date, specify the date and browse freely.

</div>

<div class="col_razd"></div>

<div class="col">

<h1 class="tit">SEARCH MACS DL</h1>

<img src="images/col_img3.jpg" class="img_l" alt="" />Type any query in the above search text field and click the search

button. Get the results instantly!

</div>

<div style="clear: both"></div>

<div style="height:15px; width: 100%"></div>

<div class="razd_g"></div>

<div style="clear: both"></div>

</div>

<div id="content_bot"></div>

<!-- content ends -->

Page 67: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

67

<div style="height:15px; width: 100%"></div>

<!-- bottom end -->

<!-- footer begins -->

<div id="footer">

<p>Copyright 2012<p>Design by

<a href="http://www.nul.ls" title="MACS DL">Mosola Napo N</a>

<!--End of notice --></p><!-- end of copyright notice-->

</div>

<!-- footer ends -->

</div>

</div>

</body>

</html>

//Java Source code for Uploading files

<%@page contentType="text/html" pageEncoding="UTF-8"%>

<%@page language="java"%>

<%@page import="InvertedIndex.*"%>

<%@page import ="java.io.File,java.io.FileInputStream,java.io.InputStream"%>

<%@page import="java.io.*,java.util.*, javax.servlet.*" %>

<%@page import="javax.servlet.http.*,javax.servlet.ServletException"%>

<%@page import="org.apache.commons.fileupload.*" %>

<%@page import="org.apache.commons.fileupload.disk.*"%>

<%@page import="org.apache.commons.fileupload.servlet.*" %>

<%@page import="org.apache.commons.io.output.*" %>

<%

//

//

//Upload document to the server.

File file ;

Page 68: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

68

// Verify the content type

String contentType = request.getContentType();

if ((contentType.indexOf("multipart/form-data") >= 0))

{

DiskFileItemFactory factory = new DiskFileItemFactory();

String Path="C:/Users/KELVIN/Documents/NetBeansProjects/DigitalLibrarySearch/documents/";

factory.setRepository(new File(Path));

String filename=null;

// Create a new file upload handler

ServletFileUpload upload = new ServletFileUpload(factory);

try

{

// Parse the request to get file items.

List fileItems = upload.parseRequest(request);

// Process the uploaded file items

Iterator i = fileItems.iterator();

while ( i.hasNext () )

{

FileItem fi = (FileItem)i.next();

filename=fi.getName();

file=new File(Path+filename);

fi.write( file ) ;

%>

You have successfully uploaded the file by the name of:<br>

<%=filename%>

<%

}

}catch(Exception ex) {

Page 69: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

69

%>

<%=ex%>

<%

}%>

<%@page import="org.apache.tika.metadata.Metadata"%>

<%@page import="org.apache.tika.parser.AutoDetectParser"%>

<%@page import="org.apache.tika.sax.BodyContentHandler"%>

<%@page import="java.sql.*"%>

<%

try

{

Connection conn=null;

// ResultSet results=null;

Statement stat;

//Class.forName("com.mysql.jdbc.Driver");

//conn=DriverManager.getConnection("jdbc:mysql://localhost:3306/dl",

// "root",

// "admin");

Class.forName("oracle.jdbc.driver.OracleDriver");

conn=DriverManager.getConnection

("jdbc:oracle:thin:dl/admin@localhost:1521/XE");

String resourceLocation = Path+filename;

File file2 = new File(resourceLocation);

InputStream input = new FileInputStream(file2);

Metadata metadata = new Metadata();

BodyContentHandler handler = new BodyContentHandler();

AutoDetectParser parser = new AutoDetectParser();

parser.parse(input, handler, metadata);

String Author= metadata.get(Metadata.AUTHOR);

String Title=metadata.get(Metadata.TITLE);

String last_modified=metadata.get(Metadata.LAST_MODIFIED);

Page 70: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

70

%>

Author:

<%=Author %><b></b>Title:

<%=Title %><b></b>Last_Modified:

<%=last_modified %>

<%

if(Author!=null&&Title!=null&&last_modified!=null)

{

stat=conn.createStatement();

int count=stat.executeUpdate

("insert into browse Values('"+Author.toLowerCase()+"','"+

Title.toLowerCase()+"','"+last_modified.toLowerCase()+"','"+filename+"')");

}

}

catch(SQLException exc)

{

;

}%>

<%

//

//Index the uploaded document

IndexBuilder index = new IndexBuilder(Path+filename,Path+"stopwords.txt");

index.ReadIndexFromDisk(Path+"invertedIndex.object");

index.indexDocument();

index.SaveIndexToDisk(Path+"invertedIndex.object");

%>

<%

}else

{

%>

No document uploaded!

<%

Page 71: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

71

}

%>

%>

<meta http-equiv="refresh" content="0; URL=http://localhost:8080/DigitalLibrarySearch/macsdl.jsp">

<meta name="keywords" content="automatic redirection">

//Java server page for Browsing: Browse by Author

<%@page contentType="text/html" pageEncoding="UTF-8"%>

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"

"http://www.w3.org/TR/html4/loose.dtd">

<html>

<head>

<meta http-equiv="content-type" content="text/html; charset=utf-8" />

<title>Mathematics & Computer Science Digital Library System</title>

<meta name="keywords" content="" />

<meta name="description" content="" />

<script type="text/javascript" src="lib/jquery-1.3.2.min.js"></script>

<script type="text/javascript" src="lib/jquery.tools.js"></script>

<script type="text/javascript" src="lib/jquery.custom.js"></script>

<link href="styles.css" rel="stylesheet" type="text/css" />

<link href="style.css" rel="stylesheet" type="text/css" />

</head>

<script language="JAVASCRIPT" type="TEXT/JAVASCRIPT">

function confirmMessage()

{

//display a confirmation box asking the visitor if they want to get a message

Page 72: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

72

{

alert("File successfully uploaded to server");

}

}

$(document).ready(function()

{

var passfield = document.getElementById('password_field_id');

passfield.type = 'text';

});

function focusCheckDefaultValue(field, type, defaultValue)

{

if (field.value == defaultValue)

{

field.value = '';

}

if (type == 'pass')

{

field.type = 'password';

}

}

function blurCheckDefaultValue(field, type, defaultValue)

{

if (field.value == '')

{

field.value = defaultValue;

}

if (type == 'pass' && field.value == defaultValue)

{

field.type = 'text';

}

Page 73: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

73

else if (type == 'pass' && field.value != defaultValue)

{

field.type = 'password';

}

}

</script>

<body>

<div id="bg">

<div id="main">

<div id="content">

<div class="navi"></div> <!-- create automatically the point dor the navigation depending on the numbers of items -->

<div id ="header1">

<div id="menu">

<ul>

<li id="button1"><a href="macsdl.jsp" title="">Home</a></li>

<li id="button2"><a href="ByAuthor.jsp" title="">Browse</a></li>

<li id="button2"><a href="#" title="">Contacts</a></li>

</ul>

</div>

<div id ="logos"></div>

<div align="center">

<br/>

<center>

Page 74: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

74

<a href="Browse.jsp" >Browse by author</a><br/><br/>

<a href="BrowseByTittle.jsp" >Browse title</a><br/><br/>

<a href="BrowsebyDate.jsp" >Browse by date</a><br/><br/>

</center>

</div>

</div>

<br/><br/><br/><br/><br/><br/><br/><br/><br/><br/><br/><br/>

<div class="razd_g"></div><br />

<div class="col">

<h1>Add to the MACS DL</h1>

<img src="images/col_img1.jpg" class="img_l" alt="" />Add you objects and share with the NUL community by uploading your

files<br/>to the server, download and get stuff you need most!

</div>

<div class="col_razd"></div>

<div class="col">

<h1 class="tit">Browse by date</h1>

<img src="images/col_img2.jpg" class="img_l" alt="" />Browse the collection by date, specify the date and browse freely.

</div>

<div class="col_razd"></div>

<div class="col">

<h1 class="tit">SEARCH MACS DL</h1>

<img src="images/col_img3.jpg" class="img_l" alt="" />Type any query in the above search text field and click the search

button. Get the results instantly!

</div>

<div style="clear: both"></div>

<div style="height:15px; width: 100%"></div>

<div class="razd_g"></div>

<div style="clear: both"></div>

Page 75: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

75

</div>

<div id="content_bot"></div>

<!-- content ends -->

<div style="height:15px; width: 100%"></div>

<!-- bottom end -->

<!-- footer begins -->

<div id="footer">

<p>Copyright 2012<p>Design by

<a href="http://www.nul.ls" title="MACS DL">Mosola Napo N</a>

<!--End of notice --></p><!-- end of copyright notice-->

</div>

<!-- footer ends -->

</div>

</div>

</body>

</html>

//Java server Page for Browsing: Browse by Title

<%@page contentType="text/html" pageEncoding="UTF-8"%>

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"

"http://www.w3.org/TR/html4/loose.dtd">

<html>

<head>

<meta http-equiv="content-type" content="text/html; charset=utf-8" />

<title>Mathematics & Computer Science Digital Library System</title>

<meta name="keywords" content="" />

<meta name="description" content="" />

Page 76: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

76

<script type="text/javascript" src="lib/jquery-1.3.2.min.js"></script>

<script type="text/javascript" src="lib/jquery.tools.js"></script>

<script type="text/javascript" src="lib/jquery.custom.js"></script>

<link href="styles.css" rel="stylesheet" type="text/css" />

</head>

<script language="JAVASCRIPT" type="TEXT/JAVASCRIPT">

function confirmMessage()

{

//display a confirmation box asking the visitor if they want to get a message

{

alert("File successfully uploaded to server");

}

}

$(document).ready(function()

{

var passfield = document.getElementById('password_field_id');

passfield.type = 'text';

});

function focusCheckDefaultValue(field, type, defaultValue)

{

if (field.value == defaultValue)

{

field.value = '';

}

if (type == 'pass')

{

field.type = 'password';

}

}

Page 77: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

77

function blurCheckDefaultValue(field, type, defaultValue)

{

if (field.value == '')

{

field.value = defaultValue;

}

if (type == 'pass' && field.value == defaultValue)

{

field.type = 'text';

}

else if (type == 'pass' && field.value != defaultValue)

{

field.type = 'password';

}

}

</script>

<body>

<div id="bg">

<div id="main">

<div id="content">

<div class="navi"></div> <!-- create automatically the point dor the navigation depending on the numbers of items -->

<div id ="header1">

<div id="menu">

<ul>

<li id="button1"><a href="macsdl.jsp" title="">Home</a></li>

<li id="button2"><a href="ByAuthor.jsp" title="">Browse</a></li>

<li id="button2"><a href="#" title="">Contacts</a></li>

</ul>

</div>

Page 78: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

78

<div id ="logos"></div>

</div>

<br/><br/>

<div id="main">

<%@page import="java.io.*"%>

<%@page import="java.sql.*,java.util.*" %>

<%

String nam=request.getParameter("Name");

if(nam!=null)

{

Connection conn=null;

ResultSet results=null;

Statement stat;

Class.forName("oracle.jdbc.driver.OracleDriver");

conn=DriverManager.getConnection

("jdbc:oracle:thin:dl/admin@localhost:1521/XE");

stat=conn.createStatement();

results = stat.executeQuery("Select reference from browse "+

"Where title Like '%"+ nam.toLowerCase()+"%'");

while (results.next()) {

String filename=results.getString("reference");

%>

<!--embed src="test.pdf" width="800px" height="110px"></embed--->

<!--a href="test.pdf">test</a-->

<br><br><br>

<center>

<h1>Browse Results:</h1>

<div id="main">

<div class="main_top"></div>

<div class="main_bg1">

<tr>

Page 79: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

79

<td>

<a href="downloadfile.jsp?<%=filename%>

"><h2><%=filename%> </h2>

</a><br><br>

</td>

</tr>

</div>

<div class="main_bot"></div>

</div>

</center>

<%

}

results.close();

}

else

{%>

<div id="search">

<b>Enter The Title Of The Book:</b>

<form method="get" action="BrowseByTittle.jsp">

<fieldset>

<input type="text" name="Name" id="search-text" size="25" value ="Title"

onFocus="javascript:focusCheckDefaultValue(this, '', 'Title');"

onBlur="javascript:blurCheckDefaultValue(this, '', 'Title');"

>

<input type="submit" id="search-submit" value="" />

</fieldset>

</form>

</div>

<%

}%>

<br/><br/><br/><br/><br/><br/>

<div class="razd_g"></div><br />

Page 80: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

80

<div class="col">

<h1>Add to the MACS DL</h1>

<img src="images/col_img1.jpg" class="img_l" alt="" />Add you objects and share with the NUL community by uploading your

files<br/>to the server, download and get stuff you need most!

</div>

<div class="col_razd"></div>

<div class="col">

<h1 class="tit">Browse by date</h1>

<img src="images/col_img2.jpg" class="img_l" alt="" />Browse the collection by date, specify the date and browse freely.

</div>

<div class="col_razd"></div>

<div class="col">

<h1 class="tit">SEARCH MACS DL</h1>

<img src="images/col_img3.jpg" class="img_l" alt="" />Type any query in the above search text field and click the search

button. Get the results instantly!

</div>

<div style="clear: both"></div>

<div style="height:15px; width: 100%"></div>

<div class="razd_g"></div>

<div style="clear: both"></div>

</div>

<div id="content_bot"></div>

<!-- content ends -->

<div style="height:15px; width: 100%"></div>

</div>

</div>

</div>

</body>

</html>

//Java server page for Browsing: Browse by date

<%@page contentType="text/html" pageEncoding="UTF-8"%>

Page 81: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

81

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"

"http://www.w3.org/TR/html4/loose.dtd">

<html>

<head>

<meta http-equiv="content-type" content="text/html; charset=utf-8" />

<title>Mathematics & Computer Science Digital Library System</title>

<meta name="keywords" content="" />

<meta name="description" content="" />

<script type="text/javascript" src="lib/jquery-1.3.2.min.js"></script>

<script type="text/javascript" src="lib/jquery.tools.js"></script>

<script type="text/javascript" src="lib/jquery.custom.js"></script>

<link href="styles.css" rel="stylesheet" type="text/css" />

</head>

<script language="JAVASCRIPT" type="TEXT/JAVASCRIPT">

function confirmMessage()

{

//display a confirmation box asking the visitor if they want to get a message

{

alert("File successfully uploaded to server");

}

}

$(document).ready(function()

{

var passfield = document.getElementById('password_field_id');

passfield.type = 'text';

});

function focusCheckDefaultValue(field, type, defaultValue)

{

if (field.value == defaultValue)

{

field.value = '';

Page 82: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

82

}

if (type == 'pass')

{

field.type = 'password';

}

}

function blurCheckDefaultValue(field, type, defaultValue)

{

if (field.value == '')

{

field.value = defaultValue;

}

if (type == 'pass' && field.value == defaultValue)

{

field.type = 'text';

}

else if (type == 'pass' && field.value != defaultValue)

{

field.type = 'password';

}

}

</script>

<body>

<div id="bg">

<div id="main">

<div id="content">

<div class="navi"></div> <!-- create automatically the point dor the navigation depending on the numbers of items -->

<div id ="header1">

<div id="menu">

<ul>

<li id="button1"><a href="macsdl.jsp" title="">Home</a></li>

<li id="button2"><a href="ByAuthor.jsp" title="">Browse</a></li>

Page 83: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

83

<li id="button2"><a href="#" title="">Contacts</a></li>

</ul>

</div>

<div id ="logos"></div>

</div>

<br/><br/>

<div id="main">

<%@page import="java.io.*"%>

<%@page import="java.sql.*,java.util.*" %>

<%

String nam=request.getParameter("Name");

if(nam!=null)

{

Connection conn=null;

ResultSet results=null;

Statement stat;

Class.forName("oracle.jdbc.driver.OracleDriver");

conn=DriverManager.getConnection

("jdbc:oracle:thin:dl/admin@localhost:1521/XE");

stat=conn.createStatement();

results = stat.executeQuery("Select reference from browse "+

"Where date_modified Like '%"+ nam.toLowerCase()+"%'");

while (results.next()) {

String filename=results.getString("reference");

%>

<!--embed src="test.pdf" width="800px" height="110px"></embed--->

<!--a href="test.pdf">test</a-->

<br><br><br>

<center>

<h1>Browse Results:</h1>

<div id="main">

<div class="main_top"></div>

Page 84: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

84

<div class="main_bg1">

<tr>

<td>

<a href="downloadfile.jsp?<%=filename%>

"><h2><%=filename%> </h2>

</a><br><br>

</td>

</tr>

</div>

<div class="main_bot"></div>

</div>

</center>

<%

}

results.close();

}

else

{%>

<div id="search">

<b>Enter Year Of Publication:</b>

<form method="get" action="BrowsebyDate.jsp">

<fieldset>

<input type="text" name="Name" id="search-text" size="25" value ="Year"

onFocus="javascript:focusCheckDefaultValue(this, '', 'Year');"

onBlur="javascript:blurCheckDefaultValue(this, '', 'Year');"

>

<input type="submit" id="search-submit" value="" />

</fieldset>

</form>

</div>

<%

Page 85: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

85

}%>

<br/><br/><br/><br/><br/><br/>

<div class="razd_g"></div><br />

<div class="col">

<h1>Add to the MACS DL</h1>

<img src="images/col_img1.jpg" class="img_l" alt="" />Add you objects and share with the NUL community by uploading your

files<br/>to the server, download and get stuff you need most!

</div>

<div class="col_razd"></div>

<div class="col">

<h1 class="tit">Browse by date</h1>

<img src="images/col_img2.jpg" class="img_l" alt="" />Browse the collection by date, specify the date and browse freely.

</div>

<div class="col_razd"></div>

<div class="col">

<h1 class="tit">SEARCH MACS DL</h1>

<img src="images/col_img3.jpg" class="img_l" alt="" />Type any query in the above search text field and click the search

button. Get the results instantly!

</div>

<div style="clear: both"></div>

<div style="height:15px; width: 100%"></div>

<div class="razd_g"></div>

<div style="clear: both"></div>

</div>

<div id="content_bot"></div>

<!-- content ends -->

<div style="height:15px; width: 100%"></div>

</div>

</div>

</div>

</body>

</html>

Page 86: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

86

//Java Source code for search results

<%@page contentType="text/html" pageEncoding="UTF-8"%>

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"

"http://www.w3.org/TR/html4/loose.dtd">

<html>

<head>

<meta http-equiv="content-type" content="text/html; charset=utf-8" />

<title>Mathematics & Computer Science Digital Library System</title>

<meta name="keywords" content="" />

<meta name="description" content="" />

<script type="text/javascript" src="lib/jquery-1.3.2.min.js"></script>

<script type="text/javascript" src="lib/jquery.tools.js"></script>

<script type="text/javascript" src="lib/jquery.custom.js"></script>

<link href="styles.css" rel="stylesheet" type="text/css" />

</head>

<script language="JAVASCRIPT" type="TEXT/JAVASCRIPT">

function confirmMessage()

{

//display a confirmation box asking the visitor if they want to get a message

{

alert("File successfully uploaded to server");

}

}

$(document).ready(function()

{

var passfield = document.getElementById('password_field_id');

passfield.type = 'text';

});

function focusCheckDefaultValue(field, type, defaultValue)

Page 87: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

87

{

if (field.value == defaultValue)

{

field.value = '';

}

if (type == 'pass')

{

field.type = 'password';

}

}

function blurCheckDefaultValue(field, type, defaultValue)

{

if (field.value == '')

{

field.value = defaultValue;

}

if (type == 'pass' && field.value == defaultValue)

{

field.type = 'text';

}

else if (type == 'pass' && field.value != defaultValue)

{

field.type = 'password';

}

}

</script>

<body>

<div id="bg">

<div id="main">

<div id="content">

<div class="navi"></div>

<div id ="header1">

Page 88: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

88

<div id="menu">

<ul>

<li id="button1"><a href="macsdl.jsp" title="">Home</a></li>

<li id="button2"><a href="ByAuthor.jsp" title="">Browse</a></li>

<li id="button2"><a href="#" title="">Contacts</a></li>

</ul>

</div>

<div id ="logos"></div>

<div id="search">

<form method="get" action="searchResults.jsp">

<fieldset>

<input type="text" name="search" id="search-text" size="25" value ="Search"

onFocus="javascript:focusCheckDefaultValue(this, '', 'Search');"

onBlur="javascript:blurCheckDefaultValue(this, '', 'Search');"

>

<input type="submit" id="search-submit" value="" />

</fieldset>

</form>

</div>

</div>

<br/><br/>

<center>

<br/><br/><br/>

<h1>Search Results related to the query</h1>

<div id="main2">

<div class="main_top"></div>

<div class="main_bg1">

<!--p style="line-height: 200%; margin-bottom: 3px" >First Name :</p-->

<tr>

Page 89: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

89

<td>

<%@page import="InvertedIndex.*,java.io.*"%>

<%//Display browsed items

String Path=

"C:/Users/KELVIN/Documents/NetBeansProjects/DigitalLibrarySearch/documents/";

IndexBuilder invertedIndex =

new IndexBuilder(Path+"stopwords.txt");

invertedIndex.ReadIndexFromDisk(Path+"invertedIndex.object");

invertedIndex.AnswerQuery(request.getParameter("search"));

File filename;

for(int i=0;i<invertedIndex.QueryResults.size();i++)

{

filename=new File((String)invertedIndex.QueryResults.get(i));

String file=filename.getName();

%>

<h1>

<a href="downloadfile.jsp?<%=file%>">

<%=file%>

</a></h1>

<%

}

%>

</td>

</tr><br /><br/>

</div>

<div class="main_bot"></div>

</div>

</center>

Page 90: Digital Library System

MATHEMATICS AND COMPUTER SCIENCE DIGITAL LIBRARY SYSTEM

90

<!-- content ends -->

</div>

</div>

</body>

</html>

//Source code for downloading a file from Server, downloadfile.jsp

<%

String filename=request.getQueryString();

String Path="C:/Users/KELVIN/Documents/NetBeansProjects/DigitalLibrarySearch/documents/";

File file=new File(Path+filename);

BufferedInputStream reader=

new BufferedInputStream(new FileInputStream(file));

try

{

//servlet=response.getOutputStream();

response.setContentType("APPLICATION/OCTET-STREAM");

response.setHeader("Content-Disposition","attachment;filename="+file.getName());

//response.setContentLength((int)file.length());

//start to read file contents in bytes

int iterator=0;

while((iterator==reader.read())!= -1)

out.write(iterator);

reader.close();

out.close();

}

//Errors were caught

catch(Exception error){ }

%>