BSc Thesis

MULTI KEY INDEXING FOR DISTRIBUTED DATABASE MANAGEMENT SYSTEM

BY

MD. SHAZZAD HOSAIN STUDENT NO: 9505025

For the partial fulfillment of B.Sc. Engineering Degree in Computer Science and Engineering

SUPERVISED BY MD. HUMAYUN KABIR

Assistant Professor &

ABDUL HAKIM NEWTON Lecturer

Department of Computer Science & Engineering, BUET

Department of Computer Science & Engineering Bangladesh University of Engineering & Technology, Dhaka – 1000, Bangladesh

Acknowledgement

I like to express my sincerest appreciation and profound gratitude to my supervisors

Md. Humayun Kabir, Assistant Professor and Abdul Hakim Newton, Lecturer,

Department of Computer Science and Engineering, BUET, for their supervision,

encouragement and guidance. Especially, Md. Humayun Kabir has keen interest in

distributed database, and his valuable suggestions and advice were the source of all

inspirations to me. I also like to convey gratitude to all my course teachers here. Their

teaching helps me a lot to start and complete this thesis work. Finally, I would like to

acknowledge the assistance and contribution of a large number of individuals and

express my gratefulness to them.

(Md. Shazzad Hosain )

Abstract Complex accessing structures like indexes are a major aspect of centralized database.

The support for these structures is the most important part of the database

management systems (DDMS). The reason for providing indexes is to obtain fast and

efficient access to data. Most of the centralized database management systems use B-

tree or other types of index structures such as bit vector, graph structure or grid file

index. But in distributed databases no index structure is used to obtain fast and

efficient access to the data. Therefore efficient access is the major problem in

distributed databases. We proposed a distributed index model, which is a data

structure based index comprising of two types of index structures: Global Index (GI)

and Local Index (LI). GI is created and maintained by distributed database component

(DDB) and LI is created and maintained by local database component (DB) of a

distributed database management system. Our proposed global index (GI) uses the

techniques of bit vector, graph structure and grid file organization. This distributed

database index is implemented with multi key indexing technique, which enables us

to search a record with more than one attribute values. A simulation program tested

the proposed model and found satisfactory results.

Chapter 1

Introduction

1.1 An overview of distributed database system

The traditional database approach keeps all data centrally and then accesses them

mostly in a client server model. But in a distributed database system data are

distributed over site geographically. Let us clarify this feature through an example.

Say there are three branch of a bank at different site. There will be two types of

transaction, one is local transaction and the other is global transaction. Local

transaction means money transaction in the account at the same site. But if it needs to

transfer money from one account of a site to another account of another site then a

global transaction will occur. In that case the program has to access data over site,

which needs much attention, such as, transaction over network, speed, efficient

access, integrity, recovery, concurrency control, privacy, security and a lot of things.

These entire tasks are done by DDBMSs (Distributed Database Management System).

When a local operation happens all are right as usual, but when a global operation

needs it is the DDBMSs to determine which site has to access or how to perform the

operation.

An important property of DDBMSs is whether they are homogeneous or

heterogeneous. Homogeneity and heterogeneity can be considered at different levels

in a distributed database, the hardware, the operating system and the local DBMSs.

However, the important thing for us is at the level of local DBMSs, because the

communication software manages differences at lower levels. Therefore the term

homogeneous DDBMSs, refers to a DDBMSs with the same DBMS at each site, even

if the computers and / or the operating system are not the same. A heterogeneous

DDBMS uses instead at least two different DBMSs. This adds the problem of

translating between the different data models of the different local DBMSs to the

complexity of homogeneous DDBMSs. So if Top-Down design of a global schema

for a new system has to be designed a homogeneous system is a fine solution but if

there already exists different local DBMSs and one has to integrate those systems then

obviously heterogeneous system will emerge and in that case the DDBMSs has to

cope with this heterogeneous system.

1.2 Reference architecture of distributed database

In order to understand distributed database system we have to study this Reference

architecture of distributed database architecture. Actually it is not explicitly

implemented an all distributed databases but we have to analyze and understand all

the components of this architecture to have better knowledge about distributed

database.

Fig 1.1: The reference architecture for distributed database.[8]

The reference model has two main parts

1. Site independent schemas

2. Site dependent schemas

Global Schema

Fragmentation Schema

Local mapping Schema 1

Local mapping Schema 2

DBMS of site 1 DBMS of site 1

Local database At site 1

Local database At site 2

(Other site)

Allocation schema

1.2.1 Site independent schemas:

This schemas has the following parts

• Global schema

• Fragmentation schema

• Allocation schema

1.2.1.1 Global schema:

This schema defines all the data that are contained in the distributed database as if

the database were not distributed at all. Therefore, this schema is defined exactly as

the schema of no distributed database. However, the data model that is used for the

definition of a global schema should be compatible for the definition of the mapping

to the other levels of the distributed database. For this purpose, the relational data

model will be used. Using this model, the global schema consists of the definition of a

set of global relations.

1.2.1.2 Fragmentation schema:

Each global relation can be split into several no overlapping portions that are

called fragments. The mapping between global relations and fragments is defined as

fragmentation schema. This mapping is one to many: i.e., several fragments

correspond to one global relation, but only one global relation corresponds to one

fragment. Fragments are indicated by a global relation name with an index (fragment

index); for example, Ri indicates the ith fragment of global relation R.

1.2.1.3 Allocation schema:

Fragments are logical portions of global relations that are physically located at one

or several sites of the network. The allocation schema defines at which sites a

fragment is located. All the fragments that correspond to the same global relation R

and are located at the same site j constitute the physical image of global relation at site

j. Therefore there is a one to one mapping between a physical image and a pair; a

global relation named and a site index can indicate physical images. To distinguish

them from fragments, we will use a superscript; for example, Rj indicates the physical

image of the global relation R at site j.

An example of the relationship between the object types defined above is shown

in fig 2. A global relation R is split into four fragments R1, R2, R3, R4. These four

fragments are allocated redundantly at the three sites of a computer network, thus

building three physical images R1, R2 and R3.

R

Global relation Fragments Physical image

1. FIG 1.2: FRAGMENTS AND PHYSICAL IMAGES FOR A GLOBAL RELATION [8]

To complete the terminology, we will refer to a copy of a fragment at a given site,

and denote it using the global relation name and two indexes. For example, in fig 2,

the notation R32 indicates the copy of fragment R2 that is located at site 3.

Finally, note that two physical images can be identical. In this case, we will say

that a physical image is a copy of another physical image. For example, in fig 2 R1 is

a copy of R2.

1.2.2 Site dependent schemas:

This schemas has the following parts

• Local mapping schema

• DBMS of the local site

• Local database at that site

R1

R2

R3

R4

R1 site 1

R2 site 2

R3 site 3

R11

R12

R21

R22

R31

R32

R33

1.2.2.1 Local mapping schema:

Since the top three levels are site independent, therefore they do not depend on the

data model of local DBMS. At a lower level, it is necessary to map the physical

images to the objects that are manipulated by the local DBMS. This mapping is called

a local mapping schema and depends on the type of local DBMS; therefore, in a

heterogeneous system we have different types of local mappings at different sites.

This reference architecture provides a very general conceptual framework for

understanding distributed database system. The most important three features that

motivates in designing this architecture are

• Separation of data fragmentation and allocation.

• The control of redundancy.

• The independence from local DBMS.

• Separation of data fragmentation and allocation:

This separation allows us to distinguish two different levels of distribution

transparency, namely fragmentation transparency and location transparency.

Fragmentation transparency is the highest degree of transparency where as the

location transparency is the lower degree of transparency. The separation between the

concept of fragmentation and allocation is very convenient for the distributed

database because the determination of relevant portions of the data is thus

distinguished from the problem of optimal allocation.

• Explicit control of redundancy:

In fig 2 the two physical images R2 and R3 are overlapping; i.e. they contain

common data. The definition of disjoint fragments as building blocks of physical

images allows us to refer explicitly to this overlapping part: the replicated fragment

R2. As we shall see, the explicit control over redundancy is useful in several aspects

of distributed database management.

• Independence from local DBMS:

The feature local mapping transparency allows us to build distributed database

system homogeneous or heterogeneous. In a homogeneous system, it is possible that

the site independent schemata are defined using the same data model as the local

DBMS but in a heterogeneous database system local mapping schemata helps to

coordinate the different kinds of DBMS.

Another kind of transparency that is strictly related to location transparency is

replication transparency. Replication transparency means that the user is unaware of

the replication of fragments.

1.2.3 Types of data fragmentation:

The decomposition of data fragmentation is two types, horizontal fragmentation

and vertical fragmentation. More fragmentation that is complex can be obtained by

combining these two types of fragmentation. In all types of fragmentation, a fragment

can be defined by an expression in a relational language that takes global relations as

operands and produces the fragment as result. There are some rules that must be

followed when defining fragments.

1.2.3.1 Completeness condition:

All the data of the global relation must be mapped into the fragments; i.e., it must

not happen that a data item that belongs to global relation does not belong to any

fragment.

1.2.3.2 Reconstruction condition:

It must be always possible to reconstruct the global relation from those fragments.

This necessity must comply with this architecture because distributed database only

store the fragments in different sites and global relation have to be built through this

reconstruction operation if necessary.

1.2.3.3 Disjointness condition:

It is convenient that fragments should be disjoint, so that the replication of data

can be controlled explicitly at the allocation level. However, this condition is satisfied

by horizontal fragmentation while the in vertical fragmentation this condition

sometimes violated.

1.2.4 Horizontal fragmentation:

Horizontal fragmentation consists of partitioning the tupelos of a global relation

into subsets that is very much useful for the distributed database system. If data are

fragmented by some common properties then those data can be stored geographically

in a convenient way. Here we can clarify this by an example. Let a global relation be

SUPPLIER (SUM, NAME, CITY)

Here the SUPPLIER contains supplier number, supplier name and the city where

the supplier lives. However if the entire supplier comes from Sanfransisco city (SF”)

and Losangels city (“LA”) then the horizontal fragmentation can be defined in the

following way:

SUPPLIER1 = SL CITY = ”SF” SUPPLIER

SUPPLIER2 = SL CITY = ”LA” SUPPLIER

The above fragmentation satisfies the completeness condition because “SF” and

“LA” are the only possible values of CITY attribute, otherwise we would not know to

which fragments the other CITY values belong.

Again, the reconstruction condition is easily verified, because it is always possible

to reconstruct the SUPPLIER global relation through the union operation:

SUPPLIER = SUPPLIER1 UN SUPPLIER2

1.2.5 Vertical fragmentation:

The vertical fragmentation can be obtained by the subdivision of its attributes into

global groups. It is useful when the subgroups have the same common geographical

properties. For example a global relation

EMPLOYEE (EMPNUM, SAL, TAX, MGRNUM, DEPTNUM)

A vertical fragmentation of this relation can be defined as

EMPLOYEE1 = PJ EMPNUM, NAME, MGRNUM, DEPTNUM EMPLOYEE

EMPLOYEE2 = PJ EMPNUM, SAL, TAX EMPLOYEE

The fragmentation could, for instance, reflect an organization in which salaries

and taxes are managed separately. The reconstruction of relation EMPLOYEE can be

obtained as

EMPLOYEE = EMPLOYEE1 JOIN EMPNUM = EMPNUM EMPLOYEE2

1.2.6 Mixed fragmentation:

The fragments that are obtained by the above fragmentation operations are

recursively. The global relation finally can be obtained by the reconstruction rules in

inverse order; consider for example, the same global relation

EMPLOYEE (EMPNUM, NAME, SAL, TAX, MGRNUM, DEPTNUM)

The following is a mixed fragmentation that is obtained by applying the vertical

fragmentation of the previous example, followed by a horizontal fragmentation on

DEPTNUM:

EMPLOYEE1 = SL DEPTNUM <= 10 PJ EMPNUM, NAME, MGRNUM, DEPTNUM EMPLOYEE

EMPLOYEE2 = SL 10 < DEPTNUM <= 20 PJ EMPNUM, NAME, MGRNUM, DEPTNUM EMPLOYEE

EMPLOYEE3 = SL DEPTNUM > 10 PJ EMPNUM, NAME, MGRNUM, DEPTNUM EMPLOYEE

EMPLOYEE4 = PJ EMPNUM, NAME, SAL, TAX EMPLOYEE

The reconstruction of relation EMPLOYEE is defined by the following expression:

EMPLOYEE = UN (EMPLOYEE1, EMPLOYEE2, EMPLOYEE3) JN EMPNUM=EMPNUM

PJ EMPNUM, SAL, TAX EMPLOYEE4

Here the convention followed

JN = JOIN UN = UNION SL = SELECTION PJ = PROJECTION

EMPLOYEE

EMPLOYEE4

2. FIG 1.3: THE FRAGMENTATION TREE OF RELATION EMPLOYEE[8]

Mixed fragmentation can be conviniently presented by a fragmentation tree. In a

fragmentation tree, the root corresponds to a global realtion, the leaves correspind to

the fragments, and the intermediate nodes results of the fragment-defining

expressions.

H

EMPLOYEE1 EMPLOYEE1 EMPLOYEE1

V

Chapter 2

Design of distributed database system

2.1 Objectives of the design of data distribution:

In the design of data distribution we have to consider the following objectives

2.1.1 Process locality:

In a distributed database system there must be remote refference as well as local

reference. Since remote reference costs more than local reference it is the main issue

to minimize the remote reference. Thus in distributing data to minimize remote

reference or to maxize local reference we have to place data as close as possible to the

application which use them.

Designing data distribution for maximizing procesing loaclity can be achieved by

adding the number of local and remote references corresponding to each candidate

fragmentation and fragment allocation and selecting the best solution among them.

2.1.2 Availability and reliability of distributed data:

A high degree of availability for read only aplications is achieved by storing

multiple copies of the same information. The system must be able to switch to an

alternative copy when the one that should be accessed under normal condition is not

available.

Reliability can also be achieved by storing multiple copies of data. If one copy is

destroyed by events which have destroyed computers or storage, data can be available

from other copy of data. Since physical destruction can be caused by fire, earthquake

or sabotage, it is relevant to store replicated copies in geographically dispersed

locations.

2.1.3 Workload distribution:

Distributing the workload over the sites is an important feature of distributed

computer system. Workload distribution is done in order to take advantage of the

different powers or utilizations of computers at each site, and to maximize the degree

of parallelism of execution of applications.

2.1.4 Storage cost and availability:

Though the cost of data storage is not relevant if compared with CPU, I/O, and

transmission cost of applications, but the limitation of available storeage at each site

must be considered.

2.2 Distributed database design:

There are two attarctive approaches to design the destributed databas system. One

is top-down and the other is bottom-up approach.

In the top-down approache, we start by designing the global schema, and we

proceed by designing the fragmentation of the database, and then allocationg the

fragments to the sites, creationg the physical images. This approach is the most

attractive for systems which are developed from scratch, since it alows performing the

design rationally.

While top-down appreach is attractive for the system which is to be installed for

the first time. But for an existing system this approach is not prefferd. In that case

existing databases are aggregated, a bottom-up approach. This approach is based on

the integration of existing schemata into a single, global schema. By integrating it we

mean the merging of commonn data definitions and the resolution of conflicts among

different representations given to the same data. In summary, the bottom-up approach

of a distributed database system rewuires:

1. The seelction of a common database model for describing the global schema of

the database.

2. The tarnslation of each local schema into the common data model.

3. The integration of the local schemata into a common global schema.

2.2.1 The allocation of fragments:

The data allocation problem has been widely analyzed in the context of the ‘file

allocation problem’. The easiest way to apply this work to the fragment allocation

problem is to consider each fragment as a separate file; however, this approach is not

convinient for the following reasons:

1. Fragments are not properly modeled as individual files, since in this way we do

not take into account the fact that they have the same structure or behavior.

2. There are many more fragmnets then original global relations, and many analytic

models cannot compute the solution of problems involving too many variables.

3. Modeling application behavior in file systems is very simple, while in distributed

database applications can make a sophisticated use of data.

Some of this problems have not yet been solved satisfactorily; for instance,

problem 3 is particularly difficult, since the correct approach would be to evaluate

data distribution measuring how optimized aplications should behave with it. This,

however, requires oprimizing all the important applications for each possible data

allocation; solving query optimization as a subproblem of data allocation is probably

too hard, since it requires either an excessive amount of simulation or an excessively

complex analytic computation.

2.2.2 General criteria for fragment allocation:

In determining the allocation of fragments, it is important to distinguish whether

we design a final nonredundant or redundant allocation. The simplest method for a

nonredundant final allocation is a best fit approach; a measure is associated with each

possible allocation. However replication introduces further complexity in the design,

because:

1. The degree of replication of each fragment becomes a variable of the problem.

2. Modeling read applications is complicated by the fact theat the applications can

now select among several alternative sites for accessing fragments.

For determining the redundant alloation of fragments, either of the following

methods can be used:

1. All beneficial sites: In this approach the set of all sites where the benefit of

allocation one copy of the fragment is higher than the cost, and allocate a copy of

the fragment to eache element of this set.

2. Additional replication: Here first the solution of the nonreplicated problem, and

then progresively introduce replicated copies starting from the most benefiicial;

the process is terminated when no additional replication is beneficial.

2.2.3 Measure of costs and benefits of fragment allcation:

Here we wile give some very simple formulas for evaluating costs and benefits of

the allocation of fragments of a global relation R. let us first introduce some

definitions:

• i is the fragment index

• j is the site index

• k is the application index

• fkj is the frequency of application k at site j

• rki is the number of retrival references of application k to fragment i

• uki is the number of update references of application k to fragment i

• nki = rki - uki

2.2.3.1 Horizontal fragmentation:

1. Using the ‘best-fit’ approach for a nonreplicated allocation, we place Ri at the site

where the number of references to Ri is maximum. The number of local references

of Ri at site j is

Bij = ∑k fkj nki [8]

Ri is allocated at site j* such that Bij* is maximum.

2. Using the ‘all beneficial sites’ method for replicated allocation, we place Ri at all

sites j where the cost of retrieval references of applications is larger than the cost

of update references to Ri from applications at any other site. Bij is evaluated as

the diference:

Bij = ∑k fkj rki – C * ∑k∑j’≠j fkj’ uki [8]

C is a constant which measures the ratio between the cost of an update and a

retrieval access; typically, update accesses are more expensive, since they rewuire

a larger number of control messages and local opereations.

3. Using the ‘additional replication’ method for replicated allocation, we can

measure the benefit of placing a new copy of Ri in terms of increased reliability

and availability of the system. The benefit does not grow proportionally to the

degree of redundancy of Ri. Let di denote the degree of redundancy of Ri, and let

Fi denote the benefit of having Ri fully replicated at each site. Here the following

function β(di) was introduced to measure this benefit:

β(di) = (1 – 21-di) FI [8]

Note that β(1) = 0, β(2) = Fi / 2, β(3) = 3 Fi / 4, and so on. Then we evaluate the

benefit of introducing a new copy of Ri at site j by modifying the formula of case

2 as follows:

Bij = ∑k fki rki – C * ∑k ∑j’ ≠ j fkj / uki + β(di) [8]

The formula takes into account the degree or replication.

Chapter 3

Query optimization

3.1 Translation of global queries to fragment queries

So far we have seen that an access operation issued by an application can be

expressed as a query, which references global relations. The DDBMS has to transform

this query into simpler queries, which refer only to fragments. In general, there are

several different ways to transform a query over global relations (called global query)

into queries over fragments (called fragment queries). These different transformations

produce fragment queries, which are equivalent, in the sense that they produce the

same result and this is known as equivalent transformation.

3.1.1 Equivalence transformations for queries:

A relational query can be expressed using different languages; however here we

will use relational algebra and SQL for this purpose because it is possible to transform

most SQL queries into equivalent expressions of relational algebra and vice versa. We

can interpret an expression of relational algebra not only as the specification of the

semantics of a query but also as the specification of a sequence of operations. From

this viewpoint, two expressions with the same semantics can describe two different

sequences of operations. For example,

PJ NAME, DEPTNUM SL DEPTNUM = 15 EMP

and

SL DEPTNUM = 15 PJ NAME, DEPTNUM EMP

are equivalent expressions but define two different sequences of operations.

3.1.2 Operator tree of a query:

In order to have a more practical representation of queries, in which expression

manipulation is easier to follow, operator tree is introduced. Let us consider query Q1,

which requires the supplier number of suppliers that have issued a supply order in the

North area of our company. Query Q1 corresponds to the following expression of the

relational algebra:

Q1 : PJ SNUM SL AREA = “North” (SUPPLY JN DEPTNM = DEPTNUM DEPT)

The operator tree of the corresponding query is given bellow:

PJ SNUM

SL AREA = “North”

JN DEPTNUM = DEPTNUM

SUPPLY DEPT

Fig 3.1 : An operator tree for query Q1.[8]

3.1.3 Equivalent transformations for the relational algebra:

Two relations are equivalent when their tuples represent the same mapping from

attribute names to values, even if the order of attribute is different, equivalent

transformations can be given systematically for small expressions, i.e., expressions of

two or three operand relations. These transformations are classified into categories

according to the type of the operators involved. Let U and B denote unary and binary

algebraic operations, respectively. We have:

• Commutativity of unary operations

U1U2R ↔ U2U1R

• Commutativity of operands of binary operations:

RBS ↔ SBR

• Associativity of binary operations:

RB (S B T) ↔ (R B S) BT

• Idempotence of unary operations:

UR ↔ U1U2R

• Distributivity of unary operations with respect to binary operations:

U (R B S) → U (R) B U (S)

• Factorization of unary operations

U (R) B U (S) → U (R B S)

In nondistributed databases, general criteria have been given for applying

equivalence transformations for the purpose of simplifying the execution of queries:

Criterion 1. Use idempotence of selection and projection to generate appropriate

selections and projections for each operand relation.

Criterion 2. Push selections and projections down in the tree as far as possible.

These criteria descend from the consideration that binary operations, and

specifically joins, are the most expensive operations of database systems, and

therefore it is convenient to reduce the sizes of operands of binary operations before

performing them. In distributed databases, these criteria are even more important:

binary operations require the comparison of operands that could be allocated at

different sites. Transmission of data is one of the major components of the costs and

delays associated with query execution. Thus, reducing the size of operands of binary

operations is a major concern.

PJ SNUM


PJ NAME.DEPTNUM PJ DEPTNUM

SUPPLY SL AREA = “NORTH”

DEPT

Fig 3.2: A modified operator tree for query Q1[8]

Fig2 shows a modified operator tree for query Q1, in which the following

transformations have been applied:

1. The selection is distributed with respect to the join; thus, the selection is

applied directly to the DEPT relation.

2. Two new projection operations are generated and are distributed with respect

to the join.

3.1.4 Operator graph and determination of common sub expressions

An important issue in applying transformations to a query expression is to

discover its common sub expressions’ i.e., sub expressions which appear more than

once in the query; clearly, there is a saving if common sub expressions are evaluated

only once. Here an example is given to illustrate this method. Let Q2: give the names

of employees who work in a department whose manager has number 373 but who do

not earn more than $35,000. An example for this query is:

Q2: PJ EMP.NAME (( EMP JN DEPTNUM = DEPTNUM SL MGRNUM = 373 DEPT ) DF ( SL SAL >

35000 EMP JN DEPTNUM = DEPTNUM SL MGRNUM = 373 DEPT ))

The corresponding operator tree and the operator trees after improving is given in the

next page:

3.2 TRANSFORMING GLOBAL QUERIES INTO FRAGMENT

QUERIES

In this section we give a standard transformation, which maps an algebraic

expression over the global schema into an algebraic expression over the fragmentation

schema.

3.2.1 Canonical expression of a fragment query

Given an algebraic expression over the global schema, its canonical expression is

obtained by substituting, for each global relation name appearing in it, the algebraic

expression giving the reconstruction of global relations from fragments. In the same

way, we map an operator tree on the global schema to an operator tree on the

fragmentation schema by substituting for the leaves of the first tree the corresponding

expressions of the inverse of the fragmentation schema. The important fact is that the

leaves of the operator tree of the canonical expression are now fragments rather than

global relations.

Fig4 (A) shows the transformation of the operator tree of query Q1 represented in

fig2 into the operator tree of the canonical expression of Q1. In fig4(A) the two

circled sub trees substitute the global relations SUPPLY and DEPT of fig2.

3.2.2 Simplifications of Horizontally Fragmented Relations

Let us consider, for simplicity, the join between two fragmented relations R and S.

There are two distinct possibilities of joining them; the first one requires collecting all

the fragments of R and S before performing the join. The second one consists of

performing the join between fragments and then collecting all the results into the

same result relating we refer to this second case as “distributed join”. Neither of the

above possibilities dominates the other. Very generally, we prefer the first solution if

conditions on fragments are highly selective; the second solution is preferred if the

join between fragments involves few pairs of fragments. Join graph help the design

the fragments that need to be joined together. Building the join graph and then using

the following criteria simplification is made:

Criteria 1: In order to distributed joins which appear in the global query, unions

(representing fragment collections) must be pushed up, beyond the joins that we want

to distribute.

Criteria 2: Use the algebra of qualified relations to evaluate the qualification of

operands of joins; substitute the sub tree, including the join and its operands, with the

empty relation if the qualification of the result of the join is contradictory.

Now let us consider an example of a distributed join. We start from query Q3 that

requires the number SNUM of all suppliers having a supply order. The algebraic

expression of the query over the global schema is

Q3: PJ SNUM (SUPPLY NJN SUPPLIER)

Fig5a shows the canonical form of the query. We recall that the fragmentation of

SUPPLY is derived from the fragmentation of SUPPLIER; i.e., each tuple of

SUPPLY is stored either in fragment SUPPLY1, if it refers to a supplier of San

Francisco, or in fragment SUPPLY2, if it refers to a supplier of Los Angeles.

Applying criterion 1, we push the two unions up beyond the join; thus, we generate

four joins between fragments. We then apply criterion 2, and we discover that two of

them are intrinsically empty because their qualification is contradictory. The empty

joins are those of SUPPLIER1 (in “SF”) with SUPPLY2 (of “LA” suppliers), and

likewise of SUPPLIER2 (in “LA”) with SUPPLY1 (of “SF” suppliers). Thus the

operator tree reduces to that of fig 3.5(b). Assuming that fragments with the same

index are placed at the same site (i.e., that data about SUPPLY are stored together

with data about SUPPLIERS), this operator tree corresponds to an efficient way of

evaluating the query, because each join is local to one site.

PJ


UN UN

[SUPPLY1 SNUM = [SUPPLY2 SNUM = [SUPPLIER1 [SUPPLIER2

SUPPLIER.SNUM AND SUPPLIER SNUM AND CITY =- “SF”] CITY = “LA’]

SUPPLIER.CITY = “SF”] SUPPLIER.CITY = “LA”]

(A) Canonical form of query Q3

UN

PJ SNUM PJ SNUM

JN DEPTNUM = DEPTNUM JN DEPTNUM = DEPTNUM

[SUPPLY1 : SNUM = [SUPPLIER1 [SUPPLY2 : SNUM = [SUPPLIER2

SUPPLIER.SNUM AND CITY = “SF”] SUPPLIER.SNUM AND CITY =

“LA”]


(A) Distributed join for query Q4

Fig 3.5: simplification of joins between horizontally fragmented relations[8]

3.3 Using inference for further simplification

Here we want to give examples of how additional information could be used for

simplifying queries. Let us consider again query Q1 that requires the supplier number

of those suppliers having a supply order issued in the North area, for which we have

developed the operator tree in fig 3.4. Assume that the following knowledge is

available to the query optimizer:

1. The north area includes only department 1 to 10

2. Orders from departments 1 to 10 are all addressed to suppliers of San Francisco.

We use the above knowledge to “infer” contradictions that allow elimination of

sub-expressions.

a. From above, we can write the following implications:

AREA = “North” => NOT (10 < DEPTNUM <= 20)

AREA = “North” => NOT (DEPTNUM > 20)

Here we have to apply the selection to fragments DEPT1, DEPT2 and DEPT3 and

evaluate the qualification of the results. By virtue of the above implications, two of

them are contradictory. This allows us to eliminate the sub expressions for fragments

DEPT2 and DEPT3. Thus, the operator tree of fig 3.4(B) reduces to that of fig 3.6a.

b. We then apply criterion 1 for distributing the join; in principle, we would need to

join the sub tree including DEPT1 with both sub trees including SUPPLY1 and

SUPPLY2. But from 1 above, we know that:

AREA = “Nort” => DEPTNUM <= 10

and from 2 above we know that :

DEPTNUM <= 10 => NOT (SNUM = SUPPLIER.SNUM AND SUPLIER.CITY = “LA”)

By applying criterion 2, it is possible to deduce that only the sub tree including

SUPPLY1 needs to be joined. The final operator tree for query Q1 is shown in fig6b.

PJ SNUM


UN PJ DEPTNUM

PJ SNUM, DEPTNUM PJ SNUM, DEPTNUM SL AREA = “NORTH”

[SUPPLY1 SNUM = [SUPPLY2 : SNUM = [DEPT1

SUPPLIER.SNUM AND SUPPLIER.SNUM AND 1 <= DEPTNUM < 10


(a)

PJ SNUM


PJ SNUM, DEPTNUM PJ DEPTNUM

[SUPPLY1 SNUM = SL AREA = “NORTH”

SUPPLIER SNUM AND

SUPPLIER.CITY = “SF”

[DEPT1 1 <= DEPTNUM < 10]

(b)

Fig 3.6 : Simplification of an operator tree using inference[8]

3.4 Simplification of vertically fragmented relations

We now turn to the simplification of vertically fragmented relations, which is dual

to the simplification of horizontally fragmented ones. The rationale behind this

simplification is to determine a proper subset of the fragments, which is sufficient for

answering the query, and then to eliminate all other fragments from the query

expression, as well as the joins, which are used in the inverse of the fragmentation

schema for reconstructing the global relations. In particular, if the fragments, which

are used in the inverse of the fragments, which are required, reduce to only one

fragment, there is no need of performing join operations.

We show the simplification with an example. Consider query Q4, which requires

names and salaries of employees. The query on the global schema is simply

Q4: PJ NAME, SAL EMP

The canonical operator tree of the expression is shown in fig 3.7(a). Recall that

EMP is first vertically partitioned into fragment EMP4 and a second fragment, which

is further partitioned horizontally into EMP1, EMP2 and EMP3. We notice that the

attributes of EMP4 include NAME and SAL, which are required by the query. Then it

is possible to answer the query using only the fragment EMP4, and the operator tree

can be simplified by disregarding the other fragments and the join operation. The final

operator tree for query Q4 is shown in fig 3.7(b).

PJ NAME, SAL

JN EMPNUM = EMPNUM

[EMP4 true] UN PJ NAME, SAL

[EMP : [EMP2 : [EMP3 : [EMP4 : true]

DEPPTNUM <= 10] 10 < DEPTNUM <= 20] DEPTNUM > 201]

(a) Canonical form of query Q4 (b) Simplified query

Fig 3.7: Simplification of vertically fragmented relations [8]

Chapter 4

Multi-key Processing 4.1 Introduction

For primary key indexing, each index entry identifies a unique record in the main

file. Any application however, require multikey retrieval, that is the retrieval from a

file of all records having some combination of attribute values. For example, a college

dean might want to genereate a list of all students with

• U.S. citizenship

• Physics or math major

• GPA of at least 3.3

• Student identification number less than 150, 0000

There are likely to be many records in a file with the same value of a particular

secondary attribute. Indexes of various forms are one mechanism for finding them.

Normally secondary index can produce a list of pointers to records having a particular

value of a secondary key.

However there is a price to be paid. Indexes take up space, and if the file is

changed frequently much time may be spent updating secondary indexes. Multi-key

queries involving ranges of attribute values are awkward to deal with using

conventional indexes. They might be better served by the grid file organization.

4.2 Threaded files

In a threaded file a pointer field is associated with each indexed secondary key

field. The value in the pointer field identifies the next record in the file with the same

value of the secondary key. Thus a number of threads run through the file. An entry

points to the first record having the attribute value and acts as a header of a linked list.

Let a file in which records have k attributes and suppose that two of these

attributes are threaded in the manner described. If the first two attributes has N

different values and the second has M different values, then the general form of the

file and indexes would be shown as given bellow

Record

Number

Attribute

1

Next

Pointer

Attribute

2

Next

Pointer

Attribute

3

Attribute

4

Attribute

K

1 - - - - - - -

2 - - - - - - -

3 - - - - - - -

Fig 4.1: general threaded file[21]

A part of a file of car records with one set of threads (for manufacturer) and part

of the corresponding index is shown bellow

Manufacturer index

Ford: 1

↓

VW: 2

↓

BMW: 3

↓

Audi: 11

↓

Honda: 15

Record

No

Manuf. Next

Manuf.

Model Color

Next

Color

License

1 Ford 4 Pinto White 4 HORS4ME

2 VW 6 Bug Red 9 SKIBNY

3 BMW 9 322I Black 6 DADIOUI

4 Ford 5 Mustang White 7 VALEGRL

5 Ford 7 Pinto Blue 8 RATFACE

6 VW 8 Rabbit Black 12 910VCD

7 Ford 10 Pinto White 11 PACMAX

8 VW 12 Rabbit Blue 10 BYE2NOW

9 BMW 16 320 Red 14 CMEGO

10 Ford 14 Mustang Blue 17 DPGURI

11 Audi ? 5000 White 16 OULSK

12 VW 13 Letta Black 18 LKWJE

13 VW ? Bug Green 15 WEIOU

14 Ford 18 Mustang Red 19 SD2332

15 Honda ? Civic Green 20 SDF33

16 BMW 17 320 White ? SDF3

17 BMW ? 322I Blue ? SFSDF23

18 Ford 19 Tempo Black ? TRE344

19 Ford 20 Pinto Red ? 34DRTER

20 Ford ? Mustang Green ? LKJPE

Fig 4.2: threaded file [21]

4.3 Multi-lists

In the multi-list organization the threads in the main file have the same structure

as in the threaded file organization but the index entries are different. Instead of an

index entry pointing simply to the beginning of a thread, it now points to every kth

record on the thread (for some value of k). In effect, we have a number of sub list of

length k and there is a pointer in the index to each subsist. An index entry now has

two links, one to the entry for the next value of the attribute and a second to a list of

pointers to the main file. With this additional information in the index, performance of

merge operations can be speeded up. Here we give an example of multi-list index

using fig 4.2 where the value of k is taken as 3.

Manufacturer index Color index

↓ ↓

Ford :1-7-18-49 White :1-11

↓ ↓

VW :2-12 Red :2-19

↓ ↓

BMW :3-17 Black :3-18

↓ ↓

Audi : 11 Blue : 5-17

↓ ↓

Honda : 15 Green : 13-25-40-28

Fig 4.3: Multi-lists [21]

4.4 Inverted files

Here the initial threaded file represents one extreme of the multi-list organization

with k = ∞. The other extreme is when k = 1; in this case the index points to every

record with a particular attribute value. This type of multi-list organization is known

as inverted file. Virtually all the commercially available systems are based on inverted

file designs.

4.4.1 STAIRS: An Application of Inverted Files

IBM’s STAIRS (Storage And Information Retrieval System) is a powerful

document retrieval system. Users can retrieve documents based on their content. For

example, they can retrieve documents containing an arbitrary word, or those

satisfying a complex Boolean expression of words.

The STAIRS system has this capability because it indexes every word occurrence

in the text, in contrast to most document systems, which index a few selected

keywords. STAIRS can thus be classified as a full text document retrieval system.

Here we give a brief description of the file structures that make retrievals efficient and

show how queries are answered. The file structures we describe are simplified

versions at the actual STAIRS structures.

Matrix

↓

Dictionary

↓

Occurrence file

↓

Index

↓

Documents

Fig 4.4: STAIRS file hierarchy [21]

4.4.2 File structure:

The STAIRS system contains five levels of data structures files as depicted in

fig4. The lowest level in the structure, the documents file, contains the machine-

readable documents. The only change from the conventional representation of a

document is that each paragraph is tagged with a label such as TITLE, TEXT,

ABSTRACT, and so on. In addition, the document contains end-of-sentence codes

that the system can recognize.

The next level in the structure, the index, has one entry for each document. The

entry contains information such as a pointer to the document, protection codes, and

date of entry into the system. The three file levels above the index refer to a document

by a unique document number: the number of its entry in the index.

The occurrence file contains one record for each word occurrence in the document

collection. The information recorded for each word occurrence is:

• Document number

• Paragraph code

• Sentence number

• Position within sentence

The entries in the occurrence file are ordered so that all records for a particular

word are contiguous. Within this grouping, records are stored in the order of the four

fields listed above.

The dictionary contains an entry for each different word in the document

collection, including such common words as ‘the’, ‘of’, and ‘in’. Summary

information, such as the numbers of times the word occurs and the number of

different documents in which it occurs, is stored together with the word.

In large dictionaries in book form usually have a thumb index, which enables the

user to find an alphabetical section rapidly? The matrix takes this one step further. It

has 26*27 entries, each of which identifies the start of a section of the dictionary for

words beginning with a particular pair of letters. In fact matrix eliminates the need to

store the first two letters of words in the dictionary. This key compression saves a

certain amount of space.

Fig5 represents a small part of the top three levels of an example file collection.

We assume in this example that in the document collection, “macabre” is a

alphabetically the first word starting with the letters “ma”. Also, we assume that the

word “mainframe” occurs a total of 109 different times and in 20 different documents.

Chapter 5

Index implementation 5.1 Introduction

With the multi-list and inverted file indexes there is a problem of maintaining

variable-length lists for each attribute value. This is one of the major problems of

multi-list indexing. Here we consider two alternatives to simple lists: bit vectors and a

general graph structure.

5.2 Bit vectors

A bit vector in the context of indexes is an array of two-valued objects having as

many elements as there are records in the main file. Each element indicates whether

or not the corresponding main file record has a particular attribute of the car file of

fig2.

Figure 5.1: Bit Vector Index [22]

5.3 Graph structure

We can save a certain amount of space in a structure by combining those list

elements that point to the same main file record. Thus each main file record is

represented in an inverted directory file by a single node that is on as many lists as

there are indexed attributes. A node will have, for each indexed attribute, a pointer to

the next node representing a main file record with the same value of that attribute. It is

Manufacturer

Ford

VW

BMW

Honda

101110000110101111000011100

110010101001110111000000110

101101010101010001100000001

101010101011111100000110101

convenient if each node also contains a pointer back to the owner of the list, that is,

the index entry for the particular attribute value. Fig6 shows what a node might look

like. Fig7 shows a small part of the graph structure for our car file. Only nodes for

records 1, 4 and 5 are shown.

Fig 5.2: General graph node[22]

Pointer to main file Pointer to owner node attribute 1 Pointer to next node with same value of attribute 1

Fig 5.3 Partial Graph Structure [22]

To find green fords we choose, arbitrarily, one of the two attributes, for example

manufacturer. We follow the manufacturer pointers from the first node pointed to by

“Ford”. At each node visited we check the owner of the color attribute to see if it is

green.

Since the nodes are distributed all over the file, we may have to access the entire

file to traverse the path. This takes a lot of time to find a specific query. Grid File

Organization can eliminate this problem.

Manufacturer

Ford

VW

BMW

Model

Pinto

Bug

Mustang

Color

Red

White

Black

To main file

To main file

To main file

To node for record to next mustang

To node for record to next white car

To node for record to next Ford

To node for record to next Pinto

To node for record to next Black

5.3.1 Comparison of Bit Vectors and Graphs

Bit vectors may at first appear to have higher storage costs than graph structures.

However, suppose that the main file contains M records, that P different attributes are

indexed, and that the attributes have an average of N different values. The storage

required for the bit vectors is

N*M*P bits

Assuming that we are combining nodes as described above, the storage for the

comparable part of the graph structure (M nodes) is

M*(2*P+1) pointers, which equals (2*M*P)+M pointers

Roughly speaking, if the number of different values of an attribute is less than the

number of bits required to hold two pointers, then a set of bit vectors occupies less

space than the comparable graph structure. A hybrid system might be a preferred

compromise; bit vectors would be used for attributes with few different values and

lists would be used for attributes with many different values.

The biggest advantage of bit vectors is the speed with which simple set operations

can be performed on conventional hardware. Most computers have machine-level

instructions for performing logical operations on bit patterns. In contrast, list merging

is comparatively slow.

5.3.2 Index maintenance

Secondary indexes, like primary indexes, must reflect the contents of the main file

at all times. Although there is only one primary index, there may be several secondary

indexes. Thus maintenance of correct index entries may become a big overhead. Here

we consider how bit lists and graph structures compare in the amount of maintenance

required.

5.3.2.1 Updating

An inverted file must be updated in three cases: when we insert a new record into

the main file, when we delete a record, or when we change the value of a secondary

key of an existing record. In the case of an insertion or a deletion, all indexes for the

file have to be modified. In the case of a single field modification, one index at most

has to be updated.

To delete a record from the main file we could simply mark it deleted. The

alternative is to rewrite the file and omit the deleted record. The marking operation is

a logical rather than a physical deletion. Logical deletion is simpler. A disadvantage

of physical deletion is that records may change position in the file and thus require

changes in pointers to them. An advantage of physical deletion is the reduction in list

length; this makes subsequent traversals shorter. For some insertions we may be able

to reuse logically deleted records in the main file, in which case there is no problem in

updating a bit vector.

If we change the value of an attribute the changes in the bit list are vary small. We

simply clear a bit in one list and set it in another. When we modify an attribute in a

graph structure, on the other hand, we must release the node from one inverted

attribute list and assign it to another.

Both the insertion of a new record and a change in an attribute value might

introduce an attribute value not previously represented in the file. In the case of a bit

vector, we must create a new vector with exactly one bit set. In the case of the graph

structure, there wil be a new list with exactly one node.

5.3.3 Reliability

Secondary storage tends to be more vulnerable than primary storage to data

corruption. Some of the data structures we have described above can be rendered

useless if a critical pointer is damaged. We thus therefore consider methods of making

the structures more robust, that is, less likely to be damaged irreparably. A general

technique might be to provide two or more paths to any record. A file maintenance

program can check the integrity of access paths. If a damaged path is detected; we can

possibly use another route and repair it. This recovery principle suggests that a

double-linked circular list is preferable to a single-linked list. 5.4 Grid Files

Nievergelt, Hinterberger, and Sevcik describe a secondary key accessing

teachnique-using grids that performs well on both stable and volatile files. We

describe the briefly here.

5.4.1 Design Aims

The design aims of the grid file organization are fourfold:

1. Point queries. The processing of a completely specified query should require no

more then two disc accesses. A completely specified query (or point query) is one

in which a single value is specified for each key attribute. An example from the

car file is: find Manufacturer = Ford, Color = Black, License = 1GWN821, Model

= Bug

2. Range queries: Processing of range queries and partially specified queries should

be efficient. Two examples of such queries are: find Manufacturer = VW, Model

= Bug, D < license < X: find Manufacturer = Ford, Color = Green

3. Dynamic adaptation: The file structure should adapt smoothly to insertions and

deletions.

4. Symmetry: All key fields, whether primary or secondary, should be treated

equally.

5.4.2 Ideal Solution

Assume that records have k keys. Consider the k-dimensional space defined by

the k sets of attribute values. Modifying the bit vector idea given above, we can

conceive of a k-dimensional bit matrix I have different values. If a particular element

of the k-dimensional matrix is set to 1, this indicates that a record exists with the

corresponding set of k attribute values. If the bit is 0, then no such record exists. Note

that the matrix of bits is therefore a complete representation of the set of records.

This organization satisfies the four design aims above, although we have assumed

nothing about how it might be stored. Processing point queries involves examination

of a single element. Processing range queries involves processing all elements in

particular j-dimensional matrix j <= k. Insertions and deletions are carried out by

setting and clearing appropriate single elements of the matrix. All key fields are

treated equally.

The k-dimensional matrix, however, is an ideal rather than a practical file

organization. In practice the matrix would be far too large to store. If a file has records

with 4 keys and each attribute has 1—different values, the matrix will have

100,000,000 elements. The grid file organization that we discuss next is in some ways

an approximation of the matrix ideal.

5.4.3 Practical Grid File Implementation

In the grid partitioning sets of attribute values reduces file organization the size of

the matrix. For example, we could partition the set of colors into four subsets. If we

regard attribute values as character strings, we might have

Color < F

F <= Color < K

K <= Color < Q

Q <= Color

Thus the color lemon, for example, falls into the third partition (K <= lemon < Q).

The partition points are held in linear scales. The set of k linear scales, one for each

attribute, defines a grid on the k-dimensional attribute space. The space is thus

divided into grid blocks. The number of grid blocks is much smaller than the number

of elements in the matrix. What we have lost, however, is the one-to-one

correspondence between elements of the grid/matrix and possible records.

In the grid file organization, records are stored in buckets. Buckets have a fixed

size, but there can be arbitrarily many of them in a file. The dynamic assignment of

buckets to grid blocks is maintained in the grid directory. The grid directory consists

of linear scales and a grid array, where each element contains a pointer to a bucket,

grid array elements, which form a k-dimensional rectangle, any point to the same

bucket in the file where records with the corresponding attribute values are stored.

We can represent the partitioning of the Color attribute described above by the

following linear scale

Color (F, K, Q)

Suppose that the other three attributes are partitioned similarly and that the linear

scales are

Manufacturer (G, R)

Model (C, H, N, T)

License (B, M, S)

Consider the record with

Manufacturer = Ford, Model = Pinto, Color = Blue, License = BBC1500

The partitions into which the attribute values fall are 1, 4, 1 and 2 respectively.

Therefore, the bucket pointed to by

Grid-array [1, 4, 1, 2]

Is the only place where the record would be stored?

The grid array of pointers is normally so large that it must be held in secondary

memory. On the other hand, the linear scales would normally fir into main memory.

Next we consider how well the grid file organization achieves the four design aims of

point queries range queries, dynamic adaptation, and symmetry.

5.4.4 Performance of Grid Files

Point queries: In response to a point query, each of the k specified attribute values

is first transformed into a grid index using the appropriate linear scale. The element of

the grid array selected by the set of k indexes can now be fetched from disc.

Nievergelt, Hinterberger, and Sevcik make a number of suggestions for implementing

the grid array. The calculations involved in mapping a rectangular k-dimensional

array onto linear memory are not complex. The address of a particular element is easy

to compute, and the element can be fetched in one access. A second disc access

fetches the bucket pointed to from the array element. Thus the first design aim is

achieved.

5.4.4.1 Range queries

The second design aim is to answer range queries efficiently. To satisfy this aim it

must be possible to move efficiently along an arbitrary axis of the grid array. That is,

given the address of a particular element, it must be easy to compute the address of

the next or previous element in any of the k-dimensions. For example, to satisfy the

range query

Manufacturer = VW, Model = Bug, D < license < X

We need to process records in buckets pointed to by elements in the rectangle

Grid array [3, 1, 1…4, 2…4]

A linked list implementation of the matrix would satisfy this requirement. An

array implementation enabling direct access of an element given its indexes would

also be suitable.

5.4.4.2 Dynamic adaptation

The third design aim is that the organization should adapt smoothly to insertions

and deletions. Let us consider these in turn.

5.4.4.3 Insertions

If a record must be inserted into a bucket that is already full, then a new bucket is

allocated to the file and records are distributed between the two buckets. There are

two cases to consider: when only only pointer points to the full bucket.

If the full bucket is pointed to form more than one element of the grid array, we do

not need to make changes to the partitioning. Records are distributed between the two

buckets according to current partitioning, and some of the pointers change from the

old to new bucket.

Suppose, for example, that the grid element representing records with

Color < F

G <= Manufacturer < R

H <= Model < N

License < B

And the grid element representing records with

Color < F


N <= Model < T

License < B

Both point to the same bucket and that this bucket overflows. A new bucket is

allocated to the file, and one of the two elements of the grid array is changed to point

to it. Records in the overflowing bucket are distributed between the two buckets

according to whether the value of the model attribute is less than N.

If only one grid element points to a bucket, the grid must be refined. One of the

sub ranges represented by the bucket contents must be divided. A new partition point

is added to the appropriate linear scale. One bucket is assigned to each half of the

original grid element, and the records are distributed according to the new

partitioning.

Suppose that after further insertions there is overflow in the bucket pointed to by

the element representing records with

Color < F


N <= Model < T

License < B

Assume further those only one-element points to this bucket. We therefore need to

split one of the sub ranges. Choosing arbitrarily, the Manufacturer dimension, we

could insert a partition at M. the corresponding linear scale is now

Manufacturer (G, M, R)

The number of elements in the grid array increases by 33% because the

manufacturer dimension now has four rather than three sub ranges. Most of the new

elements will point to an existing bucket. For example, the element representing

records with

J <= Color < K

G<=Manufacturer<M

Model < C

M <= License < S

Will point to the same bucket as the element representing records with

F <= color < K

M <= Manufacturer < R

Model < C

M <= License < S

We need to allocate a new bucket to the file and distrib ute records from the

overflowing on. The two buckets will now be pointed to by the array elements

representing

Color < F

G <= Manufacturer < M

N <= Model < T

License < B

And

Color < F

M <= Manufacturer < R

N <= Model < T

License < B

5.4.4.4 Deletions

To maintain reasonable storage utilization, two candidate buckets might be

merged if their combined number of records falls below some threshold. The records

would be moved into one of the buckets and pointers to the other reassigned to it. The

empty bucket would be de-allocated from the file. Note that not every pair of buckets

can be merged. Only elements that form a k-dimensional rectangle can point to a

particular bucket.

5.4.4.5 Symmetry

The symmetry of the matrix ideal is preserved in the grid organization. There is no

performance difference between primary and secondary indexing because all indexed

attributes are treated in the same way.

Chapter 6

PROPOSED DISTRIBUTED

DATABASE INDEX 6.1 Introduction

In previous chapters we have discussed index techniques and their uses in the

databases. Indices are associated with the main data file. They facilitate the Database

Management Systems to access records in the data file faster and in a random fashion.

There are different types of index structures and algorithms. Each has some

advantages and disadvantages over the others. One structure may be suitable in one

context but may not be suitable in another context. For example B, B* and B+ trees

are often used for implementing dynamic index in primary key implementation. But

for secondary keys such as multi-key indexing the methods are Bit Vectors, Graph

Structures, and Grid File Organizations etc. We have discussed about those methods

and their advantages and disadvantages in previous chapters. Now we will show how

we can improve searching in distributed database management systems.

6.2 Distributed Database

In recent years, distributed databases have become an important area of

information processing, and it is easy to foresee that their importance will rapidly

grow. There are both organizational and technological reasons for this trend.

Distributed databases eliminate many of the shortcomings of centralized databases

and fit more naturally in the decentralized structures of many organizations.

A distributed database is a collection of data, which belong logically to the same

system but are spread over the sites of a computer network.

For example, consider a bank that has three branches at different locations. At

each branch, a computer controls the teller terminals of the branch and the account

database of the branch. Each computer with its local account database at one branch

constitutes one site of the distributed database; a communication network connects

computers. During normal operations the applications, which are requested from the

terminals of a branch, need only to access the database of that branch. These

applications are completely called local applications. An example of a local

application is a debit or a credit application performed on an account stored at the

same branch at which the application is requested. Some applications are called global

applications or distributed applications. A typical global application is a transfer of

funds from an account of one branch to an account of another branch. This application

requires updating the databases at two different branches.

Therefore, a distributed database is a collection of data distributed over different

computers of a computer network. Each site of a network has autonomous processing

capability and can perform local applications. Each site also participates in the

execution of at least one global application, which requires accessing data at several

sites using a communication subsystem. Figure given below shows a typical

Distributed Database.

6.3 Finding records in a Distributed Database

At present distributed databases are inefficient in locating records since it is not

using any global index structure. For example, if we have a book data file in a

distributed database, the single book data file should be fragmented into several data

files and these fragments should be allocated in different sites of the distributed

database. The fragments information will be stored in the fragmentation schema, and

the information regarding the allocation of fragments to the sites will be stored in the

allocation schema. When a query is searching a book by a particular author it will be

fragmented into sub queries according to fragmentation and allocation schemas

because the fragmentation schema will tell the number and the names of the

fragments of the data file and the allocation schema will tell about the sites where to

get the said fragments. Using this information Distributed Database Management

Systems submit the fragmented queries into sites and collect the results from different

sites. Some sites that don’t contain the query information will result empty set. So it is

a waste of time to submit the fragmented queries in those sites.

If we know before hand which sites contain the required information and submit

fragmented queries only in those sites then it will faster and more efficient than

before. We can achieve this goal by implementing global index.

6.4 PROPOSED DISTRIBUTED INDEX

6.4.1 INTRODUCTION

The reason for providing indexes is to obtain fast and efficient access to data.

Indexing is a data structure based technique for accessing records in a file. Multi-key

indexing is often graph structured or grid file organized. Though local database has

index file to search a record efficiently distributed database has no such opportunity.

Our aim is to organize the whole thing so that global queries can be executed with

more efficiently and more fast.

6.4.2 Local Index Architecture

In local Database Management Systems primary key index is implemented in

B/B*/B+ trees. Grid File Organization often implements by Bit Vector, Graph

Structure or multikey indices. Normally Graph Structure and Grid File are used

widely.

6.4.3. Comparison among Bit Vector, Graph Structure and Grid File

Organization

6.4.3.1 Advantages and disadvantages of Bit vector

If we implement index on one attribute then we need a two-dimensional index

vector as shown in chapter 5. This bit vector is efficient because of the speed with

which simple set of operations can be performed on conventional hardware. But if we

want to implement a multi-list index with bit vector then a three dimensional bit

vector will be necessary, where the third dimension will represent the fields on which

the index is to be implemented. For example let we want to make a multi-list index on

the attributes Manufacturer, Model and Color of a car. Then the third dimension or the

z-axis of the bit vector will represent the bit vectors of Manufacturer, Model and

Color. Now if we want to know the query where Manufacturer = Ford, Model =

Mustang, and Color = Green, we have to access the corresponding bits from the bit

vector and to find all three bits are to be 1. But it is not easy to access those three bits

at the same time. The same problem we face when we delete of update a record.

Again different attribute has different no of attribute values. So the bit vector is not

always equilateral. It also makes maintaining the bit vectors difficult. These problems

are solved by graph structures.

6.4.3.2 Advantages and disadvantages of Graph Structure

To eliminate the problems of bit vector, in graph structure the record information

is represented in concise way. As described in previous chapters, if we want to find a

query we have to traverse the nodes through a single path of any one attribute. As the

attributes may be distributed all over the index file this requires to check all the nodes

throughout the entire index file. This may cause to access the index file several times,

which make the process inefficient. These problems are solved in grid file

organization where a particular record can be found only in two disk accesses.

6.4.4 Distributed Index Architecture

6.4.4.1 Introduction

To provide efficient access to the data we proposed a distributed index concept.

Distributed index is also a data structure based index comprising of two types of index

structures. One is Global Index (GI) and the other is Local Index (LI). Figure 6.2

shows the proposed distributed structure.

Fig 6.2: Architecture of Distributed Index [8]

GI is created and maintained by distributed database component (DDB) of

distributed database management systems (DDBMS). LI is created and maintained by

local database management component (DB) of DDBMS. Our study shows Bit

Vector, Graph Structure and Grid File Organization all have advantages in some way

and have disadvantages in other way. For this reason we preferred a way which uses

the techniques of Bit Vector, Graph Structure and Grid File Organization for GI. LI

has been imple mented by grid file. For every site there is a Local Index (LI), which

has been created, updated and used independently. Like other local database

management components LI enjoys autonomy in each site. There must be a single

global index (GI) for a distributed index. GI is created, updated and used based on

local indexes. All the local indexes are perfectly mapped with the global indexes.

When a record is searched in a distributed database, GI is used first to determine

which LI needs to be used to find the data. After selecting the right LI it is used to

access records in the corresponding site. In this way, distributed index ensures

efficient access to the data in a distributed database. If a record is inserted, deleted or

updated in a site and if it causes a new combination of fields or deletes a record

completely in any other sites then the record information is passed all other sites. The

other sites update their own global index according to the record information.

Proposed Global Index (GI) is a combination of Bit Vector, Graph Structure and

Grid File. In a grid file the actual records of database is stored in buckets as described

in previous chapters, and the index file is created to access those buckets. Introducing

Grid Dictionary accelerates this access. The Grid Dictionary consists of linear scales

and a grid array, where each element contains a pointer to a bucket. On the other hand

in Graph Structure all the records are stored in index file concisely as graph nodes,

where only two pointers for each indexed attribute, the forward pointer and the back

pointer, and an original pointer to the record of main file is stored.

Our main goal is to optimize query submission to database sites so that

unnecessary submission is not made. Hence we want to know every possible record of

other sites. This is possible in the form of graph nodes. For example if there are M no

of records in all sites but N (N < M) no of different records, then we have to keep

information of N records. Let us clarify it fully by an example. We have a database of

Manufacturer, Model, Color and License of cars at two different sites where multi key

index is created on Manufacturer, Model and Color fields. The site records are given

below.

Records in site1:

Rec. no Manuf. Model Color License

1 Ford Pinto Green 23023234

2 VW Civic White 23424244

3 BMW Bug Red 43543535

4 Ford Mustang Black 65435435

5 BMW Mustang White 45645654

6 Honda Tempo Green 23432543


8 BMW Bug Red 34543543


10 Honda Tempo Green 54654654

Here, there are 10 records but 6 different records according to three indexed fields

as record 1 & 9, 2 & 7, 3 & 8, and 6 & 10 have same value in three fields.

Records at site2:

Record no Manuf. Model Color License

1 Ford Mustang White 23432424



4 BMW Pinto Green 65765353

5 Ford Mustang White 32984983


7 Honda Tempo Red 54366547


9 BMW Pinto Green 45645664

10 Honda Tempo Red 45654654

Similarly here, there are only 5 different records. Again in both sites there are only

9 no of different records and we have to keep information about these 9 records in our

global index (GI) file.

Now our aim is that, if we want to find a record with Manuf = Honda, Model =

Tempo and Color = Green then we will submit the query only in site 1 as site 2 has no

such records, but if we want to query the record with Manuf = Ford, Model = Pinto

and Color = Green, then we will submit the query to both sites.

To achieve this goal we will store the 9 different records in global index file as a

form of graph nodes and the graph nodes will be stored as a fashion of Grid File

Organization. Here the graph nodes will consists of three back pointers and a bit

vector, where the bit vector gives the site address of that specific record and it

replaces the original pointer of records in data files. Here the graph nodes will be like

the following figure:

Bit vector

Back Ptr1

Back Ptr2

Back Ptr3

The corresponding graph nodes of the two records with Manufacturer = Honda,

Model = Tempo, Color = Green and Manufacturer = Ford, Model = Pinto, Color =

Green are given below.

This record exists in site 1 this record exist in both sites

In global index file these records will be stored as the original records of database

are stored in grid file. Thus by partitioning the entire range of different values of

individual fields into linear scales we will access the bucket of these records by grid

array and then find the desired records searching through the nodes linearly. The

whole process is done like Grid File Organization as if the nodes are now treated as

original records. Let us clear the whole thing using the example of the above two

database sites.

The different values of the three attributes are given in the following table

Manufacturer Model Color

BMW Bug Black

Ford Civic Green

Honda Mustang Red

VW Pinto White

Tempo

Now we could partition the three set of attributes into subsets. If we regard

attribute values as character strings, we might have

Manufacturer < G , G <= Manufacturer

Model < K, K <= Model < R, R <= Model

Color < H, H <= Color

Thus the color Black falls into the first partition (Black < G). Similarly Model

Pinto falls into the second partition (K <= Pinto < R) and Manufacturer Ford falls into

the first partition (Ford < G). The partition points are held in linear scales. The set of k

10

11 Site bit vector

Honda

Tempo Green

Site bit vector

Ford

Pinto Green

linear scales, one for each attribute, defines a grid on the k-dimensional attribute

space. The space is thus divided into grid blocks.

Now, consider the first record of site1 database with

Manufacturer = Ford, Model = Mustang and Color = Black

The partition into which the attribute values fall are 1, 2 and 1 respectively.

Therefore the bucket pointed to by Grid-array[1,2,1] is the only place where the

record would be stored.

The grid array of pointers is normally so large that it must be held in secondary

memory. On the other hand, the linear scales would normally fit into main memory.

The grid array contains the pointers of the buckets where the records are stored. The

grid array looks like figure 6.3.

Fig 6.3: Grid-Array pointing to buckets [15]

[1, 1, 1]

[1, 1, 2]

[1, 2, 1]

[1, 2, 2]

[1, 3, 1]

[1, 3, 2] . . . . . .

[2, 3, 2]

Bucket 1

Bucket 2

Bucket 3

0 1 2 3 4 5

Grid Array

Bucket 12

* *

So the record with Manufacturer = Ford, Model = Mustang and Color = Black

falls into bucket3 among with other records that belongs to the group of grid-array [1,

2, 1]. We can find the group [1, 2, 1] in the grid array by using the following formula:

In general if the grid array is [i, j, k] and there are P, Q and R no of subsets in

linear scales then the grid array index will be (i-1)*Q*R + (j-1)*R + (k-1). Here in our

example the value of i = 1, j = 2, k = 1, P = 2, Q = 3 and R = 2. So the index value of

grid array is (1-1)*3*2 + (2-1)*2 + (1-1) = 2.

After finding the bucket pointer from the grid-array we access the bucket from the

file and search linearly through the entire bucket. The record can exists only in that

bucket. While finding, we visit each node in the bucket and check the back pointer for

real attribute values. Here the organization of Bucket 3 is given in figure 6.4.

Fig 6.4: Proposed Global Index File Organization

From the above figure we see that we will find the record with Manufacturer =

Ford, Model = Mustang and Color = Black at the second position of the bucket. At the

node the bit vector is ‘10’ that is the record exists only in site1 but not in site2. Hence

we will submit the query for the record in site1 but not in site2.

Manufacturer

Ford

VW

BMW

Model

Pinto

Bug

Mustang

Tempo

Civic

Color

White

Black

Green

Red

11

10

01

00

00

00

Bucket 3

6.4.4.2 Distributed Global Index Searching (GI Searching)

Searching point query

When searching for a record with a given key value we start searching in global

index first to find out the right local index. We start searching the global index by

finding the grid array index using the linear scales upon which the field values are

divided. When we find the grid array index we find the bucket pointer from the grid

array and access the corresponding bucket. Then we search for the record throughout

the entire bucket with the help of back pointer of the nodes. If the record is found then

we get the site bit vector from the node and send the query to the sites in which the

record actually exists.

Search Point Query in GI ( searchKey )

{

get the grid array index using linear scales

find the bucket pointer from grid array using the index

access the bucket from index file

search the desired record in the bucket

if the record is found get the bit vector of that record

Submit the query according the bit vector where the corresponding record

actually exists

}

When the subqueries are submitted into local sites, the local sites search the

records using their own local index. Here the actual records are stored in grid file. So

its local index consists of grid dictionary. Using this LI, local database management

system searches the record and returns the result to the site that issue the original

query. Here the search technique follow the same as GI search for a particular graph

node in global index file. The LI search technique is given below:

Search Point Query in LI ( searchKey )

{

get the grid array index using linear scales

find the bucket pointer from grid array using the index

access the bucket from main file

search the desired record in the bucket

the record is found and return that to original site that issued the query

}

Searching range query

Two examples of such queries are: find Manufacturer = VW, Model = Bug,

D < license < X: find Manufacturer = Ford, Color = Green

Search Range Query in GI ( searchKey )

{

get the grid array index range using linear scales

find the bucket pointers from grid array using the index

access the buckets from index file

search the desired record in the buckets

if the record is found get the bit vector of the records

do OR operation among the bit vectors

If all bit become one

Submit the query all the sites

Else

Submit the query which sites have those records

}

Algorithm for searching in Global Index of a Distributed Database

6.4.4.3 Global Index Insertion (GI Insertion)

Let a record with Manufacturer = Ford, Model = Mustang, and Color = Black is

inserted into local database at site 2. Each time a record is inserted it is checked

whether it is a new combination of values of indexed attributes. Here this record is

new for site2. So it will send the values of this record to other sites like site1. Site1

will now locate the bucket where this record information could be stored. As

mentioned above this record falls into Grid Array [1, 2, 1] and from this we know that

the record will be stored in bucket 3. Now in bucket3 the record is searched. Here we

find the record in position two in the bucket. Now what we do is to turn on the bit

value of that node for site2 and save the node in the bucket again. The site bit vector

will now look like 11.

Let us now try another example. We want to insert a record with Manufacturer =

Audi, Model = Pinto, and Color = White. After inserting the record into site1 it will

be found as a new combination of indexed attribute values. So its information will be

passed to other sites to maintain GI. As calculated above this record information will

be stored in bucket3. Bucket3 is searched whether this record already exists or not. In

that case this record doesn’t exists. So a new node is created with attribute values

pointers pointing to their original value, and its site bit vector is made 10 as the record

exists only in site1.

While creating a new node in any bucket it could be found that the bucket is

already full. In that case the bucket should be split into two buckets and records will

be distributed between those two buckets. The splitting process depend on two cases.

1. when only one pointer points to a bucket

2. when more than one pointer points to a bucket

let us now clarify the process with two examples. Before this we assume that, at

present the elements of Grid Array ([1, 1, 1] to [2, 3, 2]) points to only one bucket.

Case 1

Let, while inserting a record with Manufacturer = BMW, Model = Pinto, and

Color = Black we find the corresponding bucket is full. According to linear scales this

record falls into the bucket pointed by the Grid Array [1, 2, 1] i.e. the 3rd bucket and

only one pointer points this. In that case one of the sub ranges represented by the

bucket contents must be divided. Choosing, arbitrarily, the Manufacturer dimension,

we could insert a partition at C. The corresponding linear scale is now

Manufacturer (C, G), i.e.

Manufacturer < C, C <= Manufacturer < G, G <= Manufacturer

Previously the number of Grid Array element was 2*3*2 or, 12, but now the

number of Grid Array element is 3*3*2 or, 18. The number of elements increases by

33%. Most of the new elements will point to an existing bucket. Here the new Grid

Array structure is given below:

Fig 6.5: Insertion of record in (case 1)

Previously the bucket1 was pointed only by one pointer Grid Array [1, 1, 1] but

now bucket1 is pointed by two pointers, Grid Array [1, 1, 1] and Grid Array [2, 1, 1].

Previously the records, which fall into group

Manufacturer < G

Model < K

Color < H

Now fall into two groups. The groups are

[1, 1, 1]

[1, 1, 2]

[1, 2, 1]

[1, 2, 2]

[1, 3, 1]

[1, 3, 2]

[2, 1, 1]

[2, 1, 2]

[2, 2, 1]

[2, 2, 2]

[2, 3, 1]

[2, 3, 2]

[3, 1, 1] . . . . .

[3, 3, 2]

Bucket 1

Bucket 2

Bucket 3

0 1 2 3 4 5 6 7 8 9 10 11 12 . . . . . . .

17

Grid Array

Bucket 12

Bucket 13

* * * * *

Manufacturer < C, Model < K, Color < H

And

C <= Manufacturer < G, Model < K, Color < H

But, since the records of both groups were in bucket1, now both the pointers point

to bucket1. Similarly both Grid Array [1, 1, 2] and Grid Array [2, 1, 2] point to

bucket2 and so on.

We actually split bucket3, which was pointed by Grid Array [1, 2, 1] previously.

Now we allocate a new bucket, bucket13, as there were 12 buckets previously, and

distribute the records of bucket3 to bucket3 and bucket13 according to the following

two groups:

1. Manufacturer < C, K <= Model < R, Color < H

2. C <= Manufacturer < G, K <= Model < R, Color < H

Let we keep group1 in bucket3 and group2 in bucke13. So Grid Array [1, 2, 1] still

points to bucket3 and Grid Array [2, 1, 1] now points to bucket13.

Case 2

Let us now insert a record that falls into bucket2, which is pointed by two

pointers, Grid Array [1, 1, 2] and Grid Array [2, 1, 2]. Now if we find that the bucket

is full then we don’t split the linear scales. Rather we will allocate a new bucket,

bucket14, and distribute the records of bucket2 into those two buckets according to

the following groups:

1. Manufacturer < C, Model < K, H <= Color

2. C <= Manufacturer < G, Model < K, H <= Color

If we keep group1 in bucket2 and group2 in bucket14 then Grid Array [1, 1, 2]

still points to bucket2 and Grid Array [2, 1, 2] now points to newly created bucket,

bucket14. The related picture is drawn below for further clarification:

Fig 6.6: Insertion of record in (case 2)

[1, 1, 1]

[1, 1, 2]

[1, 2, 1]

[1, 2, 2]

[1, 3, 1]

[1, 3, 2]

[2, 1, 1]

[2, 1, 2]

[2, 2, 1]

[2, 2, 2]

[2, 3, 1]

[2, 3, 2]

[3, 1, 1] . . . . .

[3, 3, 2]

Bucket 1

Bucket 2

Bucket 3

0 1 2 3 4 5 6 7 8 9 10 11 12 . . . . . . .

17

Grid Array

Bucket 12

Bucket 13

* * * * *

Bucket 14

Insert LI ()

{

Insert the record in local database

If the record is new

send it to all other sites

Insert GI ( inRec )

}

Insert GI ( inRec )

{

Search GI ( inRec )

If inRec exists in global index

then simply turn on the bit of the corresponding site

Else

insert the inRec into the bucket

if bucket is full

find whether the bucket is pointed by one pointer or more than

one pointer

if the bucket is pointed by only one pointer

randomly select any one of the linear scales

divide it into its middle

create a bucket and distribute the records according to

new linear scales

create a node in the corresponding bucket

turn on the bit of the corresponding site

save the node in the bucket

else if the bucket is pointed by more than one pointer

create a new bucket

make necessary change in grid array pointers

distribute records according to grid array pointers and

their subdivisions

create a graph node in the appropriate bucket

turn on the bit of the corresponding site and save it

}

Algorithm for inserting records in Global Index of a Distributed Database

6.4.4.4. Global Index Deletion (GI Deletion)

When a record is deleted from a local database it is checked that whether there is

any more such records in that database. If there are no such records then the

information is passed to other sites to delete the record information from their GI.

When a site receives record information to delete from its global index (GI), it

searches the corresponding node in buckets and if it finds the node it turns off the

corresponding site bit from the site bit vector of that node. Now it checks whether all

the bits of the site bit vector is zero. If so, then there are no such records in any other

sites. So it deletes the node from the bucket.

To maintain reasonable storage utilization, two candidate buckets might be

merged if their combined number of records falls below some threshold. The records

would be moved into one of the buckets and pointers to the other reassigned to it. The

empty bucket would be de-allocated from the file. Note that not every pair of buckets

can be merged. Only elements that form a k-dimensional rectangle can point to a

particular bucket.

Delete LI ()

{

delete a record from local database

if there is no such record anymore

to delete the record information from all other sites call the function

DeleteGI ( outRec )

}

Delete GI ( outRec )

{

Search GI ( outRec )

If the outRec is found in the bucket

Turn off the bit of the corresponding site

If all site bits is zero

delete the record from bucket

If the no of records in bucket falls below some threshold

Move records into one bucket

Re-assign the other pointers to it

}

Algorithm for deleting records in Global Index of a Distributed Database

6.5 Performance evaluation

In Global Index structure the techniques of Bit Vector, Graph Structure and Grid

File Organization are used. If there are P no of different values of one attribute, Q no

of different values of another attribute and R no of different values of the other then

we have to keep information at most P*Q*R no of records. This is less than the no of

total actual records in all sites. Previously in Graph Structure two pointers are used for

each attribute, but now only one pointer is used. Again the bit vector of site addresses

replaces the space of original record data file pointer. So the total amount of space

required for a graph node is almost half than the previous one. Since the graph nodes

or so-called records are stored in GI file in a fashion of Grid File Organization we

achieve the goal of searching a record in two disc file access for point queries. It also

facilitates searching range queries and dynamic adaptation such as insertion, deletion

and updating. Since only one bit is used to keep information whether a record exists in

a site or not it reduces the amount of space and also speeds up the execution.

The performance of global index is dependent on how the records of original

database are distributed over the sites. For example, a record, with Manufacturer =

Ford, Model = Pinto and Color = Black, if exists in 2 or 3 sites of total 8 sites then we

gain a considerable efficiency. But if the record exists in more than 4 sites then we

don’t gain any better performance. Again if the record exists in all sites then this

global index causes some overhead because without GI the record is searched in local

sites only, but now the record will be searched in global index file as well as in all

local sites where it exists. That is if a record exists in less than 50% of total sites then

we gain efficiency by the use of GI. So it is a matter of further inquiry how the

records are distributed all over the sites.

We have simulated a program, which assumes 8 sites and 100,000 different no of

individual records according to the indexed attributes. Each site contains 50,000

records randomly. So we see 400,000 no of actual records are distributed all over 8

sites. That is, one individual record can exist in at most 4 sites i.e. 50% of total no of

sites. In that condition, we examined 10,000 individual point queries randomly. It

shows that, before implementing GI, it needs 8*N no of comparison if there are N no

of records in a bucket, but after implementing GI it needs 7*N no of comparison.

Thus we see our performance gain equivalents to 12.5% in terms of comparison.

Again after implementation of GI the cost of network overhead reduces to 50% than

before.

The above result is true when the probability of record distribution is 0.5 among

the sites. If the probability is reduced then the performance gain increases linearly.

2.1.1.1.1 References

[1] “A Distributed Weighted Centroid-based Indexing System, in: Proceedings of the 8th European Networking Conference (JENC8)”, 1997. M. Rio. J. Macedo, V. Freitas, http://www.international. conf.jene8papaers322.ps [2] “An algorithm for the organization of information”, G. M. Adel’son-Vel’skii and E. M. Landis, Doklady Academia Nauk SSSR, vol, 146, no. 2, pp. 263-266, 1962. English translation in Soviet Mathematics, vol.3, no. 5, pp. 1259-1263, 1962. [3] “Architecture of the Whois++ Index Service, RFC1835”, August 1995, P. Deutsch, R. Schoultz, P. Faltstrom, C. Weider, ftp: ftp.ripe.net_rfe_rfe1835.txt [4] “Architecture of the Whois++ Index Service, RFC1913”, February 1996, C. Weider, J. Fullton, S. Spero, ftp://ftp.ripe. net/rfe1913.txt [5] “CIP Index Object Format for SOIF Objects (RFC draft version 2) April 1997”, M. Bowman, D, Hardy, M. Schwartz, D. Wessels, ftp://ftp.ietf.org/internet-drafts/draft-ietf-find-eip-soif-02.txt [6] “Complexity of the Common Indexing Protocol”P. Panotzki, September 1996, http://www.bunyip.com/reaserchpapers1996eip/eip. html [7] ”Database System Concepts”, Abraham Silberschatz, Henry F. Korth, S. Sudarshan, Third Edition,The McGraw Hill Companies, Inc, 1997. [8] “Distributed databases Principals & Systems”, Stefano Ceri, Ginseppe Pelagatti, McGrawHill Book Company, 1984. [9] “Files and Databases: An Introduction”, Peter D. Smith, G. Michael Barnes, Addision-Wesley Publishing Company. [10] “Free Harvest Web Indexing Software Development”, http://www.trdis.ed.ac.uk/harvest [11] “InfoSeek Distributed Search Pattern, 1997”, http:// software/infseek.com/patents/dist.search [12] “Lightweight Directory Access Protocol (v3), RFC2251, December 1997” , M. Wahl, T. Howes, S. Kille, ftp://ftp.ripe.net/ rfc/rfc225.txt [13] “ORACLE 8: A Beginner’s Guide”, Michael Abbey, Michael J. Corey, Tata McGraw -Hill Publishing Company Limited, 1997. [14] “ORACLE The Complete Reference Third Edition” George Koch, Kevin Loney, Osborne McGraw -Hill, 1995. [15] “Organization and maintenance of a large ordered indexes”, R. Bayer, E. McCreight Acta Information, vol 1. No.3 pp, 173-189, 1972. [16] “Proceedings of the TERENA Networking Conference 1998”, P> Valkenburg, D. Beckett, M. Hamilton, S. Wilkinson, Standards in the CHIC-Pilot Distributed Indexing Architecture, in: Computer Networks and ISDN Systems special issue http://www.terena.nl/ libr/tech/ehie-fr.html

[17] “Special Edition Using Microsoft SQL Server 6.5“ Stephen Wynkoop, Prentiee-Hall of India Private Limited, 19998. [18] “Stanford Protocol Proposal fo Internet Search and Retrieval, January 1997”, L. Gravano, K. Chang, H. Garcia -Molina, C, Lagoze, A. Paepcke, http://www.db.stanford.edu/~gravano/ staarts.html [19] “TERNA Task Force on Cooperative Hierarchical Indexing Coordination (TF-CHIC)”, http://www.terena-nl/task/eits.html [20] “The AltaVista Search Service” , http://www.altavista. digital.com [21] “The Architecture of the Common Index Protocol (CIP)” J. Allen, M. Mealling,(RFC draft version 1), 1997, ftp://ftp.ietf.org/internet-drafts/draft-ietf-find-eip-arch-01.txt [22] “The Art of Computer Proogramming vol.3: Sorting and Searching”, D.E. Knuth, Addision-Wesley, Reading Mass, 1973. [23] “The InfoSeek Search Service”, http://www.infoseek.com [24] “W3C’s Distributed Indexing/Searching workshop”, May 1996, http://www.w3.org/Search/9605-Indexing-Workshop