8
Parallelism in Database Production Systems Jaideep Srivastava Kuo-Wei Hwang Jack S. Eddy Tan Computer Science Department University of Minnesota Minneapolis. MN 55455 ABSTRACT The need for large database production systems is being felt by both the database and AI communities. Achieving good perfor- mance will be a major problem. and we believe parallelism is a promising solution. In this paper we study the issues in paralleliz- ing database production systems. We classify various kinds of parallelism possible in production systems. Parallel execution of production systems leads to some subtle problems, which if not handled carefully, can cause altered execution semantics. We identify the precise conditions that any parallel implementation of production systems must fulfill in order to be semantically con- sistent. We present a mechanism that guarantees the semantic consistency of parallel execution, and prove its correctness. It is based on a novel locking mechanism that provides more parallel- ism than conventional 2-phase locking. Finally, we discuss the various factors that can affect the actual speed-up of a database productionsystem. 1. Introduction There is motivation for building production systems that run in a database environment. First. expert system users are asking for knowledge sharing and knowledge persistence, features found currently in databases. Secondly, many new database applica- tions, e.g.. manufacturing and process control, need some rule based reasoning, a principle feature of expert systems. Since pro- duction systems are commonly used to implement expert systems, the study of database production systems becomes very important. Commercial DBMS's do not have the necessary mechanisms to support production systems. Relational views can be used as rules in a limited way [SHMu86], and recent work has developed powerful mechanisms to handle a larger class of rules [BANC86]. However, these are not sufficient to support full-fledged produc- tion systems like OPS5 FORG81 J and HEARSAY-I1 [HAYES]. Some recent work by Sellis and Raschid [SELL88. RASC881 has focused on issues in database production systems. Performance is expected to be a major problem for these sys- tems, and we believe parallel processing to be a promising solu- tion to it. A number of efforts have been made to parallelize pro- duction system execution [GUPT86,OSHI88, ISHI85, RAMN851, with focus on main memory resident data. Some ideas on parallel- ization of database production systems has also been reported in [SELL 881. We classify the types of parallelism in production systems, and identify the necessary consistency conditions for a parallel implementation. A locking based mechanism for con- sistency is described and proved to be correct. Various factors CH2S40-7/oooo/ol21$01.00 0 1990 IEEE affecting parallelism are also identified. This paper is organized as follows: Section 2 presents the pro- duction system model and surveys past work. Section 3 discusses semantic equivalence of executions. Section 4 presents a locking mechanism for semantic consistency. Section 5 illustrates, by means of examples, the various factors affecting actual speed-up obtained, and section 6 provides our conclusions. 2. The Production System Model A production system consists of a set of rules @reductions), a database (working memory), and an interpreter. Items in the work- ing memory are called working memory elements (WMEs). A pro- duction has a condition part (LHS), and an action part, (RHS). Thus, we have <Production>: if <condition> then <action>. The LHS is a conjunction of terms (condition elemem). while the RHS is a set of operations to be performed when the LHS is true. The RHS can have operations create, modify, and delete. which respectively add to. modify, and remove items from the database. The production system interpreter executes a three phase production system cycle repeatedly until a termination con- dition occurs. The phases are: match: The productions are matched against the database to determine the satisfied LHS's. This is the set of active pro- ductions (conflict set). select: The &"nantproduction of the conflict set is selected for execution based on some priority scheme. Various hcuris- tics have been devised to decide the rule priorities [FORGII].' If the conflict set is empty or the system goal is achieved, the termination condition becomes true and the sys- tem halts. execute: The RHS operations of the selected production are performed, which may cause changes to the database. Production system execution can be speeded up by utilizing parallelism in various ways. These can broadly be classified into two categories, i.e. user visible and user transparent parallelism. User visible: The user is aware of parallelism opportunities, and makes full use of them. Example approaches are (1) dividing a task into non-interacting subtasks, (2) partitioning the database into classes of objects accessed by different tasks. ' However. it is important to note that the selection of the dominant production is not under user control. As shown later, this is important in definiig the execution semantics of a production system. 121

[IEEE Comput. Soc Sixth International Conference on Data Engineering - Los Angeles, CA, USA (5-9 Feb. 1990)] [1990] Proceedings. Sixth International Conference on Data Engineering

  • Upload
    jse

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

Page 1: [IEEE Comput. Soc Sixth International Conference on Data Engineering - Los Angeles, CA, USA (5-9 Feb. 1990)] [1990] Proceedings. Sixth International Conference on Data Engineering

Parallelism in Database Production Systems

Jaideep Srivastava Kuo-Wei Hwang Jack S. Eddy Tan

Computer Science Department University of Minnesota Minneapolis. MN 55455

ABSTRACT

The need for large database production systems is being felt by both the database and AI communities. Achieving good perfor- mance will be a major problem. and we believe parallelism is a promising solution. In this paper we study the issues in paralleliz- ing database production systems. We classify various kinds of parallelism possible in production systems. Parallel execution of production systems leads to some subtle problems, which if not handled carefully, can cause altered execution semantics. We identify the precise conditions that any parallel implementation of production systems must fulfill in order to be semantically con- sistent. We present a mechanism that guarantees the semantic consistency of parallel execution, and prove its correctness. It is based on a novel locking mechanism that provides more parallel- ism than conventional 2-phase locking. Finally, we discuss the various factors that can affect the actual speed-up of a database production system.

1. Introduction There is motivation for building production systems that run in

a database environment. First. expert system users are asking for knowledge sharing and knowledge persistence, features found currently in databases. Secondly, many new database applica- tions, e.g.. manufacturing and process control, need some rule based reasoning, a principle feature of expert systems. Since pro- duction systems are commonly used to implement expert systems, the study of database production systems becomes very important.

Commercial DBMS's do not have the necessary mechanisms to support production systems. Relational views can be used as rules in a limited way [SHMu86], and recent work has developed powerful mechanisms to handle a larger class of rules [BANC86]. However, these are not sufficient to support full-fledged produc- tion systems like OPS5 FORG81 J and HEARSAY-I1 [HAYES]. Some recent work by Sellis and Raschid [SELL88. RASC881 has focused on issues in database production systems.

Performance is expected to be a major problem for these sys- tems, and we believe parallel processing to be a promising solu- tion to it. A number of efforts have been made to parallelize pro- duction system execution [GUPT86,OSHI88, ISHI85, RAMN851, with focus on main memory resident data. Some ideas on parallel- ization of database production systems has also been reported in [SELL 881. We classify the types of parallelism in production systems, and identify the necessary consistency conditions for a parallel implementation. A locking based mechanism for con- sistency is described and proved to be correct. Various factors

CH2S40-7/oooo/ol21$01.00 0 1990 IEEE

affecting parallelism are also identified. This paper is organized as follows: Section 2 presents the pro-

duction system model and surveys past work. Section 3 discusses semantic equivalence of executions. Section 4 presents a locking mechanism for semantic consistency. Section 5 illustrates, by means of examples, the various factors affecting actual speed-up obtained, and section 6 provides our conclusions.

2. The Production System Model A production system consists of a set of rules @reductions), a

database (working memory), and an interpreter. Items in the work- ing memory are called working memory elements (WMEs). A pro- duction has a condition part (LHS), and an action part, (RHS). Thus, we have

<Production>: if <condition> then <action>.

The LHS is a conjunction of terms (condition elemem). while the RHS is a set of operations to be performed when the LHS is true. The RHS can have operations create, modify, and delete. which respectively add to. modify, and remove items from the database. The production system interpreter executes a three phase production system cycle repeatedly until a termination con- dition occurs. The phases are:

match: The productions are matched against the database to determine the satisfied LHS's. This is the set of active pro- ductions (conflict set). select: The &"nantproduction of the conflict set is selected for execution based on some priority scheme. Various hcuris- tics have been devised to decide the rule priorities [FORGII].' If the conflict set is empty or the system goal is achieved, the termination condition becomes true and the sys- tem halts. execute: The RHS operations of the selected production are performed, which may cause changes to the database.

Production system execution can be speeded up by utilizing parallelism in various ways. These can broadly be classified into two categories, i.e. user visible and user transparent parallelism.

User visible: The user is aware of parallelism opportunities, and makes full use of them. Example approaches are (1) dividing a task into non-interacting subtasks, (2) partitioning the database into classes of objects accessed by different tasks.

' However. it is important to note that the selection of the dominant production is not under user control. As shown later, this is important in definiig the execution semantics of a production system.

121

Page 2: [IEEE Comput. Soc Sixth International Conference on Data Engineering - Los Angeles, CA, USA (5-9 Feb. 1990)] [1990] Proceedings. Sixth International Conference on Data Engineering

User transparent: Effective utilization of parallelism is the system’s responsibility. Example approaches are (1) intra- phme parallelism, i.e., execution of each phase in a parallel manner, (2) inter-phase parallelism, i.e., overlapped execu- tion of different phases, and (3) taking advantage of potential sharing of computation among different subtasks2 Finally, tasks of different users can be done in parallel.

AI research has focussed on main-memory based, single-user expert systems, where the match phase is the bottleneck [FORG82]. A good solution is the Rete matching network [FORG82], which, (1) does incremental condition evaluation by storing state from previous matches, and (2) allows sharing of common subexpressions among LHS’s of different productions. Subsequent research has developed parallel algorithms and spe- cialized architectures for matching [GUPT86, MIRA84, RAMN86, SHAW81, STOL841. Some attention has been focused on the parallelization of the execute phase, and interference3 among productions was identified as the main problem. Proposed approaches are based on pre-execution analysis to partition pro- ductions into non-conflicting classes [ISHI85, MIRA84, TENOSS]. Problems with these approaches are, (1) static analysis cannot be exhaustive due to rapid state explosion, (2) interference usually depends on run-time values of variables, and (3) partition- ing created statically does not remain optimal during execution. Recently, [OSHI88] proposed a dynamic scheme, and reported the criteria for detecting interference? However, (1) the criteria are not necesscuy for non-interference, and thus overly restrictive, and (2) no interference detection algorithm is reported.

Recent database research has focused on active databuses [STON86, DAYA881, with focus on providing some support for rule processing. These databases can monitor conditions and exe- cute appropriate actions. Alerters and triggers are the outcome of this research. So far, the focus has been on condition monitoring, which is equivalent to the match phase. However, the execution phase will be a full-fledged database query and is likely to be time consuming, making its paralleliiation an attractive proposition.

Some recent work on database production systems [SELL88, RASC881 has focused on the match phase, and cond relations are proposed instead of the Rete network. as the database matching algorithm. Parallelization of the execution phase is mentioned but no detailed algorithms are provided.

3. Execution Semantics for Production Systems Starting with an initial conflict set and an initial state of the

database, a production system goes through a sequence of execu- tions. Even though the exact sequence can almost never be deter- mined a priori. there are some sequences that cannot occur. Given the initial system state? it is possible (conceptually) to determine the allowable sequences of state changes. Each allowable sequence is a valid sequence. Given an initial system state, the set of valid sequences defines the execution or operationul semuntics of the production system.

3.1. Single vs. Multiple Execution Threads Normal operation of a production system, where only one pro-

The RETE fFORG821 and TREAT [MIRAU] pattem matching algo-

3 hoductjm PI interferes with production Pz if the execution of PI’S

Incidentally, these criteria are identical to detecting coq7icfing data-

rithms are examples of this approach.

RHS can cause P2’s LHS to become false.

base operations [PAPA 861.

duction is executed at a time, is the single execution thread mechanism. The multiple execution thread mechanism is where many productions can be selected simultaneously, leading to con- current RHS executions. This approach can be used to execute the productions either on a uni-processor or a multi-processor. As shown in section 5, the single thread mechanism is at least as effi- cient as the multiple thread mechanism on a uni-processor. How- ever, the latter can lead to substantial savings on a multi- processor.

32. Semantics of the Single Execution Thread Mechanism We earlier defined the execution semantics as the set of valid

execution sequences, given the initial system state. The execution semantics will, in general, not be the same for different execution mechanisms. Since the basic production system model is defined for single execution thread? we use its execution semantics,

Execution of a production system is modeled as a progression through its state space. The system state consists of the conflict set and database contents, while a transition consists of (1) select- ing some active production(s), (2) executing selected production(s), and (3) matching the LHS’s with the new database to update the conflict set. Execution sequences are represented by strings of the form . . .pipjpt . . . . Each state is uniquely associ- ated with a string representing the sequence of productions executed to reach it, starting from the state S, (where E is the null string). The state reached after executing the string a. S, is,

where PA@) and WM(a) are the new conflict set and database state, respectively. Given an initial state, any execution sequence allowable by the single thread mechanism can be mapped to a unique root-originating path7 of a graph, as shown in figure 3.1. This graph depends on the initial state, S,, and is called the execu- tion graph. It can be constructed in a recursive manner by starting at the r o t , and adding to each leaf node, Sa, the edges corresponding to the productions in the conflict set, PA(at), as shown in figure 3.1.

as the correctness criteria.

so = <PA(a););WM(a)>

Figure 3.1. Single Thread Execution Graph

~

Initial system state amsists of the initial conflict set and the initial

OPS5 system [FORG79] is a well known example. state of the dstpb.se.

’ A Root-originating path is one that originates at the root node of h e graph

Page 3: [IEEE Comput. Soc Sixth International Conference on Data Engineering - Los Angeles, CA, USA (5-9 Feb. 1990)] [1990] Proceedings. Sixth International Conference on Data Engineering

Definition 3.1 (Execution Semantics of the Single Execution Thread Mechanism): The execution semantics of the single execu- tion thread mechanism, ES,,,,,. is defined to be the set of root- originating paths in its execution graph, and all prefixes of such Paths-

Definition 3 2 (Semantic Consisfency Condition): The execu- tion semantics of an execution mechanism M. ES,, is consistent with that of the single execution thread mechanism i f f ES, E

This defdtion of execution semantics accepts any sequence of production executions which lead to the conflict set becoming empty, and prefixes of such sequences, provided the system is run on a uniprocessor. In the production system paradigm the conflict resolution phase may prefer some sequences over others. The OPS5 system, for example, uses conflict resolution policies such as LEX and MEA [FORGBl]. These are heuristics that strongly favor some sequences over others. However, the important point to note is that they do not rule out any execution sequence entirely. Thus, the notion of execution semantics defined above captures exactly the set of sequences that could possibly happen. Our effort, henceforth, will be to ensure that the semantic consistency condition is met. Thus, once the basic guarantee of correcmess is provided, heuristics such as LEX, MEA. and others can be incor- porated as devices to favor some sequences over others. This is entirely compatible with our model and can be done easily.

Remark: It is possible to start from the state S,. follow a path not shown in figure 3.1, and reach another state shown in the graph. This means that starting from the initial state, there is a way of mimicking this particular sequence of the single execution thread approach, which would never be allowed by it. In general this is unlikely to happen and depends on the application. Our d e f ~ t i o n of consistent execution semantics does not require the system to have any application specific knowledge. The only guarantee a system implementing an execution approach has to provide is that the consistency condition above is fulfilled.

33. An Example of Execution Semantics The execution of a production P, causes some productions to

be added to/ deleted from the conflict set. These are the add set (A,”) and delete set (49 of P,. In general these will depend on P, and the current database state. However, for illustration we assume the dependence is only on P, .

Consider the set of productions {plPZP3P4PSPdr with the fol- lowing add and delete sets:

ESsuqk.

A f = (P4); A?= ( P g 3 ) ; A$= [); A$= ( P i ] ; A;= [PSI; A:= ( P z P s l ; G= (PSI; a4”= ( P z ] ; A:= (1; A!= (PiJ’z); Y = { P d ; G= IP1941;

Let (P1,Pfl3,PS] be the initial conflict set. As shown in figure 3.2, the possible execution sequences are p p a , p,p.q@s, pzp3p95,

P ? P 9 9 @ s * PSlP@9@S. P?PlP@@S. P9lPsPs . PSSPS. and P s p 9 . 4 5 . Thus,

ES,.,I. = IPIP~PSIPLP~@S,PZP~P~PS.~SPS~PS. PSIP@@@S.

PSIP@@s. PPIP@s. PP@S. PSPS@d

4. Mechanisms to Ensure Semantic Consistency of Multiple Thread Execution We distinguish two different approaches to parallel execution

of productions, based on when conflict analysis is cameti out.

Figure 3.2. Execution Graph

4.1. Static Approach Static approach is based on pre-execution analysis to identify

sets of non-interfering productions; i.e., partitioning the produc- tions into non-interfering groups. (Two productions are non- interfering if there is no read-write or write-write conflict between them.) The paxtitioning can be done on either the whole produc- tion set before running the production system or on set P A before the execute phase of every production cycle, or a combination of both [ISHI85]. Theorem 1: Multiple thread execution with the static approach fulfills the semantic consistency requirement defmed in section 3. Proof: Let S’=<PA‘;WM’> be the initial state of the production system, Ps={ PIP2, .....PJ. PA; c PA‘, be the set of productions selected for parallel execution, and S”= <P’”‘;WM”> be the state reached after the parallel firings. Since PIP2. .....,Pi are non- interfering. they update non-overlapping parts of the working memory. Thus the state of WM” will be identical to the resulting state of some serial update by the same productions. Similarly, the new set of active productions PA” will be identical to the set gen- erated by the same serial execution. Thus, S”=S,, where (T is a permutation of the productions in Ps, and S,= cP”(a);WM(a)> is a state reached in the execution graph of figure 3.1 following the single thread execution of the string a. Clearly, any subsequent system state reached as a result of further parallel firings will be identical to some state in the. execution graph of figure 3.1.0

In general it is very difficult, if not impossible, to optimally partition the rules of a production system because of the state explosion problem. Further, optimality depends also on the state of WM. which cannot be determined a priori. The conventional approach is to use the information obtained from execution traces [ISHI85]. but this may be very difficult to automate.

hother problem concerns the data objects in WM. which may be hierarchically structured. Two productions may have interfer- ence on the ’class’ level, where in fact each is referencing a dif- ferent ’subclass’ data item. To detect such ’false’ interference may again be cost prohibitive. And as pointed out in section 2, interference. may depend on variables whose value is acquired at run time. In such cases the analyzer must behave in a conservative manner. sacrificing parallelism. in order to prevent inconsistency.

123

Page 4: [IEEE Comput. Soc Sixth International Conference on Data Engineering - Los Angeles, CA, USA (5-9 Feb. 1990)] [1990] Proceedings. Sixth International Conference on Data Engineering

Even if the analysis is restricted to productions in PA only, the overhead may still be large. The analysis has to be done in every production cycle. If IP A I or the number of data objects involved is large, then the time spent in finding the non-interfering produc- tions may be prohibitive. For applications that require fast response time, such delay may not be acceptable. More discussion and examples of the static approach can be found in [ISHI85, MOLD86J.

Acquire read locks for condition evaluation

! 4.2. Dynamic Approach This section describes a new approach that dynamically con-

trols concurrent execution of productions during run time. It does not require aprwri analysis of the production rules.

The conventional match, select, execute cycle of production system is modified to increase parallelism. The match mechanism determines which of the non-active productions can be activated. Once a production's LHS is evaluated to be true. its RHS can be executed immediately. Let the number of available pmessors be denoted by Np. As long as Np 2 IP A 1. no explicit select mechanism is required.

Executing the productions in parallel is similar to concurrent execution of transactions in a DBMS environment. Concurrency control mechanisms are required to maintain consistency of the shared data in WM. Conventional locking schemes, with read and write locks, will provide the requisite consistent (i.e.. serializable) execution of productions. Below is an example of such a scheme, using a centralized lock manager.

During condition evaluation. a production requests read locks for the data objects referenced. Note that condition evaluation does not require write locks. A read lock of a data object can be granted only if no other production has placed a write lock on it. If the condition evaluates to false, the production relinquishes its locks and its execution stops. If the condition is true, execution of the RHS begins and additional read and write locks may be requested. A write lock is granted by the lock manager only if there are no other locks on the data object All the locks obtained are kept until execution of RHS completes (i.e. commits). A com- mit event triggers the match mechanism. Figure 4.1 shows the logical execution sequence of a production.

The WM content is atomically updated, only when a produc- tion reaches its commit point. The changes made may activate some productions and deactivate others. Conceptually, we can think of the commit of Pi as adding (subtracting) the set Af (Ai? to (from) the conflict set P".

In this scheme read locks are held in a shared manner and write locks are held in an exclusive manner. Thus, the commit sequence of any parallel (or interleaved) execution of productions using such a scheme is equivalent to some serial order execution [PApA86]. However, we still must show that any such order satisfies the definition of semantic consistency of section 3, i.e. the serializable order is indeed one of those allowed by the single thread execution.

Let ESkk be the execution semantics of the locking scheme described. Any such sequence uniquely defines an execution instance of the multiple thread execution using the above locking scheme. To prove that semantic consistency is satisfied, we must

Theorem 2: The execution semantics of the locking scheme satis- fies the semantic consistency condition of definition 3.2. ProoF: (By induction) Suppose the sequence of production commit

Show that that E s h k G ES,i,+.

Release all locks

j r

Figure 4.1. Standard Lock Acquisition for Production .... .fiP,Pi ..... is represented by the string ..... pipjpk ...... Now, ESkk = ur', where r' is the set of a l l commit sequences of length

i . Also, = ~ 2 , where 2 is the set of all the root originating

paths of length i in the execution graph of Figure 3.2. The initial system state corresponds to root originating path of length 0, thus the problem reduces to showing that r' sZi, for 2 1 .

Induction Base: Recall that update to WM and invocation of the match mechanism are performed only when a production com- mits. Let Pi be the first production to commit. Consider the le le- ment string pi E I-'. Just before Pi commits, there would have been no new active productions created, and no changes to WM. When Pi commits. the WM will be updated and productions in A,d will be removed from the conflict set. whereas those in A,a will be added to it. The system state is thus identical to the state S, of a single thread execution, where S,=<PA(pi);WM@,)>. Thus, the commit of Pi in the above locking scheme is equivalent to the serial execu- tion of Pi in a single thread execution. Since Pi is chosen arbi- trarily, we have:

i10

iH)

r ' d Induction Hypothesis: Assume that for some k > 1, rj SZJ is

true for j=l ,2,.....&.

Induction Step: Suppose k productions have completed exe- cution using the locking scheme above. Let the commit sequence so far be represented by the string p*....pk E P. Let fk+l be the next production to commit. Hence fk+l E PA@& ....pk ). Since there is no change to WM in the interval following fk 's commit and before pk+I's commit, no new productions can be added to, or deleted, from PA during this interval. Thus. by the induction hypothesis, just before fk+] commits, the system state is:

Once fk+l commits, the state of WM becomes: sp-...,= <wM@#2..qh) ; A @@2.-Pk)>

[email protected]+l)~ and the new conflict set becomes:

Thus, the system state is identical to Sm,+,- of a single thread execution. The commit sequence P$2 ....pkPk+I is identical to some single thtead execution of the same sequence. Since this holds true for any commit sequence p@> ...P&+I. we have:

PA@@2..-PkPk+l) ' P A @lh.-.Pt) - &+I U A ~ + I

P+l E P+l. Therefore, we have I ' s X i , for i a , or ur' E uZ', or

iZ0 iH)

EShk E

1 24

Page 5: [IEEE Comput. Soc Sixth International Conference on Data Engineering - Los Angeles, CA, USA (5-9 Feb. 1990)] [1990] Proceedings. Sixth International Conference on Data Engineering

As shown above, standard 2-phase locking provides semantic consistency for multiple thread execution. However, its conserva- tive nature causes a serious performance drawback. The action part of a production can be long, which is the case for many data- base applications. Read locks acquired for evaluating the LHS are held more conservatively than necessary while other productions ready for execution must wait for their release. In the next section, we show how concurrency can be enhanced by making these read locks more liberal. 43. An Improved Locking Scheme for Enhanced Parallelism

A closer look at the data access pattern during production exe- cution shows that:

(i) LHS of a production must be executed before its RHS. (ii) DATA access in LHS is read only. (iii) Data access in RHS is read-write.

Based on this observation, we define the following 3 kinds of locks:

R,: Read lock for condition evaluation. R,: Read lock for action execution. W.: Write lock for action execution.

R, and W , are the same read and write locks as before. R, is a new kind of lock whose usage will become clear below. The new lock compatibility matrix is shown in Table 4.1

During condition evaluation, a production will request only R, locks on data items. The lock manager will grant a R, lock as long as no production has already placed a W, lock on the same data item. As before, a production can start executing its RHS, holding the R, locks, if its LHS is satisfied. If the condition is not satisfied, all locks are relinquished.

When a production begins executing its RHS. it first obtains the corresponding R, and W , locks. Again, an R, lock can be granted only if there. is no other production currently holding a W, lock on the Same object. Similarly, a W , lock can be granted only if there is no outstanding R, or W,, lock. Note that a W , lock can be granted even ifanother production is holding a R, lock on the data (allowing R,-W, conflict to exist !).This is the key to enhanced parallelism. The logical execution sequence is shown in Figure 4.2.

The increased parallelism does not come for free. More pro- cessing needs to be done at the end of production execution to make sure that not only data consistency is preserved. but execu- tion semantic consistency is satisfied as well. Specifically, the lock manager must make sure that when there is a R,-W. conflict, execution of the productions involved must be serializable. Con- sider two productions P, and P i . P j holds a R, lock on data item q , and P, holds a W , lock on q. Assuming that they do not have other data items in common, the two valid serial execution sequences

The Sew Loek Compatibility Matrh ! k k ,s-nea 1 k k r c q u w by 9. I I b? ?, ! 2, I R. i %, i

I

i rl, I Y 1 Y ; Y - \

~ .?. ! l - / Y ! s !

n; s : s x

Table 4.1

1 1 !I

Figure 4.2. Lock Acquisition with R, Locks are: Pipi and Pi (assuming that the update by Pi nullifies the con- dition of Pi) . This can be achieved by the following d e s :

(i) if Pi reaches the commit point first, it commits and is not interfered by Pi. (ii) if Pi reaches the commit point first, P, must be forced to abort.

The scenario is depicted in figure 4.3(a) and 4.3(b). In the first case, Pi will commit and then Pi will update q and commits. The serial order is therefore PjPi . If Pi commits first, as in figure 4.3(b), the lock manager finds all productions holding R, lock on q and forces them to abort. Thus Pi will be aborted. It is possible that the update by Pi may not have changed the condition of Pi to false. One altemative of rule (ii) may be to reevaluate Pj ' s condi- tion to see if abort is necessary, at the expense of increase over- head.

Note that the above scenario occurs only if the execution of Pi stans after the matching preceding P j begins. Since all R. and W, locks are acquired at the start of P i , if the matching preceding P, begins after execution of Pi begins, P j will never enter ths conflict set.

Consider now a more complicated example depicted in Figure 4.4. Suppose Pi holds a R, lock on data item q , a W, lock on data item r , while Pi holds a R, lock on r and a W , lock on q. Since they interfere with each other, a valid serial execution will allow only one of them to be executed. Using the rules above, the com- mitment of one production always forces the other to abort. Thus the consistent execution semantics is once again satisfied.

Let P be the set of commit sequences of length k in a multiple thread execution using R, locks. Given the same initial WM state, any order of production commits f, f 8 p, will again be identi- cal to some mot Originating path of length k of the execution graph of Figure 3.2. This is because the new lock introduced. R,, does not change the fact that read is still done in a sharcd manncr, and update is still done in an exclusive manncr (by forcing intcr- fered production to abort when necessary). Thus the proof for Theorem 2 holds for the improved scheme.

It should be noted that the non-exclusive nature of the new R, lock does not introduce new kinds of deadlocks. n u s . the deadlock prevention, avoidance, detection or resolution schcmes for standard 2-phase locking can be applied to our scheme as well.

Like regular read and write locks, the R, locks can be escalated for performance reasons. In the extreme case, a R, lock may lock an entire relation. An example is when a condition is dcpendcnt on the absence of some tuples from a relation (negative dependence). In this case a lock can be placed at the relation level. Such a lock is equivalent to locking the appropriate tuple in rhc 'SYSTEM-CATALOG' relation.

I25

Page 6: [IEEE Comput. Soc Sixth International Conference on Data Engineering - Los Angeles, CA, USA (5-9 Feb. 1990)] [1990] Proceedings. Sixth International Conference on Data Engineering

Pt-oduction P;: I LHS I RHS ; I

t i commits

f ioduct ion Pj (R. ~lme- p i l commi t s

(8)

commits Roduct ion Pi

W. Fkdmcl,on P,

LHS I RHS

L r i m c 9

Figure 4.4. Circular Conflict Dependency

5. Performance of Single Thread vs. Multiple Thread Execution The amount of speedup gained by using a multiple execution

thread approach on parallel hardware depends on many factors. Some of them are discussed here, and their effects will be illus- trated through examples.

Single thread execution can be performed only on a uniproces- sor. Multiple thread execution can be NII on either a uniprocessor or a multiprocessor. Execution speedup gained by multiple thread depends on several contributing factors and will be discussed below. Speedup is the ratio of the execution times of the single thread mechanism to that of the multiple thread mechanism. For example, if a single thread execution sequence takes 12 time units while its corresponding multiple thread takes 3 time units. the speedup is 4.

The degree of parallelism attained by the multiple thread mechanism depends on various factors. The ones we discuss are (i) Degree of interference (ii) Number of available processors (iii)

Execution times of individual productions

In this paper, we illustrate the effects on these parameters through examples. A formal analysis of these effects is beyond the scope of this paper. and work on it is currently underway.

Execution of the single thread mechanism on a uniprocessor will take no more time than the multiple thread mechanism.

Example 5.1 Let execution sequence U be an element of the execution

semantics of the single thread mechanism, ES-,,. Its execution time on a uniprocessor. T+,,(U). is

Ti+(d= Wj) P I G O

where Pj's are productions that have completed execution so far, and T(Pj) is the execution time of production Pi.

For the multiple thread execution, all productions in the active set, pA. are executed simultaneously. If executed on a uniproces- sor. the time taken to complete all active productions is

T"M+(u)= TVj)+f*Z T(pd ? I . 0

where Pk E u A; and OSf < 1. The second term is the contribu-

tion from the partial executions of all productions that started exe- cuting but were aborted due to the committing of productions in U. The factorf is an averaged fraction.

Since an aborted production leads to wasted coiputation. T-,, s T- . Hence, single thread execution on a unipmesor is no worse than multiple thread execution.

A multipmcessor can exploit the concurrency inherent in the multiple thread execution mechanism. Speedups attained. as men- t i d above, depend on the degree of conflicthterferen, execu- tion time of actionsiproductions and the number of available pm-

Following examples illustrate these effects. In each example, one parameter is varied while the other two an? held constanL We compare the speedups obtained by varying the parameters.

Let the conflict set be P* = {P~,PZ,P~,P~), the number of avail- able processors be Np = 4, and the times taken to execute all pro- ductions in PA be T ( P I ) = 5 , T(P2)=3, T (P3)=2 . T(P4)=4, and u1 = p w 4 be an allowable execution sequence. Figure 5.1 shows the execution of PA on a 4 processor machine. The corresponding add and delete sets shown in Table 5.1. All subsequent examples will use the above values as a comparison base. The execution

PI

cessors.

I Add/Delete Set

Table 5.2. Different Add/

Delete Sets of PA Showing

Changed Degree of Conflict.

Table 5.1. Add/Delete Sets for PA

I 26

Page 7: [IEEE Comput. Soc Sixth International Conference on Data Engineering - Los Angeles, CA, USA (5-9 Feb. 1990)] [1990] Proceedings. Sixth International Conference on Data Engineering

time for a1 using a single thread approach is, as noted in Figure 5.1, T*,,(ul) = T ( P 3 + T(P2) + T(p3 = 2+3+4 = 9. The execution time for the multiple thread approach is 4. Hence. the speedup is 9 - = 2.25. 4

5.1. Degree of Conflict Variation Let us change the degree of interference of productions in PA

to that shown in Figure 5.1. If a2=pp2 is the new selected execu- tion sequence, then productions. T,*,&2) = 5. T,,,,+&d = 3 and

the speedup is J. = 1.67 (see Figure 5.2). Recall that the speedup

was initially 2.25. The degree of conflict is thus an important fac- tor that affects the speedup for parallel executions of productions.

3

5.2. Execution Time Variation

cution time for aI using a single thread mechanism becomes, If the execution time of P2 is increased by 1 time unit, the exe-

TM&(Ul) = T(P,) + T ( P 3 + T(P4) = 10

As shown in Figure 5.3, execution time for aI using multiple thread mechanism is.

TnllII,*(Ql) = 4

10 4

Recall that the initial speedup was 2.25. In this case, the speedup increase is due to the increase in Ti,+(ol) as a result of increasing the execution time of P2. 53. Number of Processors Variation

execution, Np, will expedite execution if

Speedup = - = 2.5

The number of processors available for the multiple thread

NcLmaxl PA1

since I P'l is dynamically changing. However, even if N p S m a x l PAl, execution are also expedited if the cum

Typically. if N,, s max I PA I, at least two productions will share the same processor which is akin to an isolated single thread fr- hg. Such a scenario is still superior to a single thread execution on a u n i p m s o r .

The following example illustrates the effect of varying the number of available processors. When the number of available processors is reduced to 3, the following is observed (see Figure 5.4)

T.&(U,) = T(P3 + T(P3 + T(P3 = 9

IP'I<Np<"IP'I

T*(UJ = 6

** = 9 = 1.5 6

The speedup shows a decrease from 2.25 to 1.5. The reduction in speedup is intuitive-since now there is a pmessor that has more than one production to execute.

As shown in this section. all parameters mentioned are impor- tant in determining the speedup attainable for parallel execution of produaions

6. Condusion Recent years have witnessed growing interest in the area of

database production systems. Their need is being felt in both data- base and AI communities. This widespread interest has necessi- tated a look at the expected performance of these systems. Tradi- tional production systems used in the AI realm store their data in main memory, and have not dealt with the complexities of mass data storage. The mechanisms provided by today's database sys- tems are not suitable for database production systems, primarily because of performance considerations. Good performance has always been a critical issue for database researchers, and database production systems will make it even harder to achieve. We believe the availability of parallel hardware should be exploited to achieve enhanced performance. This paper explores the possibili- ties of parallelism in production systems, with special emphasis to the database environment.

Following have been our contributions. We classified various opportunities for parallelism. Parallel execution of production systems can lead to subtle problems, which if not handled carefully can lead to altered semantics. We identified the precise conditions that any parallel production system implementation must satisfy in order to be semantically consistent We presented a mechanism to ensure the consistency of parallel database produc- tion systems. A new kind of locking mechanism, which allows enhanced concurrency, was introduced and proven correct. It is simple and elegant, and requires minor modifications to conven- tional lock managers. We illustrated the effect of various parame- ters on the actual speed-up obtained by means of examples.

There is ample evidence that database production systems can benefit a great deal from parallelii. However, many issues remain to be resolved before such systems can be built Some of these are: (1) what are the right kinds of parallelization techniques (data structures and algorithms) that are suitable for each phase of the production cycle, (2) how can these be combined, (3) how can the specific hardware be taken advantage of. These and many other issues are currently being investigated as part of an ongoing project

0 1 = PGA .-Aborted by Pa

fifoceseor 1 : .--- Tdmde(~l) = 2+3+4 = 9

Tmdtip(r(Q1) = 4 Speedup = 2.25

Rocessor 2 ;i: A.ocessor 3 '1

A.ocessor 4 P

4

Figure 5.2. Execution of u2 for a Different

Degree of Conflict

I27

Page 8: [IEEE Comput. Soc Sixth International Conference on Data Engineering - Los Angeles, CA, USA (5-9 Feb. 1990)] [1990] Proceedings. Sixth International Conference on Data Engineering

=2 + 4 + 4 = 10

t"---- Speedup = 2.5

Figure 5.3. Character is t ics of Varying Execution

Times

Figure 5.4. Execution Characterietica for Change

in Number of Processors Available

7. References [BANC861

[CHAK86]

[DAYA88]

[FORGSl]

[FORG82]

[GUPT84]

[GUPT86]

[HAYE851

Bancilhon, F., and R. Ramakrishnan, "An Amateur's Invoduction to Recursive Query Pro- cessing." Proceedings of the ACM-SIGMOD Inter- national Conference on the Management of Data, Washington. D.C.. 1986. Chakravanhy, U.S., and J. Minker, "Multiple Query Processing in Deductive Databases," Proceedings of the 12th International Conference on Very Large Databases, Kyoto, Japan, 1986. Dayal, U. et al, "The HiPAC Project: Combining Active Databases and Timing Constraints," SIG- MOD Record,(l7) 1 (1988). Forgy, C.L.. "OPS5 User's Manual," Tech. Report CMU-CS-81-135, Camegie-Mellon University, 1981. Forgy, C.L.. "Rete: A Fast Algorithm for the Many P a t t e m a n y Object Pattern Match Problem." Artificial Intelligence 19. pp. 17-37. 1982. Gupta, A., "Parallelism in Production Systems: The Sources and the Expected Speedup," Camegie Mel- lon University, Pittsburgh. December 1984. Gupta, A., et al., Parallel Algorithms and Architec- tures for Rule-Based Systems," Proceedings 13th International Symposium on Computer Architec- ture, pp. 28-37 IEEE. 1986. Hayes-Roth, F., "Rule Based Systems," Communi- cations of the ACM, Volume 28, No. 9.1985.

[ISH1851

[MCDE82]

[MIRA841

[MOLD861

[OSHI88]

[PAPA861

[-861

[RASC88]

[SELL881

[SHAWIl]

[SHMU86]

[sToL841-

[STON851

[STON861

[TEN0851

Ishida. T.. and Stolfo Salvatore.. "Towards the Parallel Execution of Rules in Production System Programs," Proceedings International Conference on Parallel Processing, pp. 568-575.1985. McDermott. J., "Rl: A Rule-Based Configurer of Computer Systems." Amycial Intelligence, Vol.

Miranker. D.P.. "Performance Estimates for the DADO Machine. A Comparison od TREAT and RETE," Proceedings of the International Confer- ence on Fifrh Generation Computer System. pp. 449-457 ICOT, 1984. Moldovan, D. I.. "A Model for Parallel F%mesSing of Production Systems." Proceedings IEEE Inter- national Conference on System, Man, and Cyber-

Oshisanwo. A. 0.. and Dasiewicz. P. P.. "A Parallel Model and Architecture For production Systems," Proceedings International Conference on Parallel Processing, pp. 147-153.1988. Papadimitriou. C.. The Theory of Database Con- currency Control, Computer science Press, Rock- ville. Maryland, 1986. Ramnarayan, R.. et al.. "PESA-1: A Parallel Archi- tecture for O B 5 Production System," Proceedings of the Nineteenth Annual Hawaii International Conference on System Sciences, 1986, pp. 201-205 HICSS. 1986. Raschid. L.. et. el.. "Exploiting Concurrency in a DBMS Implementation for Production Systems." Intermtianal Symposium on Databases in Parallel and Dism'buted System. Austin, Texas, pp. 34-45, 1988. Sellis. T.. et. el., "Implementing large Production Systems in a DBMS Environment: Concepts and Algorithms," Proceedings of ACM-SIGMOD. 1988. Shaw. D.E.. "NON-VON: A Parallel Machine Architecture for Knowledge-Based Information Processing." Proceedings Seventh International Joint Conference on Artificial Intelligence. pp.

Shmueli, 0.. Tsur. S., and Zfira. H.. "Rule Support in Prolog." Proc. 1st Int'l Workshop on Expert Database System. Benjamin-Cummings. 1986. Stolfo, S.J. and Miranker, D.P., "DADO: A Parallel Processor for Expert Systems," Proceedings Inter- national Conference on Parallel Processing, pp.

Stonebraker, M. "Triggers and Inference in Data Base Systems," Proceedings 4 the Ismlamorada Expert Database Conference, 1985. Stonebraker. M. and L. Rowe. "The Design of Postgres," Proceedings ofthe ACM-SIGMOD Inter- natwnal Conference on the Management o f Data, Washington. DC, 1986. Tenorio. M.F.M. and Moldovan, D.L. "Mapping Production Systems into Multiprocessors," Proceedings of the International Conference on Parallel Processing, pp. 56-62 IEEE, 1985.

19, NO. 1, pp. 39-88.

netic~, p ~ . 568-573. 1986.

961-963 IJCAI. August 1981.

74-82 IEEE. August 1984.

128