37
P&railei KAowle<iCe- Bued Intor'tnatioQ Oil du )faN- VON Compuwr Science Columcis Uo.1Ten1.CY :-i .. Yotic. lOO'l1 ,\uqust 1981 CC'CS-21-31 :-iON.YON [Shaw. 191'l: SUw .• , I!.. 19811 :. " hlp", j)V'IJlei ru.chlne Ule c:mcneiy eiBcims sappo" ol eatt.&.in openQOt1.l :Jl» a. -.ida ot •• cowiedce-baMd in{orm.. :iOQ taau. !n ordct :.0 demoa.acnt.e '.wIler ol :.h. :-iON-YON . .utiticia! _ n .. ve " simple ·..nth :l1.e ?r.mi;ive ;-iON-YON lz1nruc:tioa.I emuiAwd 5Qf\W"An. n. SYSWlD eompanl Ki'U.·iUt. :ioaa lAd W'1n0cra4. 19r.l :l1e eol1wcca o{ -11&& J1 j)taCuc. be a very !.M;e d.a.4&Oue. !'1IU'lrrinC oW erlteria ror deK."lpuol1-macelWli dedl1c:rive i.Afannc:e over a. -iomam-'l'«iiic m(\wietfJ9 bue. nu.. p.pet Ill. :.: our =;:er..:ne:w (now!edp-oueQ "..&am. n.iI . .... ,eb ,... .fUpp«* i.D p&l'1 ;r :b. Jeler:ae AdYUC:e<! ?"'OJef:U AgeJ:ley' 'Jllder

taau. n .. n. nu.. - Columbia University

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: taau. n .. n. nu.. - Columbia University

P&railei KAowle<iCe- Bued Intor'tnatioQ n'~l'i.Y'Li

Oil du )faN- VON ~i1ine

Compuwr Science D.pat~Ulel1' Columcis Uo.1Ten1.CY :-i .. Yotic. ~ lOO'l1

,\uqust 1981

CC'CS-21-31

:-iON.YON [Shaw. 191'l: SUw .• , I!.. 19811 :. " hlp", j)V'IJlei ru.chlne a.d&9~ ~ Ule c:mcneiy eiBcims

sappo" ol eatt.&.in openQOt1.l :Jl» ~pat autr~ ~ a. -.ida r'&A~ ot ~.tcai •• cowiedce-baMd in{orm..

:iOQ ~roc~, taau. !n ordct :.0 demoa.acnt.e ~e '.wIler ol :.h. :-iON-YON ~~W't ~or ~e-~ •

. .utiticia! rl1wUi~ce ~puQ'iot1.l. _ n .. ve (mpiemC'!l.~ " simple bo.Ic:d~b&.Md ~U'ien../ ~urn. ·..nth :l1.e

?r.mi;ive ;-iON-YON ~ lz1nruc:tioa.I emuiAwd ~ 5Qf\W"An. n. SYSWlD eompanl Ki'U.·iUt. d_r:~

:ioaa ~Dtow lAd W'1n0cra4. 19r.l wi~ :l1e eol1wcca o{ -11&& ~ J1 j)taCuc. be a very !.M;e d.a.4&Oue.

!'1IU'lrrinC oW ":::a~hiac' dalcri'P'io~; ~Ile erlteria ror deK."lpuol1-macelWli ~uire dedl1c:rive i.Afannc:e over a.

-iomam-'l'«iiic m(\wietfJ9 bue. nu.. p.pet d.~ba Ill. ~risi =~m' ~plc;~ :.: our =;:er..:ne:w

(now!edp-oueQ ~trie"' "..&am.

n.iI . ....,eb ,... .fUpp«* i.D p&l'1 ;r :b. Jeler:ae AdYUC:e<! .~e:t.atc:i:J ?"'OJef:U AgeJ:ley' 'Jllder ~a~" .'r{J)A.903-11-~ ~d 80003~G-<}1J2.

Page 2: taau. n .. n. nu.. - Columbia University

1. Inuoduetioo

Tlle centra! r~ oC our recent r~arch has beeu the Lnve!tigatioQ oi hl;hiy ;Jar311ei nOQ·VOQ :"IeumanQ

m~bin. ueltitee~ures aaapt..ed ':0 the icinds or operacioQ.3 that appear celltr:1.i ':0 ehe oper3tlon of 1 ~road

dasa of 1;u~e-5Qje knowled~e-baacd sys~!. Our 1pproaci1 is ba.sefi OQ ~ uchitee:u:e :Shaw. l a791 :n~ ,uppcrt.3 tile !li~hly erllcien~ evaluauon oi ebe mo" "'difficuit" set :heo~tlc :1lld relational ~ieor~c opentors.

T!le :na.clline compr~ a :~_structured Primary P~OCe511nc 5uo-Y5t.m (P PS). '.vrueh ',n are implementln~

'J.SlllI cu.atom "MaS nSI caipa. ilnd a Seconciary Proce51u1I Subsys~m (SPS) :.ncorporatinl :nodiEed. hi~y

:.n~llilJent disk drives. Our Lnitial efforts. reported in il recent S~ford doctora! cii.slertatlon :S~w. l!l804j,

ha.ve Yielded:

1. Tbe ~p-ln'ei~ uchit.ctunl stleelficatioQ oC a highly ~ailel ma.c:hine wttieh 'q

:aye Slllee come :.0 call ~ON·YON,

2. ,-\J1 analysia 01 tile time compiexicy of the t!:Yentw p&l'~el hardwv. a..ltorithlnl to be executed on the NON· YON m~hine in the course of Latse-sc.ue. m~I·b&M<i dac.a manipWation.

3. The :.mplemenr..a.tion of an oper:ltion:u lalowllHige-bued informatioQ recrieval "Y~ ~m. demon.uratinl tbe 'J.Se of NON-VON (emulated in $OftwV1!) in support or a. very bien lc:Tel deseriptiYe formalism bued on the languace KRL iBobrow &Ad Win0cn4. 19TT1· '

Portioll.l of ehe :-lONe YON enadune are QO_ in the earli., s~es of pbysical itnplemesl~tioQ a.a part 01

a cooperative effort :nY'olvinl eh. a.w Computer Science D~&l'tmeut " Columbia &Ad the Knowledge Bue

~f&nl4ement Syst.ms P~oject at Stanford. Details of the NON·YON hardware ~ p ..... nt.eC eLM_here [Shaw,

et at •• 19S1l. and will not be de3Cnbed. here, Th. central COClD oC the present paper t.s the stroleture and function

o{ the demoQ.3tr~ion sys~ar ... bieh we tlave implemenr.ed a.a a vehicle Cor illuau,'inl the IJ.SI of ~ON· YON

:.n J. ,imple, but chata.c~r~tic lalle-scale Al application. Tb..i.J SY3t4!m demon!~aws :he ~.D..r :n which

~ON· YON might be utiliJed in support of ~he highly efficient remeval of recortU Cram very I~ da~b:l.MS

in applicatiOn! whent ehe alteria for descriptioo-mat.ehing ~uire deductiYe inference over 1 domalll·spec~e

4cnowledge bu.", Our demollltn£ion syst4m. wttich 'n! i.mpiement.d :.n M.\CLISP on ~he DEC PDp·tO a'

~h. S~ord ArtiEew Ulteili,euee Laboratory, emulate the primitive operatiOn! oi tile :-lO;-(·';ON ma.cltine

:n ~ it. "iftl'I.

Our demoQ.l~uion syswm UMI a restricted fint-order precicaw cakuha as J. ~n of ":nt41nneciiat41

Corm". bridling the ~p betwe'eQ the sem&ntiC! of our KRL·lilce descrIption l:1llguace and cert:l.in operators

01 a. rel.a~OQaJ :1lrebra having particular importllllee in the computational ta..1c of logicaJ sa"Lsfac,ion. These

relational algebraic operators are eval~tad in parallel by ~he :'iON·YON hardwate. yiel<ilng a sill1jfica~H

mproYement over the best :nethods knowu ror perfor~in, equivalent operatioQ.S OQ a cOllve:ltional compu~

syswm. [n thia paper, we will tr3.Ce the operatioQ of our lcnowiedge-bued retrieval syHem Crom the levet oC

KRL-like descriptiOn!, through tha.t of the predicaw logic-bued in~rmed~te form. l.Ild down to ~he level o(

Lhe prlmitive relationa.! .1l~cbr.Uc operal.Ol'S, witich are e'f:1lua~ in panilcl on the ~ON-YON bardw&l'e. We

~n with a. discu.ion of a gener~ and i01port.aM clau of problem.s which may be though~ o( il& cOQeep,u~

llIa,c11izl, tub. o( which ~he knowledge-baaed information ~trienl J.ppiication :, a p&ltleul:lr in!taJlce.

Page 3: taau. n .. n. nu.. - Columbia University

2. The General Conceptual ~atehiJl, Problem

M~y intorma,ioQ procesainl ta.slu ~r{ormed by men :lJld cnadunes alike involve ~hc process ai ."naccbillg,

by which a corraJ)Ondence i" auigned between illembers of CWo set3 ot' entities. The c:iterla :01' cert~n sorta

oj :n~ches ue qui~ ,impie to descrIbe. Letters ue routInely rnlltched with :n:l.llboxl!S, for ~:lJtlple, :.:td

Con!l:resamen With coaa,jtuenf.l. ~c:.ordinl to ,tnightforww al!l:omhms bued on 3imp!e, 5in&ie ;Hopereies

oj the eDuti .. in question. By contr»,. ou.: demoll3tl':"ion system illay be tllou~ht o( II concer~ed with a

:nore interestilll cia. of ~ wltich milht be eermed coacepcuai ma~lung problems. This :nore dem~ding

sort of wit, which ~ Dooetllelea a common put oj our cOl%titive e:tperlence. invoives the a.uir-uneot o( a

corr~lXln<1ence ~tween entitles witicil hwn&llJ perceive loS ~Ulg b.ighly ~d sy'~m;";caily nr.Jctured, baaed

on seleec..d ,il%1i!c~t e.hu:u:e.r1.1tia of ~1lOM: nructure:L

A surprisill,l,. luie shue ot tile lcin~ ot ill!orm~ion proceuinl loCtivitia witll ' .. nitA 'ooti1 human and

autom~ dac.a procl!SlOr5 an e.h&l'l'ed :nay be viewed. o1a inTolvUli YVioua lcin~ o( concep'ual :n~elU.oc

;Iroolem.s. Althoucb meaning-baaed :n:uehinC Olay, ill a COBOL-ba.ed i.n"entory control ,.,.tem ~or Wj)l.ule

PUtl, be deeply eOlbeUded within procnm loops a.od sort routines. ud hence difficult to recognil1 11 ,uc:h.

the (act remains ~haL & lArte proportion of the CPU eyel .. witich wiU be expended i.n any particuLv run rnicht

be ti10ugJa of ;sa identiIyini &ppropm~ records Cor OlallipWat.ion on ti1e baail of domaill-sp«ilic criteria-in

our example, criteria involvlnS, say, aU'planes, Olo~1'S &Dd wer'&it.-speciBc p~wi101e reiatioa.si1ipe. tn~.

genersl spirit ol much of the reeent worle on bowled~e-bued systeaa, we miiht aalc (or the ~ility to describe

these ma"hinc criteria. alonl with the domaio-sp«illc: entities and reiatiol1lhipa on which they ll'e baaed,

in a very ~t. illodular, euily undcrstandable lI\:lnner, cnapping ,Alient Cacti, rules and reiatioa.si1ipa on~

independently I!Xptesaable auertioaa wiUlin the prosr:unmill, system.

?reviou. work: in t.he Seld of altUicia! intelli~enu orren a rich set of aowledge :eprese!l~tioQ a.Ad

Olatcillnc eeehoiques wltich mi~1i be employed i.n the pursuit of thi.t general appro&CQ to the prob~cm of

conceptual Ola~hinl. The lciQcU ot applic:l.tiol1l ',lfith · .. hieh _ an illc:.t concerned, though. ale tllOM i.n

which the quantity o( daLa to ·"hicl1 conceptual ma~iling :.eciu1iqul!S illust be applied :nay be quite la.rte.

More speeIiic:wy, OW' doctoral r~e.h at.ta.cb ti1e problem o( :natching 1 given pawutl descrip'ion J.g:Wl!,

:!\e :ne!'OOers o( , .. jaL rnay be a very 1a.tTW _ of e.uldldac.e ~riee d~cr.ptiol1.'l according to :ne3.nin;-baaed

c.~c.erU. The potentia! sUI ot the collect.ion ot t.ulet deseriiHioaa impoees SpecIal :o!Ut~t3 on the sorts oi

conce~ma.l makhinc tAcluuquei thar. rnicht be succ!!SIluily applied in pr:u:tiu.

3. Knawl.d, .. B&Md Informacion Retrieval

The PUtlcuW in.s~ce o( the eooceptulU illa~hinl ~Ic that _ have chOM:n Cor our deOlon.Hla,ion sY'tem

;" borrowed (rom t.be general paradigm ot i4formaeioQ :eerievai, ~nd ~ it.HiI []lost euily l!Xem pliaed by

the documeM re"ievai (more accur:1r.ely, reference retrieval) ~ppll~tion. In ~ ordinary documcM retrieTal

system, a collcetion o( targe' documenc.t-iUl tbe boob in 3. computer science libr:1ry, ror ex;unple-i5 fine

illdexed by 3.UOCia~i.n, 3. wiee description with ~h document in the collection. The end '.lMr o( the

sy5tem. wllo we will ea.l1 the Jearcner, then prepues ~ p.".rn description wbich embodies some of tile

2

Page 4: taau. n .. n. nu.. - Columbia University

salien' ch&l:w:~ria'ia oj ~he sort.a of documeQt.S ill which he is ior.erested. The sy~tem '-hell compara :he

p.tWfU dacriptioa with ~he c~didll.t.e tallet descrtptiona in :he coilectioa. rcturllUlg ail r.argct.S th~t ~CQa'~h~

a.c~rdinc to eertoain presPftluc<i ctitcri~

r~ iI the o:aUA of these erl~ria thu distinguahes the ~hll.vior oj .1 icnowiedg~blUed illi'orl':'l.ltion retr1ev~

sysr.em. ~nd :.lld~. of a system for conceptual :n:I.l.ciililg in ~ener.U. lt1 such appiication". It :" aot poaaible ill

geneni to dccide · .. iler-het a mar.cl1 should succeed ill a stricroly meeh&n!c:al, "'$yot.ctic' maoner; Lo.uea.d. :he

acc:ep~bi1ity o{ a m~h m.lY depend on doma!n·,~ac: eZHities &Ad retatiolUhipa, &Ad 00 deductive illfereaces

over these entities &Ad n!~tioJl.lhiPi. IJ1 tile CaM ot the compu~ selence Ubrvy, for aampie. :he sy1tem

:ni&h' be requir~ to "know about" such entities aa compu~, ~(Ontillzu. programmet3. ~d ltorage deVICes.

CertaUl eilara.cwitcic aHributa o( these entiti .. (the nOI'1(9 medium attribut.e. · .. hOM 'I"I.lues dilf'cr (or

diH~n' ltind.l 01 J~,....e devices, ror exampie) aticht We be included in this domaiJl·5pec~c lc:nawledge.

Amonl the cy-pica! lcin.u o( relAt.ioaahipa tllat. mipt. be embo<iied in the lalow/edre bue o( such a syswm ia

the ta.ct til» a t&pe drive ia a particular kind ot ,corace den'ic. ""hOM ,,'-Craft medium ia ~waYl magnetic e.apr,

oae simple deductive inIerenee involving tllis relatiooahip might es~bu..h the (act tllat a patwrt1 descriptioQ in

which the 5ubje1:t oC a documeJlt i.s described a.t illyolvine a. stool',,' device with mq,Detic t4pe .u it.S :nedium

"ROuld be '.Li.aOad by a ta.rpt description in wtUch a. e.ape drive appeared in the correspoading j)Oaition..

L" III aow brielly ~e the rn~ner in which such a. cowledge-bued. inJ"ormatioQ n!r.rieva1 ",tcm

:night be uaed in ;lr.ctice. In coat,na. 'llritll &A ordinary in.torm~ioQ r"rieva.i syswm, tbree diatinct dMNS ot

'J.Hrs would be inv~yed iQ the op.n.tion or a knowledge-bued n!ttieval sy1~. IJ1 addition to tile searchers

&Ad indexers. a tltird clau ot UHrs havillg aperti.le in the subje1:t &leU of the documellt.l to be illdexed

would be n:quired :.0 Cormulua alId encode the 50rta of dOma44·5peci1ic Itnowledce described .lOove (or use l.ll

illduing iIJld rer.rienl. Members o( tilia third daaI o( UMrS. which h.aa ao &AalOCUe wjtllill tlle eOQtG't o( the

coaventioo:ll inIortn.lt.ion recrieTal sy5t«m. might be ealled Imow/ed~ engilleet3. Our primary eoocern in :hia

;lapet. Ilow..,er, ...,U be wit.h t.he proceea or retriev&1 by ~ching end-lUetS, under the aasumpcioa that the

Icnowledge bue bu pnmou.c, bee!! eoQ.ltlUC~ and ~ docurnent.S in the eoileetion indextKi.

W1We SpK. do. IlOC permit a detailed d..i.scu3lioa ot :he wC1llc:nesaes o( exiltiog i-nformatioQ retrIeval

!y!~lD!. ~d ot ehe manner ill which our Icnowlecige-bued retrieval system addreue.s ~hese llmitatiolU, it i.s

'North mentioQinC tha& our approadl would oiIer the gruwsc adnll~ in :Ile :aM .,.,ttere hlghly sp~e

:.a~ W'8 to be ~nQ rrom amonl a l.ule clau of "conceptu:lily !1eterogeneous" doeument.1. PhrueG

dilTerenUy, kllowledge-baaed remen! methoca should prove moet critical in the eon~:n oi t.aau in · .. ducll ~he

sem&.Qc.ic criteria (or sati.ai:w:tioa o( a uaet's requCl' are meaningful (or only a comparatively sm2Jl sublet o(

ehe t.&rpC =Uectiol1. A ver"f ambitious e:umple o( such a taak mic1n be the seleetion o( a spedai.i2ed journAl

a.r~iele .hOM ~lt"f'&Ace ati&h' only be apparent :0. saT, a gnduate student worlWlc Ul tile field who had read

:Ile pap.!', rrom :unoal tile set of all documenr.a in a. luge uQivet3i~ eQUeetioa.

4.. A Synem fop ZUlowled, ... Bued Retrieval

lt1 order to demo!Utr~~ the wa.y in which a NON· YO N· like m:lChiae might be used in ~ ac~u~ AI ~pplicatioQ,

3

Page 5: taau. n .. n. nu.. - Columbia University

we n:l.ve implementcfi a ,impie knowiedge-bued information retrieval 'ystem havin; ~lle b:l.Sic "ructu~

outlli1ed &bon. In tne inc.erest or J.ppli~bility ~ problem! other tbu our sample documeQt ~ctrie~

applic.atioQ. Qow .... er. the sy3c.em we have Un piemented is in fa.ct ~mew hat :nore ~ener~j iLl one r~~eoet

than sugges~ by the J.bove <fucu.saioo. Specifically, the role;, deiiAln; ~he ;emantlC3 oi matc~iQg wIthin

~l1e document description lanl'l~e !l;lve aot been embedded inextri~bly within ~he eoae of ~he remeni

sys&em. !lue have ilUtea4 been explicitly iormulated aa u independclH. sc~ataole set of axiom! expresaed in

~estric:ed dnt--order predic:l.c.e wculus. Specmc:l.tioQ oC the match semantiC3 in tne torm of a sepal1,c.e Mt

of dedantive1y specified rules We contribute! t.o the dexibility oC our demolUCtacion system. ;Jl th~, tt\.

mateiling morns could be eaaily modi.£ied :.0 reflect changes in the descnption lancuqe or in ;he rules for

descriptioQ :naceitin; without ~ec~lLlg the behavior oC the syswm aa a wilole tllrougb mociiac:ac[otU at the

code ir.ae!!. More a.ecurac.e!y, tllen. the lcnowledge-bue<i retrieva! system wltich we aave impiemenced m~y

be thought oC Ja mWnI reference t.o three conceptually dilcinct "diLUbues": a tatllt coilection, a dom~.

sp«ille lcnowledce bue, and a lDol,cil 'pecmca,joa delinin' ~he matchiag Hm~tiC3 oC the ~O'oIIJled~bued

description iaAl'lace.

. .u will be seen shortly, the duible approach co the the specillcatioQ or mate.h sern&D.tiC3 til a' 'Me !uTe

ou~li.aed. ~'her with the capacity (or the ,~ oC domain-specific lenowledge in enlua,inl the succ .. ol

potentiai matches. supports a. very powerful:u1d llilhly genera.iMt oC capabilities aot ~vailable in a cOaTeJ\cioa.a.l

iJ:Irol'Tll~ion retrieval system. It is 00' clilB.cult t.o con.Jtruet a scenario in which thia sort. or pceralUed

lcnowledge-b~ ~etrieY&1 syswm might offer a number oC important pnctical a.dnac.,ges by compar~n wita

a coaventioQa.! :nIormlltion retriev&l system. Unfortunawly, ~here is a. (ul1damen~ res~t. in wll.ic~ our

preseady opentioQal demolUuation system, wh.ich runa on conyeMiooa.! computet budware, 'Rould 00' be

pt'&ttiea1 Cor IJ.M in u a.etual application involving a lMte W'get eoUectioa.

S~ificaily, ~he dernolUcrMioll system relies very heavily on ~he executioQ oi several oper:l.tiolU wll.icll.

Oll a. von :-ieuman.l1 alacwe, are quite aperuive 'Nhen the openod'S compr~ a. large amount of dat&. A.s

:t ~:1PPen.s, it ~ pr~i.sely in the caM oC a. '/ery large ~rget eoUection that our knowled~ba.sed approacil. :s oC tne greatest potentiai utility. rt tltia approach is ~ be rep.rded aa a candidate (or pt'&ttiea.l appii~tioQ,

co!Uide~tion mUll' thus be pven t.o any ~ternatives :.0 ~he '~on :-.re:.unall.ll :::aciliae uehi~tute ~hil.' :nll¢'

!u~port the more efficient axecu,ion oi these operation.s.

On the NON-VON madtine. ~hc mOlt expelUive 10_level operatioQs invoive<! in our ~pro&.l:h :.0 lenowledge­

buee ~trirrai-in puticulat. the 1Il0ll~ compur..;,ionally expeIWve primitive ope!'3~rs of a. reiolt.iollOli aJgeot~

lIl~y be eora.lu~ in a highly efficient maruler. Baaed on ~ tlienrchy oC lnc.eiligect s~rac. dn-Ices. tltia u­

chitectun iJ:I (act permit.a a.n O(log~) improYemen' (with very ravonole eOIUUJlt r<J.e~l'3) in time complexity

over ~he evaluatiOQ methods used for tllese opcra~t! on a conventional computer syst.cm. ;yithou~ the UM

oi ~cdundant storace. a.nd ~ing cutl'eMiy ~Y'llilablc J.lld po&entially ~ompetitive teehnoioc-. Altl10ugh it wu

oot our on!;in:1.l ~oal in pursuing thi3 research, tbese results have recently ~n ~ attt12.Ct attention within

'-he dat.abue m~agemcnt eommULll~y by virtue of the important role piaye<! by ~hcse'difR.:ult" rcl:l.tiona!

a.lgcbraic primitives within database man~ement sysc.ems ba.sed on the relational model of data ;Codd. lIJ70!.

4

Page 6: taau. n .. n. nu.. - Columbia University

Because oi thia coonee~ion 'Nlth Ute conceros oi re~tioQ:ll dOl.~bll.3e systems, ~gether 'Nlth ~he doae relation·

,bip becween our ~chit.ectur:ll worle ~d cariier ~esearch on 5peclalile1l ~:udw~e rOt ,jar..ob3.)e :n:ul~eme!1t,

~ON. VON ia $Oalet.1mes referred ~ as 3. (rel~Llonal) d.'.bll3e mub.tne. On cloee ~:nmH10l."on. ~llia more

immeUat.e byproduct of our r~eh i.1 :10' quit.e the cOlnc:dence It miitUt 3eem It fint giOlnce; :.nd~, ~he

reasons thee operations have ~roven ~ ~e 50 central ~ our own · .. orlc ~ closely rela.~tl ~o It !e:ll~ ooe o( ~he

re3.50O! rOt their frequent lppear3.llCe in ;he relational d~~bue Ut.enture, aI wiil be seen silortly.

5. OrcwJAtioQ ot the Retrieval 9ytt.m

The uture o( :he problem _ hay. ciloeen ~ .'t&clc, 3.lId o( the appro&ci1 ~ i" solution thIU ..... ~a.ve

adopted. impoMS .... b.at mipt be dacnbeci :la • vertieaJly illtep'Du<2 Ottllnil~ioo on this paper .. .uthou¢ we

~ coocerned with only • sUlli. wk, which \oS it.Hlt reuoo&biy .... ell clefined 3.lId m3.llage&hle in scope. the

methociie.u reader will be (onowin, our lpproa.c.b ~ ita 5Olution aloOI a loog rou~ ex ... nding (rom :11e levet

oC vf!rY !lip lenl dacriptions. throuch the aren.a oC locical rormula m3.llipuiatioa, 3.lId dowtl ~ the dOIIl~ oC

act11&! p&l'~el oparatiolU in bard ware. It haa been oW' erpmeoee in describinl thia researeh ~{ore a. :lumber

oi lllcUe!1Ces th.~ ~he "vertical dat4Jlce' whieh mUle be Cl:IYered in i" expc»i tioll m41y mue it dilRcul~ to

ret.aUl tile overall structure o( our .... ork .. lUle .tt.eudiol to tbo. d.wla or I!3.CA oC Wl .. 4"y81'3'.

Ill3.ll a.t ... mpe to mitilUl th.i.s dilRculty, ..... bave J,t~mpt.e<1 to illustra ... io Fleure 5.1 ~he vertical struc".Ire

oC our system ..... hich i.s dOMly Illlrrore1i by til. orJ:ulilatioo oC tbja paper, tile read.r m~y .... ish :.J refer baa ~

th.i.s 5gure periodiea.lly (or ilUrpoaes oC ~riell~tioll (or 3.' very leu" reUlunnce). At ~, top level, documents

.ulci domalo-speeiSe mowled,e &n repreMnt.e<1 ~ing a Icnowledp-bued de:scriptioa 1~f'U~e. Example oC

,his la.oguace lPpe., immed.iawly ~ til. right of the phrase ·lcnowledi~ba.se<i descriptioo W2~e' :n F:gure

5.1; .hey OM oot be stucUed c:aniuHy l' .h.ia ~i.nt. but &n PreM1ltfl<1 ~ give $Ome (eelinl (or the killd oC

:nform~ioo they embody. At the bottom oi our vertical COalO, til. ilrunitive operatioll' of a. reiatioll:ll·al~bre

(apiJ1 illUltl~tfl<1 :.mmediat.eiy ~ the right oj ~he corresponding :itle) ~e lJlt.er'f)M/tfl<1 by the ~O N· VO N

ma.eiline.

\fost oC the utiTity o{ oW' actua! demolUtration system. bow"er, :a.ke il1ue Ul .he middle jJonlon oC

oW' :l1ustrlltioa-the pan dealinl with formula erpressed :n J, retric~ lirst order lo~c. :\.t ',IIe 5Q~1 _,

the nren~h oC predicate WCulUi 301 &.Q "lnt4!rmedU.~ form 1 (or our sy!tem derive irom .:.1 l1exloility 301 a.

d~nptiTl t.ooi. 011 th. 001 hand. &.ad irom ir.s iot.ete:sting :lOd lLHiui relOlLionsrup to tne relatlooai .l.Igebrsic

j)rimieivs. 011 the other. III our demolUtf1ltion system, III deenptloO!. ~th :~et :lOci ~attero. are :il"!t

COQTel'teQ into a special :lormaJiJed torro. aod the M!Suitiol d~~ nTUc.W'e m:ulipuJated according ~ the

rul. embodied in the m~cil sllecificlltioQ (wltieh. we rec~. are ll.5o ~rC5.Se1i ~ ilM/di~t4! lo~e) ~ locat.e ail

:n~b.in, ~rget documen". The deta.ila of theM cnanipu1a~lol15 a.re embo4.ied in J.ll aigontnm called LSEC

(ror Logical Sacufa.ec;on by &XcefUioaai CO/UUai4C). wlloee prImitive (relatioa..al a.igebr~ic) opentiolU ate

impiemented directly aI highly pa~lel al:\.Croinstructioos of the NON-VON machine.

The remainder o( this paper dcxribes the structure 3J1d rUlletion o( the demoQstr3~ion syst4!m, trom

~hl level oi kno .... ledge-based descriptions through the invoC3tioo (but not ,he eVllluation) of the relatioo:l.!

Page 7: taau. n .. n. nu.. - Columbia University

e-based Knowledg des

la crtption nguage ,II

I Dcxum .nt with lutno, ~ a set with any of

.Clln ... olve. SUOI I n In .... nuon wIth .. .

Tlce·Or! v.: I Sto,.~e-Oe ... ic. with outcu I a ...

NOAMALJUTlON

PAocacuRe

Predicate c based rep res

alculus-·P.~cec .• Proto' (G0012. In .... ntionl

M(t.,·g.,so. ~Il·ger.cl • internal P·"'ATC

entationw !.SIC

I\I.GORIn4M

;)rot o •• FI!RSP!C .. pqOTO( •. .I .., glirnlme3 ".t·gli,O/ill.r. • OSJ··5LOT ""l.l..!JII" ( •.. l "

" ...

Re a p

iational PrOlect 0 v.r Itt·' Dtlon··Oto," Ind ... r Itt·2 Ind Itt·3

Jgebraic JOIn'

rimitives \11 0 ....

REUilONAI. OATI\8A~ I.4ACMINo!

Figun S.l

Page 8: taau. n .. n. nu.. - Columbia University

algeb~c primi'ivel which form the macroiM~ructions oj the :-rON-VO~ ma.c:hine. 5ec::ion 13 intro<luees lnd

~cmpli£c:s the uowledge-based desc:rip,ioQ langu~e. In Section 7. such l.Spe1:t.J ot :he ~e!atlonal mode! of

d:l.~ ita relaLjgaahip to predicaw c&!culu.s. aJld ic.s \J3e l.S a daf.."lbase query laogu~e l.S are ~ntlai to our owu

"Norle ate preen~ lnd iliu."r3ted: ~he ~pplic.ation o( these ":.echnlques to our own 'Norle "' ;~eQ :.ntroduee<i.

?:llai!y, 5ec:ion 3 describes the LSEC .llgoritlun. 'Nhoee exec~tion :s ceatr&! :.0 :.he oper:Hion of :heNorltinl

demoaatt3tion sys~m.

S. Th. KAow led C" B ... ci Ouuiptioa. LADcuace

The ialowle<ile-Oaaed desc:np,ion l6nfrJ&4e tll~ ·.we have adop~ for \J3e III our demonstruioo SY5~ is bueci

eiCM1y on a sunple ,ubtle, oi the lcnowle<ige rel'resent.atioo l&Q(1l~e KRL LBobtow and Winov~. t977"1.

~hu than beIu1 wit.h a (orm~ trea'-lI1eQ' this descriptioQ ~ruace. " have choecn to !int exempiiIy it.l

desc:riptioQl with a ,imple (albeit ratheT cOQtrlTed) example o( a hypothetical document description:

& DocU1%l.nc au,bon

•• t-with-all-ot' Tbom~a Waicus

•• t.-ot' aD Ell,iotter

:Ow:l trit»-OI- pablicat.ioa •• t-wit.h-aDY·ot'

USA Gre~BriWJJ

,ubjeec involv ..

aD LaYftlt.ioa with PUl'"pCM

op

Power.generat.ioa Power- tr all.SalWioo

pri:Hinr-da,-•• t-wiUl-cuctJy

1959 196:l

& Tu=book

(Ia the adual dc:scrilHion lan~u~e, il:l.ren~hcses :lZ"e "..!.Sed cxr.c:Ulvely ror ilurpoMS oi {r0UPIO\J; ..,e ~ave

'.aJwl Ult Uberty o( orniuinc the p:l.ttntheses ~ ~his and :nOlI' other ~:unples • .J.5U1g illueo~tion too convey

the same SU'Uctur1U in(orm~ioQ.)

The above dacrip'ioQ ia Ill~e up o( two dacripc";:3--C~-.:.I 'Milich Cn:l.t:l.Ct.etlles ic.s ~efeteot a.s ~

Doc:zmenc. and the other :nore 'p«Uicaily :lI a Tu,booic. The UM o( JescriptioQl eompoMd oC multipie

descriptors. e&cit reOeetinc a difTereot way oC viewing the same ~eal-'Norld entity. is an esaenti:l.l pllort o{ our

lcnowled'Ie-buod dcscnpLion !lI.O~uage.

In ill. Ulere are eipt types or deecripcors ~hat :nIlY be iocluded ill the desc:riptioaa oi our l:ulgu~e, each

o( which is illu,~tat.ed in the above ~:I.lXIplt. The ewo th~ we haTe j~t Illentioned att e:illed .::>enPfftives.

Page 9: taau. n .. n. nu.. - Columbia University

E<l':Q pe~~~lve includes ~ !Ing!e proto,ypc (Document and Te.'ctbook. :n :Ile ex~ pies LlQuer coaalderacion)

-"i~b .rucb is lUOCia~ ~ ?atticul.a.r Ht or eh;ll'~~r~tic 3io",. Oae or :nore oi :hese ,joc,., :n .. y ~ :iiled -"'Itll

uOLau description in any particular ill3tOlllce at" a pe~pective. ra our e.'{~ple. :he lucilo,., ana couneri~

of-publiu,ion sio,", (amoni ~"he~) are bota Slled. Altcr:lately. 4 pe~pcctlve '11"y lppear -..ntbout laY of its

sio,", filled. J.I III ~he .~ oi tbe Textbook descriptor .. ~ ~ariet pe~pe"ive :.s deemed ~ '3Yntaetic:illy~ match

(5ee 3uosectioc :-.7) a p;H~rn pe!"S~tlve exactly in \be (recunlvely deuced) e:J.M ''''nel'C the PfOCOCY1)a ue

identicai. :lila whe~ "err ~lo, tau is lilled in the pa'~rn h~ a m&l.cllini descriptioo in ~he eornspoaciine

~iLioa !O tbe ~ge'.

5~enl other :n>es oi descriptors ue emoedded witbin the slotl oi our ~p-Ievel exampic descriptioa. The

simplest oi these 1l't ~ne ;ndividuaJ descriptors Thom~a. W.ut.n. USA. Gte.H Brie~. Power-geller:Hioa

Power--craJl.Jmwioa. and :he two dac.es. llldividuals ue tbe ol11y ·atomic~ (i.e .. aon-deeempOlablc) descriptor

C)1)e. and :nay. ~ a Snt appraxima~oQ. be thoulht of l& correspondinc :.0 1i.nCie, s-peciac etnities in :bl ·rul

'NOrid-. Illdivid~ descripcol"S in tbe pattern description an Itlatcaed only by ideQ~ea! individual desc:iptors

i.n the corresponding ~u,et descriptiolU.

The HmlUltie.s o( the four "Ht cypes" -lie ot. SIC with ex&e:.1:1y, Jee with all of a.ad •• t wit.h auy

ot-<oniorm :HSOQaoiy '*eil 1.0 the meanillp sugested 'or their camel. til our e:umple. tbl authol"S an

described ~ l set -.. nose :nemb.!"S 1n :!.il enc.in .. l"S. a.ad _ruch includes both Thomjlaon ~d Walters in

pattlculu (pos~!IIbly alOQe with otbers). The 51!t of countries in wruch the hypo~~ic:al documenc ~ publlaheQ

~ required 1.0 include either the USA. Great BritaUl or boch. Finally, the document i.s 1.0 best exactly :wo

prinWlC dates: 19S9 a.ad 1!l82. For the 1D00t ;llZe, the rules ror matching descriptors oC tile vvious set cypes

ate rauly intwtive; for details, tbe :eader l3 referred ~ the macchinc woau themselves, wiUch are i.nduded

lot an lppendix 1.0 our doet.ora! disaertuion :Sbaw, 1980aj.

[.avo/ves descriptors support description-Illatcbin~ on ~be bua of 3UuctUrai embedding. Specifically, ~

involvu descrip1.Or in tbe patwrn _ruch hu a descnptioQ D ~ it.s ;ll'iUment will successfuily :na~ll ~~t

:lIlY d~ription :n .he earrepondillc poeitioo W'itbin tile tUlet · .. ruch eitber ;c,.,ejj' :niltches D or llaa $Om,

,ubdescrip,jon _llich m:1~bes D. ra intuitive ~1'1IU. descnption Dl :s C:1lled ~ 5ub<iescr:ption oi description

D1 U DI ~ !e:'Qc~ aested .. t aay depth within the ':lody oi D1 , either lS a. !lot illrr (in ~he C3.M at J.

pe~~tlve) or a.a attUment (in tbe CaM oC ~ 51!t t;;'l)e or di.5jWlctive descnpcor). For 1 Jlore precise definition.

the rea.Uer ~ ~ referred 1.0 the m~hinc axloau.

Flnally, thl dujWlcejve descriptoOr i.s dc6.Jled 1.0 match liai~t a. given ~get d~rip1.Or tf either or it.s :""'0

artUmlflCol (thl indiTiduala Power ,e!lera,ioa lIld Power tr~"missjoa, in our e:t:1mple) '~ould :Daccb ~~,

ttta' t.atce' decrip toOr.

Figure 6.1 de6.Jles the pre<lS4I SYIl~ of our lu1o\IIJied~e-bued description (a.a;\I&ie. Coc'~ten' with ~hl

u.sual conventioru. ~he :lames of aootermln:1l3;uoe encloaed in :lllg!e or:1cxet.t. 'Nith .lleertU,ive choices sep:1r:1ted

by vertic~ bnrs. Braces ltC 1.I.Hd lot :net:l-5ymbo~ Cor purpoHS of ~uping; puenthc:5CS. on ~be aLiter ~:1nd.

ue ,ymboi.s of tbe description l:1niuage it.MlI. :\ "p!u.t' ~upe!"Script i.s u.sed (0 incUcate ~h:1' J. 5yn~ctic ele:ncn~

may occur oae or more times. while a. superscripted ~teriak rnatu an element willcn may occ:ur lef1) or more

.. ,

"

Page 10: taau. n .. n. nu.. - Columbia University

( de:xnption) ::- (( descriptor) • )

( d~lptor) ::~ ; indMduai ) ( perspective) " Jer.-o( descriptor) I ( l4Ir.-with·ex:ut/y descriptor) ; l4Ir.-witb·&il-o( descriptor) I ( l4Ir.-wir.ll·a.ay·o( de!tCrlptor) I ( involves descriptor) I ( d~jWlctive descriptor)

( iadividuai) ::- (LISP identi.Ber )

( pet3peccive) ::- ({ a I aD } ( pro~~) (.'iller pair ) • )

( pto~rype) ::- t LlSP idencilier )

( filler pair) ::- (( 310t :lame) (desC1'iptiotl))"

( Jer.-o{ de:xriptor) ::- ( .et-ot' ( description) )

(Jer.-witb-cnr.tfy de:JCriptor:1 :;- ( I.t-with-exaedy ( description) - )

(Jer.-Wlr.ll.all-o( descriptor ):~ ( •• t-with-all-or: description) - )

( Jer.- Wltb-any-o( descriptor) ::-- ( set-with- &ny-of ( de~cription ) - )

( izlY'Olv. de:tC1'ipcor) ::- ( involv .. ( description) )

( cti.sjUllc~·,e descriptor) .:- ( Or' , description I - )

Ficure S.l SYQ~ or ~h. Kllowl~!e-a~ Description L.Yl!\alc

Page 11: taau. n .. n. nu.. - Columbia University

~imes.

Ul the COW"M of ex:~lIlIning a number or actual documeat.s chosen from :he ::om:ll:l of computer ,clence, we

were &Die t.o id.IHiCy 5ever~ ltinch of deductive Ulieren~ m~hani"ms ~hat :!ligh~ weU prove !.l.!Ciui i.n ~ woricing

knowledge-bued retrte~ ,ystem. III our :lGtuOlI demonS'ration sy!~m, however, ,01y 1 .liagie, ~e!atlve!y

simple torm 01' i.nference--oaM1l all an~edell'-coa.sequenc (or ,imply AC) ;Jair_~ chosen for ilurpoees

of demO!l.!l&ra'lng our 3.ppro&ch ~ meaning-bued mOl.tchillg. Each ao~cedeu&-eo~~ucIH l'Uie c.'cpres.M:S ~

~eiMio!l.!lilip ~fiW~n :wo descriiHiona-thc .uI~cedell' ~d che eOll3equen~which :nay be \nou~, of either

i.n :.eMn.1 or impiiea'loo or sp~ialllluioQ. Under the fint i.Atet1lre~tion. the i:l.ct ,ha' a pven .~a.1- WQrld"

elltity mOlY be appropria~i,. descnbed by the 3.nteee12ent description i" t.a.lten to loficaily imply ehat :he entity

i.n question may aJ.so be described by tile coaaequellt. From the (fotmally equivalent) ~~ttla&ive viewpoint,

:he antecedent is coo.sidered to be a jpeeiai ca.M or the coa.sequent.

Out system placa 110 restrietiollS 011 the (orm at ~h. an~ent and consequent descriptio!l.!l. !n ?ar~ic'.1lat,

it i.s ~ibl. to iormulata AC pain w!lieh e%l)rtsa jimpie genenJiucioll relatiooahipa (tile ra.e: thlU 01 Tape

drive i.s a ~p~i&l iWld of Starace deviee, (or example), and eabot'1.ted gene.raliu'ion! (e.g., :ha, a Tape drive

~ a SCQr~e device "'hoa.e aledium i.s alagIletic ~~), aIOQI with a Ilumber ot alote getle~ ~el;ltiooallipa. [~

should be !lo~d that che ~e1a'iol13h.ip !Xl)reued tn AC pain ia CrlllUicive. .~ it hap~, the ~ran.sitiYity ot implicat.ioll ~ one oC the eharac~ri.stic:.1 ",!lieh impoee certain ~p«w requirementa 011 ~he ~a4Ching procedures.

In 5eetioQ 7, 'Ne will lee how tb..ia SO~ of requirement i" aceom~t.ed by the LoSEC a!gotithm.

1. Locie. R.la,iolu a.nd R.tl'ieY&i

.~ noted earlier, ?reciica~ logic :llld the relatioQ~ a!gebra COIether provide a. ~ritica1 link benreen our

~o .. ledge-bued retri~ la...oguace :uld ~he underlyilli ~ON-VON ma.ehine. In:hi" Hetioll. :he COIlJl~­

:iOIlIl between tllese cwo al&l.hema.ticai ~olJ, aod their 'J.5e in both convelltiooal dl1t.aba.se mao~emc.nt and

knowledge-bued retrien.l t.aaiu, will be introduced. 5uOa~tioQ i.1 reviews thoee l.'Ipecta of ,he reutiollal

mode! ot data tha~ VI esMotial t.o our research i.A~rest.1. In Suos«tioll 7.2, we illustn~ the 'imple ptoeedlUe

-"y which pattern aod target descriptiooa. alollg ·..nth ~l ao~ceden&-eoo.sequellt p~ ~ ~ included ~ the

imowledge base, ~ Ilormaiifed-that i", cOl!Ver~d i.nto relational (orm for :n:lI\ipub~ion T~~ing the lo~c:U

:uld relationa,i a!geb~c t.oola. The relaticoahip becwen predicate lo~c :llld the rela.tional ~gebra i.3 introduced

i.n Subeectioll 7.3, UlIing a simple example involviJlg the evaluOI.tion oi ao ordinary (as oppoeeci ~ <nowledge­

bued) d~l.ab ... quef'Y ror ?urpoees of illu.~"ra.&ioQ. The Cull 5e& of relational .ugeor3ie primit,iv(!3 oi concern :.0

our ~eh ~ dauibed :nore sysl:elllatiwy in SubeectioQ 7.-1. The ~emai.lliJlg three ,ucaec~io£u describe

ell. tooia UMd to define t.he rules lor matchiJl, knowledge-bued descriptiolls. e!1diJlg wit.h a dueussioll oi the 25

lIla4Ch.iDi a.x.iocoa. These OlXiocoa eompriH che match 'p«iiiclUion actually usee ill our worlc.ing de!tlooatr~tioll

'y!~m, :llld appeared aa :ul appendix ~ ow Ph.D. thesis iShaw, UlSOa!.

7.1 Tbe rei41'ianai model of da"

The relatiollal mode! o{ da~, aa cypically iormulated by research.ers ill databa.se man:l.iement systems, ha.s it.s

3

Page 12: taau. n .. n. nu.. - Columbia University

rooc.a III ;wo seallnai pape~ !ly COOd [l970, :912;. (n ~hi.s ~onr.en, ~lle ~e(:u :ci;;uioa 13 used ;0 denote J. set of

stn1c:tu~ enc..iues C411ed :uples wtuell, wlt.,in :l single relation, ,hare l :o!nlllon a,:r:outc Jtruc!ure. We :nay

lDort rigorolaiy d.lin. J. aOrTllllliud reLIC/oQ oi d~C!e 1\ ~ a sec Dc' ;upie!, where each ~uple 13 ~ element oi

the cartesian produc:t of 1\ (:lot aec:esaanly di",inet) set.S-c~iled ~he 'JnderJying 10IDllIn! oi tbe relation-oi

:lon-dec:ompoaable entities. Sinc:e ~ela.,io~ are sec.a, 'J/e may rcier :0 ~he aumCer oi elcment.S-Ul :Ili.s ~.

:uples--in 3. :e~tion aa che eardi.aality oi ~h" relation. ultultiveiy, retatlo~ :nay Oe ;hought oi aa "'t~II!II"',

in ... !:lie!! each "'row" represeat.s ont cupie ;lOd each colwnn repre:sent.s one of che 1\ (3 Un pie) ,uribu (a oi chou.

reLation.

It i.s eonvenc.ion&i co eitber name or oumber :he attributes of & relation ror convenience in reieM'ing eitber

~ a ,..,aoie "eoiumn' of tbt :elation. or ~ ~he value oC the "UOlbUrA :n question withlll a p&rtic:ul.al tuple. !n

"'tile discu3llioQ! (and in p&r.ieWar, in :nuc:h of :!:Us paper), it i.s a.lao '.lHiul '.0 group several attributes (some

i>OIIlbly repeaLed) coc-ther, referrwi '.0 :hem joUlUy la a compoWld auribuC4.

The term oorma1i2efl retlec:ta che "type diostinc:tion' between under\yUll domain elementl. whic:h cnllY

~I aa ~he V'lLiUIS ot ~tributes, ~d tuples ~d rela,io~, wltic:h cna.y no,. A Sin&il :uple thla an not be

J.Hd co direc:tly repreM1lt 3. hierarc!licl.Uy nlned dl.~ nruetl.ln:. (Our UH oC the term oorr11<lii:refl corre:sponc1a

'.0 wiut ia now commonly reCerred eo a.a an, Ilormai (orm in the da.~bllM cnLQaclment community,) Tb.

alUDple binary (order ewo) relatioQ shown below apre:s.se. a part-whole feUltioruhip bc!tweeo airpl~es J.:ld

:helr (!lypothetic:.:J.l) eQllItituent p&r:.s:

PRODUCT I PART

DC-lO I wOHi

D(;'IO elllloe-llloun&

D(;'J o:rygen-Illuk

DC-IO I o:ry g-ezI-lllUjC

DC-I0 I maio

Thla example relation, ...,hic.h -.ill be U-'ed 5everal :ime:s in the remainder or thi.s section, should be ine.rprec.ci

11 lAMrtUlg that a 0(;.10 includa u ~ .. a waee!. Ul engine- cnount. aQ oxy;eo :nasle ~d 3. radio, '''hile J.

JC-3 ~clude:s ~ axysen mult. ~ot. ch~ ea.c:il oi :he entrie:s in the :ablc ue ";lritllltive doma.in ele!Ilent.s~;

~llther ~haA represent the Cact that the DC-IO con~ each oC these rour parta by aaaoca.tiJlg an Ul)tic:it

:Ltt oC paru ""ith a sincle entry (or DC-lO, thl air,lILQe', identiW mu'" be reP"te<i ror (!:1eh part In order

co ~wiy the aorma.luation requirement.s •. .u . ..,. shntl ~ in :he Qext subsection, however, dl.~ nructures

apl'1llMCi u 1lIca. t.rees. ete. cna.y be eaaiiy "~ormaliJed" in a. ,imple Uld :nechJ.:lic:ai Cashion.

\:pon e~ual in'l)eCtion, our knowledge-baNd description~ already J.ppe&l ~ !lave the !la.aic: "ruc:~ure oi

~ebtion.s. To be sun:, :Ile 3."ribuee/value stNc:ture observed in the tupie:s oi J. relation ;ltC doee ~alOl'lel,

both in iorm ~nd (unc:tion, oj the sJotl GUer con.struct.s of our knowledge-baaed desc:riptioQ laag'Jaie. ;{ote,

however. tha.t while the value oC a. simple attribute within .:l given tuple :nust be 3. non-d~omPQRb\e clement

Page 13: taau. n .. n. nu.. - Columbia University

o( 1 p:u'~icular primj'ive dom~n. ~he Gller of J. !lot al:1Y It.selt b. a. ~eneral description. ;l<lSIloly eOM~inc

it.s own embedded !ub<iescriptioll3, llld '0 on. Furthermore, .. ne incllUion ;vitilin a. ,iot of descnpr.ors o( lony

eype but individual. a.lon~ with ~he e:lp30iliey ror mw,iple &batnction through the '.ue of ~ultlpie-descnpr.or

descriptioOl, alia viol~ the eOllStr:Wlt.s oC re~tiuauJ aormali,~ .. tion.

'The ~pacity ror muHiple viewpoint.!. l.lld {or ~ ~bi'ra.ry degree of !tructur.U embeudinl;. :.s e5.!enti&l ;0

the !cind of lcnowledp-b~ retrieval ',wit.h wltich we ~e concerned. This difference be~een ~ile ",chematic'

,tructurs embodied in our knowlecip-bued descriptioWl. on the one ttand. and the tuple of a. aorm~iled

relation. on thl! other. ;., tilU3 qUI~ iWldamenw to our lpproacl1 r.o knowledge-bued ~etrie"". Con.aequently.

:lHhoucb it ;., eeruUliy ~mptin~ ~o a.ppiy the r.ooill oi locic and reLational a.lgebr1l more directly r.o the eaalc ~

land. ~ knowledp-bued desc:iptiollS :DIU' be eonver~ to normalised. reLational form belore this machinery

can be apolieci. We will thua now e::u.m.ine :he procedure by ... ltich our demoonruion sysum :10rma.liles the

lcnow led sr l.Md deICriptiollS prior to the ~pplieacjon of lolia! llld relational a.lgeb~c m~h&l1iaau.

Nor:.. ._luion o( the a~rn:lJ (unnormalised) form o{ a description involves the ~ditioQ o( aew tuples to

vvioua re:..l~'oUJ in the sincle exe.aruiooal da~bue wbich embodies all the descriptive inIorm3tion :na.nipula~

by tile system. (The ~r.n "extell3ioaal" '.vi!l be mo'i~ in Subeec~ioa i.5.) COlUider. ror ~pi ..

:he (ollDwtn, desel'ip'ion. 'albich :.s compoeed of a. sinel- perspective descripWlr harlo, Scor8f(e-d8'l'ie. a.a it.a

pro~~ lnc1 .\f~·~fM a.a che !iller o( it.s medium sIo.:

& Scorage-deviee

Oledium .- MllC'~

The m~roai (normauled) (orm oi ellis description compn.ses tile rollowing ~t oi !ix ~uples ...... hich · .. ould

be lUded to five dis'Ulct reLatioaa Ul the exteUJional dau.ba.s.e (sUlce the ne[atioa • DTO R- ·DTYP 1::. a.cquires

:woo aew tupl.):

·DTTON ··DTOR. (GD01, G(02)

·DTOR··DTYPE. (GD02, Pltnpective)

• PERSPECTIVE· -PROTOTYPE. (GOO2. Stor3(e-devjee)

.OBJECT··SLOT· ·F1LLER. (GOO2. Medium, Se.or .. e-devjee)

.DTOR· ·DTYPE· (Goo3, Individual)

• INDIVID UAL-D TOR· ·INDrvrDUAL. (GOO3 •. W ... ·upe)

(In the ine.arcst of brevity, the word OESCRIPTTON ha.s been abbrevi:l.~ 3-' OTION. DESCRIPTOR 1.1

OTOR 3lld IMPLIES a.a IMP, a.a in our actual 'Y5t.em. We will ~ be ..aing the abbreviatioUJ PAT, ror

P,HTER.'V, and TAR, ror TARGET.)

10

Page 14: taau. n .. n. nu.. - Columbia University

~ow til a,. Ul order Co) expiicitly illdic:l.t.e ~ which reia"oll :l1e :upics Ul question ''''Ill ~ J.Cided. we hAve

,JMd a tomewbu crurereat representation scheme. cailed inc.e1Uloa~ notat!oa. for ~eprl!Sent\nll :b.e :lor!Il~ed

rorm ot oW' aample description. In iatcll.llioaat Qotatioa. the name 01 ~he relation appear, al'St. foHowed by J.

puenth.ised llat 'lIFltich 3pc<l.fies (in a predetermined order ~iated WIth the relatioa :0 question) the '~ue

01 ~ simple utrlout.e ia the particular tuple being represented. PartlC:ular tuples U\ ill~!UioQ:U Qo~tioQ

thus hAve the (arm of loli~ predica£~ ( ... ith COQ.l~t ~gumeoc.a), ·..,oica. a" ...... 3bail see ~ :be (ollo...,oC

ruc.ection. will permit their incorpor:l.tioQ in weil·(ormed (ormula oC a restriceed ant order ;lred.ic~ Qlc:ulus.

The sicniScancl oC the aat.er.JIIKJ ,urT1)undin, the reiatioa aame ill inr.e!lJion~ :lo~tlon will be djxlUMd i.n

Subeectioll 1 . .5.

While it'llFill aot be a~esa&l"T Co) coa.sider the detailed 3emantia oi each oi '-he ,inetn :eiatiollS wbicl1

In employed in our demotlltfatioD sy!t.em. l.I1d their reiatioa.ship Co) the external dac:riptiollS rrorn ' .. bieb

t.b.y Itt geIlUa.c.eQ, it ia 'lIForth aoc.inl a. rew I:hu&.cr.eriacie d.i~erenca between the externall.l1d int.ernal (orlU

ol a deKriptioll. Perupe :non obYiola ia the appearance wi~ the L.nt.lrnal (orm ol a number oC ;lrimiti ....

elemen" (tbON ot the Corm GOOr\ in our example) wlticll l11a1 be said Co) "CV'r"f lnaJlinr" 50iely on ~he buia

oC their ~1a'ioQ.1hip "";~h o'bu, exf)licitly aamed, primitive elemenr.a. (Th~ t.echaique, ~ong W\~h ~tle elamin,

eOIlTl'IlUOIl rOt such e1emenfol. should be ramiliat to LISP procnmmen a" an l.I1:Uocue oi ~he "CENS~ed·

atorn.)

More L.ntenscin(, howe¥l!f' , ~ the a.ppeannc:.e "";thlll ~he lntulln! (orm o( c.r~ "nauuaHy" named

;lrimitin elem~foI wltieQ aonathel ... do not appear· a-plieitly wi~hin the oncinal external description. In

particular, note tha, while 50me oC the aamed elemenfol cornspond Co) semantically maningiul Co)UIU (ound

ill the extern&! dacriptioa. othen (in our exam pie, Penpecein alld Ladividu&i) 3er"l'l! a purely lYD~ccic

Cunctioll. c::q)lic.itly repraentinc In aorma.liled re~tioQal (orm ~he structural inform.cion .11&4 wu implicit

in the SYll~ oC oW' \cno'llFledge-bued descriptioa la.agu:l.ge. LOOMiy Jpeslring, :he proccsa oi tlormaliu.uoa

involvs a fl.£tAnillr oC the oricinn! tn. 'true~1U'1I or the enerna! descriptioa, t.ogether witn a.n e..rpwion of

the original descriptioll Co) fXl)Uc.itly repreSol!H syn~tic UlCorma,ioa.

Since descriptioaa serTe :-ol. aa p.ctenu. t&tIe~. anfAceee.nr.a or eOa.Mquen~ 11ri~Qill out !ystem. tuples

:nlat be added Co) the extea.sioll&l d.~b ... to rea~t these roies. It our example were ill [:le: <l :~rge' description

(wltieh of coune 3ee1J1S Il.IlJatanl in thia eaM), :hia COI1lll!etlOQ ..,ould be dnwu by ttle addition ot' the 3incle

tuple

.T.ARGET--COLLECTION. (GOOl, ( COilKt.iOll aame))

Co) the • TARGET· ·COLLECTION. relatioa. It, on tbe other nand. the ex~ pie d~rtptioo ·.otere the

coa.sequcnt oC an AC·pa.ir WhOM l.I1t.ec:.edeat dac:ription wu & Tap ... drive. the (oilowing rour tuples would

iaJte~ be added Co) the ~n.sional databur.

11

Page 15: taau. n .. n. nu.. - Columbia University

_ANTECEDENT .• CONSEQur;NT- (CC04. COOl)

-OTION .. OTon- (COOl. COO!)

.DTOR .. OT¥p£,a (COOS. Penp«un)

,PERSPECTIVE •• PROTOTYPE- (COOS, Tip .. dtive)

:-low ~h.at ~he \tuo.led,. base, wtuch wu cb.~.ncten1ed III Seedon ~ aa a Hpara~ dlr.aba.M iQf ?'=~lIoP;al

:c»eOS, \l ~tlJ:l.lly i.mplemeJl~ u put or ~hll sU\§ie eneasiooa). d.a~bue , c.o"e~b.er '*'Ith ~ t3ndid.", WIU

ct!$CnptiolU In ~b.t collKtioo &D.d ~h. i.nterua.1 feMU or ~b. patwru descriptio!!. used in ~!l. allJ.""W of J. pa,nic;uiu

~eUtevtJ _ioc.

(II. dti., ,ubMc~loQ. 'oW will aaml.l:lc 30IDe !ll od:llt1el1~ CODJlec~ ions lM"'"11 ~11t des.ctiptiTe e.ap:l.buiti. o( tb.

:rn OM:!ft prediu,t.t weuha <U1d t.!l. 'JM o( :e!.ational .J1.b~e opHuon 1.0 CD". tucb !opeai. descriQC<Cca

COCDputUlooally enec~iv •• While WI ilUpa h.a.ve c.h~ 1.0 exemplify tb..ia N!l&tiouabip Uline 30 :ulistic ua.a:t.pl.

ot ~he 'behaVior oi our demonlnatlOQ J1st.em, :h. rellMn eOlllpl.:n~ ot the lc:now1edle-bued 1D&lCb.ihc

op.notioo would b.:"VI mad. it uD.oeceu&l'liy diiBcult 1.0 ideotiIy the elMoUaJ. impo" or such u a.a.m.p le.

We haTe ~hWl C.lON!1 to illUltn.1.4 tbe ~entn.l UPftIol or oW' 11M or mathem.atlw 10(!~ iUld ~elational a.Isebra

tbroulh the '.1H or a !impl. a.a.m.pl. ~picaJ o( CO OTlotiQu.l ( ;r.a oppoeed :0 kno.ledce-buedJ rettien.1 i.o. a

~la' lOc.:U da.t.ab&M :l11.D.a(e!%u:!lt eQO\CXt. or intcre3, t.Q thy retUd l.:I :b.e OC-rvatlOD. :h:u. a vflde ratlle or :YP1W dau.blM quer:es ca.n be apres.Hd i.o. lile (orm of

L. 30 ... ell·formed (ormul .. in :h. tin~order predi~t.e ~a.kulu:5 (without (UIl~~OO fYm· bola), toil!:tber with

2. 30 lilt o{ (tee nnables o( that ~U·{ormfli (ormula. .lil ~ulble jOl0t In,,'alltiadoIU o( ... bien an to be returo~ .l3 the vaiue o( :be query .

Thl:s ooxt"l':uioo:a ben illwtra~ ,by eClQ:5LdetU11 a ump le ~ple. To thts t!l.d, '" ~ave supplement.ed

the hypotlumw .PRODUCT··['ART. rel:ulon iotrodueed ;0 SubMCttOo 1 . ~ "Nltll a .:.e .... • CUSTOMER··

PRODUCT. rel.,ioD., wbJeh is to be l.Qt.er;:ItI~c.cd 111 lUC:t t1nl th at . \mcn~:I.Il Airlines hu purcnucd on. or

alOr. DC-l's ;u1d OQ.. or mar. DC· 10'" ... tWo w .. c.ero Aitiin .. owu one or more DC·IO', ooly:

CUSTOMER PRODUCT PflODUCT p,vrr :~ ... '::r::LUICa.o DC-J DC· 10 ...,n ... 1

WClt.er:a DC·tO DC-IO !nl'1 ne-mouJ:lc

. \meflC;1lI DC·IO DC·J o.t')'gcn . mux

DC· 10 ory;:en-mux

DC-ID I radio

Page 16: taau. n .. n. nu.. - Columbia University

Suppose we wishe<l t.o produce a lis~ pairing each airline with each part which could be found in the

inventory or tha~ a.irline, independent or the iden~ity or the model (or model:s) of airplane which accounted (or ~

the presence or that put within the invent.ory or thaL airline. In relational terms. we should like ~he r~ult

or our query to be a new binary relation having two attributes--one ranging over the !ame primItive domain

M ~he CUSTOMER attribute of the. CUSTOMER.- -PRODUCT. relation, and one over that of the PART

attribute or the .PRODUCT--PART. relation-each of whose ~uples satisfies the relationship in question.

Such a query might be expressed in the following way uaing the Language or lirst.-order logic:

(z, z): 3y.

(.CUSTOMER--PRODUCT. (%, ~) "

.PRODUCT--PART. (y, z))

where the result variaoLe3 are specified in a pa.renthesiled lis~ which is pretix~ to the well-formed formula

and followed by a. colon. Here, we are a.saigning a correspondence between the predicate • CUSTOMER-­

PRODUCT. and the relation having CUSTOMER and PRODUCT aa its attributes, a.nd similarly, between

the • PRODUCT --PART. predicate and the other relation. The re3ult or thia query is defined to be a new

relation, having two attributes (corr~ponding to the two rree variabLes % a.nd z specified in the result variable

list), whose tupl~ enumerate ail of the combinations of one instantiation ot % and one instantiation of z for

which the well-formed formula is true for -'Ome instan~iation oC the ~ten~iaily quantified variable V.

Let us now consider how the r~ult ot such a query might be computed. All possible combin:l.~ioos or ~uples. one of which is choeen from the .CUSrOMER--PRODUCT. relation, the other from ·PRODUCT­

-PART '.' whose product a~tribut.ell share a common v:uue are identified. as illustra~d below by the lines

conneeting tuples of the two argument relationa:

CUSTOMER I PIWVUCT PRODUCT I P/\RT

American I DC-J DC-IO wheel

We;, tern I DC-IO DC-LO I engIne-mount

A.merican I DC-lO I

DC-3 oxygtm-ma..sk

DC-LO I oxygen-mask

DC-/O radio

For each such ma~chillg pair or tupl~, a new Luple i.:s created by concatenating the two and elimin3ting

one copy of the common PRODUCT :lttrib\l~, thus yielding the following ternary relation:

13

Page 17: taau. n .. n. nu.. - Columbia University

CUSTOMEH I ['[WDVCT }JI\i{T

A.mertcaa I DC-,J oxygen-masK

Wesc.rn DC-LO ''''nef!i

WesC4!Ml I DC· 10 eO(Inc-tnOWH

Wesc.rn I DC-LO oxygen-1ll6lx

WesC4!rn I DC-IO (:laio

Am en can I DC-IO I ''''beel

Am en CAn I DC-IO I eCgl.o.,..lllOU1H

,\mencal! I DC-IO oxygen-l1luJc

American , DC-LO I radio

The operuioll :hu 'Ne oave ju,n described proyides our firs, e:u.mple ot ~ relntional alliebraic opentioa.

~hich iJ called ~he join (more precisely. ~he a.:Hural join) ot ehe cwo artUmelH relatioll.S. The PRODUCT

Ol,'rioufAS of !3CQ of ~he ~ :ll'l'lme.lH relatioll.S ue :.o~ether referred eo u the join ~tuibut&l.

~. !lOWe"fef. :hat. our formulatioa at the query made ClO reference too the ptoeiuct by ·..,hlch the

C'.a~mer :uld .,ut ue rela~. To produce the desired result re~tioa. 'Ne lIlUon therefore ~emove the

PRODUCT OlU,llbuC4! irom our int.ermedlate result. ~oti~, howne!'. tha; the !int ~d eighth ;uples in the

int.ermediate reult are diatincw.hed only by the value ot their respectiTe produce attributa. Upon reraOT'a1 01

thia a.Ullbut.e. :hese two tuples would QO longer differ. introducing a. redWlda.llCT in the result rela~oa wtuch

:. proltibited by the fa.c:t that relationa ue 5et3. (Aa ..". ,hall see in the roUe,,"ng suOeeetion. our injU4ctioa

~aina' relatiolU with redWldant tuples does !lot reBeet a ,upentitiolU ~erel1ce too our formal deiinition ot

relations, but is in rut motiva~ '01 important practiea.l cOlUiderationa.)

The !ina! sc.ep ot our example thus inTolves Qot only rem0n4 o( the PRODUCT ~tnbutc. but a.Lto

e~tion oC tbe redunda.zH ~uples tila.' would otileMJiM resu1~ (rom the temOVlU of formerly dtscingui.silinr

Olt~lbuta V"&iues:

CUSTOMER I PART Amertca.o I oxyge.o-m6lic

Westero I waeeJ

Westenl ellf'.ne-lDouza

West.ern oryge!l-llla.tK

Western .~io

Am erlca.o I 'Nnef!i

Amerlc~ I e.o~e-tnOUl1'

.-\ll2erlca.o I radio

Thja combined opet'ation ~ an example of Olllother re~tioll&l .1lgl!br&1c operatioa, e.a.iled projcctioa. We

:et"er eo the CUSTOMER and PA.RT attributes, whicil a.re earried through too the result rciatloQ in our e:umple,

13 projected ~Ltributes. while ~he PRODUC'7' attllbut.e ~ described 13 .oro}e<:ced out 0.- :he .llJ1lment relation.

The very Sililpi. example that we b~ve j~t cOlUidered O:l.'S r~uireci only the illform.1l in~rodue:ion oi

the join .lad ?roJ~t o~a.to,.,. U1 the (ollo~ing subaectioa. WI! ... ill deiine a larger 5et o( relational algebraic

14

Page 18: taau. n .. n. nu.. - Columbia University

primi'ivcs, providing a more rigOrolU deHnltion (or e~h one. The importance oi ~he relatIon&'! ~~r3 t.o our

~ (:u1Q iJldeeii, toO 3 !;reat many knowlec~e-bued ~4lIiu) derives (:om the (act :hOl~ chll more :omplete se,

ot rei&tional alceb~c primitiTes bu ail ~he ·'descrl~"ive power" oi tho! logic,oMeQ query 100ilguage :ntroduced

aboor..

Perila.pe the best icnowu ana101Ue oi thi3 oblerra~ion l."l the rciation:1i da.c.abue litera~urc ~ ~he '~ed

:ompiete!lesJ resuit due t.o Codd ) Q721. !n essence, Codd pTe a COQJHuctive prooi tllOlt ~ny query formu­

'.a.t.ed WI in " the calcwU3 ot' a-aprcssioa.t (oMn called the reJa,ionaJ C~CUiU3)-& descriptive languacc 'lUIe.

similar to Out own an~order .,Mtdicac. ~culu.l-b~ query ~ncuace. but where. amonl o,her datUlctiotU.

quanti/ica.Lion is oyer tuples ~d no' elcmc:nr.s of tile prunitive dom~ou1d 0. computed by ~plic:atlon 01

a 5ui:.abie ~enet of relatIonal aigebra.ic open'ioaa. [porine (or the moment the diB'erences o.tw~n "nct

5.rswrder locie: and relational calculus, C04d's result thua pro-rldes a sy1tematic (thoup ,enera1ly :.nefficient)

.... y ot compu~c the resul~ ol ~ arbit.rUT Iin~rder quary, aa de~ed above, WlillC only !.he relAtional

&.!pbr&ic opentioQ.l that will be dwed lJ1 tbe :lEn rubee1:tioQ.

One way of viewinc tbe roles of logic ~d relational &.!gebr1o in tltis ~rt o( reerieval eult thu 'q UTI! found

particulArly useful in our worit ~ bued on ~be :lo~ion of a (ormaJ tbeory. Within this theoretical r:~eworlc.

we 'riew the query aa plU'~ of 3 fine-order tbeory, and till retatioaa in the (e:rtellSional) dat.&bue a.a a. pa.tticWu

we. inCerprec.a,ioll o( that ~Ileory, !Jl particuw, uch primitive prtdi~e. in the query is ~iat.ed ...-ith one

~tioa lJ1 the ert.ellSional dat.abue, whicb is trea~ 1I ita i.ac.rpre~'ioa. Within thia (rameworlc, .... caD

-new the problem oi fincfulC !.he ~esult o( ttle query as that o( 5..o.din, all ~ible nlues o( ~lle (fre.) rewlc

.....nables such thu the query well·{orme1i formuta is logically ,aei.siied under th", lJ1c.rpretouion. For thia

reUOQ, we :IOmetUnes call refer t.o the taU ol computinl the result o( & lopeal querr a.a one oi u.loi!la.c:ion

oyer a. attic. (albeit generally lUle) domain.

7,4 The reacionaJ algebraic primitires

The re!a.tional al«ebr1o we haTe UMd in Out ~eh ~ baaed on a. ,mall set of &.!gebr-aic opera'O~ enumet:lt.ed

by Codd (19721 ·..,bich t.a.b 011. or more rela.tiona (alon, with eer~ "control" ~(ormation) M a.rgumenr.s.

!"etu.ruing a. 'incle new rel~ioll a.a tbm nlue. Tllia set oi primitiYes includes the orcfulary set opentioca­

which, ·..,itb one rsU'icLioQ, are deaned (or relations in :nucll ~he u.me way aa tor o~her seta-a.long with 5otTen!

"ructl.lnd operu.ors, which maJca refercnce ~ tbe illc.raa1 annbute struc~ure o( ~he constituent tUlJies. T~e

ewo moat import.\At ~'ructured operll'Ol'S, project and join, have 3.lre3dy beea iniorma.!ly dl!'5Cnbed In the

llnrYlOUI subMe~. Sovera1 otber ,tl"',lctured opentioQJ will ll.:IO be :.ncro<iuce1i in ;his 5ubMction which lIlay

in (au be denTed Crem project, join and "he uQStructure1i set opentioQJ, but · .... ilich serve cer~ .,&rticula.tly

imPOI'\a.nt (unctions lJ1 many prutical ~pl.i~tiollS of retational ~gebra.ic system.s.

Specifically, '..,. will be concerned in thi3 paper ...-ith ehc rollowi.Jl~ ~ela,ioaa.l aigebr"&ic ilrimitives:

1. Union

2. [nt.enee~ioQ

3. Set diB'erenca

15

Page 19: taau. n .. n. nu.. - Columbia University

,t Projection

5. Join

• Selection . , S. RC5triction

The tllr~ !linl1l)' set openl.<Jr5 unjon. Ul~rs~tjon lad ,ec dil1et'f!nce ll'e delin~ i.n ~ ~ei:.tioQ&.' llieor:1ie

system i.n tOe same way u tor seta :n ienc:ni .... Itll one exception: the r1!!atioaa! version ot" nch -' dec.a~

only '*hen ~he two relAtioM :hu serTe u it.! operandi ll'e 'JDjon-comp:uibJe. Two rel~tio!U ue said :.0 ~

'JIlioQ-eomp .. tible if lad oniy if \hey ll'e oi the same dqr~ 1\ ~d the '.ulderiyi!1g doma.uu oi the I.th ,impie

3.ttribut.ell oi the two relation. are the same for ail i, (1 SiS 1\).

We thua delio_ the union oC two union-compatible relation. RI a.nd R1 , denoted (Rl '-' R1 ), a.s 01. relation

con.ilting oC cx~tly thOM tuples that ate iUl element oC R l , oC R,. or bocb. The intersection (Rl Ii R, ) i"

denned aI that :-elation coaw.ning all tuples found in bot.h Rl and Rt. FinaJ.ly, the sec diiTerence (Ill - R, )

~ de4n~ :.0 COQJilc oC aa.c:tly chON tuples oC III that ue noC present in R2.

[n prep&lacion ror our forma! dec.aition of the projection opcr:l~r, ....... ill :lee<1 :.0 introduce some

3.ddhiow :lo~tion. ~inc. we J40pt the eonventrOQ that a lat or primitive domaUl ercmelt.! endoaed by :l.ngi_

bn.c:uta (~(" lad ")'" will desicnate a aew tnple contWil1C tile specUie<t ciemellta aI the T'&!uC3 oC ita simple

attribut.ell, in the order litted. Futhermore. it ,. iI a tupl_ oC some n-~ relation R. WI will define ,.(il to be the

value oi the j-th utribute of,., (1 SiS 1\). rt will be c.onvenient to exund thia Qo~tion :.0 4ilow erpreuiona

,ueh aI "(Al, where A iI a cOlTlpou.ad a,ttributa of R COMiot'LnC or the In (no, a~esu.tily cliscinct) ,impl.

3.ctnbut.ell oumbered il,;","', i ... , defined 3uch that (r(AI) represeuta the Qew tuple (,.(ilL ,.[j2i"", r[j ... ;),

We ~ay :lOW deline ~ile pro)ectioQ or a relation Rove!' the compound attribute A. Ja the seC

((,.(A!) : ,.e.R}

:'iota that 'n ~ave dec.ae<t the projection opera.c.or in such ~ way eha' ,imple ~,uibut.es 'oY1thin :ile compound

a"ributa .4 :nay be replicaud in the c.oune o{ projcction, Depending on cert:W1 detaib in the dennition oi the

join operation, elliot c.onvenr.ioa may have important theoretIC&! consequenCe5 a.i'ecting :he expressive power o{

the raultiz1c algebra.

The projection ~r may be thougilt of aa a sort oi "'vertieal subsetting" operation, ~ willCD

1. the ~oa-projecl.ed" ~'tribut.es of ea.ch tuple ~ the argumcQt rela.tion ;ore eli!DiQ~ted,

Z. the rem.&ininC attributes may be permu~ lad/or replic.::lted. ~d

3. any dupUc:.ar..t tupl. ehat result {rom the eiinunar.ion of V'&lues ehat rormerly dlati.aSUished diiJ'erent tuples ate tilen removed.

tn :no" implemen~tion' on 3. von :'ieu:nllJUl m~hin_thaL i.5, :l ·convention:!l" computer system baving

:l 'ingle cenH:ll processing unit ~tini on 3. 'ingic bank or r:lndolTl access memory-~ile ltLnbuc.c elimination

and permutoltioni replic:a.tioQ {unctions can both be implemented tUUlg a 5impHt lad computa.~ionally inc­

penai.,e proee<iure whose com plexity i" llnev in the cardin:llity of the argument re~tion, The elimination or

15

Page 20: taau. n .. n. nu.. - Columbia University

redund~, tuples. on the other hand. ::nay be ,urpmlngly ~ime-eo!auming, particululy '.¥t\en ~he J.rg'JrIlent

re1a~oQ :. lafle. rn r~t, one commOQ convention L.o. 30me "Oll :-leumann lmpiement,Hlon3 i" toO ~ei:lx che re­

quiremen, tbu rejation.s oe true ser.s, &iloWU1g the illtro<iuwon of dupllc:~tion duriQ~ 30me or all ;lroJec~ioQ.t.

TlW approach L.o.t.lo<iuces the following problem3, however:

1. ~he ::n&Ulr.en:ulCe oi duplicQr.e tuples may !elld t.o combinac.ori:illy !Xploaive growth L.o. the ~dinwty of tne i.Ilr.ermeciiat. resulr.s of a complex query, 3Jld

2. runctiotU sensitin t.o the repetition of identi~l tuplet-the e=ucul:1tiou of numenw COUOI:ol :uld ,~'i."leal :ne&3ures. (or exam ple--will ao, yield a.ecu~r.e result" J redWl­dlUl' tuples &re ao' in, climulac.ed.

One oC tbe capabilities oC ehe NON·VON :na.c:l:Wle i.s :he performaoce oC true ?rojeetioQ '..,ithout the t:u~

teet. oC redunwt tuple el.itlW1.a;ioQ.

Delil1itioll oC the join o!)4f&tion requires the delinitioQ ot olle a.ciciitional coQ.ttruct: the ct)oeawnacioll ot

!;WO ,upl .. It "I ia a tuple oc a. M!l.atioll R I , laving dep'ft "1, and "1 :. a tupl. ot relation R" llaYin~ d~

"'1, the eOD~c.eQacjOD (,.d",) of "1 and "1 i" delil1ed t.o be the new ("I ... "'1)-tuple

Several nriation.s oC the join opera40r are commonly discu.ued in the ll~&ture; we will ~ oy deiinin,

:l particularly lmportOUlt variant li:nowll aI the ~-joi.ll. The equi·join oC two M!l.ation.s Rl ~d R, over :he

compoWld a4t.ribur.es Al and .4" respectively (each J.SSumecl ~ be compoeeci ot ~he same aumber of 3imple

aHnour.es, wit.h corresponding simple aHl'ibur.es llaving Ullderlying dom&iJu th~ are comp&raoie under !.he

equality p~icar.e) i.s delined a.a

.41 <lnd A, ~e referred eo aa the (compound) joill <lUrloaces. which will have !pe(:ial 'l~ilicance in ~he

~gonthlJU introduced iD this paper. !Jl t.he ea.se where ...1,1 llld At are ~he degeller3~ compoulld J.ttribur.es

eont.ailling 00 simple a.ttributeS, equi·join rei1uces eo the ut4!nded Calceslan produc' o( the :upies oi III :lOci

R2-~ha, ;., t.o t.he _ of all po.ible coocat.ena,ion.s of oae tuple from Rl with olle (tom Il" 7he :nore gentnl

join oper~iOG may be intuitively thou;ht of .lI J. proccsa oi [iI~rinl; ~hc extended e:utC:Sl:1ll product oj Rl J.Llci

R, by M!movinc (rom the result :ill conjoined ~upies whose r1:!Ipec:tivc jOL.o. attribur.es ~ve dilferen, values.

(The ct)mpu~ioQal method 5Ugested by thia i.Ilr.el'1'M!tatioo, of eoun.e. would ill general ':)e lmpractically

ineffi cient. )

The join o~ion is in general quit.e e:cpen.sive on il conventional von ~eUmaJlll :n~hine. ,ince the :up!es

of R 1 and R, mu.s' be paired for equality with respect eo the join utributes beiore the exr.eoded ~rtesWl

prociuet oc each group of "m:ltchinc" tuples C&Il be computed. In ~he <lb~IlCe o( phYSlC:l! clu.stetiog 'Mitll

respe<t eo the joia &'tributes (WhOM identity may vary in dilI'eM!ot join.s aver the same pair oi relationl), or

the use or 'lUi 0 u.s C4cl1niques requiring :l large &mount of redund:lot s~r34e, joinin~ is .ypieaily ~complished

17

Page 21: taau. n .. n. nu.. - Columbia University

!DOl' emcienl.ly OQ a voa ~euml1nn .::~ltine by pre-sortin~ ~he two atTJ!Dent :clatioru with :espeet :<I the join

utllbu~ The order or the tupies [oilowing the sort;" 3A:tually r.atuitous iniormation from ~he VteWpoUlt o(

the join operMion. From a s,rictly formal perspective. the requirc!Dent.3 of .1 join-~hat the tupi~ ~e ~:1ired

in such a way tb~ the ~ues of tbe join attrlbute matcil-are significantly '~er than :lloee of a rore, "lfluch

require that tbe ~uiting set be Hquenced according ~o ~hOM vall.1es. The :i~tinction :., ::loot in :he C~

o( a von :'fewn~ aachine, where 110 aaymptotically superior ,ener~ soll.1tton t.o ttu3 ~&irtn" problem than

sorting :., preH1nly iu\owu. 0 ne at the design go&!. or the ~ 0 N-YO N :nachine. ~owever, -' t.o !Due .lH

of the -u:er coastramta Urroived in the dednition of this kiad of operation :.0 obyiate the Ilee<i (or either

pre-sorting or the e:xt:a~ant use oi redundant storace.

One common ~t of the equi-join operator :. ~he I2&CUra.i jom. introduced in tile example ot the

;mmoua section, in wltic:h one at the two join a.uribl.1tes. which are ~undantly representad Jl the resu1~

~lation ill the C3Ie or equi-join. :. elimiaate<i (aa it oy projection). Our uehi~ture supporo:.a both the aatur~

and equi-join ill a. tuchlY efficient :nanner. A alore pnetal form ot join o~n clixuse<l in the li~lUW'e :.,

the 9- join. whoee definition :. simil., to that or the equi- join, Ol.1t with the equality p~car.e replaced by a

:nore gener3i billary ~redica~ 9. (lA Codd's definitiOll, 9 :s defined eo be one or Ule uithmetic operuion.s

-, ,.. <, ~, > or ~.) Coa.id~ioa" (or the efficient enhation of the i!ne~ a. join open,<)t di6er in

~veral respect.3 from. ~hOM illvolnd ill evaJua.ting the equi-join. We will Qot dacus thia alore pnetal eaM in

:he presc~ paper.

Each of the ocher relatiollAl algebraic opentoMi t.o be desc:ri~ ill ttu3 subeeetioll all ill (act be derived

(roal ~h. s'ructl.1~ operUOMI projeet and join and the three unJ~ructured ~t operuors. and ;ve deB.ned here

rOt Olle :It ~ch of ~he roUowing reaaoa.:

1. The ope1'UOr embodies " special c.aae or one or more at" the previowy delilled primitives wiUeh :nigia admit the pouibility of either a iesa complex. or a. more efficient, hard ware impiemenr.atjoa

Z. The oper~r represent.'l ;ul important ~d fr~uentty encoWltered "J.5e of some eompa.ltlon oi the pl'imitiY. delined earlier

One derived O~icll that occurs freql.1ently ~ ":xlth pra.c:ica4 and ~heoretic.aj discU3Slons. and 'Milich

p~ys a. particul&:ly importaA5 role in oW' approach. i.s cailed s.leetioD. Most algomllms and arcilit.eetuzes

:or '~ive ~tlien1. implemlllt what is esaentiaily a. procesa of re!.atiolla.i ~leetioll, In :he :'fON- VON

mKhine. !eiecUoll requires only a small, rlXed 3..al0I.111t of time. :ndepelldel2t oi the me of :tle d:1r.ab,,": untilee

mOl' ~'i,.. pnxeuor designs, however. out uchitectuze explicitly a.ddres.5eS the problems o( eilicientiy

impicmentinc ocher ~latiDnaJ opetlUOn aa "nil. The seleet operuor retur.u a. subset 0" It.3 single argument

relation cOlllis,ing ot a.il tl.1ples that satiaiy a Ii.n of at,riol.1c.e/v3!l.1e pa.in. The seiect operator ala.y thua ~

~ep.rded aa a natural join of the ;vgumlnt re!.atioQ with a singietoc re!.ation (a. relation :ons~tin, at exactly

one expiic:itiy specined tl.1ple) over all aLttibutes or the singletoD. Mote preciseiy. ~he resuit oi ~ scleetjoQ (rorn

re~tiOQ R witlr compol.1ncl J.ttribute A a..ad vulue ~uple V i.s

{- : 1'!R 1\ 1'~AI ,.. V}

18

Page 22: taau. n .. n. nu.. - Columbia University

",he~ ~he correspoadinc A 6.Qd V dom~n.s are ~~n laSumed ~ be compatible with respect ~ equliiity.

Another import.llH denTed oper~Lioo loS known Oil resUicCloa. While :estr\(:~ion. lilte cbe ,010 o{)er~tQr. :.s

socneucnes d.fined in ter~ oj a. geoe~l 0, we willl(:l!o be coocerned only ·.vltb ~be ca.se woere a :.s ,he binary

equalicy predieat.4l. The ~tnctioa of a relntioo R over the cor .. pound :l,:ributes AI :lad A1 (both c.ompoeed

oj sicnple a.ttributes oi R) :.s defined aa

[Q. it.l :nc»t common lorm. ·..,oere che compoWld a.ttrlbutes AI :lod A.! ~ each comPOeed oj ~tr, ooe

slmpie :l,uibu~. the restriction operator :etuz11.t ~ tuples oj it.l UJ'.lrnent relat.:oo in wbich the values of

,ile t'We specmed 1Hl'lbutes U'e equal. A.lthouch restrictioo can be delined. solely in ~ertIl.1 of the join Yld.

project opetatQrs. an implement.atioo baaed in a. sttaightfotwa.rd ."...., 00 tltis derintioo .... ould be Cl:)ruid.en.bly

alote complex &ad inefficient Ulan 001 speciJically t&JloMld ~ suppo" the res,rict oper~r. Restriction ~ ~

important eoouch oper:u.ion in practice tba; we nave tres~ the capacicy ror d.i.re1:t (and efficieat) e't"alu.atioa

of restrictive e:r;lresaioQ.S a.a a signwcata desiCU objecti ... e.

Finally, we rnU3t a.cicnowled,e a. derived operatioo that h .. coruiderable theoreticai ~d prac:tiu! impol"

~ce in many applie:ltiolU. ~ut to ·.willcn ·.we !lave devoW<i little specw at.t.ntioo in our eniuatioo of ~t.u.

na,in areiliteetures.. This opentioo, ealled divwoa. is used to a.ch.ieTe ehe e~eet.l oC UlUverW quantine.aUoIl

'Klthill Ule queries of a WlCUace bued 00 eh. relatiooal calculus (Codd [19121) and alay '~etl be worthy o{

'pecial uwntion in coune oi designiag a gelle~ly.&pplicabl. relational dacabue cnac:iline. Since it wu 00'

Ilecesary that tllia lUnd of operatioo be implecnen~ efficiently in ir.s (uil generality (or ~urpoMS o( out Al

applicatioo, bowever, th. reiational divUioa opfl'atQr wlil no' be given the sam. !Ott of 'peciai coa.side~tioa

~ this ~aper aa the other :".-..0 derived opef1>Or"3 described ~.

The aample query formulated in SubMct~n 7.3 1Il000e '.l:Ie o( ~ \o~cal pred.ie:ltes. esc:h l.SaCXz.a:.ed ·..,,,h Yl explicitly defilled. reia'ioo that aUtlbt be nored in Yl "exteruiooal datao:J,M' of the sort U3ed in our

decnoa.stratioa SY3t.em. W. refer to a. predieat.e of this lUnd. wbose rnealllng derives irom ir.s usoci;ltioo ... itll

J. rela,ioa WbON cOlUti'UeJH tuples are tXi'tieitly enumer~t.4Id. 3.1 a ,orimJtive p~djca~ .~ 'lie shaH see ~

Subeection 1.1, thouCh. our itnowiedt;e-bued retrieval ~IC require! tile '.l:Ie oi 1ll0ther idnd oC pref:iicalA,

·..,hieh ... will call & defuled pntdie~w.

A defined. pr.dieate correspoaci1 to 110 fixed. aplicitly delined relation. Oil does a. primItive pred.i~,

but iJut.4lSd derives ita lIlC:laing (rom J, propu (:'Ioo·tacieal) uiom expresaiag ir.s equivaience ~ a. 'Neil-ror:ned

formula involviog otber (defined or primitive) pntdicates. (Our ootioo o( :l defined predicate Ls doaely reiated.

~ that of a. view, Oil ucnnCf:i io the relational dat.aba.se literature, a.nd to) other con.nrucr.s that Q:J,ve b~

:ntroduced peri,9dic.ally by r~:uchcl"S in other areu o( compu~r ,cimce: it ~ our u!c of defined pntdic:u.es

,hat should be of in~rest here.} FoUo .. ing a cooveation employed in certain reant worle on logic ancl d~ba.se •

. ~ ... ill sometimes reier to ~he set of defined p~dic~ aa the in~n3jonal da'~&H of our SY3~m (by contraat

l!l

Page 23: taau. n .. n. nu.. - Columbia University

with the ~lUIOQlIJ 'h~bue. '~hich i3 composed oi relll"olU 'peedied expiicltly by enumera~ln~ aU ~b.eir

~uples, eull of which corresponds co a primitive predicate in the lOgical query lao~ua~e), l!1 ~he .Merest of

perspicacity. we adop' the COQvention of 5urroWlJin~ the name oi each prtmitive predicate by 13tertSiu. ;lI in

the e:nmpies ..... have weady 5Hfl. while the namel oi delined predicates 'Niil IlOt be 50 eoc!osed.

8y 'lI&y ai i11ustr~tioo. !ct us coruider a very 'imp!e example oC a defined i'redic:u.e. cailed CUSTOMER·

.PART. '.vhose ':>ody ;.s preei3ely the weil-formed formula from the 5ample 'luery Introduced in Subsee:ion

7,3:

Cr..:STOMEJl.··PART(:, ~) •

:V. (.CUSTOMER .. PRODUCT. (:. 1') 1\

·PRODUCT··PART· (II. ~))

rn ~h!3 a:l.IDpie. the defined predica~ CUSTOMER- -PART;'" d.fined in eerms oC ~he ~o primitive

;lredicat.a .CUSTOMER--PRODUCT· ~d .PRODUCT .. PART •• aUowinC iCol reductioa co prediQta WbOM

int.erpreta'iollJl 3.lC explicitly available in the er..enaional da~bue. Tb, oewiy d.£l.aed ;lredic~:.e could tnua

be UMd aa a sort of ~5horth:l.od" ror the sample query. which micht now ~ f!XllrellMii lA

(:, ~) :

CUSTOMER-PART (:. ~)

Although intelUionaily deaned predicates (more ?rec:sely. :heir anaiog'.les 'Nlthia the calculus or ~

expresaioQl) were llOt included aa part oC Cadd', OMlinal formwm ~or the ~e~~iooal ::lO<ieloC data. :nore

:ecent work by senn! I1!5Utchers 5UgesU ooe poeIible ::rl&aner :n which denlled j)redicste5 :nIght be e!llpioYe<1

"'Ithill a in~ordet query. A~ eh. ri31t oC at'enimpunc.a'ioQ. this 3.ppro&cil (exe!llpli5ed by Reiter) 9771 and by

ewe ~d Lee (19131) illvotv. & t~,teP procedure for ~he <:Y&iu'Hioa oi queries. Durill~ ~he 5.rst step. ~he

query (wbich may Ulvolve both primi,ive and Jenne<1 predic:l.c.cs), .llong with ~be predica.'-! cicnaition woau

{ouad ill the illt.eDSiona! da~baae. :., m&aipulatd ~Ulg Joucoma.tic theorem-provUl~ t.eetuliques ~ remove 3.il

occ:u:nnces o( til, denned predi~, yieldiac a trlUlalormed query con~inillg only primi~ive predic:l.WIS. The

:-au1tinc trlU1Slormed query :., thea evalu.attd usUlg ordill.arr ~~ba.M retrien! tecluUques. .-U we ,hail '"

ill the llc:xt. 5ubMc,ioQ, howf:VfIT. tQe l:l.c.Ic of ext.eoaiooal reedb&cic ill the course of iIltelUlOOa4 :nllOipwa.tioll

jmO<iuces cert~n problem.t which preclude the "ouibility of 3. str~lghtfarw:\ld 3.ppiic3tion oi :hi3 appro:l.eil

~ our Icnowiet1~b~ retrie~ ~Ic.

7.6 R/!Cuf3/ve pr~ic~~ deJiajejoa

zo

Page 24: taau. n .. n. nu.. - Columbia University

From ~be p4!t3!)«tive ot our own application, one or' tbe CI10" serioWi limita,tiolU oi the lppro~h t.o retrieval

eila' '" informally oUWned u ~Ile end of the l3.lt !u~tion relates to the :l~ ~ support :!1:!Jf3Jveiy-defiJl.ed

f)~ Before conaiderinc ~e roie o( recut3ive ~ffiiicalOe dclinition wtthia our 'Y!tem for bowieUge-b~

~trirra!, le~ ua eonai~et a very ,imple e:umple of ~ :eeuniveiy-Jenned ;lredi~~. rmOlli.Qe for :hc :nom~nt

that the exteMional da~b ... eontai.aed 3. biaary relation, called. CHILD - -PARE .VT., a.s.serting ,03.' SuzallJle

~aa Cbarlo-Jr a.nd .lduilyn u parenta, 'Nhile Cbuio-Jr is ill turn the son oj Cbario-Sf :u1d u~iIc, etc., loS

illuatn.~ below:

CHILD I PARENT SUJ&4Qe I Chula-if SUJ~. I .Id ari irtJ

Chari .. Jf I Charl .. Sr Cbui .. Jf I e.c.e1le Marilyn , Benjamin

Marilyn I &UJeI'

SuppcM Dew tJ1u we wished r.o cona~ruct 3. defi.Aed preUica.t4i DeSCENDANT--ANCESTOR havill, the

"",ue :.rue whenever ita line vgumenc is either the child, cncdchild, grea'"lf'l4dchild, e~ ot ita -=ond

a.rcum IIl1 c.

II it wete aoe ior the problem ot recursion Within a predica~ definition, the DESCENDANT - -ANCESTOR

predicaw eould be deaned u

DESCElVDANT--ANCESTOR(=, ~) •

·CHILD--PARENT. (=, z) V

:v. (.CHILD--PARENT. (=, II) 1\

.DESCENDANT--ANCESTOR,. (II. ~))

~ote. Qaweorer, th» Yle ane ~ (query eransiormation) oi ~he hYlXltheticai cwo-sup process OUWIle<l

:11 the preTiOUI 5ubeee~ion could no longer 'imply repl:l.Ce ~Il oceurre!l~ oi the deli ned pret!ic:J.~ D ESCEN­

DANT--M~CESTOR wit.h it.a body, !inee that body it.seU' eOQtains another illvoc:l4ion oi DeSCENDANT-­

ANCESTOR; ~e theorem-proving teehltique ...auld ~us ei'her (&il t.o t.4rmillau, or (ail eo and :lll po8lb!e

resuHa 01 the qu.,-. The &pproad1 th., we adopted ill the I.SEC algorithm eo the problem of ~eeursiTe predicau definition

'~ illt.4rmediac. results o( tho query en..iuation U1 order r.o ~rmil1:1u eompu~tion a.f~r ~ j)Ountiaily

rel~t resulta b:l.ve been obtained, :lYoidin, eornputl1t.ionaJ loops baaed on the ~ndles e%p&.ll3ioQ oi eycles

oi mutuaily-deliaed predicMel. While we will not examille 'he det.ails oi the LSEC algorithm at. eru., ?Oin~

the b:uic meehanism by which LSEC handles ree'.lt3ion is quiu simple. [n esaen~. our approach involves ehe

21

Page 25: taau. n .. n. nu.. - Columbia University

~rea~ent of the cOl1junctive operar.or in :I. query weil·iormeti Connula. aa a. ~eompu~tionai" (a.s oppoeed ~ a

strictly "IOSicaiR ) AND. A.s in ~he CaM of tile LISP AND, ehe operanci3 oi ,ueh :I. ~oaJunctioa are cvaiua.~ U1

left.to-right order until the first "railure~ -5peeilically, until tile ant CaM U1 which lppUQ~ion oi :he LSEC

procedure to some operand yields a (~exr.emioD (denned rigorou.sly in Section 3), whicn :nay r~:U<1ed 1.1

a geael'aiiution oi ~he OoolaM eoa.aWlt (abe. L1 the above ex3.alple •• D8SCE:-fDA.."iT .. A.~CES1'OR.. (y,:)

will aot be rei:ursively eva1ua.t.ed ~r the primitive p~dica.~ .CrnLD .. PAAEN1'· (=,!I) '-' :ound ~ lll.Ye

ao poaaible wtaJltiatioas eoasinent with tbe curnl1t binding oi :-tbat i.s, Uter reacbing r.be level oC tbe

"gnndparenta" .

L1 PUling, it should be aeicnowledlled tllat the importa.n~ oC reeu.nive ~redieMe definition withill our

OWll application :nay "Nell reaeet ellarac~ristics speeUic: to our tllesi.t ~Ic. We ve thu.a unable ~ m~e

aay cl.wm retvcililg ~he suitability oC our lppr~h ror u.se in eonventional relll.tional da~b:I.H applications.

Ulcieeci. Gailaire, et at [19181, Wt !'Wiewing tile problema oC reeursive p~iear.e dellnition. sugest ~hat ·it

i.s aot clear that one should permit reeursivt axioms (in t.he int.el18ion&! dat.abn.se) for rew~ic probleman. The

sllS1'ic:ion that ~e additional theo~~ic:.al power provided by reeunive predlcar.e defini~iol1 may han ~aJ

signi1icanee in ~h. cont.ext of eoncemporvy P~~lc:.al dst.&bue reqWremenc.a baa be1!n echoed :,y other a.utho~

While our owu ve.aa oC apertu.. do not permit a. qu&liHed opinion in thw reptd, it should be emph.uiJed

th~ in our application. ;n wiUch defined predieat.el are not directly inCQrpor:ued in a user· level deICription

lancuace, but iNtAad are uaed intounally to exprsa &n.d implema1t the sema"aics oC a b.i&her.levei descripQon

!angu.&«e, reCUJ"liye preciica£e defuJitiorJ i.t esHDt.iaJ.

L1 the (ollowing subaec:tion, we will examine the ...... y in wiUc:h reeursiveiy defined predicates U1 ~he

intenaionai dac.ab~ 3l1! u.sed to define the semantics o( our \(nowledge-baaed description IADguace.

7.7 A.xioau defuting elle match Je!l2azrt.ics

H~ving QOW unrociuced ehe esMntiai locic:ai aad relationaJ ~oob,·let IJa aow consider :he manner ill wi1ic:h a

\lMt'S \(nawled~baMd pattern description i.s m:uched ~a.ia.st ~he coUec:tiotl ot candid:l.tA ~et de:se.ripcioQ3

according to tile axiom. defined in the tnatch speeUicatioa. [n each such quer:,' :a.sit, ~eprdlesa 01' the particular

pattern descriptiol1, the "top-level" logical query i.s b.a.t tile rorm

(=) : DTION·(MP·DTTON(:, p.~dtioa)

The int.ended result oC tllis query i.s a unary relation. each ot whoe. tuples has 3.1 the value or it.! single

att.ribute a patticu.l.at ~e~ description (lIlo", preciMly. the primitive doma.iJl element ~h~ aachors the

int.ernal (orm oi thu t~et description) waic:h ma~hes the ginn pattern description a.ccorciing :.0 the rules

defined in the m:J.~h speeific:uion. Spccific:l.ily, the ma~h ~peeific:at.ion compri3a :J. definitloQ or ~he defined

predica~ DTION·lMP·DTI0N in t.erm. oC oeller predic::l~, some ot wnic:ll ue ~he!lUejvt:! defmed in \Jehet

axioms, &nd 50 on.

22

Page 26: taau. n .. n. nu.. - Columbia University

Figun 1.2 Axioms Definin, ~he Match SemaD.tic.s

The na.mf!o or each or tbe 28 u:iona defining the matcb s.emantiu of our knowledge-based dl!'Scriptioa

language &Ie wt.ed in F'igure 7.2. We have indent.ed this list t.o indicate which predicat.es &Ie defined in (.erm..

or which other predica.tes, with the Dame or a delined predicate showa below, and mde!lt.l!d by one un it WIth

re:op~' 10, ~he oa .. ,. 0 ' t he deliell.e pred iu,t.a in '"..holf: body it II fint wed. (The ~t.op- level~ defined predicate

OTION.(MP-DTION, ror a&mple, is defie.cd in t.c:mu or the defined predicate OTraN·OIRECTLY·eMP­

OTfON.) Exceptional definitional directions :ue ind icated explicldy wmg 1fro~. (P£RSP£CTfVr.-DTOR­

lMP-PERSPECTfV&.OTOR, rO t v:a.mple. ia defined in term. or the LOp- levcJ deGncd pred icate DTfON-CMP·

OTlON.) Each oC the upwiUd-pointing arrowl thw id entifies .. recunlve preJicat.e defin ition loop. undencoting

the ceottaJ role ,uch definitions occupy in our thesis synem .

Although. it will DOt be nec.~ary :lL tbia point t.o consider the det:&ib of nch defined jnedicate ie. the

match s~lfication, it may be instructive to consider one typical ,uch predicate in an attempt to convey

some feeling fot the kind of info rmation embodied in thcsc matching rulC3. To thill end, we consider the.

defined preuicate PERSPECTNE-DTOR-IMI'-Pf.RSPECTrvE.DTOn. which implemcnt.5 what we may c~1

,yntact,lc pe~pective matching-:

23

Page 27: taau. n .. n. nu.. - Columbia University

; ;fOCO .

( P~!,·pro,o (pa,.per, ptTJr.o) ,\

Per.ptTJc.o (eM' per, proco)) "

..., U'" .

rr: pao;..fill .

Ob/-J/oo;..iill (pa'-per, lIo', pao;..fill))

-, :: ?a:.-fiJl, ~r·fill .

(Obi-J/o,""Bll (pat,.pcf, .sloc, pat,.51J) A

Obj.sloo;..oJI (eu-~r, J/oc, ;u-611) "

Dtiou-unpodtiau (eu·fill, pat,.51J)))

b tbe cuune o( syTI.~c:tic ~npective matdli.D.C, all \Ule\ I)tf"!pec~\"a ~b a' lI1a~1l " givea. p.~'Wrtl

j)tMpectIVt! l-nd !lave :he s.am. pro!,()~ :1.11 ,hat patt.ru penpKti .... an id'fltLli.~. By (On'f:l.ll' "'i~b. .MaIM"':

p<!r3peeclve maub..i.or, ~ba proced.ure does aot identIfy thOM ~;et ~npe<:tivClll ~h:l.t b.ave dil1ert.oc pror.ocypa

:bm Ihu o( the patWMl. but chat would iA fKt $&cilfy the autchiul a-iter.a 00 ~he bw oC doltlll..i.n-Jpeeliic

,peeWilaCJoo :eLAtioa.scips derived. {rom ~he ltl'Iowledre baM. The ::o.eaQlu, oi ,jW aefined ilredicau !S

tea.tOaaQ ly stf~Y1t[ot'l¥atd: 111 order fa r a ·~at!!;e .... penpe<::ive-dtor·· t.o match;!, "";:IaHe:'!1-peMpe<:tive--dtar" ou

loe oaaU o( cb~ defi.o.ed. predic3te, Iha ilfototy'Pft oC e.:I.eh rn~t be Ibe 'arne, J.D.d for I!$cb. ,lot ... b..icb. is 5.Hed

'.!I. che p.ttern. Ihe cotT~pondin, ,lot rulLlt be 5.lled ID. 1. c.orupatlble ..... y (J..5 'petlned ~y the recU1"3ive Q\l '.0

DTION·ruP·DTION) Wlthia ,he :.&l'Iet.

Tb.e (u U Jet of 25 u:ioros ,11" t.o~~ller cOa:Jtitute the m.uell !pl!(.\nCltlOQ rOt our ko.owlf'dCfo butd

dele~ptloll t.1.n~:a.ce.1.l'a p,esea!.ed :u .1.IJ. '1ppol ll.dix t.a " " . d"cl.Qr~ .:ii»cn:uioQ .5'1l. ..... , 19150~.

:lam, ~o ..... aamiaed lJl u.ample of :be use of weil-forJled formu lae wniH.'l. ollr demolUtntiol1 ~tem,

it ,",au appropnaw ~ explicitly lilt tile rH~t.3 In ... hlch our 10Jlc,o:J..$Cd query :uld ~reOic;u, definition

lanru&Ce ~ ia (act restricted by compul:IOn witb the iull l:l.oJU&4' of :inc.-order pn:dic!1L.e c!1lclJlul. r'in ' .

fU.D.ctioll rymboll (wh.ich iJ1 ra.ct.lodd oa (ormal apresarve power t.o ,he predic:l.tc ~lcllhu J b"ye aHa elimja3~

from c.ouiderat.iou.. Second, our prlH:a' 1.pplio::l.tioQ b.aa 0.0' rtq,wred ~b.e U3 ' oi upiicit .,ep'ioo; :he .VO T

ope:ruo, hu tbus Octl1 OIDl'tcd (rom our ia.nlj;ua.ce .... weU. ~oc..e tha.t by toatr:l.ll' wltll ~h. CaM of {unctioD.S,

:oe ~ditioQ of (lC'!i:Hlan would '1lj;ll.liicmdy apLQd til. cxptestive capaoilititl of our Lanlj;>aCe: we cOlUlder

sucn U1 cxtea:Jlon t.o reptHent a potenti~ly importOl.lH a.venue ,'\lool ..... bieb. our o ..... n worlc :l'lIght Ca exteoded.

Sec:l.UM nCi"3tion introduces .1 l'Iumbe, or surpri.sin;ly difficult problems when Uled in .J. comput.:nionai ly

ed'ecttve dcscriptioa language, howl'fer, we b;J.vI chOMJl t.o olUit thia con:Hruct from CO D.Sider::r.t ion in OUf worlc

Page 28: taau. n .. n. nu.. - Columbia University

Finally, w. tlave int.en~OQlii1y pc:rmitt.ed only a. to,rie:.ef.i iotm oi ~iverul qUa!lt~c~tioll. Ibth.r th~

peTmit universal qU~C1lieation of the genera! (orm

'of ::.P(::)

.~ r~trlc' p~=) ~ ~ of the rorm

Q(::) ~ B(::)

·..,bere Q(::), ca.lleQ the qu&iiEea'ioQ c.JalLH, is Curther restricted CO contain no disjunction or 'Jl11vcr3:li qu~tifie:HioQ,

wlule the body 8(::) is a.n unrestrietad · ... u·ror:ned formula.. While det~led cOlUider~tion 01' \lnivv3Il

qU:lIltllicatioll will be de{err~ to 5e1:tion 7, we aot.e (or aow thM the qu&lliic~ioQ cla.use serves the imporu.at

?r:lCtlea.i funetioQ of restrietinc che piaualble rule ol uninnally quanti.6.ed variables, ~h~ limitill~ ;he !4It ol

":: VlJu." ... ruch :%llat be CllC1lidered co thOM V1Jues Cor ... lUeh Q(::) is S&~6ed.

8. Th. LSEC AlCOPithm rop Locic.al SatiatactioQ

la tllia ".etion, we ... ill describe the funetioll ol the LSEC &Igorithm. through wlUch the eOIUle1::ioll between

logi~ d~riptlve mcch&nlat:ll.1 a.nd .ctlla! M!lAtiow alcebr&ie oper:uioQ.l is established :n our demOlULr:ltioll

!TSt.em. The LSEC ~,ori.thm ~u been fully implemene.d in oW' demorunuioQ 'Ylt.4fIl :lila ~ted 011 eanCully

ehoeen e:umples desiened :.0 exercise ~ portion or the alrorithm.

We becin in SubMCtioll 3.1 wiUl an aample ol the '..lM oj LSEC in e",lIMine .. he result.. ol a. simp~

conYelltionaJ da~b ... query, a.voiding ma.ny oC the 'pW'ioua complicatioQ.l ~ilin, in our knowledge-b&M1i

retrillY'll applicatioQ that a.re peripheral ;g the esMlltiaJ open,tioQ o{ Ule LSEC algorithm. (n :he remainder

ot the 5ection. we describe the ~havior oC the alroritb..m UpGQ enCllullt.erinC each o( ehe ,jx :ypes or !ogiea.4

(ormuia UMd in co~'ruetin, the ma~h 'p«i.5eation, endinc wi~ the procedure by whiei\ the result M!!atioll

:.s con"ructed.

8.1 A JimpJe example

The example we nan ehoeen co illUlt.nt.e the proce:sa of exce!l.Jioaa.i coll.tcraUlc haa the virtue of iilU.1tr3tillg

:he esaenUa! beila ... ior of :he LSEC algorithm. but ml.Y .... eil ~m .. bit contr1ve<i co :he ~ud.er. To be ,ure

:hu our 30mewbu UAD&CUn.l a.:unpl. does :lot oo.cure the features ..... ... lU be a.twml)ting :.J tllU.1trat.e .....

chua becm ..nUl :l brieC diKualion or it.. "reai-wortd~ Mt.tinC.

l.u our example. a. hypothetical ~pit~t ,.,islus co .wKt the behl.Vl0r of major U.S. corpor:L"OQ.1 by

exertinC ~ indirect iDJIuence on key individuaJs wit.h.in thOM Co l1'oruiolU-spec:i.5caily , Oil the oween &ad

di.netol"l 01 thOM nrm.1. Tha indirect. in.Jluence is in turn ~ be media~ by ~hc .t.:orneys :lJld lceoWl~t.I

oC theM k~ individuu. 011 ... hom thse indinduall a.t1I presumed to rely for illform~ioQ a.nd ~yiee. The

c:1pitali.st augilt suecm. tor example. in :aduellcing the beilavior of ;he corporltion by bribiag it..s presIdent',

.. ::orney, or the J.Ccoun~t of olle of it.. di~ton. To thi.s end. our c.a.pitaWt wishes CO ~vicw a lis,

of proCesaionals (attorneys and J.CCOUll:.nta), pa.ited with col1'otMiolU on wbom tbC54! proiession~ could

ultimat.eiy exert a.n influenee ;hrough 30me third pa.rty WhOM identity i.s oC 00 concern ;g the ~pitwt.

25

Page 29: taau. n .. n. nu.. - Columbia University

ArTY ~ Ct.vT At,yl i Jooes Attyl I S"'ne Atty1 .I'one:s

..lccr I CL:~T

.4.cct1 I Smith

Accti I Jooes

nf(~R . CORP

.5allctJ XCorp I DR:'n jon~

Jone,

3mltll

CORP YCorp XCorp XCorp

The :irs, aasoc:'~ utero.,.! 'Nith their clieeta; the Meood. :lCeOUDt.3.Ilf.S "Nlch .heir :iiellt.S; ~Ile ~l1ird. omeers

"Nitl1 ehe eorpo~'iolU by which .h.,. ue employed: the (ourch. directors witil the corpor;uioru 00 wl1o.e

~&l'cis :hey sef"'l'e. The bi~ry reiatioo wiliC:l our hypotlletiQi ~pltali.n w;.,nes te :eVll!'i1~ :nay be c1esc:i~

'.uing che lopca.! query

(ptQi~ionaJ. corporuio!!): 3 clien"

(( • ArTY . ·Ct.VT. (profesaiooal, eiie.at) V

·ACCT··CL.YT. (professiooal, ciie.at)) ;\

(.OFCR. ·CORP. (clil!llc, corporatioa) V

• DRTR· ·CORP. (clienc, carporacio!!)))

Without concentr~'i!l1 Oil the detaia or the aigorit.lun, we will aow !icetch ehe :nan.ner in which LSEC

Sncis Ule resultl ot UliJ query.

The mc»c important da~ structure m:Unt~e<i ~d manipulated by the LSEC lIgorltlun :" called the

Accumulaced D~joinc EzteruioQ (~metimes referred eo aa the ADE. or ~mewh~ [est pree:"e!y, ;Y simply tile

~~mloa). Loc»ely !peUjJlg, the ADE may be thou;ht ot aa ~ set of reiations, eac.h oC willeh describes oae

..... y in which ~'1a' put of the [~ul rormula .hus rat encoull~red :nay be satisfied. (l!1 one sense. th~ .~E

:nay ~hu.s be ~nougl1t of aa , ~r~ of dynamic. ~nsion:u :1na.lO!Ue ';0 ~lle :n~!13lon&l :lotion or ~ ciUjuQccire

~orma.i form.) The initw :\DE in ~,. LSEC ,..ion :., ~way! the djs,iDgui,sheci .~E erue-u:~ruioQ. wlllel1

:., , ~, eo~is'il1c ot the si!lpe d.!1en~ relation caot.lUlill~ :10 'ttrabut.eS 6!1c1 no tuples (~he erue-rela'ioa).

I.n the eoune ot proceaiDI ~l1e eop-i.vei query (a!ld ehe defined predicate booies introouced :.n ;lle cou~ oi

expanding ew 'illUY), tltia ADE iJ corutrwed by the vuious lo~ca.! 5ubfonnula; :liter ehe whole query Iaa

been proccsaed. ~lle result relation may be enncte<i (in :1 mllAner detailed in SubMeclon '.Q) (rom the anal .~E.

'!'b. t;Heo~.,.vsal :ugoritlun embodied :n the LSEC algorithm causes the initio! ADt:: (faLte-u:temloa) ~

be cans~ed !irs, by the di.sjunc:tion

(.OFCTl.··CORP. (cJie!lt. corpora'ion) V

·DRTTl.··CORP· (client, corpor~jon))

25

Page 30: taau. n .. n. nu.. - Columbia University

7!le ~oul~lng ADt:: is a,., of two rc1.a,ioo.s •• he tint corr~ponuing ~ ,"l1uenc:::! e.'tet~ :hroll~!l oUicers.

al1d the ~nd througl1 direc~l"I:

Jones I '(Corp I Jones XCor" I 5mHb XCorp i

ConltniniJlg this AD e: (urther to redeet the conwnt of the fint conjoined 3ubfor=u1a oi ~he body of

our query (to redeet the fact tou :!.CeeM to these Ie.,. lllci.ividuaJ.a may be through either toeir J.ttorneys or

a.ccounWltl), LSEC "mu.i~iplles throuch' :~e origi.D.a! • ATTY - -CL:-iT· and .ACCT- -CL:-fT. roeiationa by

joi~ e&en 'lrith ~ of the :wo ~ (coll!tituent relatioru) in the ADE oye!' the eOa1mon e.~.sc.ntially

q1lll.llufie<i vvtable clif!Zl'- Bcau.H none oC the 3.ttoruey' in our exam pie haa J.I client the lone individu.a1

(Smi,b) who servIS all a eorponc. officer, one of the four resultinl t.erm.t will be a. !~roaejon-3. relation

~aving ~y "tribute ,tructure, but no tupia-w!lich will be e.llininat.e1i rrom the ADE, yielding :h. thr~

~lation ADE

At'>' 1 I Jona I '(Corp

.-\"Y 1 I Jona I XCorp

.-\e:y21 Jona I '(Corp

. \"y21 lona I XCorp

Ac,,1 I StZlJeb I XCorp

.-\c,,1 I lana i '(Corp

.-\c,,1 I lana I XCorp

~otice tbA' by existentially qu~tilyin, cLiene, ~IUt.e::J.d of !eaving it fref! in tbe query ;veil-formed ror=u1~

'," have indicatefi tW tb. idmt.ity oC the climt i.s oC no conCenl to the end-lJ.Hr; it i.s used only to eeubllsb

ellat JOme IinJc between proC_ioaAl 3.Ild corpoution erisca. Upoa exiting (rom ~he sco~ o( :he ~e.ntia.Uy

qU~'ule<i (ormula, .... thus project out ~lle CL:VT attribute :rom aJl ~rms in ;hc eutT'e!lt ADE. The ~ewt

(Uter eiimUlating OQ. red\1Ad~t tuple in :he third ~ejation) :.s

'\"yl I

,UcyJ I ACtyl I .-\"y2 I

'(Corp

XCorp

'(Corp

XCorp

.-\cc,1 I XCorp

Ace,l '(Corp

Fina!lr. the union oi <Lil ~ei~tioru in ~he AD!!: is t.ollcen, ~h~ combining aJl iLs tcrltlJ into :l 31nllie re:swt

relation:

Page 31: taau. n .. n. nu.. - Columbia University

:\tty/ YCorp :\Gty / XCorp :\Gcy2 I YCor", .\.tty2 I XCorp Acctl I XCorp Acctl YCorp

~ou th~ during the perform:ulce at this fina! union operation. one :nore rupie ~omes redund;lIH. and

:nua' be ejimiu~.

s., way ot summary, then. ;oe LSEC ~,orith.a1 s~t.a with an 'Jnconst'1.ined !!Xt.elUion (the ,~E eru ...

e%r.en.Jiol2), wtllch ~ then cOlUtr:uned by the query fonnul ... In oW' demon."l'1.tion system, :he lnitial query :.s ~ways tla defined p~ciicaca

(=) : DTION-IMP.DTION(:, pa",deion) ,

WhOM bcxiy is e:ql~ded in the course of the alCOrithm. with clli1'erent ~ ot 1000eal formula "sW'ra.cin~

at diH'eretlt depta., witlljn tllis e:qlanJioo. I.n the rema.iAder of thll 3e1:tion, w. will rniew tile behavior of the

LSEC ~gorithm upon eneoWlt.el"UlC each of these logiui formulA ty"peL

T~e procesing of ex:i.st.enti~y qUa.lltified formula within the top-level query, or witbiJl ~me defined predic:lo;a

body enc~uncared in the course of evaluating tb.i3 query, ~ quite simple. but provide a ~004 introduction

:0 the use of one of the _ntia! dat.a s~rueture manipuiat.eei by ~he ~gorithm: the logical 'r.lI'iable 3Cadc.

(Although it will be referred to informally ~ simpiy "the suck", the logiui variable Sf.;lClc should QOt be

con1u.sed with :he stack maUltained by the underlying LISP system. ''''ruch mainWu :he bindings oi the

~.vatiables involved in exeeution ot our democumlCioll system.)

A ~ew stack !rame is creac.ed each ~ime ln ex:i.st.elltia!iy or univeruily quantined :ormula. or ~ deaned

;lre<iicaca, is enc:.oWltare1i. to hold certain iniormatioll a.bout nell qUa.Iltintfi v:ltiable or formai paramer..r

(ill ~he caM! of t.h. defined p~diclWI) lntroduced ...,i~hin the current rormul:lo. There ~c ~wo componenc.s ~

eltia inIorm-'ioG: the a~t speeiJles the vatill.ble's ~pe. wttieb may be either eOll3t~~vlliued or acuzour.e-id·

vaiued, while the -=nci :., ehe value it.seiC. QU:lontified variables, tllough. :lre alway, oi ~he laccar type: 'JpOQ

encoW1r..nn. the ex:i.stentia.ily quantilled Cormula. a wtique aame called ln a"ribur.e idl!!ltifiel' (attribur.e-id)

i! geoeracca (or each ~calltially qUa.lltified variable ~ dcsigll;lca a pareicuw attribute ehat will ultimately

appe:lt in one or more rcbtiOQ3 III che AD E. Although the ,ame attnbut.e ma.y be ~rerenced 'J5illg different

names within the bodies or dilTcrent defined prediC:lo4eS. each such oame wiil be "bound" ~ ~he same at.t.rtbur.e­

ld "" dill'crent locatioQ3 within ehe lo«ical v:uiable stack. Conversely, ehe s.'lmC V:ltiable n;lme may lppeat ~

vatiolU frames within ehe currellt stack in the e3ae where ~he same variable aame ~ fe-USed (possibly Crom

28

Page 32: taau. n .. n. nu.. - Columbia University

~ile ~e piace whhi.n ~o :les~ invocations of ~be same defined ;lredica~) within a scope :n wnlch it l.u

~y bftts defined. III esch of it.. :l.ppeUallCQ:S, ,uell a name al:l.Y be bound ~ l diil'erenc ,lLtribute--id; .n

,uell QMI, che IDOl' reeenuy ~ded st~ic frame (the frame 00 che "top' oi ehe "oleic) ~ ~reate<i la "eurnIU",

The proc .. inC of exiswntially qUaJltificQ iormule '" now qUlw mnpie to describe. t; pon enwring ,ulon ~

formula. r..SEC firn crenta a aew ,tacit :rllme eont.wllng Uliormation :l.Oout ~ch 01" ~h. a:i3tentially qua~miied

vanabies i.atroduce<1 in the cuncn' Cormui&. :-lext. the current ADr!: ~ (reeunlveiy) eocatraloed by ir.. body.

witicl1 ~ implicitly trntad la a. conjuncnoll 01" ooe or more ,ubCormul. (see SUCaectioQ 3.1). Upoo exie Croru

:he al3wntia.ily qUaJ1,Uied formulA. eac.b. oi these ala~~a!IT qU:l.4tiAeC vvtables iI ;:Jro}«!tttci OIH of esch

';enn ill tbe reswei.nC ADE Cor :eaaollS oi ~rlieieQq, Readen C&miliar '''leh Coda's conam.1ctlVe complec.e!1eu

prooi lCoda. 191'2! may !lOLlCe that OUt :nlWltellaJlct of a IOCIc:al vuabie nacit Hrres wh., may be rep.l'cieci

a.1 a dynamic ~&locue of the proceu of eOQvenioQ to prena: :lorma! (orm, but ~&Pted r.o the e.:lH where ~tle

requireci rena.minc operatiolU c.:uulot. be determined Jt&tic.ally OQ a Pllte!y luic~ buia.

.-\A i.n the caM ot the exUtential ronnu\.a., procesai.ng oC unn-lnally q\aQeiJil1'l Cormw. ~ ;rith the ~ea~oll

o( !l"" utribuc.-icis to idenwy esch o( the ne.ly·introciuced UlIiversally qTJueiBed nriable.. ~d a a. ... sc.a.cjc

rrame is created to nt=rd the ~ ("u,nbuc..-id.nlul1'l") ~d Y"alue (the I1lwly PI1I~ted '.uUque a.r.tribuc.-id

name) of nell such vuiable. The subetantive portion of the proc_ing oC univers&ll1IlUan~ed (ormula ""~

LSEC, QOWen1, is coruidenoly more el)alpiex than the correpond.inl part of the p~edure ror proc_ing

aist.entiaily qu:l.4tified Cormw.. ~ noted in Subeei:tion 7.7, :loll suell formula u. of I.he (orm

1:.Q(:) ~ 8(:}

where the quaillie.:a'-ioa cl~ Q(:) iJ restricted to coown ao djsjunction or '.Ulivenal quaneifi~tion'. Thia

qUaUA~ion clauH iJ UMd r.o rettiet the range o( the '.Ulivers&tly quantiAed variaoie : i.a each oC the po.lbie

eoa~ dec.ned by a!t.eru&c.ive join\ ina~ti~io[U of the vlLrious quantIfied variables thac are "'visib!e" ~it!Un

~he cunene scope. We QOW colUidu in some detail the manner Ul '4'iueh LSEC corutraUu the currellt AD E

by 1 universally quantiBed formula.

Ree.all fine that. the ADE ia Lo "eneral a. see of HveNli relatioQS, <:ailed the ~rms of ~hat ADE. LSEC

eOl\5u'a£na eac.h ~rm by the univer3ai rormuiOl. tbcn takes ~b(! unIon oi the resuit..s ~ (orm J, :lew ADE. r.n order to el)na~ a liven exteruiOQ ~rm by J, uJ\lvcrw [ormuiOl. LSEC mu." fmt Identity a. crucial se' of

~l. called the concen variables of ~W COflDuia. The eon~t vvia.bies are p~i.se!y ~hOM &:.Cribut.e-id­

vlUuad vari.ables tW appear '4'it!Un the qualiacatioa clause, bue wbose ~nermOit scope ~ 'lot locaL ~ the

qu<lllii~ioQ cl~u.se. (Our somewha.t '.Ulwieldy deiillition " required ~ ~ure that J, variable :l.PPUl'll1l wil.hizl

~be qua!ificatioa dau.se at :l. ;lOint wllere ir.s name ;" bound both ioc:wy-that :So within the '1u:Wii~'ioa

e!auae-and lPoba!ly is ooe treJ.teU a.a J, coat.ae vvillbie.)

r..SEC nCln compute. what iJ e:illed the quiiliiicd ex~ll3joa t4!rm by (re1:W'Siveiy) cOll5tr&ioing the extell5iOIl

term by the qualia~tioQ d~u.se (Q( %)) of ~he univers;U rorm Ill&. The qualiiied e:ltu:nsiOQ ~rm ~ ~heQ proj«!:fti

29

Page 33: taau. n .. n. nu.. - Columbia University

ov~r the ""fibula eorrepondin~ -..a c:lch oi ~he eOQt.en 'f';lt'I&olcs (35 cieLerm Ln~ by ehe eurreaL SL&eE b, nciinp

for LbOlO cont.en vvi.:!.olesl. Yldding & Jet of contu" cuples. i::Ol.th such tuple defines one eontex: fo r ev:ll u;).'Lon

of !.h. body o( lb. \IJl1vel'3oJly qua.lHi6.ed formula. tn ilHUILI'I'e e.erms, ;l cont«XL :nay be :b.ou'"~t ot" loS COOL.l.Il1lnll

l.!1 r,l.vo.nt intormaloioQ lboUL ooe poMlble "'fay in which tbe qu&lification dauu mll1tt 'Je saLLJiied-tbaL .3,

00. ~$uch cnu .. ~ conditioQ in & univcrul (ormul;). which ~ught be d~nbe<i by the i::tI; ii:ln UMn ion ·~Ot

~I : 'Ucil ,hu Q(=) is ~rue , 8 (: l 12U15L llso be tru.' . :'-Ioc. tba' It Lt cot ,ufficient lQ ,caera.! :.0 find &ll ::

ntu,iyioJ ella "such th., ... • e!&uat, l.Ild ~ simply lubl'lLU~ aJ.I luch % Int.o the body oi th. uQ.Lytna.! fo tmula

:.0 obUla a o.w ML o( t.OnJLnuUnJ formula. tn order to <J.Yold adud inc jOlln i41t~nti&tioBS ""bici1 .. tuc~

::nldn w.ll .u.uaCy tb. :.op-l.vei query , i.n1or::n&tioo reprdin! COQJItrJ.lJ\t.s 0 0 ~ablat J !obal :.0 tb.. univcrs.&J.

fo rmula :nun be prop-ca;ed .uOOI ""'th ~cb ,ueb % r.a.iu •.

E&ea o( t il. .. coo.t.t!Xt tuplat W1.ll ul timat.ely contrlbuw to Lb. rftuh (or til.. eW'1'c~n I!%t.e.~ioll t. rm , :he

?&ltiaJ result. due ~ u.c.iI beUlI appended. :.oeeLb.r &t ~h. t1ld to form ;l a ... exttl1Jioa. LeI us aow coQ.Sid.r

:be a:U .. l1l1ef In ""bje~ I PVC!! eOQt.cn :uple ia .,wed :.0 COQJCr:un :h. quaJ.ilitd awnslOQ tum by : be body ot

th.1141nnaUy qU&IHlfied variaol • . Fine, cb. qualiSed «Ansion e.erm \s ,.l~~ on the ",atlouta &.Qd ~ues

,peoellled. 1I1 cb. COIlWXC vuliol. u.., &tld t il.. CUrTlllC COQ~ tuple, rt:Sptet inly, ~ OOc.&ul tb.. :ooc.trt.oboWld

e:zc",nJJOO ,lie:. fot tba' toot.cn ~up l •. By projKti.ne the CJ)ntu"'OcWld a:c",Qlion "'it. O"fl!' th. uniTuuUy

quaauUed. "&nabl • • It is now p(*lbl. t.o iduuily : be decti,.. r&.II.'. ot til.. UAinnallr qU&.Qtilitd 'lVi&bl.

,"dun ,h. CUIn!!t cont.en, resulti.!!., i.!!. ~ li:It o( eooc.e.n-bound WlinnaJly quanti6itd valua.

To obc.ain til.. p&nia! ~t. th.. universally qUl.QtiJied V&liabl. ia lin, projected out o( ~b. quaJ.i..6ed.

uttn:Jioa tArtD Wet. Th. result (corrapondil1l ~ til.. eurnllt COI1t.tn tupl. ) \s tb." eoanrained. by 'lViOua

ituu.ntiawd "e",ooa ot the body in succesaioQ-linc, by a "enlOIl ot til.. body w;,h the its, con~xc.-boWld

\IJl1ven4lly qUa.!l.u.6.ed ~u. ~ublC ituted rOt cb, Itniveruily qUa.o.LWed vUI&bl • • aln .... ,11. cb. second ,uch vaJu.

,ubimtuteci (or tb.&' .... nable , IQd ~ 00 ror e~ ot tbe ~Ibl. V'Ilua withUl ebe eiJ'ec:ive (coot.u't.-Oo\'\.o.d)

fan Ie ot" the uQlyenally qU &nc.Uitd Y';lM&ble. ne result cor!'!' tspood ing to tb is coot.crt. :uple ~ now eOI:Qblned

W'ltb thOM derIVed (rom u.eh ot til.. ocbe" to ob t.:I..LD. J. 'I' el'3loa o( tb e on;:in&l exWI1310tl t.erm eO lUtr.u.!!.ed ':Iy

tbe UlIJ.vtrsaliy qU&.atifitd formula. AI QOWld ;U)ay., tbe ana.! result :s ob t.a..il1ed by eakin; ~he unloa ot ~b.

~esu lt.s due to cseh exwo.sioll slie. i.!!. the ori(illal A.o£,

.\oJ 1I1 :h. CMt 0( CClt.en.tW qua~Htac.&.ioa. til.. UD.lve:rully qU &.l1t loed. vl.t1abl., ~ ?~Jec:ed ou t ot :b.

result upoa l.&Y1.ll1 to.. xo~ ot lb. \I.I1innaJ.ly qu&.l1tliied ~ormuja fo t :cuooa o( eiiidencT.

8." COQjUDcUoa

It i. in ,h. CMt of a cooJU.D.cloiTe fo rm.ula tb.a' tbe proclSl o( prott",I'" co~tt ;l,jl1t "ilicil. uad,di. ~h.

op.r~tion or cbl LSEC al loriti:un lJ mOlt avid.n,. Upo n e:ncounwn.ne 1 MC or eonJoioe<i 7Ilblormw..

PI:) A QI: ) A R(:) A . •.

t!'te old ADE: i, lint CO ",Lr:Line<i by P(Z)i \ne result is tb ea fu rther coo.str&.lned by Q(.z: ), \hcll by R(: ), Uld

50 00 UD.tll eitber th. atenslon h.a.t bHn cot1Strained by tb. !a51 cOl1joined SUbro(a'! ul~ or l (aiJt!..ut.ltUlon '.s returned by on. sueb constl'1inil1l ,tep , iQ which e.a.M comput.ldOD terminates with ~he value iabe-e.xcel"..3loo .

30

Page 34: taau. n .. n. nu.. - Columbia University

3.5 DiljuJlctioa

La. th. course ot e003trsiAinc :he ADE by <l disjullctive iormul:1, the "width' oi the .-1.0 t!:-rnore prec~iy,

~he nwnber of relations witich it com priMt-i.s , in the general ~. iacre:ued ~ redect :l l:ltger cumber oi

~terU&Liv • .,qys in wtuch the query mishc be ~isdeti. To eOl13trw ~Ile eUlreo, ADE by :l disjunctiol1

rot e:umple. cwo copi. ot the ADE are made; oae is eOQ.l\rsiAeti by P(:), ~he o~ller by Q(:), a.ad the resulta

combineti ~ (orm .. Ile new ADE.

The proe_inc at ~ defined predic~ witlJ..i.n the L3EC ~corithm i.I ~&lOCOUl ~ the bindinc oC }..vvi<:UIl ..

within LISP. A aew 5t.ack Cr&me :., crea'«i :.0 lSaOCiac.e with the Corm&! p&r&metarS :ll1 ~eiennt inIorm~ioa

inheric.ed {rom the ldual panmeters. At :he time oC billdin" e&en {onual panmec.er :., des~~ :loS either ~

COIUUlIt--VaJued V'Viable, it the eorrespoocililg ~tual panmeter i.s either & coaat~t or laa it.NIC wady been

claaified u & coaa~c.-valued variable. Of &11 &"ribuc,e..id-vaJaed vviable, in the eUI ..,here the eotNlll>OQciiq

lCtU:l.i. par&mawr is itacil' &ttribuc.id-nlued (by virtue ol baving bftn bound" some leni to & quan~ed

vvlAble).

FoUawiJlc creation ot the aew .stack Cr&me, the curnnt ADE is (recuniTely) eoaatraineti by the body ot the

ddined pred1cac.e. ""th ita ~Qt bin~. For syntactic simplicity. the body of ~ defined predicl1c.e !Day be

l list oC conjoined subCormul •• and aeed Qot be 3D explicit e.onjunctioo. TypiaUy, then. & defined pmicac.e

is proceeaea by e~blishiJlI oew bindings and then e"f&!ua~ing the [U, oC eOQjoioed subtormut. wb.iell :nue

up ita body.

8.7 Primjtjve pndicau.

[t ~ in the proc:c.illl ot primi,ive pred.iaws tW :n0l~ of chc computation&! ettort oi :he L5EC &l~oritlun

;" expended. For a simple illuatrauoll oC the eompu~ioa&ilr dem~dlng iUpe1:~ of ~ll.i.s proeea. cOl13ider ~ile

ProccsLDI oC tile simple quer"f

where P &I1d Q u. both primitive predicaw..

Sin~ the body oC the quuy is a eonjunction. the initial AD E (cru"u~ll310a) i.s lint eoo."ra111eti by che

primitin prwdicaw P(::, z). COl13tra.i.n, oC the true-cxUll3iOQ by l prunitive predicar.a is tre:1teU &a ~ !l)eci~

caM by the LSEC ~lOrithm. Th. resultiag ADE coa~w ~ siogle reiatioQ, the indepcoderH cxeen.sion of the

primitive prediCAtA in question. The independe", ext.coaion of a. primitive pre<1ica~e ~ defined J.3 the ~uit oC

seiectiJJr the cornspondinc primitive relation in the exunaioaal da~b<l.M 00 the v:1lues of a.ay co~~~",v:llueti

&/'IUlnenta that. mllY be specUied. then projeccinl out the &t.tributa corresponding to <lil such e.olU~t--nlUeti

31

Page 35: taau. n .. n. nu.. - Columbia University

argumen". rn our a6mple. P ha.s ao constan~valued ugumenr.a: ~he result i3 ~hus :he dCiener3.te C3:Ie oi :1Q

illdependen' ~.enJioQ: a aew ADE coruisting of tile Single ~lrlmltive relatioa :orres-p<JaciiDg to p,

This ADE ia !text CQaatr:lined by the ~colld conjoined factor. the ;lrimltive predic:J."-' Q(z, 1)' In ~h~

Clore typical CMe, where the :\D(l; i3 not equal to crUe-e.'t""c"IOD. :.he ~dcpencie!lt exuruion of Q(:.l/) is Grst

compllteo by ~i~tion J.Qd projectioa of the primitive relatioa corresponoing :.0 Q, loS de5Crtbed ~ove. J.Q11

then joined with the current Al)E over all commoll :J.l.ttlbl.lr.c:s. In ollr exampie, the Uldl!pencient eneaaioll o{

Q(z, l/) would be joined with the old Al)E (the Uldependent extellaioa of P(:, :J) over tile commoll er.stenti~y

qul\Atiiied variable :. III our knowledge-baaed retrieva! taaic. ~Ili.s join oper:ltion, 'Rtlich wOl.lld :.n i:ener~ be

'rerr apeaaive Oll an ordinary :Iladtine in the C~ where the relations i.nvoived ue of !.arge eard.i!1aiity, occurs

quite irequently i.n the eoune of ex~uting the L5EC algontllm, ~d would problLbly .ccount for Clost or the

execution time i.n a realistic :1ppliea.t.ion. It j., the need to perform such joUu (or similar opera-tions) i.n a hlghly

efficient manner whicl1 thus proYldes ·"hat :.s proba.bly the mo., importa.D.t justiScatioll (or the \lM o( p~r&Uei

!la:dware i.n the ~ds ot lulowledge-bued 3.pplicatioaa with which .... are concerned.

3.3 T11e result formuia

Up to thll point, we have CQnsidere<i oaJy the treatment ot ~tentially ~d univet'3&ily qU&Dtiaed TViabl.

by the LSEC algoritlun. r~ will be recalled, aawenr, thaG tile "top-level" query rormula mia' ~""ay. COI1c.&iA

3.C leaac oae, ~d poeiibly Clore, fr~ variable;" ~I poMlble ,atisfying combi.u,ions o{ ·Nb.iCb. will ul~mae.l,.

be returned aa the result of the query. Duri.llg the bulle ot Wle LSEC algorithm, (lee variables are treat.ed Ul

exactly the same maruler aa aiat.entiaily qua.ntiSed variables: dacinct iUtribut.e-ids 3le erea~d Cot eacil, ~ci

tbe lSIOCi.a~ ~iormatioll stored ou ~he logical variable stacx without ;],oy indicatioll o( ~heir '~ial ,t~'us.

A.fter the initial ADE (true-ut.en.sioo) hu been coa.stnined by the (ull ... eil-Cormed (ormula., QOwe¥et,

euh of ~he re!atioll3 i.n the resultinC ,\.DE i3 proj~,ed over tile result varla.bles, aad ~he 'Ulioo. oi the

(necesaarily union.compatible-5ee SubMctioD 7.4) resulting rebtions i3 taken to yield the query result. rn our demonstration syst.em, (or example, esc.h relation i.n the anal AD E i3 ?roj~teci over the :1ttribut.e-id

corresponding to :he top-Icvel target document descriptions. and the union of ~he resulting unary rel"tioQ5--a

:lew ~:3tioQ Usting euh of the matching tal'1et.!-~ duplayed to the! u.ser.

8.9 00 'be compjuj~ of LSEC

A.s .... hAn aJ.res,dy QO~. the :'-ION-VON machiac ;., designed to e.'Cet:u,,-, the primitive opera.tions ot ~he

LSEC algonchm in 3. lUghly efficient CllUlaer. (The reader i3 referred to Shaw [l~i9! ror the algorithau

them.lves, and to Shaw, et ai. :1981l for de~~il.t or the NON· VON archit~tU1e.) Siace a number of these

rela.tioaai algebraic oper~ioQs will ill i:cner:ll be required i.n the cour,. ot an .ctual retrienl ~k. however.

it is re3.50nable at th~ poine ~ cOll3iuer the complexity of tile LSEC :1igorichm in :4fm-' ot'tllese reiation:u

algcbraic primi~ivcs. [i'irst, it should be aot.ed tll:1t the individual who eonstructs the ~t of defined prt:1lic:ucs

(which, in our demonstr:l.tion ,ysccm, implement the match ~rnOUlti~) may exercise a considerable dqree of

expiicit control oYer the sequence of operaLions th~ WIll ultimately be pcrforme<i in the COUMe or executing tile

32

Page 36: taau. n .. n. nu.. - Columbia University

LSr::C aJcori~hm. [n p!'1.Ctic:e. i, hal betn our e.."q)erienc:c chu predicate defini,ion :" :u\ 4Ctlvlty :note :latty

lilte onUl1&tY (~bei~ very ~&i1.lcvei) procnmmUlg ~lla.n. say. ~ile ~&iOlOWI ~jc coo.irootlllil ~:"e ~tc!lit«t ot

a ~lu'ion theorem pl'OYUlg JYsr.em. In pa.niculat, It ~ ~Ible ~ deane ".WO "wealdy equlV;l.ICnt" 5Ir.s ol

predi~tha, :". C'\ItO 5Ir.a 'Nttic:.Il ut illcWtingu~habje on :he bu~ oi theU' illPUtl output ~eilaviot under

:be L.5EC algoritiun-,ucil ~i'\a' ooe ~ con.lderably Clore erliclent ~na.n :he other.

!t h.3.ll bHn our experience ~na~ :ne number of ~ei.a'ioW a.lgebraic o!)eratioQ.l which occur il1 ~ne c.oun.

of retlin-inc target deseripr.ioa.s Cla"lting patteru description. ot rewtic sin ud complexwf :" (airly Clod_t

(no :nore tb&A J. rew dosen sucil ev&!1U.~na (or tne :neet det&.LleQ of our eat desc:iptiollS. ror example).

To be sun. tne lumber of such operatiolU eould in cheery ~w qwc.. rapidly u tile sill oi J1e ~&tteru

descnptioQ gnw very ~_tb. enet b4!b.a.vior dependinc both on iJluioaic chal&.cteri.tti~ of :he ttiCD-leve1

description l.&nguace and on (actors uder the c.ontrol ot the illdiTicilal respolUible ror p~ic~ definition.

In prscr.ic ... !lo.ner. cne ;s,ue ot query sile ~d complexity is much lIM impor~t ~ thAt. of d.&&.ah ...

5ise. partirularly il1 the e.ue ol ene Vflr! luIe datab ... :.0 which our ~ch ~ dirftted. In :hia ~d.

it is the ract thAt. the number ot rtl.&tional aJg.br~c operuiona, while cI.iJ'1Ictly relac..d :.0 query compluicy.

:" inde~delH ol :he siu of :he da~but. wttich ~ ot ~ua! conCen1. The critiCAl dlt.r:nm&IH ot !T1c.em

beh.avior in realistic lUJ .. sc:ai. dar.abLH appuQtiola i.s ~hus the e.llicienc:y with wbieh the illd.iTidual rtlatioc.&l

algeb~c primitiv_panicuwly the join oper~r, by virtu. ot ita complexity &Dd frequency ot i.nT~ioll

within LSEC-ue perfonned 011 the uncieriyUl, machine. It i.s ll.r. ~a~ the NON-VON archit.ctur. ocTus a

poc.ent.i&ily dnmatic p4!r{orm~ce improvement (with a comparable iJlTestment :.n hard,",") anr c.oQyentional

eompuc..r syswms.

Bobrow. Duiel G .. ud Win~. Terry, "All Overview ot KRL-O, a Knowl~iI Represetlcation LolJ1~', Cop.i tin Sciell ce, 1 (1) (1911).

Chane. C. L. &Dd Lee. R. C. T., Symbolie LOcle ~d .l,{eeb:1llicai Tbeorem P~OYizlr. Comput.er Science and Appued Machem.cia Sm_, Academic Preu. Inc., ~ew Yorit (1~73).

Codd. E. F .. ":\. ~el&tiona1 mode! of data (or wlI ,hued dat.a bAAb". Commwlicatioll.S of the :\e.\{, 13 (6) (June 1970).

Codd. E. F .• "Relational eompllc..n_ ot da~ bue rubla.o~J~e5', :n R-atiJl, lUndall (ed.), Cour&llt Compuc..r Scjence 3ym~lwn 6: D&&& BaM Sy.wmJ. En~lewood C~lU. :'iew Jersey. PreMlce-nail. Inc:. (191'2).

Gallai.te. Bernt. Millkcr, lack. ~nd Nicola., J, M .• "An overview and introductioa ~ lo~c ~nd ~ bues". ill Galla&n. aerte ~d Muaer, JJ.Clc. LOgle &Ad D~ &.cs. ~ew York. Plenum P.esa (1978).

Raic.er. R.. ~An approacil eo d~uctiTe question-&Il.IWerin('. 3BN Report ~o. 35-19, 8o1t, Ben.nlit a.od Newman. [nCo, Cambridge, ~au. (Sept.ember L971).

Slulw. DaYid Ellio," ",\ Hiervchiea! Asaociative Architecture (or the Parallel Evallation ot Relatiol1a1 Algebraic DaC&bue Primir.ives'. Staniord Compu~r Sciel1et DeparttncllL Rcpor~ STA..'i·CS-79-i78 (October 1979).

Sb~w. David smo," ",\ Rc!o.tiono.l Databue Machine Archi~ture', P~oceedinp 01' the 1080 Worbbop on Computer Arcbic«cure ror .Voa-Numerlc PrgcessJnr • ..uuomu, C~iroraia, (March L98O).

Shaw. Do.yid Elliot. Kllowledfe-BQ.Hd Retrieval 00 a RelacioaaJ D~c .. b..,. Mac/tio/!, Stanford Ph.D. DiaMrta,ioQ and SI.4nCord Com puter SClenee Departme!l~ Report 5TA.'l-CS-80-a'23. (Augu.l' L98Cla).

33

Page 37: taau. n .. n. nu.. - Columbia University

Shaw, David Elliot. fbrllhim, Hwsaein, Wiederhold, Gio aad .'u1drews, J. A •. -A Hl~hly P:ltallei VLSI·Oaaed

5ubays~m of tile ~ON-VON Database MacaiJle', Columbia Computer Science DC?aremcllt :tepor~, (Juiy

1981).

31