11
IEEE INTERNET COMPUTING 1089-7801/01/$10.00 © 2001 IEEE http://computer.org/internet/ MAY • JUNE 2001 21 Data Security Elisa Bertino and Silvana Castano University of Milan Elena Ferrari University of Como Securing XML Documents with Author-X This Java-based access-control system supports secure XML document administration at varying levels of granularity. T he widespread adoption of XML for Web-based information exchange is laying a foundation for flexible granularity in information retrieval. XML can “tag” semantic elements, which can then be directly and independently retrieved through XML query languages. Further, XML can define application- specific document types through the use of document type definitions (DTDs). Such granularity requires mechanisms to control access at varying levels within documents. In some cases, a single-access control policy may apply to a set of docu- ments; in other cases, different policies may apply to fine-grained portions of the same document. Many other intermediate situations also arise. (For an overview of research and available commercial prod- ucts, see “The XML Security Page” at http://www.nue.et-inf.uni-siegen.de/ ~geuer-pollmann/xml_security.html.) The typical three-tier architecture for accessing an XML document set over the Web consists of a Web client, network servers, and the back-end information system with a suite of data sources. In this framework, shown in Figure 1 (next page), public-key infrastructures (PKIs) represent an important development for addressing security concerns such as user authentication. But such facilities do not provide mechanisms for access control to document contents nor for their release and distribution. Author-X is a Java-based system, devel- oped at the University of Milan’s Depart- ment of Information Science, to address the security issues of access control and policy design for XML documents. 1 Author-X supports the specification of policies at varying granularity levels and the specification of user credentials as a way to enforce access control. Access con- trol is available according to both push and pull document distribution policies, and document updates are distributed through a combination of hash functions and digi- tal signature techniques. The Author-X approach to distributed updates allows a user to verify a document’s integrity with- out contacting the document server.

Securing XML Documents with Author-X

Embed Size (px)

Citation preview

IEEE INTERNET COMPUTING 1089-7801/01/$10.00©2001 IEEE http://computer.org/internet/ MAY • JUNE 2001 21

Dat

a Se

curi

ty

Elisa Bertinoand Silvana CastanoUniversity of Milan

Elena FerrariUniversity of Como

SecuringXML Documentswith Author-XThis Java-based access-control system supports secure XML

document administration at varying levels of granularity.

The widespread adoption of XML forWeb-based information exchange islaying a foundation for flexible

granularity in information retrieval. XMLcan “tag” semantic elements, which canthen be directly and independentlyretrieved through XML query languages.Further, XML can define application-specific document types through the use ofdocument type definitions (DTDs).

Such granularity requires mechanismsto control access at varying levels withindocuments. In some cases, a single-accesscontrol policy may apply to a set of docu-ments; in other cases, different policiesmay apply to fine-grained portions of thesame document. Many other intermediatesituations also arise. (For an overview ofresearch and available commercial prod-ucts, see “The XML Security Page” athttp://www.nue.et-inf.uni-siegen.de/~geuer-pollmann/xml_security.html.)

The typical three-tier architecture foraccessing an XML document set over theWeb consists of a Web client, networkservers, and the back-end information

system with a suite of data sources. Inthis framework, shown in Figure 1 (nextpage), public-key infrastructures (PKIs)represent an important development foraddressing security concerns such asuser authentication. But such facilitiesdo not provide mechanisms for accesscontrol to document contents nor fortheir release and distribution.

Author-X is a Java-based system, devel-oped at the University of Milan’s Depart-ment of Information Science, to addressthe security issues of access control andpolicy design for XML documents.1

Author-X supports the specification ofpolicies at varying granularity levels andthe specification of user credentials as away to enforce access control. Access con-trol is available according to both push andpull document distribution policies, anddocument updates are distributed througha combination of hash functions and digi-tal signature techniques. The Author-Xapproach to distributed updates allows auser to verify a document’s integrity with-out contacting the document server.

In this article, we will first illustrate the distin-guishing features of credential-based security poli-cies in Author-X, then examine the system’s archi-tecture, and conclude with details about itsaccess-control and administration engines.

Credential-BasedSecurity PoliciesIn general, security policies state who can accessenterprise data and under which modalities. Oncepolicies are stated, they are implemented by anaccess-control mechanism. In Author-X, securitypolicies for XML documents have the followingdistinguishing features:

� They can be set-oriented or instance-oriented,reflecting support for both DTD- and docu-ment-level protection.

� They can be positive or negative at differentgranularity levels, enforcing differentiated pro-tection of XML documents and DTDs.

� They include options for controlled propagationof access rights, whereby a policy defined for adocument or DTD can be applied to othersemantically related documents and DTDs (orportions of them).

� They reflect user profiles through credential-based qualifications.

Author-X security policies are implementedthrough six basic components.

User CredentialsThe first component is a user credential — that is,a set of properties about a user that are relevantfor security policies (age, position within the orga-nization, current projects, and so on).2 Credentials

allow a system security administrator to expresssecurity policies directly. Each user is assigned oneor more credentials upon subscribing to the sys-tem. To simplify this task, Author-X groups cre-dentials with similar structures into credentialtypes. It also provides a language based on XMLfor encoding credentials and credential types.

Figure 2 shows example credentials forsubjects entitled to access an XML purchase orderdocument. Figure 3 represents the documentgraphically: Blue nodes represent elements,yellow nodes represent XML syntax attributes,and edges represent element-to-subelement andelement-to-attribute relationships. Users can bequalified by XPath expressions that set conditionson credentials and credential properties.3 Forexample, an XPath expression could be specifiedto say that employees of carrier company X (CCX)can access customer information in the purchaseorders for which CCX is the carrier. Similarly, wecould say: All secretaries working in the salesdepartment can access and modify all purchaseorder documents.

Protection ObjectsAuthor-X protection objects are documents orDTDs (or portions of them). Policies can be speci-fied for a range of protection objects:

� all instances of a given DTD;� collections of both well-formed (follows XML

grammar rules, but does not have an associat-ed DTD) and valid (well-formed and conformsto a given DTD) XML documents; and

� selected portions within one or more docu-ments (such as an element, attribute, or link —or a set of any of these).

22 MAY • JUNE 2001 http://computer.org/internet/ IEEE INTERNET COMPUTING

Feature

Internet

Network server

Client

Security policies

Client

Client

Firewall

Accesscontrolmechanism

Enterprise system

Information system(databases,XML sources,applications ... )

Figure 1.Typical three-tier architecture for access to an XML document source. Public-key infrastructures do not providemechanisms for access control to document contents nor for their release and distribution.

IEEE INTERNET COMPUTING http://computer.org/internet/ MAY • JUNE 2001 23

Data Security

The granularity of protection objects is comple-mented in Author-X by support for content-basedaccess control, which allows the security adminis-trator to specify security policies based on docu-ment content (attribute values). This feature isimportant because documents with the same struc-ture frequently have different protection require-ments with respect to their contents.

Access ModesAuthor-X categorizes security policies into twogroups: browsing and authoring. Browsing poli-cies allow a user to read the information in aprotection object or possibly to navigate throughits link, whereas authoring security policies allowa subject to modify protection objects accordingto different modes (such as append, write, delete,or insert).

SignsAuthor-X security policies can specify either per-missions (positive policies) or denials (negative poli-cies). By using this feature, the security administra-tor can easily specify exceptions (granting accesspermissions to a whole document except for oneattribute, for instance) and thus reduce the numberof policies that must be specified to secure an XMLsource, that is, a set of XML documents and DTDs.

Author-X uses a strongest-policy principle tosolve conflicts that arise between positive and neg-ative policies. This conflict-resolution strategyrelies on criteria stating that security policiesdefined for specific documents prevail over thosedefined for DTDs (because the latter are consideredless specific) and that policies defined at a lowerlevel in a document or DTD hierarchy prevail overthose defined at higher levels.

Purchase_order

Date

10/1/2000

Item Item

Code

202051

QuantityDescr. Code

Quantity

Descr.

RAM memory4

202052

Monitor 1

Customer

Taker

P. Brown

Address

2030

Name

CCX

Person_ in_charge

Carrier

M. Smith

OrderID

Figure 3. XML document representation. Blue nodes represent elements, yellow nodes represent XMLsyntax attributes, and edges represent element-to-subelement and element-to-attribute relationships.

<carrier_employee credID="154"><name>

<fname> Bob </fname><lname> Watson </lname>

</name><phone_number> 8005769840 </phone_number><company> UPS </company>

</carrier_employee>

<secretary credID="104", managerID="154"> <name>

<fname> Tom </fname><lname> Moore </finame>

</name><age> 25 </age><department> sales < /department><salary> 2,000 </salary><level> third </level><duty> manager secretary </duty>

</secretary>

Figure 2.Two example user credentials. XML tags define propertiesthat Author-X uses for document access control.

Propagation OptionsThe Author-X access-control model provides a setof propagation options. These options specify howand whether a security policy for a given protec-tion object propagates to another. This feature thusoffers a means to concise expression for a set ofsecurity requirements.

There are two types of propagation: implicitand explicit. Implicit propagation is applied bydefault to all specified policies and is based on thetwo principles:

� DTD-level policies automatically propagate toall DTD instances, and

� policies specified on a given document or DTDelement automatically propagate to all associ-ated attributes and links.

The security administrator can, when necessary,overwrite these default propagation principles byspecifying explicit positive and negative securitypolicies. There are three explicit propagationoptions available in Author-X:

� NO_PROP. The security policy applies only tothe protection objects defined in its specifica-tion; no propagation is enacted.

� FIRST_LEVEL. The policy propagates to alldirect child elements of the elements in the pol-icy specification.

� CASCADE. The policy propagates to all directand indirect child elements of the elements inthe policy specification.

The security administrator can thus state if andhow a given policy propagates to protection objectsat lower levels in the document or DTD hierarchy.

Policy BaseAll security policies for an XML source are encod-

ed in an XML file called the policy base. Figure 4shows an example policy base for the XML docu-ment in Figure 3. The first policy allows the secre-taries of the sales department to modify andbrowse all purchase order documents (enforcedthrough a policy specified at the DTD level). Thenext two policies give CCX employees access toinformation about the customer, carrier, orderid, and date on purchase orders for which CCXis the carrier:

� a positive policy referring to orders for whichCCX is the carrier applies to all purchase orderdocuments, and

� a negative policy applies only to the portionsof documents that are unavailable to CCXemployees (that is, the item elements).

The last two policies authorize publicity agents toaccess the order id and description of pur-chase order items.

System ArchitectureAuthor-X has a security policy engine for systemspecification, validation, and maintenance; theengine itself is based on XML to enforce accesscontrol. Figure 5 shows the system’s general archi-tecture, which consists of the XML source and X-bases repositories and two Java components, X-Access and X-Admin.

RepositoriesThe XML source repository contains the XML docu-ments and DTDs, if defined, which are to be protected.

X-bases repositories are the following:

� The policy base contains the security policies forthe documents and DTDs in the XML source.

� The credential base contains the user creden-tials and credential types.

24 MAY • JUNE 2001 http://computer.org/internet/ IEEE INTERNET COMPUTING

Feature

<policy_base> <policy_spec cred_expr="//secretary [department="sales"]" target="Purchase_order.dtd" priv="ALL" type="GRANT" prop="CASCADE"/ > <policy_spec cred_expr="//carrier_employee[company="CCX"]" target="Purchase_order.xml" path = "//Purchase_order[Purchase_order/carrier/name="CCX"]" priv="VIEW" type="GRANT" prop="CASCADE"/ > <policy_spec cred_expr="//carrier_employee[company="CCX"]" target="Purchase_order.xml" path = "//item" priv="VIEW" type="DENY" prop="CASCADE"/ > <policy_spec cred_expr="//publicity_agent target="Purchase_order.dtd path="//item/description priv="VIEW" type="GRANT" prop="NO_PROP"/ > <policy_spec cred_expr="//publicity_agent target="Purchase_order.dtd" path="//Purchase_order/order ID" priv="VIEW" type="GRANT" prop="NO_PROP"/ ></policy_base>

Figure 4. Example policy base. Author-X encodes the security policies for a set of XML documents in an XML file calledthe policy base.This policy base encodes the security policies specified for the document in Figure 3.

� The encrypted document base contains anencrypted copy of portions of the documentsin the XML source.

Expressing credentials, credential types, and secu-rity policies using XML makes them fully interop-erable with one another. Using XML to specifycredentials also facilitates secure submission anddistribution because we can apply the digital sig-nature and encryption mechanisms we havedeveloped to protect XML documents. Moreover,expressing security policies in XML simplifiesinformation exportation when a documentmigrates from one source to another.

Java ComponentsX-Access implements access control over the XMLsource by using security policies and credentialsstored in the corresponding X-bases. In responseto each access request, X-Access returns a viewwith only those portions of the requested docu-ments for which the policy base authorizes access.X-Access supports the controlled release of XMLdocuments according to the pull and push dis-semination modes.

X-Admin provides an environment and graph-ical tool set to assist the security administrator inadministration activities related to policy and cre-dential management.

Because they constitute the core components ofXML document protection, we will discuss theseJava components in more detail.

X-Access Dissemination ModesThe pull and push dissemination modes inAuthor-X are complementary, and the choice ofwhich to use depends on many factors. The secu-rity administrator can use document characteris-tics and user profiles to decide which mode toemploy for all documents in the source. Informa-tion pull mode is commonly used in traditionaldatabase management systems; it is suitable fordocuments that are not distributed as regularly orthat are usually requested individually. Informa-tion push can be useful for sending documentssuch as monthly newsletters to a set of sub-scribers; it could also be appropriate for a com-pany that distributes a monthly report containingsections that everyone can access and others(financial information, for example) that onlyselected users can access.

Pull-Mode OperationsUnder information pull, a user submits an explic-

it document access request, which is representedas a tuple:

r = <subject, target, path,

acc_modality>

where subject is the user requesting access, tar-get corresponds to the requested XML docu-ments, path is an XPath expression that eventu-ally selects specific portions of the requesteddocuments, and acc_modality is an element ofthe set {browsing, authoring} that specifieswhich type of access is requested.

Upon receiving an access request , X-Accesschecks which authorizations (both positive andnegative) the subject has on the target docu-ments — either directly by the security policies orimplicitly by propagation options. The system then

IEEE INTERNET COMPUTING http://computer.org/internet/ MAY • JUNE 2001 25

Data Security

Figure 5. Author-X system architecture. Author-X components arethe XML source and X-bases repositories and the X-Access and X-Admin Java modules.

Author-X

X_Admin

Java component

X_Access

Java component

DOM/XQL

XML source

X_bases

Subjects

Accessrequest View

Pull Push

Securityadministrator

Administrativeoperations

Encrypteddoc. base

Policybase

Credentialbase

Doc. package

presents the subject a view of the authorizedportions of the requested documents.

The first step in building the subject view isthe pruning phase, during which the systemremoves all unauthorized elements andattributes from the requested document ordocuments. During this phase, X-Access firstqueries the policy base to extract all browsing orauthoring policies (depending on the type ofaccess request) specified for the targetdocuments — including those specified on theassociated DTD. Access is denied if the queryresult is empty. Otherwise, the system uses theconflict-resolution rule to order the policies indecreasing priority.4

The pruning algorithm iteratively considerseach extracted policy and marks the elementsand attributes with a “+” to specify permissionor a “–” for denial. All attributes marked with aminus sign or left unmarked are pruned from thetarget view, and the path expression is evalu-ated against this pruned document. If the userrequests access to a whole document (that is, thepath component is null), the system returns thepruned file. The resulting node set is turned backinto a well-formed XML document for transferand display.

For example, suppose that Bob, a publicityagent, submits the following access request:

<Bob, Purchase_order.xml,

//Purchase_order[@orderID=‘2030’]/

item,browsing>

as shown in the left-hand frame in Figure 6. Fol-lowing the security policy stored in the policybase, Bob receives a view that contains only thedescription of each item in the order (that is, RAMand monitor). The right-hand frame in Figure 6shows the purchase order view returned to Bob(top area) and its associated DTD (bottom area).

Push-Mode OperationUnder information push, the system periodicallybroadcasts documents to users, rather than send-ing them upon request. Because different usersmay have different viewing rights for portions ofthe same document, support for the push modecould entail generating different views of eachdocument. However, information push is largelydesigned to distribute documents to large commu-nities, so this approach is not practically scalable.

For this reason, X-Access adopts an approachthat essentially allows the security administratorto send the same document to all subjects, whilestill enforcing stated security policies. ExtendingGladney and Lotspiech’s Cryptolope approach,5 oursystem encrypts portions of a document and usesdifferent encryption keys for different securitypolicies. X-Access selectively distributes the keysnecessary to let each authorized user access appro-priate portions of the document. We have devel-oped mechanisms to support information push forboth browsing and authoring access.

Browsing access. Document encryption is drivenby the specified security policies: All the elementsand attributes to which a single policy (or combi-nation of policies) applies are encrypted with a sin-gle key. The encrypted copy of the document isthen stored in the encrypted document base(shown in Figure 4).

To illustrate the document encryption strategy,suppose the purchase order document in Figure 3should be “pushed” according to the policy base inFigure 4. As shown in Figure 7, encryption gener-ates four keys (K1, K2, K3, K4) for different portionsof the document. According to the security poli-cies, secretaries will receive all the keys, all com-pany employees will receive K1, and publicityagents will receive K3 and K4.

X-Access supports two different distributionmethods for delivering the encrypted documentand keys to a set of users:� online mode delivers the keys and document

together, whereas� offline mode requires users to retrieve the keys

through further interactions with the system.

26 MAY • JUNE 2001 http://computer.org/internet/ IEEE INTERNET COMPUTING

Feature

Figure 6. Example access request and returned document view inAuthor-X.The request appears in the left-hand side of the screen;the returned view, top right; and the associated DTD, bottom right.

There are two differ-ent options for onlinemode. Under the first,the encrypted docu-ment is encapsulatedinto a package thatincludes an entrywith appropriate keysfor each recipient.Upon receiving thepackage, the userdecrypts the entrywith a private key,and then uses thekeys here containedto decrypt the docu-ment. This approachlets us send the samepackage to all sub-jects and protect thecontent because eachportion can bedecrypted only withthe correspondingprivate key.

The drawback isthat the package canreach considerable size with a large number ofusers who each need multiple keys. Moreover,this approach can be vulnerable to attack becauseall the keys are in the same package. A malicioussubject could launch a denial-of-service attackby deleting another user’s keys, for example, andmaking it impossible to decrypt document con-tent. For all these reasons, X-Access also supportsan online distribution mode in which keys aresent to each subject in a separate message usingsecure e-mail techniques.6

In offline mode, on the other hand, X-Accesssends only the encrypted copy of the document toall subjects; keys are stored at the server site usingthe Lightweight Directory Access Protocol,7 andsubjects query the LDAP directory to obtain theappropriate keys.

Choosing whether to use online or offline keydistribution for a specific document depends onissues such as the number of recipients, the num-ber of keys generated during document encryp-tion, the sensitivity level of the information, andthe users’ behavior (whether they are always con-nected to the network, for example, or are seldom-connected mobile users). With so many influentialfactors, we designed our system to support a spec-trum of key distribution methods; security admin-

istrators can select the most appropriate methodfor each document.

Authoring access. We designed the push mode forauthoring access to handle XML documents thatare subject to predefined distributed and coopera-tive updates (usually at explicit intervals along aspecified update path) during which users in dif-ferent organizational roles must modify possiblydifferent portions of the same document. Amonthly report might first be modified by a sec-retary, for example, then signed by a manager, andpassed along up the chain of command.

Using traditional pull mode, Author-X wouldhave to mediate and manage each update requestby returning only the portions of the documentthat each user was authorized to modify. More-over, the system would have to verify whethereach user’s updates obeyed the security policyrestrictions for the document. Under informationpush, on the other hand, the server generates anddistributes a single encrypted copy of the docu-ment, which flows from one subject to anotheralong an attached update path.

As Figure 8 (next page) shows, each user sepa-rately receives decryption keys for appropriate por-tions of the document. Unlike browsing access,

IEEE INTERNET COMPUTING http://computer.org/internet/ MAY • JUNE 2001 27

Data Security

K1 K4 K4

K1

K1

Purchase_order

Date

10/1/2000

Item Item

Code

202051

Quantity

Descr. CodeQuantity

Descr.

RAM memory4

202052

Monitor 1

Customer

Taker

P. Brown

Address

2030

Name

UPS

Person_in_charge

Carrier

M. Smith

OrderID

K1

K1 K1

K1K1

K1

K2

K2

K2

K2K2

K2K2

K3

K3

K3

K3

K2

K1

Figure 7. Key generation. Different portions of a document are encrypted with different keysbased on the security policies holding for the document.This document has been encrypted withfour different keys based on the security policies specified for the document in Figure 3.

authoring access also requires a mechanism thatlets each user verify whether the source’s author-ing policies allow the previous users’ authoringoperations and whether the specified update pathwas followed. By attaching additional controlinformation to the encrypted document, we createa distributed environment that enables users, inmost cases, to perform this verification withoutinteracting with the document server. To generatethe so-called document package (the encrypted doc-ument together with the control information), weuse hash functions and digital signature techniques.

X-Admin FacilitiesFigure 9 shows the pool of security administra-tion tools in the X-Admin architecture. We haveimplemented two components — the policy man-ager and the credential manager — on top of fivebasic viewer facilities:

� The document/DTD viewer uses a graphicalnotation similar to the one adopted by con-ventional XML editing and parsing tools, suchas the Apache group’s Crimson and Xerces, todisplay a target document or DTD.

� The policy viewer displays the users and XMLdocuments related to a given policy.

� The propagation viewer uses both explicit and

implicit propagation principlesto display all policies for the tar-get document or DTD. In partic-ular, it adopts different graphicalstyles to represent protectionobjects by policy type (explicit orpropagated, positive or negative,and so on). � The conflict viewer dis-plays all conflicts arising amongsecurity policies defined for thetarget document or DTD. More-over, the viewer shows thedefault conflict-resolutionchoice for each conflict, deter-mined according to thestrongest-policy principle.� The credential viewerdisplays the structure of creden-tial types and user credentials; italso provides a form-basedgraphical editing environment forspecifying credential expressions.

The security administrator caninvoke these basic facilities inter-

actively within an X-Admin working session. Prop-agation and conflict viewer facilities can also beinvoked on the whole policy base to visualize prop-agation and conflicts for all stored security policieson a per-document or DTD basis. Because docu-ments and security-related information are speci-fied in XML syntax, viewer facilities work inter-nally on document object model (DOM)representations8 and exploit the XML query lan-guage (XQL), as shown in Figure 9.

Credential ManagerThe credential manager component supports thesecurity administrator in the specification and main-tenance of credential types and associated user cre-dentials. To define a new credential type, the secu-rity administrator specifies its name and structure informs provided through the credential viewer. Oncethe credential-type structure has been defined, thecredential manager generates the correspondingXML-based syntax for storage in the credential base.

It is also possible to add new properties to exist-ing credential types through the credential viewer.Deleting a credential type also removes all user cre-dentials defined for it. Interactive menus list cre-dential types, properties, constant values, and com-parison and logic operators for graphically definingexpressions on credential types. The credential

28 MAY • JUNE 2001 http://computer.org/internet/ IEEE INTERNET COMPUTING

Feature

Author-X

Docpackage

Updated docpackage

Checks andupdates

Checks andupdates

Checksand

updates

Updated docpackage

Updated docpackage

Updated docpackage

Updated docpackage

Checksand updates Checks

and updates

Figure 8. Push mode for authoring access. Encrypted documents with respective keysare sent to all users. Users exploit the keys to decrypt received documents and canmodify only the authorized portions.

viewer-editor then generates the correspondingXPath-based syntax for the credential expressions.

Policy ManagerThe security administrator can specify securitypolicies for a target document or DTD through thepolicy manager’s specification forms. To limit theinput required for securing a target document orDTD, the policy manager derives as much informa-tion as possible from the policy base and from thedocuments and DTDs in the XML source, throughimplicit and explicit propagations. The policy man-ager proposes any automatically derived candidatepolicies to the security administrator for validation,at which point the policies can be customized asnecessary or accepted “as-is.” The policy manageralso supports maintenance operations for updatingor revoking policies in the policy base.

Figure 10 shows the X-Admin policy specifica-tion menu. In a working session, the securityadministrator first selects the policy specificationmodality — in-site or external — for the documentto be protected.

Specifying in-site-document policies. Under in-sitemode, the security administrator specifies explic-it security policies for protection objects for a tar-get document stored in the XML source (in-site-document). Security policies can be specifiedeither at the DTD level or at the document level.Author-X supplies a form with all the fields fordefining a security policy: subject, protec-tion object, access mode, sign, andpropagation option.

The policy manager automatically fills in theform’s protection object field with the XPathexpression for a protection object selected withinthe graphical representation of the target docu-ment. To specify the users to whom a security pol-icy applies, the security administrator invokes thecredential viewer editor. Each of the other fieldshas an associated multiple-choice menu that listsadmissible values and predefined default values(read, grant, and cascade), which can bemodified as necessary. As policy definition pro-ceeds, the security administrator can also use thepropagation viewer to interactively analyze secu-rity policies. Once the security requirements havebeen satisfied for the target document, the policymanager generates the XML syntax for storing thevalidated policy in the policy base.

Specifying external document policies. Underexternal mode, the security administrator can use

the policy manager to specify security policies fornew documents coming from outside. If the exter-nal document is valid, the system displays theassociated DTD graphically, and the securityadministrator can specify the DTD-level security

IEEE INTERNET COMPUTING http://computer.org/internet/ MAY • JUNE 2001 29

Data Security

Figure 10. X-Admin policy specification menu. Different options canbe invoked for securing documents stored in the XML source andthose coming from outside.

X-Admin

Security administrator

• Specification• Validation• Update• Revocation

• Specification• Update• Deletion

Policymanager

Credentialmanager

Graphical user interface

Administrative operations onsecurity policies, subject credentials,and credential types

DOM/XQL

XML source

Encrypteddoc. base

Policybase

Credentialbase

Doc./DTDviewer

Policyviewer

Conflictviewer

Propagationviewer

Credentialviewer

Figure 9. X-Admin architecture. A pool of security administrationtools is based on two components — the policy manager and thecredential manager — on top of five basic viewer facilities.

policies in the same way as for in-site documents.If the external document is well formed, the policymanager offers both a classification-based and adocument-based mechanism that help automatethe policy specification process.

The classification-based option exploits thepropagation principle to derive candidate policiesfrom a DTD that is already secured in the XMLsource. The policy manager compares the externaldocument against all secured DTDs to find the onewith the greatest number of matching protectionobjects. The classification procedure implementsan algorithm that analyzes graph-based structuresto compare the structures of all secured DTDsagainst the document. This procedure identifies aDTD D as a full or partial match depending on thenumber of its elements that correspond to theexternal document. If D is a complete match — all

the elements specified in theexternal document are pre-sent in D, and the documentin turn contains everymandatory element specifiedin D — the policy managerpropagates all security poli-cies defined for D to theexternal document.

If D provides a partial match,the policy manager enforcestwo different mechanisms forpropagating policies to a

nonmatching protection object of the externaldocument.

� With the containment-based mechanism, thepolicy manager checks whether the protectionobject is a subelement of an element for whicha policy is defined in D. If so, the policy for thiselement is propagated to the protection object.

� With the affinity-based mechanism, the policymanager checks D to see whether it contains anelement or attribute that is semantically relatedto the protection object (based on a domainontology and the semantic meaning of thelabels).9,10 If so, the element or attribute is exploit-ed to derive a policy for the protection object.

Author-X then presents policies propagatedaccording to these two mechanisms to the securi-ty administrator for validation. Thus, security poli-cies need only be created from scratch for non-matching protection objects for which nopropagated policy can be derived.

If the classification procedure finds no match-

ing DTDs, or upon the security administrator’sexplicit request, the policy manager invokes thedocument-based option to begin a policy specifi-cation session for the external document. In thismode, the security administrator uses the policymanager forms to define security policies fromscratch for protecting the external document.

Once the policy specification process is complete,the system checks which secured documents to dis-tribute under information push. Author-X then exe-cutes an algorithm that modifies any encryptedcopies of these documents stored in the encrypteddocument base to bring them in line with the newlyspecified policy. This algorithm incrementally main-tains document encryption and changes only theportions to which the specified policy applies. If noencrypted copy of the document exists, the algo-rithm generates one according to the strategydescribed earlier under “Pull-Mode Operations,” andstores it in the encrypted document base.

ConclusionThe Web community generally regards XML as themost important standardization tool for informa-tion exchange and interoperability, and we believethat XML access control will constitute the coresecurity mechanism of Web-based enterprise archi-tectures. The current Author-X prototype is builton top of the eXcelon XML server11 and supportsbrowsing and updating of DTD-based XML sources.

We plan to extend protection toward secureaccess to Web pages and compliance with XMLschemas (http://www.w3.org/XML/Schema/).Additionally, we will experiment with incorporat-ing Author-X within Web-based enterprise infor-mation system architectures by focusing on per-formance issues. In particular, we will studyXML-based solutions for certifying user creden-tials as well as access-control schemes and archi-tectures for securely disseminating information.We have proposed a preliminary set of XML-basedaccess-control schemes for distributed architec-tures,12 and we are working on a prototype extend-ing the Author-X functionalities accordingly.

AcknowledgmentsThis work has been partially supported by a grant from

Microsoft Research.

References

1. E. Bertino, S. Castano, and E. Ferrari, “Securing XML Doc-

uments: The Author-X Project Demonstration,” Proc. ACM

Int’l Conf. Management of Data (SIGMOD 2001), ACM

Press, New York, 2000.

30 MAY • JUNE 2001 http://computer.org/internet/ IEEE INTERNET COMPUTING

Feature

We plan to extendprotection towardsecure access toWeb pages andcompliance withXML schemas.

2. M. Winslett et al., “Using Digital Credentials on the World

Wide Web,” J. Computer Security, vol. 7, 1997.

3. XML Path Language (XPath), Recommendation 1.0, World

Wide Web Consortium (W3C), 1999; available at

http://www.w3.org/TR/xpath.

4. E. Bertino et al., “Specifying and Enforcing Access Control

Policies for XML Document Sources,” World Wide Web,

Baltzer Science Publishers, Netherlands, vol. 3, no. 3, 2000;

available online from http://www.baltzer.nl/www/

contents/2000/3-3.html.

5. H. Gladney and J. Lotspiech, “Safeguarding Digital Library

Contents and Users: Assuring Convenient Security and

Data Quality,” D-Lib Magazine, Corp. for National Research

Initiatives, CITY, Va., May 1997; available at http://www.

dlib.org/dlib/may97/ibm/05gladney.html.

6. W. Stallings, Network Security Essentials: Applications and

Standards, Prentice Hall, Upper Saddle River, N. J., 2000.

7. D. Srivastava, “Directories: Managing Data for Networked

Applications,” tutorial, Proc. 16th IEEE Int’l Conf. Data-

base Engineering (ICDE 2000), IEEE CS Press, Los Alami-

tos, Calif., 2000.

8. Document Object Model (DOM) Level 1, W3C Recom-

mendation; available at http://www.w3.org/TR/REC

-DOM-Level-1/.

9. S. Castano and V. DeAntonellis, “A Discovery-Based

Approach to Database Ontology Design,” Distributed and

Parallel Databases, Kluwer Academic Publishers, Nether-

lands, vol. 7, no.1, Jan. 1999, pp.67-98.

10. A. Miller, “Wordnet: A Lexical Database for English.”

Comm. ACM, vol. 38, no. 11, Nov. 1995, pp. 39-41.

11. “An XML Data Server for Building Enterprise Web Appli-

cations,” white paper, Object Design Inc., 1998; available

at http://www.odi.com/whitepapers/XMLEnterprisewWeb

Apps.pdf.

12. E. Bertino, S. Castano, and E. Ferrari, “On Specifying Secu-

rity Policies for Web Documents with an XML-Based Lan-

guage,” to appear in Proc. Symp. Access Control Models

and Technologies 2001, ACM Press, New York, 2001.

Elisa Bertino is a professor of computer science and the chair

of the Department of Computer Science at the University

of Milan. Her research interests include security, object-

oriented databases, distributed databases, multimedia data-

bases, interoperability of heterogeneous systems, and inte-

gration of artificial intelligence. She recently co-authored

Intelligent Database Systems (Addison Wesley). She also

served as program chair of the 14th European Conference

on Object-Oriented Programming (ECOOP 2000).

Silvana Castano is a professor of computer science at the Uni-

versity of Milan. Her research interests include database

and XML security, intelligent information integration

and ontologies, Web-based information systems, process

reengineering, and workflows. She received a PhD in

computer and automation engineering from Politecnico

di Milano, in 1993. She co-authored Database Security

(Addison-Wesley, 1995). She is currently chair of the

Associazione Italiana per l’Informatica ed il Calcolo

Automatico (AICA) working group on databases.

Elena Ferrari is a professor of computer science at the Universi-

ty of Como, Italy. Her research interests include security, mul-

timedia databases, and temporal object-oriented data models.

She received a PhD in computer science from the University

of Milano in 1997. She served as program chair of the first

ECOOP Workshop on XML and Object Technology (XOT00).

Readers may contact the authors via e-mail at {bertino,

castano,ferrarie}@dsi.unimi.it.

IEEE INTERNET COMPUTING http://computer.org/internet/ MAY • JUNE 2001 31

Data Security

In our next issue: Scalable Internet SerIn our next issue: Scalable Internet Servicesvices

with articles from

Eric Brewer on giant-scale servicesRandal Burns et al. on distributed file systems

Brian D. Davison on Web cachingC.D. Cranor et al. on enhanced streaming services

and more ...

Guest Editors: Fred Douglis, AT&T Labs–Research and Frans Kaashoek, Massachusetts Institute of Technology