38
Managing Content with Oracle XML DB An Oracle White Paper March 2005

Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB An Oracle White Paper March 2005

Page 2: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 2

Introduction........................................................................................... 3

Benefits of using XML for Content Management .............................. 4 Where does XCM benefit your business?........................................... 5 Basic Requirements for successful XML Content Management......... 7

Integration with popular authoring tools........................................ 7 Integrated support for one or more standard content models ........ 7 Security.......................................................................................... 8 Versioning...................................................................................... 8 Customizable meta-data ................................................................ 8 Search and Retrieval ...................................................................... 8 Full programmatic access............................................................... 8 Integration with Multi-Format Publishing capabilities................... 9 Performance, Scalability and Manageability................................... 9

Introducing the Oracle XML DB repository .................................... 10 Support for industry standard internet protocols ......................... 10 Support for Industry-Standard Content Models ........................... 14 Comprehensive Access Control ................................................... 14 Basic Version Control.................................................................. 16 Custom Meta-data Management .................................................. 17 XML and Full-Text searching ...................................................... 17 Full Programmatic Access ........................................................... 17 Integration with Multi-Format Publishing Capabilities ................ 18

Oracle XML DB Repository Architecture ....................................... 18 The XDB Database User and Database Schema ......................... 18 The Protocol Servers ................................................................... 20 The Hierarchical Index ................................................................ 21

A Sample XCM application with Oracle XML DB.......................... 22 Architecture ................................................................................. 22 Enabling Custom Meta-data ........................................................ 26 Versioning Support ...................................................................... 29 Versioning Support ...................................................................... 30 Multi Format publishing .............................................................. 31 Access Control............................................................................. 33

XML Content Management Use Cases ............................................ 34 Emissions Reporting System........................................................ 34 Course Handbook Publishing System.......................................... 34 National Television Station Web Site.......................................... 34 Legislative Authoring Process...................................................... 35

Conclusions ..................................................................................... 37

Page 3: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 3

Managing Content with Oracle XML DB

INTRODUCTION

Organizations are now realizing that significant improvements in productivity and effectiveness can be made by capturing and making use of information that exists outside traditional IT systems. Typically, the majority of this kind of content is captured in documents that are created using office productivity tools such as word-processors, email or spreadsheets. In order to maximize the return from this ‘unstructured information’, organizations need technology to manage, search, and publish and repurpose such information. While traditional relational databases have excelled at managing structured data, a platform that manages unstructured information with the same rigor has long seemed just out of reach.

Organizations have attempted to address this issue by investing significant resources in Content Management (CM) systems. Until the Mid 1980’s Content management systems tended to be developed in-house to meet the needs of very specific user communities. Examples of these systems include those used by the Pharmaceuticals industry where regulation leads to specific requirements related the management of documents; or high end publishing systems used by professional publishing companies that automate the process of publishing content.

From the Mid 1980’s until the explosive growth of the internet in the late1990’s, the evolution of the Content Management system was driven by the growth of the PC as the authoring environment. This lead to Client-Server based Content Management systems that were interfaced, via custom code or application plug-ins, with the popular PC authoring environments such as MS-Word and Word-Perfect. These systems were typically installed at the department level, with little or thought given to enterprise-level solutions.

With the explosion of the World Wide Web in the late-1990s, the primary tool for publishing content became the Web-Browser. Consequently Content Management Systems evolved into extremely complex applications that attempted to manage all of the material required to publish and manage an organization’s internal and external web-sites. These systems tended to focus on the delivery and management of web site content, and less on the supporting the process of authoring and development of the content. Neither did Web Content

Page 4: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 4

Management system address much of the non-HTML documents in the organization.

The early 2000’s saw the rise of enterprise level content management system, attempting to bring all content and all users under its purview. These systems have proven to be expensive to purchase, complex to deploy, and difficult to use and manage. Beyond departmental deployments, investment in these applications has failed to deliver the expected returns for most organizations. Their use has ended up been restricted to the high-end use-cases, where the large investments that are needed for a successful implementation and deployment can be justified. At the lower end, users have been conditioned to manage their own content with ad-hoc mixtures of shared files, email attachments, or hardcopy-based processes. Between the highest and lowest ends there has been an unmet gap of technologies and alternatives.

The rise of XML has started bridging this gap. XML is now the technology of choice for simple, powerful, standards-based, yet semantically rigorous content-management applications. Organizations are finding that XML Content Management (XCM) applications are easier and cheaper to build, maintain, and deploy than traditional high-end content management (CM) applications. They are also discovering that XCM realizes better interoperability, integrity and search than traditional CM. XML Content management also provides organizations with the flexibility needed to react to the never ending stream of media that have been used to deliver information to customers and partners.

Over the past 10 years the direct-communications industry had evolved from traditional direct-mail pieces to CD to on-line delivery. At the same time publishing has evolved from traditional print to electronic delivery with diverse formats such as HTML, PDF and Flash. Each of these revolutions has a introduced it’s own standards and with traditional content management techniques it has proven difficult to re-purpose content originally developed for use in one medium for use in another medium.

XCM systems address this issue by managing content using XML. XML has the potential for bringing to unstructured information what SQL brought to structured data – a robust standard that covers a wide set of use-cases with broad vendor support.

Benefits of using XML for Content Management

From a content-creation perspective XCM has the advantage that it is based on an open industry standard that is understood by a variety of tools from different vendors. This allows organizations to choose the correct tool for the kind of content they are developing. The choices available to today vary from popular Word Processing packages like Microsoft’s Office 2003 to light-weight, web enabled XML editors like Altova’s Authentic and to the specialized custom XML editors like X-Metal from Blast Radius, EPIC from Abortext and

Page 5: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 5

Frame from Adobe. The leading tools in this arena all support using the W3C’s XML schema language to define a set of rules to which content must adhere.

From a content-management perspective there are several benefits of XML. Its self describing nature allows XCM systems to provide extremely powerful search and retrieval capabilities. In addition, the emergence of XML-based query syntaxes – such as XPath and XQuery from the World Wide Web Consortium (W3C) and SQL:2003 from ANSI/ISO -- has greatly increased the scope of an XCM system. An XCM system that understands and enforces the XML Schema data model, and can support ad-hoc, semantically rich queries using a standard syntax, allows an organization to rigorously create as well as precisely locate any document or fragment of unstructured information.

Finally, the decoupling of presentation and content afforded by XML, combined with a maturing of XML-aware publishing systems means that it is possible to transform XML into a variety of delivery formats, including HTML, PDF ,WAP and press-ready formats. Many of these publishing systems now offer on-demand generation of the required output meaning that content is always up-date.

Where does XCM benefit your business?

Many organizations have now realized that XML is the most effective technology available for managing content. These organizations are using XML to author and publish a wide range of content including, but not limited to

• Product Documentation and Technical Manuals. XML is particularly appropriate for documentation authoring as it enables powerful features like context sensitive search. XML also makes it easier to analyze this kind of document for regulatory compliance. Use of an XCM system makes it easy to search both metadata and content. An XCM system also simplifies the interfacing the content management system with the multi-channel publishing systems.

• Product Specification and Design Documents. This class of document is similar to Documentation and a Technical manual, except that it is typically intended for internal, as opposed to external consumption. An area that is of interest for organization needing to generate or consume this kind of documents the emergence of XML standards like the OASIS CGM effort. Organizations that adopt XML and XCM will be well positioned to ensure compliance with these efforts.

• News and Information Services. One of the major trends in the news and information services is the move to XML based standards such as RSS and NewsML for real-time and near-real-time publishing of information. Using XML to author and manage content simplifies the process of publishing, ingesting and aggregating this kind of

Page 6: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 6

data. Emerging standards like RDF help XCM systems take full advantage of semantic content.

• Financial Reporting. Many national and international organizations, such as the US Securities & Exchanges Commission, the UK Financial Services Authority and the Japanese National Tax Agency (Zeimusho), are now mandating much stricter controls on the process of financial and tax reporting. Much of this effort is focused on the use the XML Business Reporting Language (XBRL) as the medium for publishing and analyzing financial reports. For an organization impacted by XBRL reporting mandates, the use of an XCM system can significantly reduce the amount of effort required to ensure compliance. Organizations that consume XBRL reports can also realize significant benefits by adopting an XCM system to manage the large volumes of XML content that they will receive.

• Contracts. Many organizations are starting to adopt XML as the basis of drafting contracts. A typical contract is a mixture of structured information and unstructured information. Depending on the nature of the contract and the nature of the business a contract can be a very rigid document or an extremely volatile document. For instance, a typical Non-Disclosure Agreement is a small document with little variability and clearly defined content (the parties involved, the effect date etc), while the contract for a major item like a power station may run to several thousand pages and contain large degrees of variability. Using XML and XCM to manage the process of drafting a contract allows organizations to gain control over the process, and ensure that all required components are present in the final document. The use of XML and XCM also makes it possible to clearly identify the parts of the contract that have impact on future business operations, such as level of service agreements.

• Web Site Content. The expectations placed on an organization’s web site are changing. Static and Dynamic pages can no longer be used to meet the needs of the organization. More and more push technologies, such as RSS are being used to provide real-time content. Adopting XML and XCM provides organizations with the ability to develop solutions that provide the flexibility needed to meet these requirements.

• Legislative Documents and Legal Publishing. The process of drafting legislation is extremely complex and requires a high degree of rigor in order to ensure that costly mistakes do not occur. XCM allows the organizations responsible for supporting the legislative bodies to deliver solutions that improve the authoring process by reducing the time

Page 7: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 7

required to review and publish statues, bills or amendments. XML also allow the Legal Publishing industry to develop services that make it easy to quickly and accurately locate the documents that contain the relevant legal information.

Basic Requirements for successful XML Content Management

XCM systems are currently under development, or have been successfully deployed in diverse industries including News publishing, Legal and Scientific publishing, Financial Services and so on. From a consideration of these use-cases, successful XCM applications must provide the following basic platform functionality:

Integration with popular authoring tools

It is critical that an XCM platform not place unnecessary restrictions on the choice of authoring tools. Popular tools currently used to create and edit content, include Microsoft Office and products from vendors like Altova, Macromedia and Adobe. In order to be successful an XCM platform must integrate seamless with these tools.

Integrated support for one or more standard content models

One of the major problems facing an organization that requires effective content management is the definition and subsequent enforcement of a consistent content model. XML has certain inherent advantages as a content model, such as being self-describing, and open. Combining the base XML standard with XML Schema greatly enhances an XCM system’s ability to understand and enforce the content model.

XML Schema is the official data definition language of the W3C. It has the full support of most major software corporations. It is used to specify the structure, content, and certain semantics of a set of XML documents. The standard is described by the W3C in http://www.w3.org/TR/xmlschema-0/. Since an XML Schema is used to define a class of XML documents, the term “instance document” is often used to describe an XML document that conforms to a particular XML Schema. XML Schema is quickly super ceding the traditional Document Type Definitions (DTD) as the standard way of defining the content of XML document.

XML Schema is used as a mechanism for defining a set of rules about the content of an XML document. XML Schema aware editors are then able to ensure that any content that is created conforms to the rules defined in the XML Schema. An XCM platform that integrates with the XML Schema standard is able to ensure that all content stored with the XCM system is compliant with the content models defined by the organization.

Page 8: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 8

Security

In order allow organizations to manage content effectively the XCM system must make it easy to define the rules which specify what operations a user is allowed to perform on a piece of content. The system must then be able to enforce those efficiently. The implementation must be flexible and scaleable, providing support for large numbers of users and large numbers of documents. It must also ensure that there are no back-doors that can be used to bypass the rules that have been specified. This allows organizations to ensure compliance with business policies and regulatory controls.

Versioning

One of key features required to manage content is an effective versioning model. This allows content to be preserved at a particular point in time, and ensures that content is protected from unintended modification. Once a version has been created the information contained in that version of the document is effectively frozen. As with Access Control, versioning is particularly important in situations where compliance and other regulatory forces require organizations to be able to show that documents have not been accidentally or maliciously altered in anyway.

Customizable meta-data

In order to effectively manage content organizations need to be able to tag content with custom meta-data that allows them to implement their business requirements. The usages for custom meta-data vary greatly, from tracking ownership, to defining validity dates, to improving search, implementing business process like compliance, workflow, or rights management.

Search and Retrieval

One of the main benefits of adopting a XCM system is to have a single source for all content, whether XML-Schema based or not, easily scaling up in terms of volume of information. However as the volume of content being managed by the XCM platform increases the system must provide the indexing and search capabilities needed to allow users to find content efficiently.

It is important that the search be tightly integrated with the security model, so as to ensure that users are only able to access content they have been granted access to. The search must also be tightly integrated with structure of the repository so that users can use location within the folder hierarchy as a method of restricting a search.

Full programmatic access

It is likely that most organizations will need to customize the out-of-the-box features on an XCM system This makes it very important that the XCM platform provide programmatic APIs that can be used to easily extend or

Page 9: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 9

complement the basic functionality offered by the platform. Given the dynamic nature of today’s business environment it is important the API’s provided by the XCM system do not constrain the organization’s ability to make or change components of it’s technology stack

Integration with Multi-Format Publishing capabilities

In addition to supporting the creation and lifecycle management of content, an XCM system must be allow content to be published in a variety of formats. One of the advantages of the open nature of XML is that it ensures that the majority of popular publishing systems can work directly with content stored in an XCM system.

Performance, Scalability and Manageability

It order to be successful the XCM platform must provide all of the functionality outlined above. However it must also be capable of providing the performance, scalability and manageability features, and be capable of growing with the organization.

This rest of this paper will discuss how the Oracle Database 10g Release 2 XML DB Repository can be used as the basis for a light-weight Content Management System. Integral to XML DB is a internet-enabled WebDAV-compliance XML Content Repository. We will discuss the Repository with special focus on:

Understanding the functionality provided by the Oracle XML DB Repository

Explaining the Architecture of the Oracle XML DB Repository

Developing a simple content management application with Oracle XML DB

Reviewing some real-life usage scenarios and success stories

Page 10: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 10

Introducing the Oracle XML DB repository

The Oracle XML DB Repository extends the capabilities of Oracle Database 10g release 2 adding the features required to build an XCM system. These include

Support for industry standard internet protocols

The Oracle XML DB repository currently provides full support for the HTTP, WebDAV and FTP protocols. This allows Microsoft’s Windows and other operating systems that include support for the WebDAV standard to interact with content stored in the Oracle XML DB repository without requiring any additional software be installed on the client.

As a result of this, content creators and editors are free to use their tool of choice when adopting the Oracle XML DB repository as the foundation of an XCM system. Standard content creation tools such as Microsoft’s Office products, work seamlessly with content managed by the Oracle XML DB repository.

The WebDAV standard

The IETF WebDAV standard defines a set of extensions to the HTTP protocol that allow an HTTP server to act as a File Server. The WebDAV allows a DAV enabled editor to interact with an HTTP server in the same way that a normal editor interacts with a file system.

The WebDAV standard uses the term resource to describe a file or a folder. Every resource managed by a WebDAV server is identified by a URL. This terminology is adopted by the Oracle XML DB repository. In Oracle XML DB each resource is managed as an XML document.

The following screen shots show use of the WebDAV protocol to access content stored in the Oracle XML DB repository directly from the familiar environment of Microsoft Windows Explorer. No additional Oracle or Microsoft software, nor any complex middleware, is required in order to enable this capability.

Page 11: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 11

The first screen show shows configuring Microsoft Explorer to access the Oracle XML DB repository. Microsoft’s Windows ’My Network Places’ feature to map a folder to the root of the Oracle XML DB repository.

Note that all that the user is required to know if the URL of the repository and a valid user name and password. As can be seen below, once the Network Place has been defined opening in Microsoft’s Windows Explorer will show what appears to be a conventional Windows folder.

Page 12: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 12

Documents stored in these folders can be accessed and updated just like any other document. In the next screen Microsoft’s Word 2003 product is seen saving content directly into a folder which is managed by the Oracle Database 10g

Release 2.

Web Services Support

As well as supporting HTTP, WebDAV and FTP, the Oracle XML DB repository can be accessed using Web Services. This creates a programming-language- independent model for operating on content in Oracle XML DB.. This is important as many content-creation products such as Altova’s StyleVision product and Microsoft’s Office 2003 Suite already provide integrated support for using Web Services to access, update and manipulate content.

Resources

Each piece of content stored in the Oracle XML DB repository is associated with a resource. The content of a resource is defined by the WebDAV standard. The resource documents contain standard meta-data such as Display Name, Creator, Owner, Creation Date and Last Modification Date. The resource document is compliant with the XML Schema registered under the URL http://xmlns.oracle.com/xdb/XDBResource.xsd.

Page 13: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 13

Page 14: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 14

The following screen-shot shows the XML representation of a resource document

Support for Industry-Standard Content Models

The Oracle XML DB Repository uses XML Schema to define Content Models. This means that the Content Management platform and the content creation tools are able to share a single definition of what defines a valid document, making it impossible to store invalid documents. In addition, the level of definition provided by an XML Schema is sufficiently detailed to allow Oracle XML DB to optimize the searching, storage and management of XML document that conform to an XML-Schema.

With Oracle XML DB XML Schema can be used to define what the contents of an XML document must look like. It can also be used to define what additional Meta-data must exist for any document managed by the Oracle XML DB repository.

Comprehensive Access Control

The WebDAV standard includes a powerful security model based on the concept of using an Access Control Lists (ACL) to control access to content. An Access Control List consists of one or more Access Control Entries (ACE). Each ACE can grants or revoke a set of permissions to a particular principle (user of the system). Since an ACL can contain multiple ACEs a single ACL can grant or revoke one or more privileges to one or more principles

The Oracle XML DB Repository enforces access control using this approach. Each document in the Oracle XML DB repository is secured using an ACL. Whenever an attempt is made to perform an operation on a document the ACL is used to determine whether the user is allowed to perform that operation. Since access control is defined at the document level different security rules can be applied to documents that are associated with the same XML Schema.

Page 15: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 15

The ACL mechanism is implemented using the row level security feature of Oracle database 10g. This means that the process of evaluating access rights is highly scaleable since it part of the underlying engine. It also means that ACL based security is enforced regardless of whether content is accessed via the XML DB repository protocols or directly from SQL.

Access Control Lists

In Oracle XML DB each Access Control List (ACL) is represented as an XML document. An Access Control List is simply an XML document that conforms to, and identifies itself as a member of, the class defined by the Oracle XML DB ACL XML Schema. The principles referred to in the ACE can be a database user, database role or Oracle Internet Directory (OID) User or Group. The Oracle XML DB repository ships with four standard or bootstrap ACLs. These documents can be found in the folder /sys/acls. New ACLs can be created by simply inserting ACL documents into the Oracle XML DB repository.

The following screenshot shows an example of an Oracle XML DB ACL document:

Page 16: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 16

Basic Version Control

The Oracle XML DB repository includes a simple versioning model based on the WebDAV standard. The current implementation provides linear versioning. The version management system is initiated and controlled via simple PL/SQL procedures.

The first step in creating a versioned document is to initiate the versioning process. Once versioning has been initiated the original version of the document can no longer be modified. In order to update the document it is necessary to perform a check-out operation. Once checked-out the document can be updated multiple times. Once the necessary modifications have been made a check-in operation is required to mark the updated content as the next protected version of the document.

If during the editing process it is decided that the new version of the document is not required then an uncheck-out operation can be used to revert to the previous version of the document.

The versioning mechanism can be easily exposed to content creation applications via a simple Web Services model.

VVeerrssiioonn ##11

VVeerrssiioonn ##22 VVeerrssiioonn ……

VVeerrssiioonn nn

UUppddaattee

CChheecckk--iinn

CChheecckk--oouutt

Page 17: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 17

Custom Meta-data Management

In Oracle Database 10g release 2 the Oracle XML DB repository has been enhanced with the ability for organization to extend the basic meta-data tracked by the resource document with custom meta-data specific to their business needs. This meta-data is also stored as one or more XML documents which means all of the power of XML DB can be used to query or modify this extended meta-data and thereafter make appropriate decisions on the underlying content.

XML and Full-Text searching

While optimized for managing schema-based content, the Oracle XML DB repository also provides full content management facilities for other kinds of content, including non-schema-based XML, and non XML content. New enhancements introduced as part Oracle Database 10g Release 2, such as support for schema-based Meta-data; greatly improve the ability of the Oracle XML DB Repository to provide effective management of XML content. The Oracle XML DB Repository is also able to fully leverage the capabilities of Oracle Text for indexing and searching XML and non-XML content.

Relational database technology has been traditionally considered inefficient for managing hierarchical structures and performing the kind of operations required by content management applications. The Oracle XML DB repository includes a new Hierarchical Index that ensures path-based access that is as efficient as primary key access, and folder traversal operations are as efficient as index range scans. This index is totally transparent to the end-user and application developer. The index ensures that path access and folder traversal operations takes place at speeds that are comparable to or faster than conventional file-systems.

Full Programmatic Access

Oracle XML DB also provides the application programmer with direct access to the repository from both SQL and PL/SQL. This means that it is possible to develop applications to manipulate content managed by the Oracle XML DB repository from any language that supports the invocation of SQL (e.g. Java, C, C++, C#, Visual Basic and J#).

Oracle XML DB allows the full capabilities of SQL to be used to access and manipulate the content stored in the Oracle XML DB repository. The Oracle XML DB introduces two public views; RESOURCE_VIEW and PATH_VIEW which provide direct SQL based access to the full contents of the repository. New operators make it possible to restrict a SQL operation against these views to a document or a set of documents located within a particular folder inside the repository.

Page 18: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 18

Integrated support for Web Services make is easy to develop Web Services based on the Oracle XML DB repository, allowing any Web Services enabled client to integrate with Oracle XML DB.

Emerging industry standards such as JSR 170, the Java Content Management API, will offer developers alternative ways of accessing and manipulating content managed by the Oracle XML DB Repository.

Integration with Multi-Format Publishing Capabilities

In addition to supporting the creation and lifecycle management of content, an XCM system must be allow content to be published in a variety of formats. The open nature of XML ensures that the majority of popular publishing systems work directly with content stored in the Oracle XML DB repository.

Oracle XML DB provides a natively-compiled XSLT Virtual machine which delivers highly optimized XSLT processing. This can be used to transform XML managed by the Oracle XML DB Repository into other XML formats and XHTML. External processing, using any PDF rendering application of the user’s choice, can transform content managed by the XML DB into PDF and other formats.

The Oracle XML DB Repository also enables dynamic publishing of relational data via XML Views. Each row in an XML View can be exposed as a document in the XML Repository. Data in the row can then be accessed by simply opening the associated XML document. XML Views are based on the concept of transforming relational data into XML using the SQL:2003 SQL/XML operators .

Oracle XML DB Repository Architecture

As we have seen, the Oracle XML DB Repository is an integrated component of Oracle Database 10g Release 2. Internally, there are 3 major components to the Repository

The XDB database User and database Schema

The Protocol Servers

The Hierarchical Index

The XDB Database User and Database Schema

The XDB schema is a locked database account that contains all of the data required to provide the repository functionality. Under no circumstances should any attempt be made to access or update the content of the tables in this schema.

The main table in the XDB schema is XDB$RESOURCE. XDB$RESOURCE is an XMLType table that contains one row for each resource managed by the

Page 19: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 19

repository. A resource is an XML document that conforms to the XML Schema registered under the URL http://xmlns.oracle.com/xdb/XDBResource.xsd.

A resource document contains all of the system-maintained meta-data that is associated with each piece of content managed by the repository. Resource documents are made public via the RES columns of RESOURCE_VIEW and PATH_VIEW.

Other tables in the XDB database schema are used to track file / folder relationships, implement the Hierarchical Index, store ACLs and manage the meta-data required to provide optimized storage of schema-based XML.

The following diagram shows the main tables in the XDB database schema.

As well as managing meta-data, XDB$RESOURCE table contains a LOB column that is used to store the content of any documents that are not associated with a registered XML Schema. For schema-based XML content XDB$RESOURCE contains the system-maintained meta-data and a reference to row in the XMLType table that manages the content. Triggers on the default table ensure that the Repository meta-data is maintained in sync with operations that take place on rows stored in the default table.

Extended meta-data is also stored outside of the repository. When an application extends the basic meta-data tracked by Oracle XML DB, it must register an XML Schema that describes the additional metadata. A single XMLType table is used to manage each category of extended metadata. A resource can have multiple sets of extended meta-data associated with it.

EXTENDED

META-DATA

PATH_VIEW

XDB$H_INDEX

Hierarchical index

RESOURCE_VIEW

XDB$RESOURCE

XDB$SCHEMA

XDB$ACL

Internal Tables

XDB Database Schema

DEFAULT TABLES

Page 20: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 20

XML DB provides a number of PL/SQL packages that can be used to manipulate the repository. These packages include

The Protocol Servers

Oracle XML DB builds on the performant and scaleable architecture of Oracle Database 10g Release 2, allowing it to manage HTTP and FTP requests in the same way that it can manage SQL requests when configured for shared server mode. This allows the database to act as an HTTP server and FTP server as well as a server for SQL*NET requests.

When an HTTP or FTP request is received by the Protocol servers the request is translated into operations on the tables in the XDB database schema. By

DBMS_XDB Create, move, lock, secure and delete files and folders.

DBMS_XDB_VERSION

Used to initiate and manage versioning

DBMS_XDBZ Enable and Disable ACL based security and other repository functions

DBMS_XDBT Used to initiate and manage Oracle Text based index of the Oracle XML DB repository

TNS Listener

Dispatcher

Shared Servers

HTTP Request

HTTP Response

Page 21: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 21

Since the HTTP, FTP and WebDAV protocols were designed with document-centric operations in mind, they are typically more efficient than Oracle NET for manipulating large volumes of content.

The protocols are fully internationalized. This allows documents to be loaded into the repository from clients using different character sets. While content is loaded into the Oracle XML DB repository, it is converted into the database character set. When documents are retrieved from the repository they are converted back into the client’s character set. With HTTP and WebDAV this process is automatic, based on the mechanisms defined by the HTTP protocol. With FTP, quot commands are provided that allows the client to identify the client character set.

The protocol servers follow the transaction semantics of a file system. Each operation performed using the protocol servers is considered to be an atomic transaction.

Leveraging the proven scalability and reliability of Oracle’s existing infrastructure allows Oracle XML DB to deliver extremely powerful new capabilities with little or no increase in the complexity of the deployment environment. No additional software needs to be deployed or managed in order to deliver an XCM system based on the Oracle XML DB repository. Very little additional configuration is required in order to deploy this functionality.

The Hierarchical Index

The hierarchical index, and the associated XDB$HI_TABLE are used manage the information needed to allow the database to support using a file /folder mechanism to organize content. The hierarchical index is automatically created and maintained for resources participating in the Oracle XML DB Repository, and is not exposed to the application developer.

Page 22: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 22

A Sample XCM application with Oracle XML DB

The section discusses the architecture and technology for a simple XML Content Management application built using the Oracle XML DB repository. The application demonstrates some of the different techniques that can be used to create a simple HTML-based user interface for the Oracle XML DB Repository. The interface allows an end-user to browse a folder hierarchy and perform basic content management operations, such as version, check-in, check-out and publish. The application also demonstrates how the extensible-metadata feature of Oracle Database 10g Release 2 can significantly improve the searchability and manageability of content stored in Oracle XML DB repository.

Architecture

The application is implemented as a series of Java Servlets. In this case the Servlets are hosted by the XML DB HTTP server and the Oracle database. Each Servlet generates an XML Document which represents the result of the operation it performs. The Java Servlets use a mixture of JNDI JDBC and PL/SQL to access the contents of the XML DB repository.

Simple operations such as lookup, list directory, folder, delete and save make use of a JNDI implementation which uses JDBC to access the RESOURCE_VIEW and PATH_VIEW and invoke the DBMS_XDB package. JNDI is a very natural API for this kind of operation. More advanced operations such as version and publish are performed using methods that use JDBC to directly invoke the appropriate PL/SQL packages.

This following diagram shows the overall architecture of the application:

JJDDBBCC

OOrraaccllee XXMMLL DDBB HHTTTTPP SSeerrvveerr

XXMMLL CCMM AApppplliiccaattiioonn ((JJaavvaa SSeerrvvlleettss aanndd XXSSLL ssttyyllee sshheeeettss))

RREESSOOUURRCCEE__VVIIEEWW,, PPAATTHH__VVIIEEWW,,

DDBBMMSS__XXDDBB,, && DDBBMMSS__XXDDBB__VVEERRSSIIOONN

XXSSLLTT VVMM

JJNNDDII

Page 23: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 23

The key components of the application are as follows:

• The FolderProcessor Servlet: This Servlet is the basis of the application. It displays the contents of a Folder. It allows a user to navigate the folder tree, select items in a folder and perform operations on them.

• The UploadFiles Servlet: This Servlet makes use of a Multipart Mime processor to allow uploading of documents via the HTML user interface. It enables the automatic versioning of documents during upload processing.

• The NewFolder Servlet: This Servlet allows the creation of new folders.

• The SelectionProcessor Servlet: This Servlet allows basic content management actions to be performed on selected resources. It supports operations like copy, move, link, delete, publish, set permissions, lock, unlock, version, check-in, check-out as well as zip and unzip.

• The VersionHistory Servlet is used to display the content of versioned resources.

The application is extremely simple to deploy. All of the components (Java Source, XSL style sheets and background images) are simply stored as resources in the XML DB repository. They can be loaded into the repository using WebDAV. In order to deploy the application all that is required is to compile the servlet code using the Oracle database JVM and then register the virtual path for each servlet with the Oracle XML DB HTTP server. All of these operations are completed using simple PL/SQL procedures. It should be noted that since the application is a pure-Java application, it could also be deployed using a Java container such as Oracle’s OC4J or the Oracle Internet Application Server.

Page 24: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 24

The application makes extensive use of the XSLT rendering capabilities of Oracle XML DB. The entire user-interface is generated using a set of simple XSLT style-sheets which transform the XML documents generated by each servlet into the HTML pages seen by the user. The “View XML” and “View XSL” buttons at the bottom of each page allow direct access to the raw XML and associated XSL style sheet.

The following screen shot shows the main folder browsing interface.

From this screen users can use the drop down list to perform all of the basic features required to manage content in an XCM system, including navigating the folder hierarchy and controlling access rights and versioning.

Page 25: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 25

The following screen shots show the XML Document generated by the FolderProcessor Servlet and the XSLT Style sheet that is used to generate the User Interface from the generated XML:

This approach has the advantage that the definition and generation of the look-and- feel of the application is totally abstracted from the XML that defines which objects appear on each page. The appearance of the application can be altered by simply replacing the XSL style sheets.

Page 26: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 26

Enabling Custom Meta-data

The Oracle XML DB Repository allows developers to define multiple classes of extended meta-data. A particular resource can be associated with one or more classes of extended meta-data. There is an XML Schema and XMLType table for each class of extended meta-data.

The Repository manages extended data in a similar manner to the content of a schema-based XML document. The extended meta-data is stored as rows in an XMLType table. The Repository then keeps track of which rows in the XMLType table are associated with which resources. Whenever a resource’s properties are accessed the extended metadata appears as part of the XML representation of the resource.

The sample XML Content Management application demonstrates the use of extended meta-data feature of the Oracle XML DB repository. In this example extended meta-data is used to provide improved facilities for managing pictures taken using digital cameras.

Most modern digital cameras now support the Exchangeable Image File Format (EXIF). The EXIF standard defines how a digital camera can embed information about the settings that were used to take a picture into the JPEG files generated when a picture is taken. In Oracle database 10g Release 2 the Intermedia component includes support for the EXIF standard, allowing it to extract EXIF information from a JPEG file. The Intermedia image processor returns this information as an XML document.

Once the extended metadata demonstration has been installed, every time a JPEG image is loaded into the repository, Oracle XML DB invokes the Intermedia Image processor to extract any EXIF information contained in the Image. It takes the XML document generated by the Intermedia processing and uses the extended metadata management capabilities of Oracle database 10g Release 2.to store the EXIF information it as part of the resource.

Due to the complex nature of the EXIF processing, it is desirable to perform the extraction of the Meta-data asynchronously. This ensures that the response time of the protocol servers is not impacted by the overhead of meta-data extraction process.

The following steps are required to enable custom metadata management in the Oracle XML DB repository.

• Create an XML Schema that provides the definition of the extended meta-data as desired by the user. In the case of EXIF, a basic XML schema for the XML representation of the EXIF data is shipped with the Oracle 10g Release 2 Intermedia product. The custom metadata demonstration application extends this schema, defining a simple XML Schema that defines an element called imageMetadata with

Page 27: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 27

child elements RESID, imageURL, Title, Description, and exifMetadata. The element exifMetadata is imported from the EXIF XML schema supplied as part of Intermedia.

• Register the XML Schema with XML DB as an Extended Metadata XML Schema.

• Add code that will ensure that the extended Meta-data is captured and stored when ever a resource is created, updated or deleted. The EXIF metadata application uses Oracle 10g R2 Advanced Queuing (AQ) to asynchronously invoke a PL/SQL notification procedure whenever a n event takes place that affects a JPEG image stored in the Oracle XML DB repository. The message that is sent includes the Resource ID of the resource that was affected and the nature of the operation that took place.

In the case of the EXIF demonstration the process flow for notification procedure is as follows

1. Use the Resource ID to access the contents of the affected resource.

2. Use the Intermedia processor to extract the EXIF information contained in the body of the image and return it as an exifMetadata element.

<xs:schema targetNamespace="http://xmlns.oracle.com/demo/imageMetadata" elementFormDefault="qualified" attributeFormDefault="unqualified" xmlns:xdb="http://xmlns.oracle.com/xdb" xmlns:xs="http://www.w3.org/2001/XMLSchema" ‘ xmlns:exif="http://xmlns.oracle.com/ord/meta/exif" xmlns="http://xmlns.oracle.com/demo/imageMetadata"> <xs:import namespace="http://xmlns.oracle.com/ord/meta/exif" schemaLocation="http://xmlns.oracle.com/ord/meta/exif"/> <xs:element name="imageMetadata" type="imageMetadataType" <xs:complexType name="imageMetadataType" <xs:sequence> <xs:element name="RESID" type="xs:string"/> <xs:element name="imageURL" type="xs:anyURI"/> <xs:element name="Title" minOccurs="0"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="80"/> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="Description" type="xs:string" minOccurs="0"/> <xs:element ref="exif:exifMetadata"/> <xs:element name="Thumbnail"

Page 28: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 28

create procedure addImageMetadata(RESOURCE_ID RAW) as EXIF_METADATA xmlType; IMAGE_METADATA xmlType; IMAGE_METADATA_NAMESPACE VARCHAR2(128) := 'http://xmlns.oracle.com/demo/imageMetadata'; VIRTUAL_PATH varchar2(256); TARGET_PATH varchar2(256); XMLREF ref xmlType; begin select ANY_PATH into TARGET_PATH from RESOURCE_VIEW where RESID = RESOURCE_ID; VIRTUAL_PATH := dbms_xdb.createOIDPath(RESOURCE_ID); begin select value(METADATA) into EXIF_METADATA from table ( ordsys.ordimage.getMetadata ( xdburiType(VIRTUAL_PATH).getBlob(), 'EXIF' ) ) METADATA; Exception -- JPEG Image did not contain EXIF meta-data. when no_data_found then null; end; select xmlElement ( "img:imageMetadata", xmlAttributes ( IMAGE_METADATA_NAMESPACE as "xmlns:img", XDB_NAMESPACES.EXIF_NAMESPACE as "xmlns", 'http://www.w3.org/2001/XMLSchema-instance' as "xmlns:xsi", IMAGE_METADATA_NAMESPACE || ' &4' as "xsi:schemaLocation" ), XmlElement("img:RESID",null), xmlElement("img:imageURL" ,VIRTUAL_PATH), EXIF_METADATA ) into IMAGE_METADATA from dual; dbms_xdb.appendResourceMetadata(TARGET_PATH,IMAGE_METADATA);

3. Generate an imageMetadata document based on the exifMetadata element.

4. Attach the imageMetadata document to the original resource document as extended metadata using the methods provided the DBMS_XDB package.

The outline of the PL/SQL package is shown below:

Page 29: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 29

The following screen shows the XML representation and the final rendition of a resource document that has extended Meta-data associated with it.

Page 30: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 30

Versioning Support

The sample XCM application also provides an end user with access to the advanced versioning features of the XML DB repository.

The SelectionProcessor Servlet makes it possible to initiate and manage the versioning process. It provides manual control of each phase of the versioning process. Using this servlet it is possible to initiate versioning and perform check-out and check-in operations on one or more documents.

The UploadFiles Servlet also provides support for the versioning features of the Oracle XML DB repository. It provides an option to create a new version of a document when a document is uploaded into the repository. When this option is selected the Servlet automatically take whatever options are required to ensure that the document being uploaded becomes a new version of the target document.

Once a document has been versioned the VersionHistory Servlet can be used to view the metadata for each member of the version series. The following screenshot shows the VersionHistory Servlet begin used to review a versioned

Page 31: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 31

document.

Multi Format publishing

The application also demonstrates how XCM makes it easy to integrate with Multi-Channel publishing software. The XCM application can be configured to invoke a custom XSLT style sheet when a request is made to render a particular type of content. Customization can take place based on factors like mime-type, and in the case of XML documents, target namespace, document element name or XML Schema. In this example the system has been configured to use a customer XSLT to display PurchaseOrder documents. When ever a user clicks on a PurchaseOrder document this style sheet will be used to display the contents of the selected document. In this example the HTML pages includes support for calling Oracle’s XML Publisher facility to be used to publish the selected document in a number of different formats.

The following screen shot shows the HTML page generated by the XSLT that has been defined as the custom style sheet for PurchaseOrder documents.

As can be seen the form provides the option of publishing the document as HTML, RTF or PDF. When the user selects one of these options the Oracle XSQL Servlet is used to generate the selected output.

Page 32: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 32

The following screen shots show the PurchaseOrder document rendered in PDF and RTF formats

Page 33: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 33

Access Control

The XCM application allows users to manipulate the security permissions of the documents they own. Additional security rules can be specified by uploading new Access Control List documents via the UploadFiles Servlet or the FTP or WebDAV protocols.

Access to rights for existing documents can be modified by selecting the document in the folder browser and choosing the Set Permissions feature. This will display a form which presents the list of Access Control List documents that are available to secure the selected documents

The following screen show shows the SelectionProcessor Servlet being used to change the permissions on a document. The form lists the standard Bootstrap ACLs that ship with the Oracle XML DB repository.

• ALL_ALL_ACL.xml grants all permissions to all authenticated users.

• ALL_OWNER_ACL.xml grants all permissions to the owner of the document and denies all permissions from any other user.

• BOOTSTRAP_ACL.xml grants all permissions to the XDBADMIN roles and read access to all other users.

• RO_ALL_ACL.xml grants read-only access to all users.

Page 34: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 34

XML Content Management Use Cases

Emissions Reporting System

The UK government has mandated1, under its E-Government initiative, the use of XML and XML Schema for all inter-agency and other forms of communication. Recently the Department of Trade and Industry (DTI) issued new regulations relating to how the Oil and Gas industry must report on the emissions of various liquids and gases into the environment. As part of this initiative the DTI developed as set of XML schemas that defined the way in which this information must be reported.

All of the organizations affected by this mandate were members of an industry group. The group led the development of a system that would allow all of its members to comply with these regulations.

The system allows the various companies that operator in this industry to report their emissions one of two ways. The can use a web based system to author XML document using an XML Editor. The editor chosen is this case is Style-vision™ from Altova. The system also allows them to generate the reports automatically by upload Excel spreadsheets containing the required information via a Web Service.

In both cases the reports are stored in an Oracle XML DB repository running under Oracle Database 10g Release 1. The XML is then published back to the DTI using a custom Web Based API, which allows the government to access the data using Oracle’s Discover product. The operators can also use this system to view and edit the XML document. The publishing system leverages Oracle Application Server 10g.

Course Handbook Publishing System

A major University needs to publish a handbook of courses and units that can be taken at the university. The descriptive content on courses and units is held as XML documents in Oracle XMLDB. This content is created using Altova’s Stylevision™ XML Editor. The application uses the XSL engine in the database to transform XML derived from the XML documents and relational content with SQLX into HTML for the web site. The descriptive content regarding courses and units are created in this manner. The course schedule tables are rendered from XML generated with SQLX from self-referential tables and transformed with XSL scripts. The application they published this content to the outside world using Oracle Portal for the web framework for the site.

National Television Station Web Site

A national Television station needs to dynamically publish content related to its programming services and news to the Web. All of the content is authored as

1 http://www.govtalk.gov.uk/schemasstandards/schemasstandards.asp

Page 35: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 35

XML using an in-house content creation system based on XUL. The web pages seen by a visitor to the website are an aggregation of many different pieces of content. The same piece of content can appear on many different web pages. The final pages are generated from components stored in the Oracle XML DB repository using Apache Cocoon. A cache of the generated pages ensures that the site provides acceptable response time.

The details of which pieces of content are required to create a particular web page is maintained in a series of relational tables. The nature of the content that needs to be generated is such that there is a highly recursive relationship between the content items. They also need to version their content; there can be several hundred versions of some pieces of content. Each item of content is stored as a separate resource in the Oracle XML DB Repository. The Repository’s versioning capability is used to manage the content.

Each page on the Web Site is associated with an XML document. The document identifies all the content, and the 2 levels of related content required to generate the page. When the content editors create a new page, or alter the relationships between content in such a way as to change the content of an existing page, a new relationship document is generated for each page affected by the change. At that point cache invalidation takes place and the next time a request is received for one of the affected page the page is regenerated from the content stored in the Oracle XML DB Repository

Legislative Authoring Process

A major US Legislature is now authoring and publishing all of it’s Bills, Measures, and Statutes using XML. Authoring, which is done by Legislative assistants, takes place using a highly customized version of X-Metal™. X-Metal is a high-end, XML Schema aware XML editor from Blast Radius. X-Metal provides the legislative assistants with a familiar, work processor like, environment for authoring this content. X-Metal ensures that that the content that is created is fully compliant with the XML appropriate XML Schemas. The X-Metal customization adds support for important features like Red-lining

WebDAV support allows content created using X-Metal to be saved into an Oracle XML DB repository running on Oracle Database 10g Release 1. The XML Schemas are registered with Oracle XML DB, allowing the content to be validated and providing high performance query and analysis over the legislative content. Web Services are provided that allow users to control the versioning process from within X-Metal.

Due to the size of the legislature in question, and the amount of content that has is produces it was vital that the system be highly scaleable. Prior to going live 203,000 legacy legislative documents were converted into XML format and loaded into the Oracle XML DB repository. The converted documents allow the legislative assistants to research previous legislation during the authoring process.

Page 36: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 36

The system has be capable of handling a peak load based on creating, editing and publishing of over a 1000 documents per day.

Content publishing takes place in multiple formats, including PDF via XSL/FO and Render/X’s XSL FO engine

Page 37: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB Page 37

Conclusions

XCM based systems are particularly suitable for new deployments where the majority of content being managed is XML, or when there are large amounts of meta-data associated with the content being managed. While not designed to be traditional high-end content management systems (in terms of both custom software and expensive hardware) , XCM are capable of delivering the majority of features need to deliver a successful, performant solution based on open standards and commodity hardware.

Page 38: Managing Content with Oracle XML DBdisciplinas.lia.ufc.br › bdnc061 › arquivos › TWP_XMLDB... · INTRODUCTION Organizations are now realizing that significant improvements in

Managing Content with Oracle XML DB

March 2005

Author: Mark D Drake

Contributing Authors:

Oracle Corporation

World Headquarters

500 Oracle Parkway

Redwood Shores, CA 94065

U.S.A.

Worldwide Inquiries:

Phone: +1.650.506.7000

Fax: +1.650.506.7200

oracle.com

Copyright © 2005, Oracle. All rights reserved.

This document is provided for information purposes only and the

contents hereof are subject to change without notice.

This document is not warranted to be error-free, nor subject to any

other warranties or conditions, whether expressed orally or implied

in law, including implied warranties and conditions of merchantability

or fitness for a particular purpose. We specifically disclaim any

liability with respect to this document and no contractual obligations

are formed either directly or indirectly by this document. This document

may not be reproduced or transmitted in any form or by any means,

electronic or mechanical, for any purpose, without our prior written permission.

Oracle, JD Edwards, and PeopleSoft are registered trademarks of

Oracle Corporation and/or its affiliates. Other names may be trademarks

of their respective owners.