Upload
dora-wilcox
View
221
Download
0
Embed Size (px)
Citation preview
OKC Tools for XML Metadata Management
Marlon Pierce
Community Grids Lab
Indiana University
Overview
• We discuss systems we have built for managing XML metadata.
• Applications include – Newsgroups– Bibtex-based citation managers– Glossary term and abbreviation managers– RIB compatible browsers
• Running demos available from www.xmlnuggets.org.• Downloads of revised newsgroup application available
soon.• Challenge: promote scientific metadata usage
– Data provenance, HPC run archiving, etc.
Parts of the System
• Each application has one or more XML schemas that serve as a data model.
• The general system contains the following components:– Form wizards for creating valid XML instances for a particular
application.– Publishers or “feeders” that post messages into the system.– Unique URI generators for storing each message.– Persistent storage of entries (Oracle and MySQL).– Readers that provide RSS-based catalogues of topics.– Support for threaded messages, keyword searching.– Role-based access control.
W iz a rd
M a ilH a n d le r J M SS e rv e r
N e w s R e c o rd e r D a ta b a s e
N e w s F e e d e r
N e w s R e a d e r
P O R T A L
J M S p u b lis h
J M S s u b s c r ib e
J D B C
J D B C
R S S & X M L
H T M L toP o r ta l
J M S p u b lis h
U s e rs H T T P
S M T P
M a ilb o x
N e w p o s t& re p ly
P o rta l m e n u
M a ilD is tr ib u to r
J M Ss u b s c r ib e
S M T P
J M Sc lie n ts
J S PIn te rfa c e s
S M T PH o s t
N u m b erG en era to r(R es ou rc e c ou n te r )
S oc k e t
S oc k e t
<?xml version="1.0"?><rss version="0.91" xmlns:cg="http://grids.ucs.indiana.edu/okc/schema/cg/ver/1"><channel>
<title>Community Grids Project Reports</title> <image> <title>ptllogo</title> <url>http://www.communitygrids.iu.edu/img/smallLOGO.gif</url> <link>http://www.communitygrids.iu.edu</link> <description>Pervasive Technology Labs Logo</description> </image>
<Item> <name>CORBA</name> <URI>glossary/C/CORBA</URI> <description>Common Object Request Broker Architecture is an open distrubuted object-computing infrastructure being standardised by the Object Management Group.</description></Item><!—Other items deleted--> </channel></rss>
Sample Applications
Overviews of newsgroup, citation manager, and BIDM applications.
Newsgroup System Features
• Email and browser-based posting.• Supports attachments.• Multiple topic subscriptions• Periodic topic digests• Multiple user privileges
– Read through browser only– Post through browser only– Email notification with/without attachments.
Citation Browser
• Supports multiple schema descriptions based on bibtex– Journal articles, books, book chapters,
conference proceedings, tech reports, theses
• Import/upload bibtex into system, export topic to bibtex.
RIB Compatible Applications
• Basic system can be used with any schema, so we created a version using the Basic Interoperability Data Model (BIDM)– Developed by the RIB team– IEEE standard
• BIDM has two important extensions that we do not currently support.– Asset certification– Intellectual property rights
Steps for a Metadata Generator
• There were common tasks that we performed for each application:– Design an object model and create a W3C XML Schema to
represent it.– Create a memory object model of the schema, i.e. corresponding
Java classes.– Design an interface, i.e. HTML forms, for user inputs, and bind
the interface with the memory model.– Let users input data.– Finally, generate XML based on input, and publish it.
• Given these repetitive tasks, we have developed a general purpose tool that automates the creation of this process.
Generating XML Form Wizards
How to convert XML schemas into web applications
SchemaWizard and XML
• Schema Wizard maps XML Schema elements to HTML form elements through its schema parser, and creates the framework and logic for an XML form wizard.
• Users use newly generated wizards to create and publish XML instances, which follow a schema, to any destinations such as publish/subscribe messaging systems or through SMTP.
• XML form wizards are Web applications that also serve as validating XML editors and are customized through schema annotations.
Steps for a Metadata Generator
• There were common tasks that we performed for each application:– Design an object model and create a W3C XML Schema to
represent it.– Create a memory object model of the schema, i.e. corresponding
Java classes.– Design an interface, i.e. HTML forms, for user inputs, and bind
the interface with the memory model.– Let users input data.– Finally, generate XML based on input, and publish it.
• Given these repetitive tasks, we have developed a general purpose tool that automates the creation of this process.
SchemaWizard and XML
• Schema Wizard maps XML Schema elements to HTML form elements through its schema parser, and creates the framework and logic for an XML form wizard.
• Users use newly generated wizards to create and publish XML instances, which follow a schema, to any destinations such as publish/subscribe messaging systems or through SMTP.
• XML form wizards are Web applications that also serve as validating XML editors and are customized through schema annotations.
SchemaWizard Architecture• The steps that take place in generating a XML form wizard
1. The Schema Wizard unpacks and deploys the Web application package into a Web server’s application repository (i.e. webapps under Tomcat).
2. User provides with a location of the Schema.3. The Schema is read in to create an in-memory representation (SOM)
of the schema and also to create Java classes.• SOM=Castor’s Schema Object Model• SOM API provides a convenient interface to access the W3C XML
Schema structures.4. Using the SOM, Castor SourceGenerator creates Java classes that
correspond to the Schema structures. These classes form the memory model (i.e. Javabeans for JSP) and come with the necessary framework to parse and regenerate (marshal and unmarshal) XML instances.
5. Java classes are compiled, and binaries are placed into the new project’s directory structure.
SchemaWizard Architecture
Castor Schema UnmarshallerCastor Schema Unmarshaller
CastorSource
Generator
CastorSource
Generator
JavaBeans
Castor SOM
SchemaParser
SchemaParser
Velocity Templates
Java CompilerJava Compiler
Annotated XML Schema
WebApplicationTemplate
WebApplicationTemplate
Libraries Classes JSPs
XML Form Wizard created as a Web Application
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
SchemaWizard Architecture• The steps that take place in generating a XML form wizard
(cont.)6. Using the SOM once again, SchemaParser traverses the in-memory
schema and collects structure information, i.e. names, types, whether element or attribute, complex or simple type.
7. Based on this information, the parser chooses what type of template will be used, stores the information in a Velocity context, and invokes the template engine to generate the program logic presented in JSP. The parser also gathers the Schema annotations, i.e. page color, input sizes, at this level and place the parameters in the context.
8. The engine runs on templates placing each JSP code in its directory, creating the interface based on the user schema.
SchemaParserData Flow and Action
Traverse schema for types
Collect type information, create a context
Decide template:Project page
Index page
Simple type
Enumerated simple type
Unbounded simple type
Complex type
Unbounded complex type
Velocity Template Engine
Castor SOM
Schema object
Individual types
Velocity contextwith type info
Context, template
JSP
JavaBeans info
Templates
XML Schema location is given to SchemaWizard.
XML Form Wizard is generated.
XML instance is marshaled.
Schema Annotations
• Users can make cosmetic changes for the final project beforehand with annotations in the schema.
• W3C XML Schema allows developers to embed user defined languages into the schema using <xs:annotation> and <xs:appinfo> structures.
• Annotations for the whole schema affects the whole page, i.e. page title, background color, default input sizes, leading numbers on and off, XML browsing on and off.
<xs:annotation><xs:appinfo source="title">SchemaWizard Output for Topics Schema </xs:appinfo><xs:appinfo source="inputsize">30</xs:appinfo><xs:appinfo source="bgcolor">#e0e0ff</xs:appinfo><xs:appinfo source="leadingnumbers">false</xs:appinfo><xs:appinfo source="showxml">true</xs:appinfo>
</xs:annotation>
Schema Annotations
• Annotations for individual structures override the schema annotations, i.e. input size for each element. Also, labels for each element can be defined, and input fields can be changed to larger text areas with a textarea parameter and row numbers, or to password fields by a password parameter whose value set to true.
<xs:annotation><xs:appinfo source=“label">User Password</xs:appinfo><xs:appinfo source="inputsize">15</xs:appinfo><xs:appinfo source=“password">true</xs:appinfo>
</xs:annotation>…
<xs:annotation><xs:appinfo source=“label">Memo</xs:appinfo><xs:appinfo source=“textarea">5</xs:appinfo>
</xs:annotation>
Smaller input size
Textarea, row count set to 5
Unbounded elementwith its own add/delete buttons
XML browsing turned on
Title set
Background set to gray
Access Rights, Controls and Roles
Topic based permissions
System Access Control Overview
• The core of the system contains a JMS-based publish/subscribe system.
• Postings are thus based on JMS topics, or channels.
• Access privileges (read/write by web, read/write by email, modify privileges) are enforced for each topic.
User Privileges
• Users request access to specific topics/channels.– Granted by administrator for that topic
• Can request– Read/write by browser– Read/write by email (newsgroups)– Receive/dont’ receive attachments.
• Topic admin can edit these requests.
Topic Administrator Privileges
• Topic admins can approve or revoke access to topics.
• Can also modify individual privileges – Revoke post privilege, require email notification.
• Have all other rights of users for that topic.• Topics can have multiple administrators.• A person can be a regular user of one group and
administer another group.
Super-Administrator Privileges
• A super admin manages an entire application.
• Can create new topics.
• Can assign administration privileges to users.
• Can act as administrator and regular user of all topics.
User U1 start to useChannel Ch2
User U1Role R1
Channel Ch1
User Request PoolUser U2Role R1
Channel Ch1
User U3Role R2
Channel Ch3
Administrator ofCh1
Confirmed
Confirm
Administrator ofCh3
User U1Role R1
Channel Ch2
Administrator ofCh2
Confirm ed Rejected
UserU1
UserU2
UserU1
Ch1R1
no use of the ChannelCh3 for User U3
UserU3
Result Pool
Admin Pool
User U1 and U2 start touse Channel Ch1
Ch3R2
Ch1 R1