34
OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Embed Size (px)

Citation preview

Page 1: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

OKC Tools for XML Metadata Management

Marlon Pierce

Community Grids Lab

Indiana University

Page 2: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Overview

• We discuss systems we have built for managing XML metadata.

• Applications include – Newsgroups– Bibtex-based citation managers– Glossary term and abbreviation managers– RIB compatible browsers

• Running demos available from www.xmlnuggets.org.• Downloads of revised newsgroup application available

soon.• Challenge: promote scientific metadata usage

– Data provenance, HPC run archiving, etc.

Page 3: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Parts of the System

• Each application has one or more XML schemas that serve as a data model.

• The general system contains the following components:– Form wizards for creating valid XML instances for a particular

application.– Publishers or “feeders” that post messages into the system.– Unique URI generators for storing each message.– Persistent storage of entries (Oracle and MySQL).– Readers that provide RSS-based catalogues of topics.– Support for threaded messages, keyword searching.– Role-based access control.

Page 4: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

W iz a rd

M a ilH a n d le r J M SS e rv e r

N e w s R e c o rd e r D a ta b a s e

N e w s F e e d e r

N e w s R e a d e r

P O R T A L

J M S p u b lis h

J M S s u b s c r ib e

J D B C

J D B C

R S S & X M L

H T M L toP o r ta l

J M S p u b lis h

U s e rs H T T P

S M T P

M a ilb o x

N e w p o s t& re p ly

P o rta l m e n u

M a ilD is tr ib u to r

J M Ss u b s c r ib e

S M T P

J M Sc lie n ts

J S PIn te rfa c e s

S M T PH o s t

N u m b erG en era to r(R es ou rc e c ou n te r )

S oc k e t

S oc k e t

Page 5: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

<?xml version="1.0"?><rss version="0.91" xmlns:cg="http://grids.ucs.indiana.edu/okc/schema/cg/ver/1"><channel>

<title>Community Grids Project Reports</title> <image> <title>ptllogo</title> <url>http://www.communitygrids.iu.edu/img/smallLOGO.gif</url> <link>http://www.communitygrids.iu.edu</link> <description>Pervasive Technology Labs Logo</description> </image>

<Item> <name>CORBA</name> <URI>glossary/C/CORBA</URI> <description>Common Object Request Broker Architecture is an open distrubuted object-computing infrastructure being standardised by the Object Management Group.</description></Item><!—Other items deleted--> </channel></rss>

Page 6: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Sample Applications

Overviews of newsgroup, citation manager, and BIDM applications.

Page 7: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Newsgroup System Features

• Email and browser-based posting.• Supports attachments.• Multiple topic subscriptions• Periodic topic digests• Multiple user privileges

– Read through browser only– Post through browser only– Email notification with/without attachments.

Page 8: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University
Page 9: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Citation Browser

• Supports multiple schema descriptions based on bibtex– Journal articles, books, book chapters,

conference proceedings, tech reports, theses

• Import/upload bibtex into system, export topic to bibtex.

Page 10: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University
Page 11: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University
Page 12: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

RIB Compatible Applications

• Basic system can be used with any schema, so we created a version using the Basic Interoperability Data Model (BIDM)– Developed by the RIB team– IEEE standard

• BIDM has two important extensions that we do not currently support.– Asset certification– Intellectual property rights

Page 13: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University
Page 14: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Steps for a Metadata Generator

• There were common tasks that we performed for each application:– Design an object model and create a W3C XML Schema to

represent it.– Create a memory object model of the schema, i.e. corresponding

Java classes.– Design an interface, i.e. HTML forms, for user inputs, and bind

the interface with the memory model.– Let users input data.– Finally, generate XML based on input, and publish it.

• Given these repetitive tasks, we have developed a general purpose tool that automates the creation of this process.

Page 15: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Generating XML Form Wizards

How to convert XML schemas into web applications

Page 16: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

SchemaWizard and XML

• Schema Wizard maps XML Schema elements to HTML form elements through its schema parser, and creates the framework and logic for an XML form wizard.

• Users use newly generated wizards to create and publish XML instances, which follow a schema, to any destinations such as publish/subscribe messaging systems or through SMTP.

• XML form wizards are Web applications that also serve as validating XML editors and are customized through schema annotations.

Page 17: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Steps for a Metadata Generator

• There were common tasks that we performed for each application:– Design an object model and create a W3C XML Schema to

represent it.– Create a memory object model of the schema, i.e. corresponding

Java classes.– Design an interface, i.e. HTML forms, for user inputs, and bind

the interface with the memory model.– Let users input data.– Finally, generate XML based on input, and publish it.

• Given these repetitive tasks, we have developed a general purpose tool that automates the creation of this process.

Page 18: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

SchemaWizard and XML

• Schema Wizard maps XML Schema elements to HTML form elements through its schema parser, and creates the framework and logic for an XML form wizard.

• Users use newly generated wizards to create and publish XML instances, which follow a schema, to any destinations such as publish/subscribe messaging systems or through SMTP.

• XML form wizards are Web applications that also serve as validating XML editors and are customized through schema annotations.

Page 19: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

SchemaWizard Architecture• The steps that take place in generating a XML form wizard

1. The Schema Wizard unpacks and deploys the Web application package into a Web server’s application repository (i.e. webapps under Tomcat).

2. User provides with a location of the Schema.3. The Schema is read in to create an in-memory representation (SOM)

of the schema and also to create Java classes.• SOM=Castor’s Schema Object Model• SOM API provides a convenient interface to access the W3C XML

Schema structures.4. Using the SOM, Castor SourceGenerator creates Java classes that

correspond to the Schema structures. These classes form the memory model (i.e. Javabeans for JSP) and come with the necessary framework to parse and regenerate (marshal and unmarshal) XML instances.

5. Java classes are compiled, and binaries are placed into the new project’s directory structure.

Page 20: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

SchemaWizard Architecture

Castor Schema UnmarshallerCastor Schema Unmarshaller

CastorSource

Generator

CastorSource

Generator

JavaBeans

Castor SOM

SchemaParser

SchemaParser

Velocity Templates

Java CompilerJava Compiler

Annotated XML Schema

WebApplicationTemplate

WebApplicationTemplate

Libraries Classes JSPs

XML Form Wizard created as a Web Application

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

Page 21: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

SchemaWizard Architecture• The steps that take place in generating a XML form wizard

(cont.)6. Using the SOM once again, SchemaParser traverses the in-memory

schema and collects structure information, i.e. names, types, whether element or attribute, complex or simple type.

7. Based on this information, the parser chooses what type of template will be used, stores the information in a Velocity context, and invokes the template engine to generate the program logic presented in JSP. The parser also gathers the Schema annotations, i.e. page color, input sizes, at this level and place the parameters in the context.

8. The engine runs on templates placing each JSP code in its directory, creating the interface based on the user schema.

Page 22: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

SchemaParserData Flow and Action

Traverse schema for types

Collect type information, create a context

Decide template:Project page

Index page

Simple type

Enumerated simple type

Unbounded simple type

Complex type

Unbounded complex type

Velocity Template Engine

Castor SOM

Schema object

Individual types

Velocity contextwith type info

Context, template

JSP

JavaBeans info

Templates

Page 23: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

XML Schema location is given to SchemaWizard.

XML Form Wizard is generated.

XML instance is marshaled.

Page 24: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Schema Annotations

• Users can make cosmetic changes for the final project beforehand with annotations in the schema.

• W3C XML Schema allows developers to embed user defined languages into the schema using <xs:annotation> and <xs:appinfo> structures.

• Annotations for the whole schema affects the whole page, i.e. page title, background color, default input sizes, leading numbers on and off, XML browsing on and off.

<xs:annotation><xs:appinfo source="title">SchemaWizard Output for Topics Schema </xs:appinfo><xs:appinfo source="inputsize">30</xs:appinfo><xs:appinfo source="bgcolor">#e0e0ff</xs:appinfo><xs:appinfo source="leadingnumbers">false</xs:appinfo><xs:appinfo source="showxml">true</xs:appinfo>

</xs:annotation>

Page 25: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Schema Annotations

• Annotations for individual structures override the schema annotations, i.e. input size for each element. Also, labels for each element can be defined, and input fields can be changed to larger text areas with a textarea parameter and row numbers, or to password fields by a password parameter whose value set to true.

<xs:annotation><xs:appinfo source=“label">User Password</xs:appinfo><xs:appinfo source="inputsize">15</xs:appinfo><xs:appinfo source=“password">true</xs:appinfo>

</xs:annotation>…

<xs:annotation><xs:appinfo source=“label">Memo</xs:appinfo><xs:appinfo source=“textarea">5</xs:appinfo>

</xs:annotation>

Page 26: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Smaller input size

Textarea, row count set to 5

Unbounded elementwith its own add/delete buttons

XML browsing turned on

Title set

Background set to gray

Page 27: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Access Rights, Controls and Roles

Topic based permissions

Page 28: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

System Access Control Overview

• The core of the system contains a JMS-based publish/subscribe system.

• Postings are thus based on JMS topics, or channels.

• Access privileges (read/write by web, read/write by email, modify privileges) are enforced for each topic.

Page 29: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

User Privileges

• Users request access to specific topics/channels.– Granted by administrator for that topic

• Can request– Read/write by browser– Read/write by email (newsgroups)– Receive/dont’ receive attachments.

• Topic admin can edit these requests.

Page 30: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Topic Administrator Privileges

• Topic admins can approve or revoke access to topics.

• Can also modify individual privileges – Revoke post privilege, require email notification.

• Have all other rights of users for that topic.• Topics can have multiple administrators.• A person can be a regular user of one group and

administer another group.

Page 31: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Super-Administrator Privileges

• A super admin manages an entire application.

• Can create new topics.

• Can assign administration privileges to users.

• Can act as administrator and regular user of all topics.

Page 32: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University
Page 33: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

User U1 start to useChannel Ch2

User U1Role R1

Channel Ch1

User Request PoolUser U2Role R1

Channel Ch1

User U3Role R2

Channel Ch3

Administrator ofCh1

Confirmed

Confirm

Administrator ofCh3

User U1Role R1

Channel Ch2

Administrator ofCh2

Confirm ed Rejected

UserU1

UserU2

UserU1

Ch1R1

no use of the ChannelCh3 for User U3

UserU3

Result Pool

Admin Pool

User U1 and U2 start touse Channel Ch1

Ch3R2

Ch1 R1

Page 34: OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University

Contact Info

• See www.xmlnuggets.org for more information.

• Email: [email protected].