Extended Metadata Registry (XMDR) September 2004 Bruce Bargmeyer +1 (510) 495-2905...

Preview:

Citation preview

Extended Metadata Registry (XMDR)

September 2004

Bruce Bargmeyer+1 (510) 495-2905bebargmeyer@lbl.gov

Interagency/International Cooperation on EcoinformaticsBrussels, Belgium

2

Past, Present, Future

Lots of users Lots of information systems

Lots of DataSources

UsersUsers

EEA

DOE

DoD

EPAenvironagricultureclimatehuman healthindustrytourismsoilwaterair

123345445670248591308

123345445670248591308

3268082513485038270800002178

3268082513485038270800002178

text data

environagricultureclimatehuman healthindustrytourismsoilwaterair

123345445670248591308

123345445670248591308

3268082513485038270800002178

3268082513485038270800002178

text

ambienteagriculturatiemposalud hunanoindustriaturismotierraaguaaero

123345445670248591308

123345445670248591308

3268082513485038270800002178

3268082513485038270800002178

text data

data

environagricultureclimatehuman healthindustrytourismsoilwaterair

123345445670248591308

123345445670248591308

3268082513485038270800002178

3268082513485038270800002178

text data

Others . . .

ambienteagriculturatiemposalud hunoindustriaturismotierraaguaaero

123345445670248591308

123345445670248591308

3268082513485038

3268082513485038270800002178

text data

3

Data Standards

Avoid a combinatorial explosion of data content, description, and metadata arrangements for information storage, access and interchange. Data standards and metadata registries can help.

4

Data Element ConceptAfghanistan

Belgium

China

Denmark

Egypt

France

Germany

…………

Data ElementsData ElementsAFG

BEL

CHN

DNK

EGY

FRA

DEU

…………

ISO 3166English Name

ISO 31663-Numeric Code

004

056

156

208

818

250

276

…………

ISO 31663-Alpha Code

Afghanistan

Belgium

China

Denmark

Egypt

France

Germany

…………

Name:Context:Definition:Unique ID: 4572Value Domain:Maintenance Org.:Steward:Classification:Registration Authority:Others

Name:Context:Definition:Unique ID: 3820Value Domain:Maintenance Org.:Steward:Classification:Registration Authority:Others

Name:Context: Definition:Unique ID: 1047Value Domain:Maintenance Org.:Steward:Classification:Registration Authority:Others

Name: Country IdentifiersContext:Definition:Unique ID: 5769Conceptual Domain:Maintenance Org.:Steward:Classification:Registration Authority:Others

5

Data_Element_Concept_Relationship

<<Required>> type_description

Non_enumerated_Domain

<<Required>> description

Value_Domain_Relationship

<<Required>> type_description

Enumerated_Domain

Permissible_Value

<<Required>> item<<Required>> begin_date<<Conditional>> end_date

2..n

1..*

+member_of2..n

+specifing1..*

allowed_value

Value_Meaning

<<Required>> identifier<<Optional>> description<<Required>> begin_date<<Conditional>> end_date

2..n

0..*

+contained_in2..n

+containing

0..*

permissib le_value

1..*

0..*

+represented_by

1..*

+representing0..*

permissib le_value_meaning

Conceptual_Domain

<<Optional>> administered_component_information : Administered_Component<<Optional>> dimensionality

0..*0..*

+containing

0..*

comceptual_domain_relationship

+contained_in0..*

1..*

0..*

+containing

1..*

+contained_in

0..*

value_meaning_set

Value_Domain

<<Optional>> administered_component : Administered_Component<<Optional>> name<<Required>> datatype : Datatype<<Optional>> maximum_character_quantity<<Optional>> minimum_character_quantity<<Optional>> format<<Optional>> unit_of_quantity : Unit_of_Quantity

0..*

0..1

+contained_in

0..*

value_domain_relationship

+containing

0..1 0..*

1..1

+representing0..*

+specified_by

1..1

specification

Example

<<Required>> item

Data_Element_Concept

<<Required>> administered_component : Administered_Component<<Optional>> object_class : Object_Class<<Optional>> object_class_qualifier<<Optional>> property : Property<<Optional>> property_qualifier

0..1

0..* +containing0..1

data_element_concept_relationship

+contained_in0..*

1..10..*+specifing

1..1+having

0..*

data_element_concept_conceptual_domain_relationship

Data_Element

<<Required>> administered_component : Administered_Component<<Required>> representation_class : Representation_Class<<Optional>> representation_class_qualifier

0..* 1..1

+represented_with

0..*+providing_representation_for

1..1

representation

1..*

1..*

+represented_by

1..*

+representing1..*

exemplication

0..*

1..1

+providing_representation_to

0..*

+represented_by

1..1

expression

Rule

<<Optional>> administered_component : Administered_Component<<Required>> description

Source_Data_Element

0..*

1..*

+containing

0..*

+contained_in

1..*

derivation_input

0..1

1..1

+is_input_to0..1

+resulting_from

1..1

derivation_output

1..1

0..*

+is_formula_for1..1

+used_by0..*

derivation

Proposal for Comments11179-3 RevisionDD Mann

PAGE 111179-3 METAMODELMain Model

NOTE:

This model represents the logical structure of a registry for data elements and related components that are in a "recorded" or higher registration status.

For UML v1.3 documentation see:ftp://ftp.omg.org/pub/docs/ad/99-06-08.pdf

1999-12-13

AFG

BEL

CHN

DNK

EGY

FRA

DEU

…………

004

056

156

208

818

250

276

…………

Afghanistan

Belgium

China

Denmark

Egypt

France

Germany

…………

6

Metadata RegistriesSemantics Management Evolution

Database (schema) integration System design

Data use - metadata Warehouse support – schema and metadata XML support (schema) “Backed into” terminology support Next: Semantics servers -- for semantics

web and semantics based computing

7

Metadata Registries

Metadata Registry

Terminology Thesaurus Themes

DataStandards

Ontology GEMET

StructuredMetadata

ISO/IEC 11179 Metadata Registries

8

Concept

Sign Object

Elements of Terminology

troutSalmo truttabrown trouttruite

any of several game fishes of the genus Salmo, related to the salmon...

9

Terminology

Terms ContextConcept

troutSalmo truttatruite

common namescientific nameFrench name

any of several game fishes of the genus Salmo, related to the salmon...

UIN=6349

10

Terms ContextConcept

Brown troutSalmo truttatruite

common namescientific nameFrench name

UIN=6349

DataElements

NameName: trout species

Definition: The names of species of trout.Values:brook trout Salvelinus fontinalisbrown trout Salmo truttacutthroat trout Oncorhynchus clarkii

11

Systems:STORETEnvirofacts . . .

DBMSQuery

Terms ContextConcept

Brown troutSalmo truttatruite

common namescientific nameFrench name

UIN=6349

XML SchemasEDI Messages

DataInterchange

Federal RegisterRegulationsReports

DocumentsPublishing

DataElements

12

Search Example:Fish TroutTrout

Documents Data1 2 3 4 5 6 7 8

SearchEngine

Terms ContextConcept

Brown troutSalmo truttatruite

common namescientific nameFrench name

UIN=6349Thesaurus

Salmo truttaBrown Trout

trout

fish

13

Terms ContextConcept

Brown troutSalmo truttatruite

common namescientific nameFrench name

UIN=6349

LocalMapping

CentralMapping

Query AgentBrokerMediatorResource Agent

IntelligentInformation

Services (IIS):Ontology

Example:

fish trout brown trout

14 14

Global Ontology Observation Station

U nit Determinant

Medium

AnalyticalFraction TimeStamp

NERIObservationCharacteristics NERITime

NERIStation

Table(x) Table(y) Table(z) Table(m)

Local Ontology

Local DB Schema

Semantic Mapping

15

GEneral MultilingualEnvironmental

Thesaurus (GEMET)And CNR Earth

Thesaurus

DOE, NIH and NCI Safety and HealthConcepts/terms

DataElements

Terminology Sources

Terms ContextConcept

Brown troutSalmo truttatruite

common namescientific nameFrench name

UIN=6349

I/ICE ParticipantsEcotermGovernmentState/LocalPrivate EnterpriseAcademe

16

Terminology Management

DictionaryKeyword

Keyword

OntologyOntologyThesaurusThesaurus

DataDataElementsElements

SearchEngine

DBMS/EDI/ Documents

IISa category of vertebrate, cold-blooded craniate animals with permanent gills...

SearchEngine

DBMS/EDI/ Documents

SemanticsServer

17

Purpose of XMDR

Extend Semantics Management Capabilities of ISO/IEC 11179

Test & Demo Extended Capabilities in a Reference Implementation

Produce Design for Next Generation Operational 11179 Registries Propose Revisions to 11179 Parts 2 & 3 (Ver. 3)

Adapt & Adopt Emerging (Semantic) Technologies Help Resolve Registration & Interrelation Issues for

Complex Metadata Standards

Forging Semantics Based Computing

18

Project Background

Collaborative, Interagency Effort DOD, EPA, LBNL, USGS, NCI, Mayo Clinic…Others?

Draws on and Contributes to Interagency/International Cooperation on Ecoinformatics

Involves International, National, State, Local Government Agencies, other Organizations

Recognizes Great Potential of Semantics-based Computing, Management of Metadata Improving Collection, Maintenance, Dissemination, Processing of Very

Diverse Data Structures Collaboration Arises from Need to Share Diverse Data Across

Multiple Organizations Project Duration Expected to be July 04 – Jun 05, +

Many Players, Many Interests…Shared Context

19

Concept Of Operations

Service Oriented Architecture Enables Heterogeneous, Disparate Systems to Interoperate Agreement in the Interface, Not the Implementation Publish, Find, Bind

Standards Based Design Lifecycle Application Support Abstract Technology Commonalities Open Standards, Technology Agnostic Durable: Used for Current and Future Technologies

Semantic Web Service Publish, Find, Bind…Automatically Component of Semantic Web Bootstrap Semantic Web?

20

Major Tasks, Deliverables & Milestones

Task/Deliverable DateDevelop Project Plan Jul 04

Identify, Select Technologies Dec 04

Identify, Select Metadata Sources Dec 04

Initial Architecture Design Aug 04

Research and Development < EOP

System Test & Evaluation (Internal Participants) < Mar 05

Test Implementation (External Users) Mar 05

Present Proposed 11179 Part 2 Revisions to SC32

WG2 mtg in DC

Nov 04

Prepare Draft Revision of 11179 Part 2 for SC32

mtg in Berlin

Apr 05

21

Potential Standards/Technologies

DBMS Object, XML, Relational, RDF/Graph, Logic, Text, Document,

Multimedia Knowledge Representation

Web Ontology Language (OWL) Simple Common Logic (SCL)

Middleware/Messaging Cocoon 2, Jini, CoABS, JMS, XMLBlaster, SOAP

XML [Semantic] Web Services Axis, JWSDP

Agent Development ABLE, JADE

Engines/Servers OMS (IBM), Federator/OMS (OWI) Jess

22

ISO/IEC 11179Expressed as an Ontology

<?xml version="1.0" encoding="ISO-8859-1"?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns="http://www.owl-ontologies.com/unnamed.owl#" xml:base="http://www.owl-ontologies.com/unnamed.owl"> <owl:Ontology rdf:about=""/> <owl:Class rdf:ID="Registrar"> <rdfs:subClassOf rdf:resource="http://www.w3.org/2002/07/owl#Thing"/> <rdfs:subClassOf> <owl:Restriction> <owl:cardinality rdf:datatype="http://www.w3.org/2001/XMLSchema#int" >1</owl:cardinality> <owl:onProperty> <owl:ObjectProperty rdf:ID="contact"/> </owl:onProperty> </owl:Restriction> </rdfs:subClassOf> <rdfs:subClassOf> <owl:Restriction>

23

Potential Content Domains

Environmental (Ecoterm, GBIF, …) Biomedical Chemical Geographic Information Systems Bibliographic Ontologies/Metadata Standards General Terminologies/Ontologies Economic Code Sets Other

Diverse Domains, Structures…Representative Samples

24

Other Calendar Events

Date Event Location Comments

13-14 Jul 04 Project Kickoff Berkeley, CA LBNL Hosted

7-11 Sep 04 MedInfo 2004 San Fran, CA

13-15 Oct QPR Berkeley, CA Quarterly Project Review

4-6 Nov 04 FOIS Torino, IT Formal Ontologies in Info. Systems

7-11 Nov 04 ISWC Hiroshima Int’l Semantic Web

Conference

8-12 Nov 04 SC32/WG2 Meeting DC Provide Progress Report

11-14 Apr 05 Open Forum on

Metadata Registries

Berlin Report 11179 Revisions

Involved in Other Disciplines

Metadata Registries

Companies

Universities

Agencies

DataServices

SemanticServices

Others

UsersUsers

September 2004

Env

ironm

enta

l Dat

a G

rid

Environmental Computer GridHigh Performance, cluster, Personal

Environmental SemanticsGrid

Terminology Thesaurus Ontology Taxonomy

StructuredMetadata

ComputationServices

Software:Models, Visualization, AnalysisAgent systemsSemantic Based Computing

DataStandards

26

XMDR & I/ICE

How do we collaborate? Ecoterm GBIF EPA EDR/TRS EEA Data dictionary NIH/NCI Agriculture

Recommended