34
1 FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot Brand Niemann (US EPA), Chair, Semantic Interoperability Community of Practice (SICoP) Best Practices Committee (BPC), CIO Council December 28, 2005 http://web-services.gov/ and http://colab.cim3.net/cgi-bin/wiki.pl?SICoP http://colab.cim3.net/cgi-bin/wiki.pl? DRMImplementationThroughIterationandTestingPilotProjects

FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

  • Upload
    janice

  • View
    92

  • Download
    0

Embed Size (px)

DESCRIPTION

FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot. Brand Niemann (US EPA), Chair, Semantic Interoperability Community of Practice (SICoP) Best Practices Committee (BPC), CIO Council December 28, 2005 http://web-services.gov/ and http://colab.cim3.net/cgi-bin/wiki.pl?SICoP - PowerPoint PPT Presentation

Citation preview

Page 1: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

1

FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

Brand Niemann (US EPA), Chair,Semantic Interoperability Community of Practice (SICoP)

Best Practices Committee (BPC), CIO CouncilDecember 28, 2005

http://web-services.gov/ and http://colab.cim3.net/cgi-bin/wiki.pl?SICoP

http://colab.cim3.net/cgi-bin/wiki.pl?DRMImplementationThroughIterationandTestingPilotProjects

Page 2: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

2

Preface

• Conceptual Data Model – a model to guide data architecture and not a model to guide database development.

• But an ontology provides both and the pilot is both a CDM and an executable application based on DRM 2.0!

• So Data Architecture can be implemented in ontology-driven information systems.– See next slide.

Page 3: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

3

Preface

• Ontology-Driven Information Systems:– Methodology Side – the adoption of a highly

interdisciplinary approach:• Analyze the structure at a high level of generality.• Formulate a clear and rigorous vocabulary.

– Architectural Side – the central role in the main components of an information system:

• Information resources.• User interfaces.• Application programs.

See for example: Nicola Guarino, Formal Ontology and Information Systems,Proceedings of FOIS ’98, Trento, Italy, 6-8 June 1998.

Page 4: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

4

Preface

Layer FHA/DAWG SICoP DRM 2.0 Pilot

1 CDM Ontology FHA Health Domains & Data Element Concepts

(1-2) Wiring diagram

Semantic Relationships

Semantic links & associations

2 DB Systems

Metadata & Data

Health US 2005

Page 5: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

5

Preface• Health Information Technology CoP’s Health Information

Technology Ontology Project (HITOP)* Major Objectives and Examples:– 1. Ontology-graph assisted search of medical literature:

SemanTxLife Sciences Pilot.– 2. Ontology in major health standards development: Barry Smith

- HL7 RIM.– 3. Ontology in the FHA Data Architecture Work Group: Brand

Niemann – DRM 2.0 Pilot.– 4. Ontologies in bioinformatics: Ken Baclawski – Book and

Keynote Presentation.– 5. Ontologies in operational clinical systems: Mark Musen –

Stanford Medical Informatics.– 6. Ontologies in large –scale medical research systems: Connor

Skankey – Visual Knowledge BioCAD.

* Marc Wine, GSA Office of Intergovernmental Solutions, CoP Lead

Page 6: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

6

Preface• Question: Should the FHA DAWG be overly focused on metadata?

– Metadata and data are integrated together in DRM 2.0 and the pilot.• Question: Should FHA DAWG work with unstructured or semi-

structured data or defer this task to partners/agencies?– All three types of data are integrated together in DRM 2.0 and the pilot.

• Question: Should FHA DAWG also add physical data modeling to methodology?– The DRM ITIT Pilot shows how both conceptual and physical data are

done together with ontologies.• Question: Should educational material on metadata and data

modeling be present in the Data Strategy?– DRM 2.0 put educational material in the DRM Reference Model and ITIT

Wiki Pages, not the Reference Model Document itself.

Page 7: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

7

Preface• Question: Should we align more closely to FEA DRM?

– Aligning with DRM 2.0 adds credibility to the work and pilot specifically demonstrates the three components of DRM 2.0.

• Question: How detailed of a level of analysis can be performed by the FHA DAWG?– This depends on the level of detailed data and information that

the FHA partners are willing to expose, e.g. the pilot uses summary data that is in the public domain.

• Question: Does the FHA DAWG analyze only (discover) or does it prescribe a solution (recommendation) like semantic harmonization scenarios?– SICoP and DRM ITIT are concerned with achieving semantic

harmonization and interoperability. E.g., the suggestion to include the CHI vocabularies in the pilot should be implemented.

Page 8: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

8

Overview

• 1. The New Data Reference Model 2.0• 2. Health, United States, 2005• 3. Data Architecture Working Group• 4. Pilot Project• Appendix. Other Related Work

Page 9: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

9

1. The New Data Reference Model 2.0

• The FEA framework and its five supporting reference models (Performance, Business, Service, Technical and Data) are now used by departments and agencies in developing their budgets and setting strategic goals. With the recent release of the Data Reference Model (DRM), the FEA will be the “common language” for diverse agencies to use while communicating with each other and with state and local governments seeking to collaborate on common solutions and sharing information for improved services.

Source: Expanding E-Government, Improved Service Delivery for the American People Using Information Technology, December 2005, pages 2-3.http://www.whitehouse.gov/omb/budintegration/expanding_egov_2005.pdf

Page 10: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

10

1. The New Data Reference Model 2.0• The following chart illustrates the potential uses

of the newly released DRM Version 2.0:– The FEA mechanism for identifying what data the

Federal government has and how it can be shared in response to a business/mission requirement.

– The frame of reference to facilitate Communities of Interest (which will be aligned with the Lines of Business) toward common ground and common language to facilitate improved information sharing.

– Guidance for implementing repeatable processes for sharing data Government-wide.

Source: Expanding E-Government, Improved Service Delivery for the AmericanPeople Using Information Technology, December 2005, pages 2-3.http://www.whitehouse.gov/omb/budintegration/expanding_egov_2005.pdf

Page 11: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

11

1. The New Data Reference Model 2.0

Source: Expanding E-Government, Improved Service Delivery for the American People Using Information Technology, December 2005, pages 2-3.http://www.whitehouse.gov/omb/budintegration/expanding_egov_2005.pdf

Page 12: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

12

1. The New Data Reference Model 2.0

• FEA Reference Model Taxonomies

• FEA “Common Language”

• DRM 1.0 by committee– Implementation after

development.

• FEA Reference Model Ontology

• FEA Semantic Model

• DRM 2.0 by open, collaborative process– Implementation though

iteration and testing during development.

Paradigm Shifts

Page 13: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

13

1. The New Data Reference Model 2.0

• Original FEA Lines of Business (6):– Data and Statistics:

• Opted out because of FedStats, Federal Committee on Statistical Methodology, etc. (it had its act together for statistical data management)

• Now it’s back with:– The new Data Reference Model 2.0 because statistical

programs generally have the best data and metadata and data management practices.

– The National Infrastructure for Community Statistics Community of Practice (NICS CoP)

– The Federal Health Architecture Data Architecture Working Group because FHA agencies are statistical agencies:

» See for example Health, United States, 2005 from the National Center for Health Statistics!

Page 14: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

14

1. The New Data Reference Model 2.0• Metamodel: Precise

definitions of constructs and rules needed for abstraction, generalization, and semantic models.

• Model: Relationships between the data and its metadata.

• Metadata: Data about the data.

• Data: Facts or figures from which conclusions can be inferred.

Relationships and associations

The purpose of this schematic is to show that we need to describe information model relationships and associations in a way that can be accessed and searched.

Source: Professor Andreas Tolk, August 16, 2005

Page 15: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

15

1. The New Data Reference Model 2.0

The point of this graph is that Increasing Metadata (from glossaries to ontologies) is highly correlated with Increasing Search Capability (from discovery to reasoning).

Page 16: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

16

1. The New Data Reference Model 2.0

• Five Key Activities Over the Next Year:– 1. Education and Training in DRM Version 2.0 and use in FEA –

DRM-based Information Sharing Pilots (started June 13th). – 2. Testing of XML Schemas and OWL Ontologies by NIST and

the National Center for Ontological Research, respectively, among others (began October 27th).

– 3. Inventory/Repository of Semantic Interoperability Assets and Development of a Common Semantic Model (COSMO) by the new Ontology and Taxonomy Coordinating Work Group (ONTACWG) (started October 5th).

– 4. Continued early implementation of DRM 2.0 concepts and artifacts by industry in “open collaboration with open standards” pilot projects and workshops (started July 19th). E.g. FHA/DAWG.

– 5. Fostering champions of DRM Best Practices to improve (1) agency data architectures within agencies and (2) cross-agency data sharing across agencies in funded projects (started June 13th).

Page 17: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

17

1. The New Data Reference Model 2.0

Scale of Activity / Metadata Function

Agency CoP/LoB Cross-CoP/LoB

Discovery e.g., EAAF 2.0 e.g.,FHA/DAWG e.g., Indicators

Integration

Reasoning

Where is SICoP DRM Implementation Going?Super Pilot: Address as Many Boxes as Possible!

CoP: Community of PracticeLoB: Line of BusinessEAAF: OMB Enterprise Architecture Assessment Framework 2.0FHA/DAWG: Federal Health Architecture – Data Architecture Working Group

? ? Yes

Page 18: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

18

1. The New Data Reference Model 2.0

• December 5-7, 2005, Knowledge Management Collaboration & Knowledge Sharing Conference, Orlando, Florida:– Using CoPs To Simplify Processes and Unify Work Across

Agencies: Cross-Industry Applications:• Semantically Enabled Content (Wiki Purple Numbers, Ontology

Modeling Before Content is Created-e.g., SiberLogic, Repurposed Content, etc.)

• December 13, 2005, Invited Presentation to the Federal Metadata Management Consortium (FMMC):– SICoP and DRM Implementation Through Iteration and Testing

Work: Making It Real:• Semantic Knowledge Modeling and a Knowledge Reference Model

for Implementing the Semantic Web in the Federal Government.

Page 19: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

19

2. Health, United States, 2005

• 156 tables in Excel plus 37 tables in Excel for figures

• Metadata (multiple levels and types)– For tables– Sources of data– Data stories

• Definitions - 194• Repurpose this excellent content and

model and map it to the DRM 2.0.

Page 20: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

20

2. Health, United States, 2005

http://www.cdc.gov/nchs/hus.htm

Page 21: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

21

3. Data Architecture Working Group

Source: FHA Data Strategy, DRAFT V1.0, December 28, 2005.CDM: Conceptual Data Model.

Page 22: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

22

3. Data Architecture Working Group

• The scope of the Data Architecture Working Group is to help partner agencies to ensure that the FHA and its partners have a comprehensive and accurate view of the data needs of the FHA and to collect, store, and access the metadata in a consistent way. This charter extends to all Federal Departments whose mission is to provide and/or support the delivery of health care services that have been recommended and accepted. The Data Architecture Work Group also focuses on health metadata collection, analysis, and planning activities that are supported by the FHA Partner Council. The DAWG, as it pursues its data architecture objectives, will coordinate these activities with the other established workgroups of the FHA.

Source: FHA Data Architecture Working Group Initial Kickoff Meeting, December 13, 2005.

Page 23: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

23

4. Pilot Project

• DRM 2.0:– Description (slides 24 and 27):

• Metadata (Title, Data, Notes, and Sources)• Data Story• Definitions and Methods

– Context:• Taxonomy and Search (slides 25-26)

– Sharing:• Separation of Presentation and Data (slides 28-29)

Page 24: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

24

4. Pilot Project

See http://web-services.gov and Dynamic Knowledge Repositories

This Data Architecture Provides the Three S’s: Structure, Searchability, and Semantics.

Page 25: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

25

4. Pilot Project

Query of HUS 2005 Taxonomy Nodes

Federated Search of All FHA Taxonomy NodesSee next slide for explanation.

Page 26: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

26

4. Pilot Project• Query of HUS 2005 Taxonomy Nodes:

– This is the Expert Search Form Interface in the Web Browser where the (1) left pane has the hierarchical table of contents structure in the left pane where the document (s) and their subsections are selected for search and the (2) right pane has the boxes for the actual search query terms (“IDC codes”), number of words about the highlighted search terms that are desired (none), the search execution button, and the query syntax explanation.

• Federated Search of All FHA Taxonomy Nodes:– This is the same as item 2 above, except that a different set of

boxes are checked in the (1) left pane (the entire FHA Node) and a different query (“data architecture”) and number of words about the highlighted search terms that are desired (five) are used in the (2) right pane.

Page 27: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

27

4. Pilot Project

Note: Can Highlight Table and Copy and Paste to Spreadsheet Because of XML Markup.

Metamodel

Model

Metadata

Data

Data Story

Recall Slide 8

Page 28: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

28

4. Pilot Project

Data &Metadata(see nextslide)

http://web-services.gov/statabs2003no1.htm

Separation of the Data Presentation from the Data & Metadata.

DataPresentation/Visualization

Page 29: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

29

4. Pilot Project

Data &Metadatain XML

http://web-services.gov/statabs2003no1.htm

The Data & Metadata Travel Together in XML Format!

Page 30: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

30

4. Pilot Project

• Federal Health Architecture Data Architecture Ontology Metamodel:– Use Lines of Business/Business Reference

Model to Define the “Upper Ontology”:• See slide 31.

– Use Data Elements to Define the Domain-Specific Ontology:

• See slide 32.

Page 31: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

31

4. Pilot ProjectFHA Health Domains(FEA Health LoB Sub-Functions)

Definition Instances

1. Access to Care Focuses on the access to appropriate care

Access to Care

2. Population Health and Consumer Safety

Assesses health indicators and consumer products as a means to protect and promote the health of the general population

Healthy People Chartbook IndicatorsSupporting disease registries, e.g., cancer, hepatitis C, SCI, and immunology

3. Health Care Administration

Assures that federal health care resources are expended effectively to ensure quality, safety, and efficiency

Health Care Expenditures and Payors

4. Health Care Delivery Services

Provides and supports the delivery of health care to its beneficiaries

Health Care Utilization and Health Care Resources

5. Health Care Research and Practitioner Education

Fosters advancements in health discovery and knowledge

HUS, 2005, Special Feature: Adults 55–64 Years of Age

Page 32: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

32

4. Pilot ProjectConcept Definitions and

MethodsInstance

Population Population Data Table for Figure 1

Age Age Data Table for Figure 2

Race Race Data Table for Figure 3

Poverty Poverty level Data Table for Figure 4

Income Family income Data Table for Figure 5

Health insurance coverage

Health insurance coverage

Data Table for Figure 6

SEE ONLINE VERSION

Page 33: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

33

Appendix. Other Related Work• Building an Ontology of the National Health Information

Network (NHIN): Status Report:– http://web-services.gov/nhinrfiontology04052005.ppt– http://ontolog.cim3.net/cgi-bin/wiki.pl?NhinRfi– http://ontolog.cim3.net/cgi-bin/wiki.pl?HealthOntologyMapping

• Collaborative Expedition Workshops (examples):– December 9, 2004, Standard Vocabularies in Health Care, Kathy

Lesh, Kevric.– July 19, 2005, Building a Hospital Incident Reporting Ontology

(HIRO) in the Web Ontology Language (OWL) Using the JCAHO Patient Safety Event Taxonomy (PSET), Liju Fan, Kevric, et al.

– December 6, 2005, Introduction to the Semantic Web for Bioinformatics, Ken Baclawski, Northeastern University.

– December 6, 2005, Boston Children's Hospital "smart search" and Semantic UMLS Ontology-based Professional Language Processing PubMed Search, Michael Belanger, President, SemanTxLife Sciences.

• See http://colab.cim3.net/cgi-bin/wiki.pl?ExpeditionWorkshop

Page 34: FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

34

Appendix. Other Related Work• Health Information Technology Ontology Project

(HITOP):– New CoP Led by Marc Wine, GSA Office of Intergovernmental

Solutions:• Develop a roadmap on the state-of-the-art use of ontology tools to

achieve semantic interoperability for high priority health IT applications involving clinical decision support systems (DSS) and electronic health records (EHRs).

– See http://colab.cim3.net/cgi-bin/wiki.pl?HealthInformationTechnologyCommunityofPractice

– Fourth Semantic Interoperability for E-Government Conference, February 9-10, 2005, Work Group Reports:

• Featured Presentation: Barry Smith, HL7 RIM– See http://colab.cim3.net/cgi-bin/wiki.pl?

FourthSemanticInteroperabilityforEGovernmentConference_2006_2_0910