21
Data.gov Wiki: A Semantic Web Approach to Government Data Li Ding, Dominic DiFranzo, Sarah Magidson, Alvaro Graves, James R. Michaelis, Xian Li, Deborah L. McGuinness, Jim Hendler Tetherless World Constellation Nov 2, 2009

Data Wiki: A Semantic Web Approach to Government Data

  • Upload
    ike

  • View
    33

  • Download
    1

Embed Size (px)

DESCRIPTION

Tetherless World. Data.gov Wiki: A Semantic Web Approach to Government Data. Li Ding, Dominic DiFranzo, Sarah Magidson, Alvaro Graves, James R. Michaelis, Xian Li, Deborah L. McGuinness, Jim Hendler Tetherless World Constellation Nov 2, 2009. DATA GOV. Synergy. - PowerPoint PPT Presentation

Citation preview

Page 1: Data Wiki:  A Semantic Web Approach to Government Data

Data.gov Wiki: A Semantic Web Approach to

Government Data

Li Ding, Dominic DiFranzo, Sarah Magidson, Alvaro Graves, James R. Michaelis, Xian Li,

Deborah L. McGuinness, Jim Hendler

Tetherless World ConstellationNov 2, 2009

Page 2: Data Wiki:  A Semantic Web Approach to Government Data

Synergy

• Government: data is out there “as is”

• Loop: gov data and linked data

• Loop: gov data and web developers

• Loop: gov data and end users

Page 3: Data Wiki:  A Semantic Web Approach to Government Data

Government Data on the Web

Page 4: Data Wiki:  A Semantic Web Approach to Government Data

Objectives

• Investigate the role of semantic web in producing, processing and utilizing government datasets– To enrich the value of data via normalizing,

linking and information-extraction– To realize the value of data via applications,

esp. visualization– To support web developers via machine

friendly data access and web services

Page 5: Data Wiki:  A Semantic Web Approach to Government Data

Data Processors(Web Services & Analyzers)Data Processors(Web Services & Analyzers)

SPARQL Web Service

XSLT Service Diff Service

RDF/XML

RSS Generator

SPARQL End Point

Linked Data

Linked DataGOV data

(RDF)

Google Viz MIT Exhibit RSS 1.0 tagCloud

CSVXSL…

Tabulator

Convert D

ataLink &

Enrich D

ataV

iew &

Use D

ata

Link Annotator

RDF/XML

Li Ding, Dominic DiFranzo, Sarah Magidson, and Jim Hendler · Tetherless World Constellation · Rensselaer Polytechnic Institute · Aug 7 2009 · http://data-gov.tw.rpi.edu/

Sem Wiki

Semantic Web Architecture for Government Data

Page 6: Data Wiki:  A Semantic Web Approach to Government Data

The Landscape

Page 7: Data Wiki:  A Semantic Web Approach to Government Data

The catalog data

Page 8: Data Wiki:  A Semantic Web Approach to Government Data

(#10) Residential Energy Consumption Survey

(#401) Budget Authority and

offsetting receipts1976-2014

(#403) Governmental

Receipts1962-2014

(#402) Outlays and

offsetting receipts1962-2014

(#249) 2006 Toxics Release

Inventory

(#90) 2005-2007 ACS PUMS

Housing (#191) 2005 Toxics Release

Inventory

(#91) 2005-2007 ACS PUMS Population

(#34) Worldwide M1+

Earthquakes past 7 days

(#9) CASTNET Visibility

(#397) 2007 Toxics Release

Inventory

(#8) CASTNET Ozone

Budget

Population

Energy and Utilities

Geography and Environment

(@10001)CASTNET sites

Li Ding, Dominic DiFranzo, Sarah Magidson, and Jim Hendler · Tetherless World Constellation · Rensselaer Polytechnic Institute · Aug 7 2009 · http://data-gov.tw.rpi.edu/

Data-gov Cloud (Aug 2009)

Page 9: Data Wiki:  A Semantic Web Approach to Government Data

Data-gov Cloud (Oct 2009)

Li Ding and Jim Hendler · Tetherless World Constellation · Rensselaer Polytechnic Institute · Oct 2009 · http://data-gov.tw.rpi.edu/

US-COMMUNITY(2005-2007)

CASTNET(1990 – Present)

RECS(2005)

GOV-BUDGET(1962-2014)

TOXIC-RELEASE(2005-2008)

EARTHQUAKE(Present)

STATE-LIB(2006-2007)

PUBLIC-LIB(1992-2006)

MED-COST(1994-2009)

LABOR-STAT(19xx-Present)

DATA-GOV-CATALOG(present)

Government

Community

Services

Environment

CASTNET sites

RECS code

US agency US location

Linked Data

USAspending(2008-2010)

GeoNamesGeoNames

Page 10: Data Wiki:  A Semantic Web Approach to Government Data

More statistics

Page 11: Data Wiki:  A Semantic Web Approach to Government Data

Demos

Page 12: Data Wiki:  A Semantic Web Approach to Government Data

Data.gov + epa.gov

Page 13: Data Wiki:  A Semantic Web Approach to Government Data

Gov Data + Corporate Data + User Data

Page 14: Data Wiki:  A Semantic Web Approach to Government Data

Computing Difference of Revisions

Page 15: Data Wiki:  A Semantic Web Approach to Government Data

More demos?

• http://data-gov.tw.rpi.edu/wiki/demos

Page 16: Data Wiki:  A Semantic Web Approach to Government Data

Technical Issues

Page 17: Data Wiki:  A Semantic Web Approach to Government Data

Issues in Data.gov

• Duplicated Datasets- Some datasets are part of another dataset

– Dataset 140 (2005 Toxics Release Inventory data for the state of California (EPA)) is a subset of Dataset 191.

• Formatting Issues - The format of some datasets is not friendly to machine processing.

– Dataset 37 (Lower Colorado River Daily Average Water Elevations and Releases (US Bureau of Reclamation)).

– Dataset 335 (National Longitudinal Surveys (US Bureau of Labor Statistics)) tells you how to order data from the government.

• Access Point Issues - The access points are interactive webpage which is not friendly for machine access.

– Dataset 330 (Local Area Unemployment Statistics (US Bureau of Labor Statistics)

Sarah

Page 18: Data Wiki:  A Semantic Web Approach to Government Data

Linking Data

1. link similar datasets by reusing property namespace

2. link to rdfs:label (via rdfs:subPropertyOf) using semantic wiki

3. link to DBpedia (via owl:sameAs) using wikipedia widget

4. link instances (via common <property, literal-value> pair)

5. link government data with web data (via time and location)

6. link revisions of government data (via knowledge provenance)

Page 19: Data Wiki:  A Semantic Web Approach to Government Data

Semantic mapping: AI + CI

need manual disambiguation!

Map to Wikipedia/DBpedia Name

Page 20: Data Wiki:  A Semantic Web Approach to Government Data

RDF => SPARQL => Web

• We use SPARQL to bridge Web devlopers and Semantic Web data.

• A triple store is used to support handling multi-million triple RDF datasets

Page 21: Data Wiki:  A Semantic Web Approach to Government Data

Conclusion

semantic web enabled portal for linked government data

5 billion triples from data.gov hosts apps, demos & services provide education services integrates web users’ contributions