Upload
lel
View
55
Download
0
Tags:
Embed Size (px)
DESCRIPTION
AIP-2 Design Review Catalogue, Clearinghouse, Registry, Metadata (CCRM) WG Use Case Review. Josh Lieberman GEOSS AIP-2 Design Review 2 December 2008. Review Outline. Transverse use cases relevant to CCRM GEOSS Registries Components GetRecordbyID Standards Registry coordination - PowerPoint PPT Presentation
Citation preview
AIP-2 Design ReviewCatalogue, Clearinghouse, Registry, Metadata (CCRM) WG Use Case Review
Josh Lieberman
GEOSS AIP-2 Design Review2 December 2008
Review Outline• Transverse use cases relevant to CCRM• GEOSS Registries
– Components– GetRecordbyID– Standards Registry coordination
• GEOSS Clearinghouse deployments (Compusult, ESRI, USGS)):– Registry harvest – Community Catalog harvest– Other service harvest– Discovery interface and model
• Metadata standards and practices– Discovery / binding / evaluation roles– ISO 19115 / 19119 / 19139 profiles– GEOSS Common record / queryables– ISO Application Profile
• Next Steps
Clearinghouse and Common InfrastructureGEOSS
Registries
Services
Components
Standards
Requirements
GEOSS ClearinghouseOGC CSWISO 23950 / SRUOpenSearch?ebRS?
What is infrastructure (and who cares)?• Clearinghouse / Registry – the tracks• Portals – the terminals• Applications – the trains• Users – the passengers• Content – the baggage• SBA’s – the destinations
Without metadata, SOA itself would be impossible
DatasetsService
Instances
Community Catalogs
Provisions
Clearinghouse Harvests / Cascades
Service / Dataset Description Metadata
Get
Capabilities ?
?
?
Service Instances
Datasets
Community Catalogs
Annex B Use Case• Disaster Scenario
– Discover which data, products and services are available for this area of interest and this thematic. Connect to the FedEO networks.
– After having identified some interesting products for [user] needs, perform a catalogue search of these products to have a quick look and more information.
– Order the previously identified products directly to the providers.
Transverse Use Cases1. Register organization2. Define & publish component with services3. Develop component metadata system4. Register component & services5. Search for components6. Search community catalogs / metadata services7. Bind client to service8. Access services for application9. Evaluate component quality / usability10. Subscribe to alerts11. Construct & publish workflow12. Register standards & best practices
Define & Publish Component with Services1. Precondition: Organization registration2. Select component type to represent contribution3. Select service types which represent component
access4. Expose service endpoints5. Postcondition: Services expose component
Develop Component Metadata System1. Precondition: Develop component and services2. Choose component metalevels3. Choose metadata standards and formats4. Choose publishing method (e.g. community catalog)5. Construct metadata document(s)6. Deploy / publish metadata documents7. Postcondition: Descriptions of components and
services are available
Register component and services1. Precondition: component and services are exposed
and described adequately by metadata2. Enter information into Registry application for
component3. Enter information into Registry applications for
associated services4. Postcondition: component and services are
registered
Search for components and services (data first)1. Precondition: appropriate components are registered in the Registry
and are searchable through a Clearinghouse.2. User opens a Clearinghouse client, e.g. Geo Portal3. User enters one or more target values for queryable parameters and
initiates search.4. Alternative: user browses candidate components presented by the
Geo Portal5. Alternative: user issues search by way of a semantic mediator which
expands / maps the search terms for finding components and services in disparate domains.
6. User refines search parameters to refine search7. User searches / browses services associated to candidate
components8. Postcondition: set of candidate components with suitable service
interfaces for drill down.
Search community “catalogs”1. Precondition: community catalog is registered in Registry and
findable through Clearinghouse and/or otherwise known to user.2. User opens client application (e.g. community portal) with
capabilities to access community catalog.3. User searches community catalog for holdings of interest4. Alternative: user searches Clearinghouse for holdings of
community catalogs which have been harvested by or federated with the Clearinghouse.
5. Alternative: user finds candidate community catalog summary records through Clearinghouse, then drills down into more detailed metadata in the originating community catalog
6. Postcondition: user has identified resources of interest, both data products, and services which make them available via the Web or other communication medium.
Evaluate component quality / usability• Precondition: user has identified a component (e.g. dataset)
relevant to their project needs and accessed it through appropriate services.
• User develops decision support workflow to derive analysis / visualization of dataset(s).
• User obtains metadata descriptions of dataset(s) sufficient to determine validity of derived result (e.g. statistical power of a decision).
• Optional: user contacts provider of dataset to obtain additional metadata elements needed for evaluation
• Optional: user iterates through workflow and evaluation to optimize decision validity.
• Postcondition: user has determined the significance of an observation-based decision.
Registry Providers• Organization / Component / Service Registry – GMU
– ebRIM based, but user interface is limited to O / C / S– Service interface is discovery only
• Standard and Special Arrangements Registry – IEEE– Coordination with Service registry being developed
• User Requirements Registry – IEEE– Role is loosely defined
• Best Practices Wiki - IEEE– Not an authoritative register
Component Registration• Observing System or Sensor Network• Exchange and Dissemination System• Modeling and Data Processing Center• Data set or Database• Catalog, Registry, Metadata Collection• Portal or website• Software or application • Computational model• Initiative or Programme • Information feed, RSS, or alert• Training or educational resources • Web-accessible document, file, or graphic• Other, enter information in box:
Registry Status• Component Types• GetRecordbyID• Standards Registry coordination• Content cleaning, testing, status• Work on attributes for harvest-able metadata and
other links
Resource Discovery / Summary Needs• Datasets
– Data type / feature type– Observable(s)– Coverage in space and
time– Origin / authority– Quality / usage
• Services– Service type– Accessed content / data– Functionality / operations /
options– Bindings– Quality / availability
• Catalogs• Record types• Holdings / collections• Supported interfaces• Queryable properties• Response types / formats• Tags / categories /
relations• Portals / applications
• Functionality• Client interfaces• Supported workflow• Intended users• Technology platform
Resource Description Relationships
Dataset DescriptionService Description
Collection Description
Product Description
Catalog Description
Application Description
Workflow Description
ProvisionOperatio
n
Operates on Provided by
Derivative Description
Service – data model (Cat 2.0.2 ISO Profile)Includes extensions to 19119. Not ingest-able from OGC Capabilities without constrained MetadataURL provisionRelated to but not the same as Inspire metadata profileQuestion whether this is sufficient / needed for discovery
Community Catalog / Component Providers• JAXA EO Catalog• CNES / Erdas Catalog• FedEO Community
Clearinghouse• ICAN Coastal Atlas /
Mediator• WUSTL / ESIP AQ
Catalog• NOAA SNAAP• IP3 Mediator
• NOAA WAF
Clearinghouses• Clearinghouse Status – Archie (FGDC)
– Simple Clearinghouse" harvests component and service records from the registry. Working on the ingest procedure from e.g. Z39.50 community catalogs and moving on to CS/W catalogs, FGDC records from LandSat.
– Lots of non-functioning endpoints still in the service registry, but they are slowly being cleaned out and/or set up for testing.
– Not yet harvesting registered services except for catalogs, and the service registry, but have experimented with some WMS instances.
– Doesn't yet implement a CS/W Discovery interface, but working on it.• Clearinghouse Status – Marten (ESRI)
– Still some issues with harvesting the service registry - complaints about CS/W capabilities not being valid. Need to sync up with Yuqi and Archie to resolve this.
– Able to harvest Ted Habermann's WAF records. Some metadata validity issues worked through, but how lenient should the ingest be? At what point does lenience interfere with "findability"? (Josh)
– Also harvested the Renewable Energy registered services and Biodiversity site.– GeoGratis Z39.50 interface is still problematic (e.g. AVHRR imagery).– Last week at GEO meeting there were some folks from CEOS (EROS) with lots of medium-
resolution imagery which is only searchable through Globus.• Clearinghouse Status – Robert (Compusult)
– Ingestion from Registry being run every few days. No notification yet if harvesting fails.
Clearinghouse Distributed search vs. Harvest• Harvest alternative advantage: quick searches. Disadvantage:
metadata duplication and scale of processing for large catalogs / archives
• Distributed Search advantage - metadata is maintained closer to source. Disadvantage is that searching takes longer to complete and has more chances for the search to not be completed.
• Recommend Harvest when possible– Harvest only collection metadata at appropriate scale– Policy of community catalogue must be respected
Integration Issues• Catalogues registered with GEOSS have a wide variety of
standardization. Protocols include:– ISO23950 (Z39.50) “GEO” Profile Version 2.2
• FGDC (CSDGM Metadata)• ANZLIC Metadata• ISO 19115 Metadata
– OGC Catalogue Service for the Web (Version 2.0.1 and 2.0.2)• ebRIM Profile (incl ISO and EO Extension Packages)• FGDC Profile• ISO 19115 Profile
– SRU/SRW / OpenSearch– OAI-Protocol for Metadata Harvesting (OAI-PMH)– Dublin/Darwin Core Metadata– Web-accessible folder/ftp?
Mediation Issues - TBD• Where should mediation occur and when?• What and how many controlled vocabularies,
taxonomies, ontologies?– Organizations, SBA’s, CoP’s– Top-down <-> Bottom-up <-> Free-for-all
• How to manage and leverage mappings?• Is there a role for knowledgebase inference?
Workplan Elements for Catalogue / Clearinghouse Thread
• Persistence, completeness, findability• More resources and resource types, e.g. applications, workflows• Minimum interoperability measures, e.g. geoss:Record• Best practices for federated harvest and query• User requirements refinement and added registry / clearinghouse
value • Controlled vocabularies, mediation resources, cross-community
enablement• On-going role for search and discovery in scenarios and decision
support applications• Facilitation of usable OpenSearch / GeoSearch entry points to the
Clearinghouse• Role for publish-subscribe-notify interaction style in Clearinghouse
Other CCRM Issues• Telecon Notes 26 November