Upload
hans-de-wolf
View
35
Download
0
Embed Size (px)
Citation preview
Dr. Eric BoomIr. Hans de Wolf
A Virtual Data Portal for CcSP Projects(project COM1)
September 13, 2007
Contents of this Presentation
• Purpose of COM-1• What is a ‘Virtual Data Centre’ ?• How Does It Work ?• Status of the Project• Conclusion
Purpose of the COM-1 Project
• The COM-1 project develops a ‘Virtual Data Centre’
• All CcSP projects can use it (non-exclusively) to disseminate their data products
• It provides the users of data products with a ‘one-stop shopping’ web portal to obtain data products
What is a Virtual Data Centre ?
• A Virtual Data Centre is a system that provide access to geographically distributed data repositories, through Grid and Web services
What Does It Do ?
• The Virtual Data Centre helps users to find data products– A data product is a deliverable collection of
packaged data with associated metadata– Metadata is “data about data” that describes the
content, quality, condition, and other characteristics of the data
• The Virtual Data Centre helps users to obtain data products– It helps projects to disseminate their data
products
Finding Data Products …
• Finding Data Products consists of two steps:– Find the correct type of data product
• Question: “I am looking for historical precipitation data”• Answer: “Look in the CS8 collection from KNMI. It
covers the whole of the Netherlands between 1850-1950. Or look in …”
– Find the correct instance of data products• Question: “I want the daily precipitation data for the
period of 1900 to 1960 at Utrecht from the CS 8 collection”
• Answer: “There are 6 data products in tab-delimited text format, each covering a period of 10 years. Here are the download links …”
About Metadata
• Finding types of data products searches through the metadata in the index
• Finding instances of data products searches through the metadata in the catalogs (one catalog for each type)
• COM-1 metadata is based on standards– Dublin Core– ISO 19115– Dutch Metadata Standard for Geography Core
Set– Allows future integration with Geographical
Information Systems (GIS)
How Does It Work (1)
• To obtain data products, the user access the COM-1 portal on the World-Wide Web
How Does It Work (1)
WebPortal
How Does It Work (2)
• The portal does not have the data products
• The data products are located at the repositories of the projects.
How Does It Work (2)
Data ProductCreation
Measurements and/or Models Data
Product
Data ProductRepository
How to connect the project to the
portal ?
How Does It Work (3)
• The conceptual solution is simple:• We ‘just’ need a server at each project
– The portal can asks the project’s servers if there are new data products
– The user and/or portal can download data products from the project’s servers
• All is standard ICT technology …
How Does It Work (3)
Add a Server
How Does It Work (3)
ServerBig Ugly Firewall
How Does It Work (4)
• Reality strikes back !• The conceptually simple solution (‘just a
server’) turns into a major complication– Installing a server requires equipment– Installing a server requires expertise– Installing a server requires network
configuration (firewall settings)– Operating a server requires effort
• Status Monitoring• User administration
• COM-1 provides a solution …
How Does It Work (4)Project
Administrator
PortalAdministrator
Data ProductSpecification
ProjectCreates
Data ProductsProjectCreates
AnnouncementsPortal
ReceivesAnnouncements
Check and UpdateCatalog
User QueriesWeb Portal
Find Matching Data Products
Check& Store in Index
How Does It Work (5)
• The user has received list of data products that matches the criteria specified.
• Next step: obtain the data products
How Does It Work (5)
User SelectsData Product(s)
OrderQueue
RetrieveData Product
From RepositoryAgentRetrieves
Order
AgentAsks
For JobsPlace
Data Productat the Portal
in parking area
User DownloadsData Product(s)
Minimal Project Effort
• The Virtual Data Centre concept minimizes the effort required to disseminate data (for the science projects):– A computer with Java VM runtime– Internet connection (as web browser)– Define Data Product properties (once)– Install COM-1 software (once)– Create project specific delivery script (“copy…”)– Create text files announcing new data products
(with production process, MS Excel, …)– Run COM-1 software (ad-hoc or automatically)
No NetworkGurus Needed
Status of the Project
• Prototype is running– Projects can feed information to portal– Users can search in index for data product
types– Users can search in catalog for data
product instances– Portal can retrieve data products from
projects– User can download from the portal
• But …
Status of the Project (cont.)
• Work is not yet finished– Access rules are present, but there is no ‘nice’
way to define them– Performance, Scalability, Security– Logging & Reporting– Ergonomics and cosmetic issues– Not production ready: bugs, robustness,
management, tracebility– Hosting of operational system– User feedback– Additional functionality
Additional Functionality
• The software provides a framework that allows additional functionality– Access models (in addition to passive data)– Stored queries– Notification– Orchestration
• Standard scenarios– Designed by experts– Parameterized
• Custom (user-defined) scenarios
Conclusion
• Concept has been proven• Technology allows additional functions• COM-1 project has constructed the
building, it is up to the science projects to put in the furniture
• User Feedback is welcome / needed– Hans de Wolf ([email protected])– Eric Boom ([email protected])