Upload
julian-mckinney
View
212
Download
0
Embed Size (px)
Citation preview
Collaborative Data Management
Computer Science DepartmentStanford University
2
Structured Data
Sources•Company Directories•Product Catalogs •Inventories •Airline Schedules•Weather Reports•Patient Records•Drug Studies
Formats•Relational Database•Files (tab-delimited text, XML/RDF)•Application Programs with data interfaces
3
Relationships Among Sources
Replicated data• Cached data• Materialized views (as in data warehouses)
Heterogeneity (different schemas or vocabularies)
• Values in euro vs. values in dollars• French instead of English• Different numbers of tables or attributes
Real World Constraints• Physical laws• Governmental laws• Business rules
4
Data Integration
Single Answer
Data Broker
Manufacturer 1
Manufacturer 2
Marketplace Data
Single Query
Product analysis
SatisfactionRatings
Supplier 1Supplier 2
Supplier 3Supplier 4
5
Collaborative Data Management
When sources are independent, they can be updated independently. In the face of interrelationships among sources, individuals performing updates must collaborate (explicitly or implicitly) to ensure correct updates.
Collaborative Data Management must replace independent data management.
6
Setting
Closed Information Systems• Single CIO• Participants with different views of data
Open Information Systems• Composite enterprises• Consortia / virtual enterprises• Supply chains• Scientific communities• World Wide Web
7
Examples
Corporate Logistics - Enterprise Resource Directories Personnel, locations, organizations, equipment, orders
Electronic Commerce - Integrated Product Catalogs Catalogs, inventories, product ratings, contracts
Health Care - Consolidated Patient Records Doctors, nurses, lab technicians, administrators, patients
Multidisciplinary Engineering - Concurrent Engineering Architects, engineers, construction planners
Command and Control - Situation Assessment Commanders, intelligence, field officers, consultants
8
Collaboration Data Management Initiative
Collaboration• Stanford Computer Science Department• Companies• Governmental Agencies (e.g. SEC)
Funding• NSF• Corporate Contributions• Corporate Contracts (“ears”)
9
Research and Development
Research• Data Integration• Collaborative Logical Spreadsheets / Websheets• Differential Logic• Paraconsistent Logic• Dynamic Constraints• Workflow Management
Development• Web Standards• Web-Based Data Brokers• Web-Based Workflow Managers
10
Proposed Applications
Product Data Management (much past work)• Structured Data about products• Correct on Capture• Mfrs, Retailers/Customers, Trade Associations• Lifecycle of products
Health Care Data Management (maybe, probably not)• Electronic Patient Records• Drug Trials
Organizational Data Management (underway)• Orgnet - data on public organizations, public figures• Stanford Online Proxy Reporting Initiative
Structure and Partners
11
• To launch the CDMI, seed funding is from four major Silicon Valley companies: CISCO, HP, IBM, and SAP.
• The CDMI is an initiative of the Stanford Computer Forum (http://forum.stanford.edu) , of which all of the founders are already members.
• The project seed funding is $50K per member per year as part of supporting this Computer Forum initiative.
• The Computer Forum will provide the appropriate letters to pledge this support at our kickoff meeting.
• NSF funding based on the solicitation for “Information Integration and Informatics (III)” which is part of the “Information and Intelligent Systems (IIS) Program”
CDMI Pillars & Benefits
12
CDMI
• Visiting Scholar hosted by Professor Genesereth
• Office within the Stanford Logic Group
• Stanford Computer Forum normally charges $60K p.a., but support of the CDMI entiles founding members to be charged only $25K
Projects:• SOPRI• Digital Department• CODEX• …Activities:• Invitations to guest lectures• Invitations to group events• Invitations as guest speakers• …
• Product data management in support of electronic commerce
• Organizational data management in support of corporate governance and reporting
• Techniques and tools for support of Collaborative Data Management, i.e. integration and updating of dependent data sources with complex relations and constraints.
• CDM must replace today’s independent data management.
13
Digital Department
Goal Enterprise Data Management for Stanford
Components√ Room reservations (Existing Gates system) Event Management (Equipment, Food, Mail lists) Office Assignments Curriculum (prerequisites, requirements, policies) Programs (Undergraduate, Masters, Doctoral)
Integration Multiple Departments University Databases
Examples
14
Core Research & Technology
Software
Users
CDMI Ear Project
ShowcaseEvolving Project with CDMI Externals
…
15
16
QuickTime™ and aPhoto - JPEG decompressor
are needed to see this picture.