1An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
2An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Facilities to put machine-understandable data on the Web are becoming a high priority for many communities. The Web can reach its full potential only if it becomes a place where data can be shared and processed by automated tools as well as by people. For the Web to scale, tomorrow's programs must be able to share and process data even when these programs have been designed totally independently. The Semantic Web is a vision: the idea of having data on the web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications.
3An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
History
4An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
What is the Web, Really ?
• Millions upon millions of computers all using the same communications protocol
TCP/IP
HTTP
HTML
5An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
HTML<B><I><FONT FACE="Tahoma" SIZE=2><P ALIGN="CENTER">DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE</P><P ALIGN="CENTER">SAN JOSE STATE UNIVERSITY</P></I><P ALIGN="CENTER">SPRING 2000 COLLOQUIUM SERIES, PART II</P></FONT><I><FONT SIZE=2><P ALIGN="CENTER">Each talk with be on a Thursday at 3:00 p.m. in MacQuarrie Hall 523 </P><P ALIGN="CENTER">Please join us for refreshments beforehand, at 2:30 p.m., in MacQuarrie Hall 210</P></I><P ALIGN="CENTER">Parking available in the Seventh Street Garage at South Seventh and San Salvador Streets, San Jose, CA</P></FONT><I><FONT FACE="Tahoma" SIZE=2></I></FONT><FONT SIZE=2><P>April 6		Zvezdelina Stankova-Frenkel, Mathematics, Mills College</P>
<I><P>From Desargues to Modern Algebraic Geometry</P></B></I></FONT><FONT FACE="Arial" SIZE=2><P>We will look at some classical plane geometry . . . mathematics. </P>
6An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
The Evolution of Web Technology
• HTML 1.0 became 2.0 became ... 4.0
• Cascading style sheets and other formatting and layout standards defined by W3C
• Proprietary technologies such as Shockwave and PDF invented
7An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
The Implicit Assumptions
• Point to point (direct) communication
• The primary task of a web server is to deliver information to a human who is asking for that information– Key points: to a human, already asking for
information
8An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
The First Business Opportunity
• “The Web is like mail-order”
• Put Catalogs on the web– Easy to update– Easy to link in auxiliary information
• “People who bought that also bought …”
• Availability information
• In many cases, simply putting an “HTML front end” on existing systems
9An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Leads to Another Opportunity
• Catalogs prime the pump– Easy to understand application that is
compelling– Side-effect: lots of information is now available
on the internet
• How do we take advantage of it ?– Automate existing processes– Enable new applications
10An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
HTML is a Problem
• It’s a markup language based on document structure– Most tags are visual, about presentation– HTML solves document-level navigation
problems, for humans– Lots of information encoded in images
• Fundamentally, the wrong idea.
11An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
eXtensible Markup Language (XML)
• Basically, a language for defining markup languages
• Key idea: separate data from presentation information
• Replace HTML with two things• A domain specific markup language (defined in XML)
• A map from that markup language to HTML (defined using XSL)
12An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Split Data <SEASON><YEAR>1998</YEAR><LEAGUE><LEAGUE_NAME>National League</LEAGUE_NAME><DIVISION><DIVISION_NAME>East</DIVISION_NAME><TEAM><TEAM_CITY>Atlanta</TEAM_CITY><TEAM_NAME>Braves</TEAM_NAME><PLAYER><SURNAME>Malloy</SURNAME><GIVEN_NAME>Marty</GIVEN_NAME><POSITION>Second Base</POSITION><GAMES>11</GAMES><GAMES_STARTED>8</GAMES_STARTED><AT_BATS>28</AT_BATS><RUNS>3</RUNS><HITS>5</HITS><DOUBLES>1</DOUBLES>.....
Meaning!
From: The XML Bible by Harold
13An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
From Presentation<HTML xmlns:xsl="http://www.w3.org/TR/WD-xsl"><HEAD><TITLE> <xsl:for-each select="SEASON"> <xsl:value-of select="YEAR"/> </xsl:for-each> Major League Baseball Statistics</TITLE></HEAD><BODY> <xsl:for-each select="SEASON"> <H1 ALIGN="CENTER"> <xsl:value-of select="YEAR"/> Major League Baseball Statistics </H1>
Formatting!
14An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
What is the Web, XML Version
• HTML is a tag language, defined using XML– One of many tag languages (and the likely
target for XSL transformations)
TCP/IP
HTTP
XML
XHTML Special Purpose Tag Languages
15An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
XML Has Lots of Problems
• Everything bottoms out in strings
• DTD’s provide simple structure at the level of “documents”– Very simple inter-document structure– No provisions for intra-document structure
• No support for versioning
16An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
The VISA DTD<!ELEMENT Invoice (InvoiceHeader, InvoiceDetails+, InvoiceSummary)><!ATTLIST Invoice sectorUsageVersion CDATA #IMPLIED > <!ELEMENT InvoiceHeader (InvoiceType, InvoiceStatus, TaxTreatment, DiscountTreatment?, InvoiceTreatment, InvoiceNumber, InvoiceDate, TaxPointDate?, Currency, Party, Party, Party*, Payment?, PONum?, DeliveryNoteNum?, Ref*, Date*, GenText*)><!ELEMENT InvoiceType EMPTY><!ATTLIST InvoiceType stdValue (380|381) "380"
stdName (UNTDID:1001) "UNTDID:1001"> <!-- 380 = Invoice 381 = Credit Note -->
<!ELEMENT InvoiceStatus EMPTY><!ATTLIST InvoiceStatus stdValue (9|10|53) "9"
stdName (UNTDID:1225) "UNTDID:1225"><!-- 9 = Original, 10 = Copy, 53 = Test -->
<!ELEMENT TaxTreatment EMPTY><!ATTLIST TaxTreatment stdValue (NIL|GIL|NLL|GLL|NON) "NLL"
stdName (VISA:TAXT) "VISA:TAXT"><!-- NIL = Line item net amounts, invoice level tax GIL = Line item gross amounts, invoice level tax NLL = Line item net amounts, line level tax GLL = Line item gross amounts, line level tax NON = Tax does not apply to this invoice -->
<!ELEMENT DiscountTreatment EMPTY><!ATTLIST DiscountTreatment stdValue (UN|UG|TN) "UG"stdName (VISA:DSCT) "VISA:DSCT"> <!-- UN = Line item unit price, net of discount UG = Line item unit price, gross of discount TN = Line item sub-total, net of discount TG = Line item sub-total, gross of discount. --><!ELEMENT InvoiceTreatment EMPTY><!ATTLIST InvoiceTreatment stdValue (P|EP|E) "P"stdName (VISA:INVT) "VISA:INVT"> <!-- P = Invoice printed and given to purchaser, and then used for tax reclaim S = Printed, but printed invoice treated as supplemental invoice since electronic copy used for tax reclaim E = Printed invoice suppressed since electronic master version used for tax reclaim -->
17An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
It Gets Worse
<!ATTLIST InvoiceTreatment stdValue (P|EP|E) "P"stdName (VISA:INVT) "VISA:INVT">
<!-- P = Invoice printed and given to purchaser, and then used for tax reclaim S = Printed, but printed invoice treated as supplemental invoice since electronic copy used for tax reclaim E = Printed invoice suppressed since electronic master version used for tax reclaim -->
<!ELEMENT InvoiceNumber (#PCDATA)><!-- String, 1..35 characters -->
<!ELEMENT InvoiceDate (#PCDATA)><!-- String, 1..19 Character DateTime (CCYY-MM-DDTHH:MM:SS) -->
<!ELEMENT TaxPointDate (#PCDATA)><!-- String, 1..19 Character DateTime (CCYY-MM-DDTHH:MM:SS) -->
18An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
The Accompanying Prose
• The DTD is 4 pages
• The manual is 182 pages
The aim of this Guide is to provide sufficient information about the XML Invoice Document to enable its implementation. It documents the file structure, the business usage of the elements, and all the elements and attributes in detail.
19An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
RDF
20An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Goal: The Semantic Web
• Different sites each maintain small amounts of information
• Sites need to refer to each other’s information with full semantic integrity– Information is maintained by owners and
referred to by other sites– Information should be accessible, and coherent,
in very small chunks
21An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Needed: Precision
• Need the ability to specify things like dates, times, and monetary amounts
• Compile in those VISA comments– The more of this we can do, the less
programmer-hours are needed
• Ultimately, most web-based computation will not involve a browser
22An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Needed: Granularity
• Saying things at the “page” level is too coarse grained
• Small chunks of data necessary– And ability to aggregate into larger chunks
important
23An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Use Classes and Instances
• Objects are a natural way to represent information
• A web page can contain hundreds of instances, each with its own URI
• Hard part is figuring out how to do this in a way that works on the web
24An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Start with Resources
• A resource is a thing you talk about (can reference)
• Everything is a resource
• Resources have URI’s
25An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
How to say things in RDF
• Small set of canonical tags • Use XML syntax to define vocabularies• Information asserted via triples
– Assertions require three things:• Subject: What the assertion is about (always a resource)
• Property: A property whose value is being asserted (always a resource)
• Object: The value of the property (either a resource or a primitive value)
26An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Important Tags
• rdf:Description
• rdfs:Class
• rdfs:Property
• rdf:type
• rdfs:subClassOf
• rdfs:domain
• rdfs:range
27An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Defining a Class
<?xml version='1.0' encoding='ISO-8859-1'?><!-- Version Tue Feb 01 18:29:46 PST 2000 --><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#" xmlns:rdfutil="http://www.w3.org/rdfutil#" xmlns:bill="http://www.grosso.org/rdfexample#">
<rdf:Description rdf:ID="MotorVehicle"> <rdf:type resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Class"/> <rdfs:subClassOf rdf:resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Resource"/></rdf:Description>
28An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
An Instance of Motor Vehicle
<?xml version='1.0' encoding='ISO-8859-1'?><!-- Version Tue Feb 01 18:29:46 PST 2000 --><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#" xmlns:rdfutil="http://www.w3.org/rdfutil#" xmlns:bill="http://www.grosso.org/rdfexample#">
<rdf:Description rdf:ID="MyChevy"> <rdf:type resource= bill:MotorVehicle /></rdf:Description>
29An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Resources Define Tags
<rdfs:Class ID="MotorVehicle"> <rdfs:subClassOf rdf:resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Resource"/> </rdfs:Class>
<bill:MotorVehicle ID=MyChevy/>
30An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Classes
• Object-oriented notion
• There are classes, arranged in a taxonomy (with subclass relationships)
• Instances can be instances of more than one class
31An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Adding a Property <rdf:Description rdf:ID="rearSeatLegRoom"> <rdf:type resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Property"/> <rdfs:domain rdf:resource="#MotorVehicle"/> <rdfs:range rdf:resource="http://www.w3.org/TR/xmlschema-2/#integer"/> </rdf:Description>
<rdfs:Property ID=”rearSeatLegRoom"> <rdfs:domain rdf:resource="#MotorVehicle"/> <rdfs:range rdf:resource="http://www.w3.org/TR/xmlschema-2/#integer"/> </rdfs:Property>
32An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Setting Property Values
<rdf:Description rdf:ID=MyChevy><bill:rearSeatLegRoom> 47 </bill:rearSeatLegRoom>
</rdf:Description>
33An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Properties
• Similar to fields (data members, attributes...)• Big difference: they’re first class objects
– Defined independently of classes
– Asserted independently of classes• Classes don’t come with a set of data members• Other people (other pages) can assert properties about
your classes and instances without your knowledge or permission
34An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
The Web of Knowledge
Corning Fiberglass has a product catalog
Home Appliances Defines things like
“Blender”
Sears has an on-line store that uses (and extends) both of these
as standard vocabularies
Corning Fiberglass has a product catalog
Corning Fiberglass has a product catalog
Corning Fiberglass has a product catalog
35An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
The Web of Knowledge
Public OpinionAnd RatingsTerminology
Corning Fiberglass has a product catalog
Home Appliances. Defines things like
“Blender”
Sears has an on-line store that uses (and extends) both of these
as standard vocabularies
Corning Fiberglass has a product catalog
Corning Fiberglass has a product catalog
Corning Fiberglass has a product catalog
Consumer Reports uses the product catalogs and
attaches more informationto them
36An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
What is the Web, RDF Version
• Usually called “The Semantic Web”
TCP/IP
HTTP
XML
RDF and RDF-Schema
Schema Schema Schema
Instances Instances
37An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Further Information
• http://www.w3.org/RDF/
• http://www.w3.org/2001/sw/
• http://www.semanticweb.org/
• http://www.mozilla.org/rdf/doc/
• http://www.xml.com/pub/a/2001/01/24/rdf.html
• http://xml.coverpages.org/rdf.html
38An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
Programmatic Resources
• Protege (http://www smi.stanford.edu/projects/protege)
• RDF DB (http://web1.guha.com/rdfdb/)• Redland (http://www.redland.opensource.ac.uk/)• Java API
(http://www-db.stanford.edu/~melnik/rdf/api.html)
• Squish (http://swordfish.rdfweb.org/rdfquery/)
39An Introduction to RDF O’Reilly Enterprise Java Conference, 2001
High Profile Uses
• Electric Power Industry (http://www.langdale.com.au/XMLCIM.html)
• DMOZ (http://www.dmoz.org/)
• Epinions (http://www.epinions.com)
• DAML (www.daml.org)