121
Making the Web Searchable Peter Mika Researcher, Data Architect Yahoo! Research

Semantic Web Austin Yahoo

  • View
    9.065

  • Download
    0

Embed Size (px)

DESCRIPTION

Semantic Web Austin meetup with Yahoo's Peter Mika

Citation preview

Page 1: Semantic Web Austin Yahoo

Making the Web Searchable

Peter Mika

Researcher, Data Architect

Yahoo! Research

Page 2: Semantic Web Austin Yahoo

- 2 -

Yahoo! Research (research.yahoo.com)

Page 3: Semantic Web Austin Yahoo

- 3 -

Yahoo! Research Barcelona

• Established January, 2006

• Led by Ricardo Baeza-Yates

• Research areas

– Web Mining

• content, structure, usage

– Distributed Web retrieval

– Multimedia retrieval

– NLP and Semantics

Page 4: Semantic Web Austin Yahoo

- 4 -

Yahoo! by numbers (April, 2007)

• There are approximately 500 million users of Yahoo! branded services, meaning we reach 50 percent – or 1 out of every 2 users – online, the largest audience on the Internet (Yahoo! Internal Data).

• Yahoo! is the most visited site online with nearly 4 billion visits and an average of 30 visits per user per month in the U.S. and leads all competitors in audience reach, frequency and engagement (comScore Media Metrix, US, Feb. 2007).

• Yahoo! accounts for the largest share of time Americans spend on the Internet with 12 percent (comScore Media Metrix, US, Feb. 2007) and approximately 8 percent of the world’s online time (comScore WorldMetrix, Feb. 2007).

• Yahoo! is the #1 home page with 85 million average daily visitors on Yahoo! homepages around the world, an increase of nearly 5 million visitors in a month (comScore WorldMetrix, Feb. 2007).

• Yahoo!’s social media properties (Flickr, delicious, Answers, 360, Video, MyBlogLog, Jumpcut and Bix) have 115 million unique visitors worldwide (comScore WorldMetrix, Feb. 2007).

• Yahoo! Answers is the largest collection of human knowledge on the Web with more than 90 million unique users and 250 million answers worldwide (Yahoo! Internal Data).

• There are more than 450 million photos in Flickr in total and 1 million photos are uploaded daily. 80 percent of the photos are public (Yahoo! Internal Data).

• Yahoo! Mail is the #1 Web mail provider in the world with 243 million users (comScore WorldMetrix, Feb. 2007) and nearly 80 million users in the U.S. (comScore Media Metrix, US, Feb. 2007)

• Interoperability between Yahoo! Messenger and Windows Live Messenger has formed the largest IM community approaching 350 million user accounts (Yahoo! Internal Data).

• Yahoo! Messenger is the most popular in time spent with an average of 50 minutes per user, per day (comScore WorldMetrix, Feb. 2007).

Nearly 1 in 10 Internet users is a member of a Yahoo! Groups (Yahoo! Internal Data).• Yahoo! is one of only 26 companies to be on both the Fortune 500 list and the Fortune’s “Best Place to Work”

List (2006).

Page 5: Semantic Web Austin Yahoo

- 5 -

Agenda

• Part 1

– Publishing content on the Semantic Web

• Intro to RDF and the Semantic Web

• Six ways to publish data on the Semantic Web

• History of embedded metadata on the Web

• RDFa, best practices and tools

• Exercise

• Part 2

– Semantic Web in use

• SearchMonkey

• BOSS and YQL

• Semantic Search and Navigation

• Part 3

– Research in Semantic Search

Page 6: Semantic Web Austin Yahoo

- 6 -

Motivation

• Why publish data on the Semantic Web?

– Multiply the value of your data by increasing content agility

• The potential for reuse and aggregation with other datasets

• Make your data more easily findable

• Why develop applications using semantic technologies?

– Content agility means you can more rapidly develop applications by reusing and recombining data. Content agility leads to increased agility and robustness of your application.

Page 7: Semantic Web Austin Yahoo

Intro to the Semantic Web

Page 8: Semantic Web Austin Yahoo

- 8 -

Basic RDF

• RDF has two basic types of entities: resources and literals

– Roughly objects and built-in types in Object Oriented Programming

– Resources are identified by a URI or otherwise called a blank node

• URIs are a generalization of URLs

• Notation: <http://www.example.org/Person> or ex:Person

– Literals have an optional language and datatype (string, integer etc.)

• Datatypes are identified by URIs, e.g. XML Schema datatypes

• Two literals are the same if their components are the same

• Notation: “Joe B.” or Joe@en^^http://…#string

Page 9: Semantic Web Austin Yahoo

- 9 -

RDF models

• A triple aka a statement is a tuple of (subject, predicate, object)

– Example: (Joe, loves, Mary)

– Each triple gives the value of a property for a given resource or relates two objects to one another

– A predicate is always a resource with a URI

– A triple is also called a statement

• An RDF model is a set of triples

– Ordering of statements in an RDF document is irrelevant (unlike XML)

Page 10: Semantic Web Austin Yahoo

- 10 -

Graphical and textual notation

• A number of text-based interchange formats for RDF

– RDF/XML, Turtle, N3, N-Triples

– Example: http://www.cs.vu.nl/~pmika/foaf.rdf

my:Joe

“Joe A.”

name

foaf:Persontype

Page 11: Semantic Web Austin Yahoo

- 11 -

Ontologies

• Ontologies are collections of classes and properties used to describe objects in a particular domain

– Ontologies themselves are described in RDF or OWL (the Web Ontology Language), an extension of RDF

– Example: the Friend-Of-A-Friend (FOAF) ontology for personal profiles

• Classes can be described by sub- and superclasses, required properties

– Class membership in RDF is expressed using the rdf:type property

– An instance can have multiple classes (types)

– A class can have multiple superclasses

• Properties can be described by their domain, range, cardinalities, etc.

Page 12: Semantic Web Austin Yahoo

- 12 -

Advanced topic: Resources vs Literals

• Resources are objects, Literals are strings

• Resources are instances of classes, Literals have datatypes

• Whether something is a resource or literal sometimes depends on the detail of modeling

<meta property=“myvocab:knows”>Paris Hilton</meta>

<item rel=“foaf:knows”><meta property=“foaf:name”>Paris Hilton</meta>

</item>

• You cannot make statements about literals (literals are always the object in a triple)

• Resources can carry a globally unique identifier, literals have no identity

• Web resources such as documents and images are resources– <item rel=“rdfs:seeAlso” resource=“http://www.some.related.page.com/”/>

– <item rel=“foaf:img” resource=“http://photosite.example.org/photo.jpg”/>

• When in doubt: it’s a resource

Page 13: Semantic Web Austin Yahoo

- 13 -

Advanced Topic:Informational resources vs. Conceptual resources

• Informational resource: an HTML document, image, any other file on the Web

– Retrievable in its entirety from the Web

– Retrieving it can return a 200 OK

• Conceptual (non-informational) resource: a person, an event, a place, etc.

– A description of it may be retrievable from the Web

– When identified by a URL, retrieving it should return a 303 Redirect

• Never confuse a webpage with what it describes!

– You are not your Facebook profile: one is a document, the other is a person. A document has properties such as byte-size, media-type etc, a person has name, age, etc.

– Make sure you don’t use the URL of an existing webpage as the URI of a resource

Page 14: Semantic Web Austin Yahoo

- 14 -

RDF is designed for distributed systems

• URIs provide web-wide global identification across documents– A resource may be described by multiple documents

– We know it’s the same resource because the same URI is used or through reasoning (advanced topic…)

– URIs are intented to be reused

– Unique, but not single identifiers: two URIs may denote the same thing

• URIs are dereferencable (can be retrieved)– A well-behaved URI returns a description of the resource

– Provides authority: the definition of foaf:Person lives at that URI

• Ontologies can be looked up as well– Typically at the root of the URIs, also known as the namespace

– Example: http://xmlns.com/foaf/0.1/Person redirects to the specification

Page 15: Semantic Web Austin Yahoo

- 15 -

URIs implicitly link data together

(#joe, #name, “Joe A.”)(#joe, #email, mailto:[email protected])

(#mary, name, “Mary B.”)(#mary, gender, “female”)

(#joe, #loves, #mary)

Joe’s homepage

A dating site

Mary’s homepage

(#name, #type, #Property)(#name, #domain, #Person)

Schema doc

Linked Data:Following links from

one document to another allows to

discover the entire graph (data and

ontologies)

Page 16: Semantic Web Austin Yahoo

- 16 -

When put together, they form a single ‘global’ graph

“Joe A.”

#joe

#name

[email protected]

#email

#mary

#loves

“Mary B.”

“female”

#name

#gender

Page 17: Semantic Web Austin Yahoo

- 17 -

The even larger picture: entire datasets connected

Page 18: Semantic Web Austin Yahoo

Publishing data on the Web

Page 19: Semantic Web Austin Yahoo

- 19 -

RDF on the Web II.

• Six ways of publishing RDF

1. Standalone files (static or dynamically generated)

2. Metadata inside webpages

3. SPARQL endpoints

4. Feeds

5. XSLT/GRDDL

6. Automated tools

• Note: these are non-exclusive

Page 20: Semantic Web Austin Yahoo

- 20 -

Option 1: Standalone RDF documents

• RDF documents linked to other RDF documents

– Use rdfs:seeAlso to point to a related document

• It says: Go and look at that document if you want to know more

• Advantages:

– No change to the publishing of the HTML documents

– Data can be published by third party

• Tools

– RDB-to-RDF mappers such as D2RQ or Triplify

– Linked Data browsers

• Examples: Most datasets in the Linked Data cloud

..#PeterM

#Bud

born

“Peter Mika”

label

“Budapest”

label#Hun

capital-of

“2,000,000”

population

..#PeterM

#Bud

born

“Peter Mika”

label

“Budapest”

label#Hun

capital-of

“2,000,000”

population

..#PeterM

#Bud

born

“Peter Mika”

label

“Budapest”

label#Hun

capital-of

“2,000,000”

population

Page 21: Semantic Web Austin Yahoo

- 21 -

Option 1: cntd.

• For discovery, the metadata is often linked from HTML pages

<link rel="meta" type="application/rdf+xml" title="FOAF" href="http://www.cs.vu.nl/~pmika/foaf.rdf" />

• Additional advantages:

– Discovery from the webpage

– It’s clear that the metadata is a machine representation of the human-targeted content of the page

• Examples: FOAF profiles, BestBuy

..

Peter Mika was born

in Budapest.

Peter Mika was born

in Budapest.

#PeterM

#Bud

born

“Peter Mika”

label

“Budapest”

label#Hun

capital-of

“2,000,000”

population

Page 22: Semantic Web Austin Yahoo

- 22 -

Option 2: Metadata inside web pages

• Using microformats, RDFa, MicroData (more later)

• Advantages:

– No separate database export required

– Browser plug-in friendly

– Search engine friendly

– Copy-paste friendly

• Tools:

– XML editors (e.g. Oxygen)

– Triplr

– RDFa Distiller

– RDFa bookmarklet

– Ubiquity RDFa plugin

– Optimus microformat parser

• Examples: many, including SlideShare, YouTube, LinkedIn, Digg, Myspace, Facebook…

Peter Mika was born

in Budapest.

Peter Mika was born

in Budapest.

#PeterM

#Bud

born

“Peter Mika”

label

“Budapest”

label#Hun

capital-of

“2,000,000”

population

Peter Mika was born

in Budapest.

Peter Mika was born

in Budapest.

#PeterM

#Bud

born

“Peter Mika”

label

“Budapest”

label#Hun

capital-of

“2,000,000”

population

Page 23: Semantic Web Austin Yahoo

- 23 -

Option 3: SPARQL endpoints

• Query access to your RDF database

– Similar to exposing your database on the Web and giving someone read-only SQL access

• Advantages:

– Most flexible and best performing access from a consumer perspective

• Tools:

– Triple stores (Oracle, Virtuoso, Sesame, Jena, OWLIM etc.)

– RDB-to-RDF mappers such as D2RQ and Triplify

#PeterM

#Bud

born

“Peter Mika”

label

“Budapest”

label#Hun

capital-of

“2,000,000”

population

Page 24: Semantic Web Austin Yahoo

- 24 -

Option 4: feeds

• The equivalent of a database dump

• No standard feed format for RDF

• Advantages

– Submit your data without making it public

• Yahoo! consumes:

– DataRSS

– GoogleBase feeds

– NewsML

• Submit your feed using SiteExplorer

..#PeterM

#Bud

born

“Peter Mika”

label

“Budapest”

label#Hun

capital-of

“2,000,000”

population

#PeterM

#Bud

born

“Peter Mika”

label

“Budapest”

label#Hun

capital-of

“2,000,000”

population

#PeterM

#Bud

born

“Peter Mika”

label

“Budapest”

label#Hun

capital-of

“2,000,000”

population

Page 25: Semantic Web Austin Yahoo

- 25 -

• Publish the transformation from HTML to structured data

• GRDDL is a standard for linking an HTML page to a transformation that produces RDF data

• Advantages

– No change to the page

• Disadvantages

• Transformation needs to be executed to get to the data

• Tools

• Intel MashMaker

• Dapper

• Glue API from AdaptiveBlue

Option 5: XSLT

xx yy

1 2

<XSLT><XSLT>

Page 26: Semantic Web Austin Yahoo

- 26 -

Option 6: Automatic markup

• Restricted mostly to tagging entities with identifiers

• Advantages

– Less manual effort

• Disadvantages

– Limited to finding relevant entities in text

• Tools

– OpenCalais

– Zemanta APIPeter Mika was

born in Budapest.

Peter Mika was born in

Budapest.

<person>Peter Mika</person>

was born in <location>Budapest</location>.

<person>Peter Mika</person>

was born in <location>Budapest</location>.

Page 27: Semantic Web Austin Yahoo

- 27 -

Example: Zemanta

• A personal writing assistant for bloggers

– Plugin for popular blogging platforms and web mail clients

• Analyzes text as you type and suggests hyperlinks, tags, categories, images and related articles

• API available with the same functionality

Page 28: Semantic Web Austin Yahoo

Metadata in HTML

Page 29: Semantic Web Austin Yahoo

- 29 -

Brief history of the Annotated Web

• 1995: HTML meta tags• 1996: Simple HTML Ontology Extensions (SHOE)• 1998: RDF/XML

– RDF/XML in HTML– RDF linked from HTML

• 2003: Web 2.0– Tagging– Microformats– Metadata in Wikipedia– Machine tags in Flickr

• 2005: eRDF • 2008: RDFa

Page 30: Semantic Web Austin Yahoo

- 30 -

HTML meta tags

<HTML><HEAD profile="http://dublincore.org/documents/dcq-html/"><META name="DC.author" content="Peter Mika"><LINK rel="DC.rights copyright"

href="http://www.example.org/rights.html" /> <LINK rel="meta" type="application/rdf+xml" title="FOAF"

href= "http://www.cs.vu.nl/~pmika/foaf.rdf"> </HEAD> …</HTML>

Page 31: Semantic Web Austin Yahoo

- 31 -

SHOE example (Hefflin & Hendler, 1996)

<ONTOLOGY "our-ontology" VERSION="1.0"> <ONTOLOGY-EXTENDS "organization-ontology" VERSION="2.1" PREFIX="org"

URL="http://www.ont.org/orgont.html"> <ONTDEF CATEGORY="Person" ISA="org.Thing"> <ONTDEF RELATION="lastName" ARGS="Person STRING"> <ONTDEF RELATION="firstName" ARGS="Person STRING"> <ONTDEF RELATION="marriedTo" ARGS="Person Person"> <ONTDEF RELATION="employee" ARGS="org.Organization Person">

</ONTOLOGY>

<HEAD><META HTTP-EQUIV="Instance-Key" CONTENT="http://www.cs.umd.edu/~george"> <USE-ONTOLOGY "our-ontology" VERSION="1.0" PREFIX="our" URL="http://ont.org/our-ont.html"> </HEAD><BODY>

<CATEGORY "our.Person">

<RELATION "our.marriedTo" TO="http://www.cs.umd.edu/~helena">

<RELATION "our.employee" FROM="http://www.cs.umd.edu">

My name is

<ATTRIBUTE "our.firstName"> George </ATTRIBUTE>

<ATTRIBUTE "our.lastName"> Cook </ATTRIBUTE> and I live at...

Page 32: Semantic Web Austin Yahoo

- 32 -

SHOE system

Page 33: Semantic Web Austin Yahoo

- 33 -

SHOE Text-based query interface

Page 34: Semantic Web Austin Yahoo

- 34 -

SHOE Graphical Query Interface

Page 35: Semantic Web Austin Yahoo

- 35 -

Example: Creative Commons

Embedding CC license in HTML (now deprecated):

<HTML><HEAD>… </HEAD><BODY>…

<!–- <rdf:RDF xmlns="http://creativecommons.org/ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <Work rdf:about="http://www.yergler.net/averages/"> <dc:title>The Law of Averages</dc:title> <dc:description>...because eventually i&apos;ll be right...</dc:description> <license rdf:resource="http://creativecommons.org/licenses/by-nc/1.0/" /> </Work> <License rdf:about="http://creativecommons.org/licenses/by-nc/1.0/"><requires rdf:resource="http://web.resource.org/cc/Notice" /> <permits rdf:resource="http://web.resource.org/cc/Reproduction" /> <permits rdf:resource="http://web.resource.org/cc/Distribution" /> <prohibits rdf:resource="http://web.resource.org/cc/CommercialUse" /> </License> </rdf:RDF>

-->

Page 36: Semantic Web Austin Yahoo

- 36 -

Example: Creative Commons

• Current: rel attribute (HTML4)

This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/3.0/us/">Creative Commons Attribution 3.0 United States License</a>.

• Use of the “rel” attribute for semantic annotation is the birth of the microformat…

Page 37: Semantic Web Austin Yahoo

- 37 -

Microformats (μf)

• Community centered around microformats.org– Specifications and discussions are hosted there

• Agreements on the way to encode certain kinds metadata in HTML– Reuse of semantic-bearing HTML elements– Based on existing standards– Minimality

• Microformats exist for a limited set of objects– hCard (persons and organizations)– hCalendar (events)– hResume– hProduct– hRecipe

• Varying degrees of support and stability– hCard and rel-tag are widely supported

Page 38: Semantic Web Austin Yahoo

- 38 -

Microformats: limitations

• No shared syntax

– Each microformat has a separate syntax tailored to the vocabulary

• No formal schemas

– Limited reuse, extensibility of schemas

– Unclear which combinations are allowed

• No datatypes

• No namespaces, unique identifiers (URIs)

– no interlinking

– mapping between instances is required

• Relationship to page context is often unclear

Page 39: Semantic Web Austin Yahoo

- 39 -

Example: microformats

<cite class="vcard">

<a class="fn url" rel="friend colleague met"

href="http://meyerweb.com/">Eric Meyer</a>

</cite> wrote a post (<cite>

<a href="http://meyerweb.com/eric/thoughts/2005/12/16/tax-relief/">

Tax Relief</a></cite>) about an unintentionally humorous letter

he received from the <span class="vcard">

<a class="fn org url" href="http://irs.gov/">

Internal Revenue Service</a> </span>.

<div class="vcard">

<a class="email fn" href="mailto:[email protected]">Joe Friday</a>

<div class="tel">+1-919-555-7878</div>

<div class="title">Area Administrator, Assistant</div>

</div>

Page 40: Semantic Web Austin Yahoo

- 40 -

Microformats vs. RDFa

• Choose microformats when you find a microformat that fits your needs and supported by Yahoo!– Microformats are first option because they are simple

– We support all major microformats, see the documentation

– It’s a common misconception that RDFa requires XHTML: it doesn’t

• If you find none that perfectly fits your needs then you need RDFa– Microformats have a fixed schema: you can not add your own

attributes

• Example: a social networking site with user profiles– VCard is a good candidate, but for example it doesn’t have a way to

express the user’s social connections

– You either live without this, or go with RDFa

• The rest of this presentation is about RDFa, which is thus more powerful, but also more complex– We will focus on the concepts that are hard to grasp

Page 41: Semantic Web Austin Yahoo

- 41 -

Keep an eye on HTML5

• Currently under standardization at the W3C

– Last Call this fall, keep an eye on it

• Introduces Microdata

– Similar to microformats

• Some predefined vocabularies with central registration

– Some of the flexibility of RDFa

– Introduce new terms using reverse domain names or full URIs

• Semantic HTML elements such as <time>, <video>, <article>…

Page 42: Semantic Web Austin Yahoo

- 42 -

Microdata example<div item>

<p>My name is <span itemprop="name">Neil</span>.</p>

<p>My band is called

<span itemprop="band">Four Parts Water</span>.

I was born on

<time itemprop="birthday" datetime="2009-05-10">May 10th 2009</time>.

<img itemprop="image" src=”me.png" alt=”me”>

</p>

</div

Page 43: Semantic Web Austin Yahoo

Slides courtesy of Mark BirbeckIntroduction to RDFa

Page 44: Semantic Web Austin Yahoo

- 44 -

What does RDFa look like?

• There are some metadata features in HTML already...

• ...so we give them an RDF interpretation...

• ...then we generalise them...

• ...and then we add a few more.

Page 45: Semantic Web Austin Yahoo

- 45 -

HTML's metadata features (1)

<html> <head>   <title>RDFa: Now everyone can have an API</title>   <meta name="author" content="Mark Birbeck" />   <meta name="created" content="2009-05-09" />   <link rel="license"

    href="http://creativecommons.org/licenses/by-sa/3.0/" /> </head> . . .</html>

Page 46: Semantic Web Austin Yahoo

- 46 -

HTML's metadata features (2)

<a href="http://creativecommons.org/licenses/by-sa/3.0/" >CC Attribution-ShareAlike</a>

<a rel="license"

 href="http://creativecommons.org/licenses/by-sa/3.0/" >CC Attribution-ShareAlike</a>

Page 47: Semantic Web Austin Yahoo

- 47 -

RDFa extends @rel/@href to images

<img src="image01.png"

rel="license"

 href=“http://creativecommons.org/licenses/by-sa/3.0/”

/>

<img src="image02.png"

rel="license"

 href=“http://creativecommons.org/licenses/by-sa/3.0/”

/>

Page 48: Semantic Web Austin Yahoo

- 48 -

RDFa extends meta/@content to body

<html> <head>   <title>RDFa: Now everyone can have an API</title>   <meta name="author" content="Mark Birbeck" />   <meta name="created" content="2009-05-09" /> </head> <body>   <h1>RDFa: Now everyone can have an API</h1>   Author: <em>Mark Birbeck</em>   Created: <em>May 9th, 2009</em> </body>

</html>

Page 49: Semantic Web Austin Yahoo

- 49 -

RDFa extends meta/@content to body

<html> <head>   <title>RDFa: Now everyone can have an API</title> </head> <body>   <h1>RDFa: Now everyone can have an API</h1>   Author: <em property="author" content="Mark Birbeck"    >Mark Birbeck</em>   Created: <em property="created" content="2009-05-09"    >May 9th, 2009</em> </body>

</html>

Page 50: Semantic Web Austin Yahoo

- 50 -

RDFa extends meta/@content to body

<html> <head>   <title>RDFa: Now everyone can have an API</title> </head> <body>   <h1>RDFa: Now everyone can have an API</h1>   Author: <em property="author"    >Mark Birbeck</em>   Created: <em property="created" content="2009-05-09"    >May 9th, 2009</em> </body>

</html>

Page 51: Semantic Web Austin Yahoo

- 51 -

Vocabularies use CURIEs

<html xmlns:dc="http://purl.org/dc/terms/">

 <head>   <title>RDFa: Now everyone can have an API</title> </head> <body>   <h1>RDFa: Now everyone can have an API</h1>   Author: <em property="dc:creator"    >Mark Birbeck</em>   Created: <em property="dc:created" content="2009-05-09"    >May 9th, 2009</em> </body>

</html>

Page 52: Semantic Web Austin Yahoo

- 52 -

CURIEs, or Compact URIs

• Named after Marie Curie, who was the first person to receive two Nobel prizes, one for physics and one for chemistry.

• CURIEs allow a full URI to be expressed in a simple prefix:suffix form.

• The 'suffix' part is looser than in XML namespaces, supporting formulations such as abc:123.

Page 53: Semantic Web Austin Yahoo

- 53 -

Properties can also apply to images

<img src="image01.png”

rel="license"

 href=“http://creativecommons.org/licenses/by-sa/3.0/”

/>

<img src="image02.png”

rel="license"

 href="http://creativecommons.org/licenses/by-sa/3.0/”

/>

Page 54: Semantic Web Austin Yahoo

- 54 -

Properties can also apply to images

<img src="image01.png"

rel="license"

 href="http://creativecommons.org/licenses/by-sa/3.0/"

property="dc:creator" content="Mark Birbeck”

/>

<img src="image02.png"

rel="license"

 href="http://creativecommons.org/licenses/by-sa/3.0/" property="dc:creator" content="Mark Birbeck"

/>

Page 55: Semantic Web Austin Yahoo

- 55 -

Relationships and properties on anything

<a

 href="http://www.slideshare.net/mark.birbeck/the-5-minute-guide-to-rdfain-only-6-minutes-40-seconds" >The 5 minute guide to RDFa...in only 6 minutes and 40 seconds</a>

Page 56: Semantic Web Austin Yahoo

- 56 -

Relationships and properties on anything

<a rel="license"

 href="http://www.slideshare.net/mark.birbeck/the-5-minute-guide-to-rdfain-only-6-minutes-40-seconds" >The 5 minute guide to RDFa...in only 6 minutes and 40 seconds</a>

Doesn't say what we want.

Page 57: Semantic Web Austin Yahoo

- 57 -

Relationships and properties on anything

<a

  href="http://www.slideshare.net/mark.birbeck/the-5-minute-guide-to-rdfain-only-6-minutes-40-seconds" >The 5 minute guide to RDFa...in only 6 minutes and 40 seconds</a>

is licensed under

<a

 href="http://creativecommons.org/licenses/by-sa/2.5/"

 >CC BY SA</a>.

Page 58: Semantic Web Austin Yahoo

- 58 -

Relationships and properties on anything

<a

  href="http://www.slideshare.net/mark.birbeck/the-5-minute-guide-to-rdfain-only-6-minutes-40-seconds" >The 5 minute guide to RDFa...in only 6 minutes and 40 seconds</a>

is licensed under

<a about="http://www.slideshare.net/mark.birbeck/the-5-minute-guide-to-rdfain-only-6-minutes-40-seconds"

 rel="license"

 href="http://creativecommons.org/licenses/by-sa/2.5/"

 >CC BY SA</a>.

Page 59: Semantic Web Austin Yahoo

- 59 -

Relationships and properties on anything

<a

  href="http://www.slideshare.net/mark.birbeck/the-5-minute-guide-to-rdfain-only-6-minutes-40-seconds" >The 5 minute guide to RDFa...in only 6 minutes and 40 seconds</a>is licensed under

<a about="http://www.slideshare.net/mark.birbeck/the-5-minute-guide-to-rdfain-only-6-minutes-40-seconds"

 rel="license"

 href="http://creativecommons.org/licenses/by-sa/2.5/"

 property="dc:creator" content="Mark Birbeck>

CC BY SA

</a>.

Page 60: Semantic Web Austin Yahoo

- 60 -

@about sets context

<div about="http://www.slideshare.net/mark.birbeck/the-5-minute-guide-to-rdfain-only-6-minutes-40-seconds">

   <h1>The 5 minute guide to RDFa...</h1>   Author: <em property="dc:creator"    >Mark Birbeck</em>   Created: <em property="dc:created" content="2009-05-09"    >May 9th, 2009</em>

</div>

Page 61: Semantic Web Austin Yahoo

- 61 -

@about sets context

<html xmlns:dc="http://purl.org/dc/terms/">

 <head>   <title>RDFa: Now everyone can have an API</title>

 </head> <body>

   <h1>RDFa: Now everyone can have an API</h1>   

   Author: <em property="dc:creator"    >Mark Birbeck</em>   Created: <em property="dc:created" content="2009-05-09"    >May 9th, 2009</em> </body>

</html>

Page 62: Semantic Web Austin Yahoo

- 62 -

Basics of RDFa

– generalise HTML's existing semantic features;

– add support for CURIEs for property and relationship names;

– add @about.

Page 63: Semantic Web Austin Yahoo

- 63 -

Advanced RDFa

– use of @datatype to set the data type of @content;

– use of @typeof to set rdf:type;

– support for bnodes;

– support for XML literals;

– ability to chain statements together.

• Note that since RDFa supports all of the features you'll find in RDF, then it means that you can even mark-up OWL documents in HTML.

Page 64: Semantic Web Austin Yahoo

- 64 -

The process of annotating with RDFa

1. Invest in familiarizing with the RDFa syntax by reading the RDFa Primer– It is also highly recommended that you read the RDF Primer. RDF is the data model used

by RDFa.

2. Choose a vocabulary from the SearchMonkey documentation that fits your needs– A vocabulary describes a set of types and attributes within a given domain – If you don’t find a good candidate, extend an existing one or create a new one

3. Annotate your page.– Before you start, you might want to validate your page for (X)HTML conformance

using the W3C’s (X)HTML Validator to reduce the chance of errors. Choose Document Type XHTML + RDFa.

– No specific tool support. If you have an HTML or XML editor that supports DTDs, you will have syntax checking and highlighting.

– Use the RDFa Distiller to validate which data can be extracted from your page.– If you fancy, use the RDF Validator to graphically visualize the RDF graph that is

outputted.

4. Put the annotated page online. The data will extracted the next time your page is crawled– No need to explicitly submit anything– No notification when your site is crawled

• See http://rdfa.info/rdfa-implementations for new tools and APIs

Page 65: Semantic Web Austin Yahoo

- 65 -

RDFa pitfalls

• Validation problems can stop us from extracting data– Use the W3C validator

– Use the right DOCTYPE declaration if using XHTML

– Set the encoding of your page properly (using HTTP headers or XML declaration)

• Prefixes need to be defined using the xmlns attribute

• Unless you are making statements about the document, set the subject using the about attribute

• Do not include HTML elements in literal values– Incorrect: <div property=“foaf:name”><b>Peter Mika</b></div>

• Use absolute URIs as the value of the resource attribute– Or make sure you specify HTML base

Page 66: Semantic Web Austin Yahoo

- 66 -

More pitfalls: precedence rules

• Be careful when using rel and typeof in combination because of the precedence rules

• BAD example:

<div about=“#id”>

<span property=“foaf:name“>Peter Mika</span>

<span rel=“foaf:img“ typeof=“foaf:Image”>

<span property=“dc:format”>jpg</span>

</span

</div>

• To correct, you need to put the typeof inside the <span> node with rel=“foaf:img”

Page 67: Semantic Web Austin Yahoo

- 67 -

More pitfalls: the typeof attribute

• Typeof does two things at once: it creates a new subject resource and assigns the type to it

• BAD example:

<div about=“#id”>

<span property=“foaf:name“>Peter Mika</span>

<span rel=“foaf:img“ resource=“http://www.example.org/photo.jpg”>

<span typeof=“foaf:Image”>

<span property=“dc:format”>jpg</span>

</span

</span

</div>

• To correct, you have to repeat the resource attiribute on the span node with the typeof

Page 68: Semantic Web Austin Yahoo

- 68 -

HTML markup pitfalls

• Marking up <h1>:<h1 property=“dc:title”>My homepage</h1>

NOT: <h1><div property=“dc:title”>My homepage</h1>

•  Marking up an image:<a href="http://example.org/user/alex">

     <span about="#user1" rel="foaf:img media:image">        <img alt="Alex" src="http://example.org/photos/alex.jpg"/>     </span>

</a>

This doesn’t work:

<img rel=“foaf:img” src=“photo.jpg/>

• In the header you need<meta property=“…” content=“…”>

NOT

<meta name=“…” content=“…”>

Page 69: Semantic Web Austin Yahoo

- 69 -

More pitfalls: breaking up descriptions

• You can not break up a description like this:

<span rel=“foaf:knows">   <span property=“foaf:name">Peter Mika</span></span>….

<span rel=“foaf:knows">   <a rel=“foaf:email“ href=“mailto:[email protected] /></span>

• This is not the same as:

<span rel=“foaf:knows">   <span property=“foaf:name">Peter Mika</span>   

<a rel=“foaf:email“ href=“mailto:[email protected] />

</span>

• In the first case there are two related resources, with one attribute each, in the second case there is a single related resource with two attributes.

Page 70: Semantic Web Austin Yahoo

- 70 -

Tips

• Hiding information from being displayed

– Links without content will not be rendered

– Use <span property=“foaf:name” content=“Peter Mika”/>

• Use datatypes to provide the expected type of a literal.

– This helps validation because any tool can check whether the literal is indeed of that type.

Page 71: Semantic Web Austin Yahoo

- 71 -

Choosing a vocabulary

• Look at SearchMonkey objects

– Video, Games, Presentations, Events, News, Businesses, Products, Discussion

• Search the Web or ask for advice on mailing lists

[email protected]

[email protected]

• Beware of people who claim to have the vocabulary of everything

– Preferably you want something small and targeted

• Never a 100% fit you will need to introduce vocabulary terms (classes and properties)

– Do not introduce new classes/properties in existing namespaces

– Example: the namespace http://xmlns.com/foaf/0.1/ is used by the FOAF project. Try not to introduce a new term without contacting the owner, i.e. the membership of the FOAF mailing list.

Page 72: Semantic Web Austin Yahoo

- 72 -

Advanced topic: creating a vocabulary

1. Get advice on methodology– vocamp.org and semanticweb.org

2. Choose a namespace and a prefix– Give sensible names, e.g. name it after your site, but don’t call it searchmonkey

– Namespace ends either with a slash or a hash

3. Create an RDF or OWL document describing your classes and properties• Use an ontology editor such as Protégé 4.0

• Follow naming conventions

4. Publish your vocabulary– Make sure the URIs of your properties and classes are resolvable

1. E.g. myvocab:digicam should resolve to a document containing the definition of myvocab:digicam

• Convince others to adopt your vocabulary1. If you are in fishing, convince other fishing businesses

Page 73: Semantic Web Austin Yahoo

- 73 -

Exercise

• Explore data on the Web

– Microformats

• Search for pages on Yahoo using searchmonkey:com.yahoo.page.uf.hcard

• Try Operator Firefox Plug-in

• Try Optimus

– RDFa

• Search for pages on Yahoo using searchmonkey:com.yahoo.page.rdf.rdfa

• Try RDFa bookmarklet

• Try RDFa Distiller

• Mark up your webpage using RDFa

– See process on previous slides

Page 74: Semantic Web Austin Yahoo

Semantic Web in Use

Page 75: Semantic Web Austin Yahoo

- 81 -

Applications

• Yahoo’s SearchMonkey and Google’s Rich Snippets

• BOSS and YQL

• Semantic search and navigation

Page 76: Semantic Web Austin Yahoo

- 82 -

• Creating an ecosystem of publishers, developers and end-users – Motivating and helping publishers to implement semantic

annotation

– Providing tools for developers to create compelling applications

– Focusing on end-user experience

• Rich abstracts as a first application

• Addressing the long tail of query and content production

• Standard Semantic Web technology– dataRSS = Atom + RDFa

– Industry standard vocabularies

• http://developer.yahoo.com/searchmonkey/

SearchMonkey

Page 77: Semantic Web Austin Yahoo

- 83 -

Before After

an open platform for using structured data to build more useful and relevant search results

What is SearchMonkey?

Page 78: Semantic Web Austin Yahoo

- 84 -

image

deep links

name/value pairs or

abstract

Enhanced Result

Page 79: Semantic Web Austin Yahoo

- 85 -YAHOO! CONFIDENTIAL | 85

Infobar

Page 80: Semantic Web Austin Yahoo

- 86 -

Acme.com’sdatabase

Index

RDF/Microformat Markup

site owners/publishers share structured data with Yahoo!. 1

consumers customize their search experience with Enhanced Results or Infobars

3

site owners & third-party developers build SearchMonkey apps.2

DataRSS feed

Web Services

Page Extraction

Acme.com’s Web Pages

SearchMonkey

Page 81: Semantic Web Austin Yahoo

- 87 -

Standard enhanced results

Embed markup in your page, get an enhanced results without any programming

Page 82: Semantic Web Austin Yahoo

- 88 -

Documentation

• Simple and advanced, examples, copy-paste code, validator

Page 83: Semantic Web Austin Yahoo

- 92 -

Developer tool: create custom presentations

Page 84: Semantic Web Austin Yahoo

- 93 -

Developer tool

Page 85: Semantic Web Austin Yahoo

- 94 -

Developer tool

Page 86: Semantic Web Austin Yahoo

- 95 -

Developer tool

Page 87: Semantic Web Austin Yahoo

- 96 -

Developer tool

Page 88: Semantic Web Austin Yahoo

- 97 -

Gallery

Page 89: Semantic Web Austin Yahoo

- 98 -

Example apps

• LinkedIn

– hCard plus feed data

• Creative Commons by Ben Adida

– CC in RDFa

Page 90: Semantic Web Austin Yahoo

- 99 -

Example apps. II.

• Other me by Dan Brickley

– Google Social Graph API wrapped using a Web Service

Page 91: Semantic Web Austin Yahoo

- 100 -

Google’s Rich Snippets

• Shares a subset of the features of SearchMonkey

– Encourages publishers to embed certain microformats and RDFa into webpages

• Currently reviews, people, products, business & organizations

– These are used to generate richer search results

• SearchMonkey is customizable

– Developers can develop applications themselves

• SearchMonkey is open

– Wide support for standard vocabularies

– API access

Page 92: Semantic Web Austin Yahoo

API access to metadataYahoo BOSS & YQL

Page 93: Semantic Web Austin Yahoo

- 102 -

BOSS: Build your Own Search Service

• Ability to re-order results and blend-in addition content

• No restrictions on presentation

• No branding or attribution

• Access to multiple verticals (web search, image, news)

• 40+ supported language and region pairs

• Pricing (BOSS)

– Pay-by-usage

– 10,000 queries a day still free

– Serve any ads you want

• For more info, http://developer.yahoo.com/search/boss/

Page 94: Semantic Web Austin Yahoo

- 103 -

BOSS API to structured data

• Simple HTTP GET calls, no authentication

– You need an Application ID: register at developer.yahoo.com/search/boss/

• http://boss.yahooapis.com/ysearch/web/v1/{query}?appid={appid}&format=xml&view=searchmonkey_feed

• Restrict your query using special words

– searchmonkey:com.yahoo.page.uf.{format}

• {format} is one of hcard, hcalendar, tag, adr, hresume etc.

– searchmonkey:com.yahoo.page.rdf.rdfa

Page 95: Semantic Web Austin Yahoo

- 104 -

Demo: resume search

• Search pages with resume data and given keywords

{keyword} searchmonkey:com.yahoo.page.uf.hresume

• Parse the results as DataRSS (XML)

• Extract information and display using YUI

Page 96: Semantic Web Austin Yahoo

- 105 -

Demo

Page 97: Semantic Web Austin Yahoo

- 106 -

Yahoo Query Language (YQL)

• Query web APIs as virtual tables

– Mash-up data by joining tables

– Add an API by adding a table definition

• Example: select my friends and sort by nickname

Page 98: Semantic Web Austin Yahoo

- 107 -

PHP example: select the last 100 photos from Flickr with the word Austin

<?php

$url = "http://query.yahooapis.com/v1/public/yql?q=";

$q = "select * from flickr.photos.search(100) where text=’Austin'";

$fmt = "xml";

$x = simplexml_load_file($url.urlencode($q)."&format=$fmt");

foreach($x->attributes('http://www.yahooapis.com/v1/base.rng') as $k=>$v) {

$$k=(string)$v;

}

echo <<<EOB

$count photos fetched from

{$x->diagnostics->url} in

{$x->diagnostics->url['execution-time']} seconds<br>

EOB;

$flickr = "http://static.flickr.com/";

foreach($x->results->photo as $p) {

echo "<img src=\"$flickr{$p['server']}/{$p['id']}_{$p['secret']}_s.jpg\"/>\n";

}

?>

Page 99: Semantic Web Austin Yahoo

- 108 -

YQL example (source)

Page 100: Semantic Web Austin Yahoo

- 109 -

That’s all there is to it!

<?php

$root = 'http://query.yahooapis.com/v1/public/yql?q=';

$city = 'Barcelona';

$loc = 'Barcelona';

$yql = 'select * from html where url = \'http://en.wikipedia.org/wiki/'.$city.'\' and xpath="//div[@id=\'bodyContent\']/p" limit 3';

$url = $root . urlencode($yql) . '&format=xml';

$info = getstuff($url);

$info = preg_replace("/.*<results>|<\/results>.*/",'',$info);

$info = preg_replace("/<\?xml version=\"1\.0\"".

" encoding=\"UTF-8\"\?>/",'',$info);

$info = preg_replace("//",'',$info);

$info = preg_replace("/\"\/wiki/",'"http://en.wikipedia.org/wiki',$info);

$yql = 'select * from upcoming.events.bestinplace(5) where woeid in '.

'(select woeid from geo.places where text="'.$loc.'")'.

' | unique(field="description")';

$url = $root . urlencode($yql) . '&format=json';

$events = getstuff($url);

$events = json_decode($events);

foreach($events->query->results->event as $e){

$evHTML.='<li><h3><a href="'.$e->ticket_url.'">'.$e->name.'</a></h3><p>'.

substr($e->description,0,100).'&hellip;</p></li>';

}

$yql = 'select * from flickr.photos.info where photo_id in '.

'(select id from flickr.photos.search where woe_id in '.

'(select woeid from geo.places where text="'.$loc.'")) limit 16';

$url = $root . urlencode($yql) . '&format=json';

$photos = getstuff($url);

$photos = json_decode($photos);

foreach($photos->query->results->photo as $s){

$src = "http://farm{$s->farm}.static.flickr.com/{$s->server}/".

"{$s->id}_{$s->secret}_s.jpg";

$phHTML.='<li><a href="'.$s->urls->url->content.'"><img alt="'.

$s->title.'" src="'.$src.'"></a></li>';

}

$yql='select description from rss where '.

' url="http://weather.yahooapis.com/forecastrss?p=SPXX0015&u=c"';

$url = $root . urlencode($yql) . '&format=json';

$weather = getstuff($url);

$weather = json_decode($weather);

$weHTML = $weather->query->results->item->description;

function getstuff($url){

$curl_handle = curl_init();

curl_setopt($curl_handle, CURLOPT_URL, $url);

curl_setopt($curl_handle, CURLOPT_CONNECTTIMEOUT, 2);

curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, 1);

$buffer = curl_exec($curl_handle);

curl_close($curl_handle);

if (empty($buffer)){

return 'Error retrieving data, please try later.';

} else {

return $buffer;

}

}?>

Page 101: Semantic Web Austin Yahoo

Semantic Search and Navigation

Page 102: Semantic Web Austin Yahoo

- 111 -

Browsing Linked Data

• Browse the Linked Data graph by going from one URI to the next

– OpenLink’s Linked Data browser

– OpenLink’s Data Explorer

– Tabulator

– Disco

– Marbles

Page 103: Semantic Web Austin Yahoo

- 112 -

Semantic Search Engines

• Natural Language search engines

– Hakia

– Powerset (now built into Bing)

– TrueKnowledge

• Structured data search engines

– Searching open web data

• Sindice

• Sigma

– Searching closed world data

• Wolfram Alpha (closed structured data + computation)

Page 104: Semantic Web Austin Yahoo

- 113 -

Exercise

• Build a SearchMonkey application

• Try a few YQL queries

• Try Zemanta online

– You don’t need to install the plugin

• Find data on the Web using search or navigation

Page 105: Semantic Web Austin Yahoo

Research on Semantic Search

Page 106: Semantic Web Austin Yahoo

- 115 -

Semantic Search

• Def. matching the user’s query with the Web’s content at a conceptual level, often with the help of world knowledge

• Related disciplines

– Semantic Web, IR, Databases, NLP, IE

• As a field

– ISWC/ESWC/ASWC, WWW, SIGIR

– Exploring Semantic Annotations in Information Retrieval (ECIR08, WSDM09)

– Semantic Search Workshop (ESWC08, WWW09)

– Future of Web Search (FoWS09)

Page 107: Semantic Web Austin Yahoo

- 116 -

Hard searches

• Ambiguous searches– Paris Hilton

• Multimedia search– Images of Paris Hilton

• Imprecise or overly precise searches – Publications by Jim Hendler– Find images of strong and adventurous people (Lenat)

• Searches for descriptions– Search for yourself without using your name– Product search (ads!)

• Searches that require aggregation– Size of the Eiffer tower (Lenat)– Public opinion on Britney Spears

• Queries that require a deeper understanding of the query, the content and/or the world at large

– Note: some of these are so hard that users don’t even try them any more

Page 108: Semantic Web Austin Yahoo

- 117 -

Not just search…

Page 109: Semantic Web Austin Yahoo

- 118 -

In the best of cases…

• Matching the query intent with the document metadata can be trivial:

<adjunct id="com.yahoo.query.intent" version="0.5"> <type typeof="fb:music.artist foaf:Person"> <meta property="foaf:name">Madonna</meta> </type> </adjunct>

<adjunct id="com.yahoo.page.hcard" version="0.5"> <type typeof=“foaf:Person"> <meta property="foaf:name">Madonna</meta> </type> </adjunct>

Query:

Document metadata:

dna_checksum:AF514FE45DD33BB7CD8DCCC89AA

dna_checksum:AF514FE45DD33BB7CD8DCCC89AA

Page 110: Semantic Web Austin Yahoo

- 119 -

Semantics at every step of the IR process

bla bla bla?

bla

blabla

q=“bla” * 3

Document processing bla

blabla

blabla

bla

Ranking “bla”θ(q,d)

Query processing

Result presentation

The IR engine The Web

Page 111: Semantic Web Austin Yahoo

- 120 -

Study: improving text analysis using structured data

• Problems – Creating training data manually is expensive– Existing taggers trained on financial-political news

• Idea:

1. Learn the correspondence between entities in text and metadata

2. Extend knowledge base and generate in-domain training data

Learning to Tag and Tagging to Learn: A Case Study on Wikipedia Peter Mika; Massimiliano Ciaramita; Hugo Zaragoza; Jordi Atserias, IEEE Intelligent Systems, 2008, 5

Page 112: Semantic Web Austin Yahoo

- 121 -

Study: processing metadata using cloud computing

• Question: Can we use Pig to effectively query and reason with large amounts of RDF data?– Mapping SPARQL to PigLatin

– Forward-chaining RDF(S) reasoning

– Acknowledgement: Ben Reed

• Experimental results (LUBM)– Useful for long running queries and reasoning

– Not useful for interactive queries (< 100 s)

• Co-organizing:– Billion Triples Challenge at ISWC 2008, October 26-28,

Karlsruhe, Germany

Web Semantics in the Clouds Peter Mika; Giovanni Tummarello, IEEE Intelligent Systems, 2008, 5

Page 113: Semantic Web Austin Yahoo

- 122 -

Study: metadata analysis

• What vocabularies are being used? . • What microformats should we support? • How much vocabulary reuse/extension there is?

– Is there a convergence?

• What is the quality of metadata? – Datatype conformance – Logical consistency– Conformance to common use wrt common attributes

• How much spam is there? – Distribution of spamicity scores– Do spamicity scores transfer to metadata?

• Are there new schemas emerging through the combination of existing vocabularies?

• What is the metadata coverage in terms of queries? – What percentage of queries from query logs would result in metadata? – How many would result in metadata that could answer the query? (by some

approximation)

Page 114: Semantic Web Austin Yahoo

- 123 -

Study: Semantic Search Assist

• Observation: the same type of objects often have the same query context– Users asking for the same aspect of the type

• Could we make query suggestions based on the type of the entity?– Improvement for infrequent queries

apple ipod nano review sony plasma tv reviewjerry yang biography biography tim berners lee tim berners lee blog

peter mika yahoobritney spears shaves her head

Page 115: Semantic Web Austin Yahoo

- 124 -

Study: evaluation of semantic search

• Analysis of user needs

– How are these needs aligned with data on the Web?

– How do the vocabularies differ?

• Analysis of query types

– Object queries? Object-attribute queries? Relationship queries?

• What it means for an object or a set of triples to be relevant to a query?

– Show me the answer and only the answer

– Put me near the answer in the graph

– Show me the justification (or at least the source) of the answer

– …

• Semantic Search evaluation campaign planned for 2010

Page 116: Semantic Web Austin Yahoo

- 125 -

Challenges

• Future work in Semantic Web

– (Semi-)automated ways of metadata creation• How do we go from 5% to 95%?

– Data quality• We allow providing metadata for other people’s sites!

– Reasoning• To what extent is reasoning useful?

• For example, how much would entity resolution or taxonomic reasoning help?

– Scale• How do we exploit cluster computing techniques?

• What is between databases and IR engines?

– Fostering social agreements• How do we get people to reuse vocabularies?

Page 117: Semantic Web Austin Yahoo

- 126 -

Challenges

• Future work in IR– Query interpretation

– Ranking with metadata

– Evaluation of semantic search

– Personalization

– Semantic ads

• Constraints

– Users still want to see a document• Keyword-based search cannot suffer

• Whole page relevance, monetization can only increase

– Established expectations • Query entry

• Result presentation

Page 118: Semantic Web Austin Yahoo

- 127 -

Contact

• Peter Mika

[email protected]

– Come to Barcelona and stop by

• SearchMonkey

– developer.yahoo.com/searchmonkey/

– mailing lists

[email protected]

[email protected]

– forums

• http://suggestions.yahoo.com/searchmonkey

– Semantic Web FAQ

• http://devel.yahoo.com/searchmonkey/smguide/faq.html

Page 119: Semantic Web Austin Yahoo

- 128 -

the monkey is out!

Page 120: Semantic Web Austin Yahoo

- 129 -

Application: query intent

Paris Hilton is a person!

Page 121: Semantic Web Austin Yahoo

- 130 -

Application: query intent #2

Hugo is a person!