Why to use Semantic Annotations?
• to annotate HTML content with specific machine-
readable labels
• to allow generic scripts to provide custom services to
the page
• to enable content from many sources to be processed
by a single script in a consistent manner.
“specific machine-readable labels”?
• Indeed you have to use „labels“ from a specific vocabulary,
e.g. Schema.org
• Schema.org is endorsed by Google, Bing and Yahoo!
• This presentation does not discuss Schema.org
vocabulary but use it
• Check http://getschema.org for many other examples.
How to use these „labels“? (1)<h1>Pita Pizza</h1>
<p>By Mindy Pretner</p>
Pita Pizza is a quick snack or meal that can be customized to your liking!
<h2>1 Serving Piece</h2>
<p> Prep Time: 5 Min<br/> Cook Time: 15 Min<br/> Ready In: 20 Min</p>
<h3> Ingredients</h3>
<ul>
<li>1 pita bread round</li>
<li>5 ml olive oil</li>
<li>45 ml pizza sauce</li>
<li>55 g shredded mozzarella cheese</li>
<li>25 g sliced crimini mushrooms</li>
<li>0.7 g garlic salt</li>
</ul>
<h3> How to do it</h3>
<ol>
<li>Preheat grill for medium-high heat.</li>
<li>Spread one side of the pita with olive oil and pizza sauce. Top with cheese and mushrooms, and season with garlic salt.</li>
<li>Lightly oil grill grate. Place pita pizza on grill, cover, and cook until cheese completely melts, about 5 minutes.</li>
</ol>
<h3> Nutritional Information </h3>
<p> <strong>Amount Per Serving</strong>
Calories: 405 | Total Fat: 18g | Cholesterol: 44mg
</p>
Well, you have to tell Google that you published a Recipe
This is my HTML, what should I do now?
I want to publish a recipe
<div itemscope itemtype="http://schema.org/Recipe">
<h1>Pita Pizza</h1>
…
</div>
Use a container element (such as <div>) to describe that all enclosed content is about the same thing
itemscope? itemtype? – indeed, they are some new attributes you need to use... This is Microdata.
What about http://schema.org/Recipe ?This is the thing your content is about :)
Microdata (1) – what it is?
• Microdata is a collection of HTML5 attributes helping
us to describe items (such as a Recipe) and their
properties (such as cooking time)
• An item is a group of property-value pairs (Oh, that‘s
why I need a container. Before I just defined a Recipe
item...)
• To create an item you MUST use the itemscope attribute
• Each item should have a type defined with the itemtype attribute
Microdata (2) – itemscope, itemtype
• Usually HTML attributes have a value.. . What about this itemscope ?
• Well, in HTML5 attributes no longer require a value. Therefore use itemscope without any value.
• How to I know the value of the itemtype attribute?
• This value is a special name (technically called URI –
Uniform Resource Identifier ) which identifies and uniformly
what is about your content.
• The value http://schema.org/Recipe is defined by
Schema.org vocabulary. This is the vocabulary processed by
Google, Bing and Yahoo!. Check http://getschema.org for
examples.
• itemtype MUST come always together with itemscope
Microdata (3) - property-value pair
• What about these „property-value pairs“ ?
• Suppose you want to describe that your recipe
is for 1 serving...
What? I did this in my HTML ...
Well, you have just a plain text difficult to be processed by machines... They would need to understand the human language, even ALL languages in the world...
<div>
<span itemprop="recipeYield">1</span>
Serving
</div>
Microdata (4) - itemprop• Use a property-value pair to describe this
• Microdata has the itemprop attribute („item
property“)What should I do better?
I‘ve got it! The property is recipeYield
and the value is "1"
So we have a property-value pair
Microdata (5) - more about itemprop
• An itemprop can also introduce multiple properties
at once, to avoid duplication when that properties
have the same value.
<p itemprop="author creator" itemscope
itemtype="http://schema.org/Person">
By <a itemprop="url"
href="http://fourstarcooks.org/mindypretner">
<span itemprop="name">Mindy Pretner</span>
</a>
</p>
More about values• Properties generally have values that are strings, but in
many cases there are values that are URLs (such as the
image of my pita pizza) , then the value of such property will be taken from the attribute that express it (such as src
attribute of <img>)
<div itemscope itemtype="http://schema.org/Recipe">
<h1 itemprop="name">Pita Pizza</h1>
<img itemprop="image"
src="http://myrecipes.com/images/pitapizza.png"
alt="Pita Pizza"/>
…
</div>
There are many ways to markup (1)<p itemprop="author" itemscope
itemtype="http://schema.org/Person">
By <a itemprop="url"
href="http://fourstarcooks.org/mindypretner">
<span itemprop="name">Mindy Pretner</span>
</a>
</p>
<p itemprop="author">
<div itemscope itemtype="http://schema.org/Person">
By <a itemprop="url"
href="http://fourstarcooks.org/mindypretner">
<span itemprop="name">Mindy Pretner</span>
</a>
</div>
</p>
There are many ways to markup (2)
<p itemprop="author"
itemscope itemtype="http://schema.org/Person">
By
<span itemprop="name">
<a itemprop="url"
href="http://fourstarcooks.org/mindypretner">
Mindy Pretner
</a>
</span>
</p>
Can the machine understand "20 Min"?
• Indeed, this is a long story... Some text is more easy to be understood some other very difficult
• The total time of my recipe is 20 min...
• therefore, sometimes, we need to separate the human readable content (20 Min) from the machine readable one (P20M)...
<div itemscope itemtype="http://schema.org/Recipe">
...
Ready In:
<time itemprop="totalTime" datetime="P20M">
20 Min
</time>
...
</div>
How to describe collections?<ul>
<li itemprop="ingredient">1 pita bread
round</li>
<li itemprop="ingredient">5 ml olive oil</li>
…
</ul>
A property may appear many times
What about "5 ml"... This is a quantity. Is any way to describe it better?
Well, there are some solutions but, if you are a beginner, just do as above
Properties have "expected values"
• Schema.org properties have expected values.
• What is an "expected value"?
• Let's say it is the best value you can put for your property
• Remember the duration of of your recipe?
• The property totalTime has a Duration as expected value
• A Duration is a precise string using ISO 8601 duration format
• When such a format is not provided the machine may fail to understand the value
Referring items• How I can refer an item which is not in the same container?
• You have to uss a combination between HTML attribute id and attribute itemref sharing the SAME value
• Then you can still annotate when the data to be annotated does not follow the convenient tree structure...
<div itemprop="author" id="x">
<div itemscope itemtype="http://schema.org/Person>
<p itemprop="author">Friedrich Hayek</p>
</div>
</div>
<div itemscope itemtype="http://schema.org/Book" itemref="x">
<p itemprop="title">The Road to Serfdom</p>
</div>
Vocabularies define types• How do I know what type (itemtype) to use?
• Microdata do NOT define these types and their properties
• Types and properties are part of Web vocabularies, e.g., Schema.org
• Example of types:o http://schema.org/Recipe
o http://schema.org/Book
• Each type defines its own properties and inherits properties from its super types e.g.,
• http://schema.org/CreativeWork is a supertype (or parent/ancestor type) of http://schema.org/Diet therefore Diet inherits all CreativeWork properties
• Diet is a subtype of CreativeWork
Which vocabulary should I use?• Fact is that, according with the semantics you want to
encode there are plenty of Web vocabularies: FOAF, SKOS, DOAP, ...
• However, these types and properties should be understood by Web applications...
• Therefore when you use e.g., SKOS then ONLY applications that process SKOS will interact with your page
• Remember, Google, Bing and Yahoo! Allied to process a common vocabulary, Schema.org
See examples at http://getschema.org
Happy?
• Visit http://getschema.org to learn more
• Request an account if you like to contribute
• Follow us