Upload
harald-sack
View
627
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Presentation at VfM Seminar 'Medieninformation und Mediendokumentation', Bonn, 12.03.2013
Citation preview
Semantic Search(Semantische Suche)Bonn, 12. März 2013
Dr. Harald SackHasso-Plattner-Institut for IT-Systems Engineering
University of Potsdam
Donnerstag, 14. März 13
vfm - Seminar: Metadatenmanagement in Medienunternehmen, 05. September 2012, Bonn Jörg Waitelonis, Hasso-Plattner-Institut Potsdam
■ HPI was founded in October 1998 as a public-private-partnership
■ HPI research and teaching is focussed onIT Systems Engineering
■ 10 professors and 100 scientific coworkers■ 450 bachelor / master students ■ Winner of CHE-ranking 2010
Hasso Plattner Institute for IT Systems EngineeringUniversity of Potsdam
2
Donnerstag, 14. März 13
■Research Topics■Semantic Web Technologies■Ontological Engineering■ Information Retrieval■Multimedia Analysis & Retrieval■Social Networking■Data/Information Visualization
■Research Projects:
Research Group Semantic Technologies & Multimedia Retrieval
Donnerstag, 14. März 13
4
Semantic Search
Inhalt:■ Introduction■ Media Analysis ■ Semantic Analysis■ Semantic Search■ Explorative Search■ Realization
Albrecht Dürer: Melancholia I, 1514
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
5
The ‘Google Dilemma‘Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
6
Donnerstag, 14. März 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Workshop ,Corporate Semantic Web‘, XInnovations 2011, Berlin, 19. Sep. 2011Google Multimedia Search
Donnerstag, 14. März 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
Content Based Retrieval is Based on Textual Metadata
Google Multimedia SearchDonnerstag, 14. März 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
Seach by Media Content
Donnerstag, 14. März 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
The Ordinary Archive is a Small World...
Jules Verne
Donnerstag, 14. März 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
But, wouldn‘t it be nice, if.....
Jules Verne
...but maybe you are also interested in- George Melies (2 videos)- Mark Twain (1 video)- H.G. Wells (2 videos)- science fiction (11 videos)- adventure (20 videos)- France (101 videos)- Moon (33 videos)- literature (434 videos)- art (1.205 videos)
Donnerstag, 14. März 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
(Traditional)Information Retrieval
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
13
(Simplified) Information Retrieval Model
(acc. to Salton,G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York 1983)
Set of Documents
files of records
Set of Queries
Information requests
indexing language
similarity
indexingQueryFormulation
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
14
relevant documents retrieved documents
relevant documents that have been retrieved
RP
Recall=| R ∩ P |
|R|
Precision=| R ∩ P |
|P|
Fα=(1+α)⋅(Recall ⋅ Precision )
α⋅(Recall + Precision )
Evaluation of Information Retrieval Systems
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
15
Search Engines in the World Wide Web
• The World Wide Web is a distributed hypermedia system that
•consists of multimedia documents and• is connected via hyperlinks
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
16
URL list
http://www.xxxx.de/1234...http://www.xxxx.de/2234...http://www.xxxx.de/3234...http://www.xxxx.de/4234...http://www.xxxx.de/5234...http://www.xxxx.de/6234...http://www.xxxx.de/7234......
<a href=“...“ .../>
<a href=“...“ .../>
HTMLdocuments
WWW-ServerHTTP Request
WWW server delivers requestedHTML documents to the web crawler
1
2
3
4
Information Gathering via Web Crawler (Robot)Search Engines in the WWW
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
17
Data Normalization
Web Crawler
Data Analysis and creation of
index data structures
Preprocessing and IndexingSearch Engines in the WWW
Tokenization
Speech Identification
Word Stemming
POS-Tagging
Descriptor Generation
Document Preprocessing
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
18
Efficient Index Data Structures
Aachen
Altavista
Ananas
……
Zustand
Zypern
Index
AnanasDocID Pos Frequency Weight
D123 1;13;77;132 4 9.4D456 22;38 2 6.7 … … … …D998 15 1 1.2
Location List D123Frequency URL <H1> … <H6> <title> … text
4 1 1 0 1 … 1
D123 http://producers.ananas.org/index.htm
<html><head><title=“Ananas around the World“></head><body> … </body></html>
Inverted File
File
Search Engines in the WWW
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
19
Relevance Ranking
• based on Link Popularity (Google PageRank)
A
1.0
D
1.0
B
1.0
C
1.0
Start
Nr. PR(A) PR(B) PR(C) PR(D)1 1,0 1,0 1,0 1,02 1,0 0,575 2,275 0,153 2,083 0,575 1,191
20,15
… … … … …n 1,49 0,7833 1,577 0,15
Iteration of the PageRank computationA
1.49
D
0,15
B
0,78
C
1.57
resulting PageRank
Search Engines in theWWW
Donnerstag, 14. März 13
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
The Web is big. Really big. You just won't believe how vastly, hugely, mind-bogglingly big it is.(...according to Douglas Adams)
20
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
21
Language has its fa
llacies...
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
22
in particular,
if we don‘t know the langua
ge
Donnerstag, 14. März 13
23
Semantic Search
Inhalt:■ Introduction■ Media Analysis ■ Semantic Analysis■ Semantic Search■ Explorative Search■ Realization
Donnerstag, 14. März 13
vfm - Seminar: Metadatenmanagement in Medienunternehmen, 05. September 2012, Bonn Jörg Waitelonis, Hasso-Plattner-Institut Potsdam
Step 1: Digitization of analog media
Step 2: Annotation with (text-based) metadata
Searching a (Multi) Media Archive
Step 3: Content-based Retrieval based on available metadata
Donnerstag, 14. März 13
vfm - Seminar: Metadatenmanagement in Medienunternehmen, 05. September 2012, Bonn Jörg Waitelonis, Hasso-Plattner-Institut Potsdam
Today: Manual Annotation
Donnerstag, 14. März 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
VisualConceptDetection
Text Recognition
Visual Analysis
Automated Media Analysis
Face Detection
Face Detection
Audio-Mining
structuralanalysis
AutomatedSpeech
Recognitionaudio event detection
Logo Detection
Donnerstag, 14. März 13
vfm - Seminar: Metadatenmanagement in Medienunternehmen, 05. September 2012, Bonn Jörg Waitelonis, Hasso-Plattner-Institut Potsdam
■Result: multimedia data with spatio-temporal Annotations
temporal metadata (e.g. MPEG-7) ... <Video> <TemporalDecomposition> <VideoSegment> <TextAnnotation> <KeywordAnnotation> <Keyword>Astronaut</Keyword> </KeywordAnnotation> </TextAnnotation> <MediaTime> <MediaTimePoint> T00:05:05:0F25 </MediaTimePoint> <MediaDuration> PT00H00M31S0N25F </MediaDuration> </MediaTime> ... </VideoSegment> </TemporalDecomposition> </Video> ...
time
Automated Media Analysis
Donnerstag, 14. März 13
vfm - Seminar: Metadatenmanagement in Medienunternehmen, 05. September 2012, Bonn Jörg Waitelonis, Hasso-Plattner-Institut Potsdam
■Result: multimedia data with spatio-temporal Annotations
time
Automated Media Analysis
spatial metadata (e.g. MPEG-7) ... <SpatialDecomposition> <TextAnnotation> <KeywordAnnotation> <Keyword>Astronaut</Keyword> </KeywordAnnotation> </TextAnnotation> <SpatialMask> <SubRegion> <Polygon> <Coords> 480 150 620 480 </Coords> </Polygon> </SubRegion> </SpatialMask> ... </SpatialDecomposition> ...
Donnerstag, 14. März 13
vfm - Seminar: Metadatenmanagement in Medienunternehmen, 05. September 2012, Bonn Jörg Waitelonis, Hasso-Plattner-Institut PotsdamDonnerstag, 14. März 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
• Authoritative Metadata• structured data• semi-structured data
• natural language text • Non-authoritative Metadata
• (free) user tags and comments• restricted vocabularies
• (Media) Analysis Metadata• low level features• high level features
• etc.
How to Determine the Meaning of Metadata?
SemanticAnalysis
reliability
context
pragmatics
location dependency
accuracy
timedependency
level ofabstraction
Donnerstag, 14. März 13
4242 42 4224424242 42 424231
Semantic Search
Inhalt:■ Introduction■ Media Analysis ■ Semantic Analysis■ Semantic Search■ Explorative Search■ Realization
Donnerstag, 14. März 13
• MPEG-7 has been re-engineered to become an OWL-DL ontology (2007: Arndt et al., COMM model)
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
32
4242 42 4224424242 42 4242Multimedia OntologiesSemantic Metadata
• Localize a region → Draw a bounding box
• Annotate the content → Interpret the content → Tag ,Astronaut‘
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
4242 42 4224424242 42 424233
Example: Tagging with an MPEG-7 Ontology
Reg1
mpeg7:image
mpeg7:depicts
“Man on the Moon“
mpeg7:spatial_decomposition Reg1
mpeg7:StillRegion
rdf:type
mpeg7:depicts
“Neil Armstrong“
mpeg7:SpatialMask
mpeg7:polygon
mpeg7:Coords
Multimedia OntologiesSemantic Metadata
Donnerstag, 14. März 13
Neil Armstrong
Astronaut
is a
Person
is a
Science Occupation
subClassOf
Employment
subClassOf
Entities
Ontologies
has an
,Neil Armstrong‘ is more than just a character string
Kosmonautsame as
Juri Gagarin
is a
is NOT a
Donnerstag, 14. März 13
Where does the knowledge come from...?
Donnerstag, 14. März 13
Web of Data = Linked Open Data
Donnerstag, 14. März 13
Where does the knowledge come from...?
:Neil_Armstrong rdf:type dbpedia-owl:Astronaut .
subject property object
:Neil_Armstrongrdf:type
dbpedia-owl:Astronaut
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
38
Astronaut
Named Entity Mapping
Person
Neil Armstrong
Science Occupation
Employment
is a is a
subClassOf
subClassOf
rdfs:label Neil Armstrong
rdf:type dbpedia-owl:Astronaut
rdf:type foaf:Person
Donnerstag, 14. März 13
4242 42 4224424242 42 4242
vfm - Seminar: Metadatenmanagement in Medienunternehmen, 05. September 2012, Bonn Jörg Waitelonis, Hasso-Plattner-Institut Potsdam
39
Semantic Multimedia Retrieval
Video Analysis /Metadata Extraction
timemetadata
metadatametadata
metadatametadata
e.g., person xylocation yzevent abc
e.g., bibliographical data,geographical data,encyclopedic data, ..
Entity Recognition/ Mapping
N. Ludwig, H. Sack: Named Entity Recognition for User-Generated Tags. In Proc. of the 8th Int. Workshop on Text-based Information Retrieval, IEEE CS Press, 2011
Donnerstag, 14. März 13
4242 42 4224424242 42 4242
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
40
http://dbpedia.org/resource/Neil_Armstrong
Semantic AnalysisNamed Entity Mapping
„Armstrong landed the Eagle on the Moon.“ Text
Entity Mapping
How do I find the right entity?
Donnerstag, 14. März 13
Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Institute, University of Potsdam
Armstrong
Donnerstag, 14. März 13
4242 42 4224424242 42 4242Semantic AnalysisNamed Entity Mapping
„Armstrong landed the Eagle on the Moon.“Text
Armstrong, Florida
Determine possible Entity Mapping Candidates
Armstrong, Ontario
Armstrong County, Texas
Armstrong Tunnel
Louis Armstrong
Armstrong Tools
Armstrong (moon crater)
Armstrong (car)
The Armstrongs
Craig Armstrong
Anton Armstrong
Edward Armstrong
Gary Armstrong
George Armstrong
The Armstrong Twins
Ian Armstrong
+ 400 more...
Neil Armstrong
Armstrong Bridge
Lance Armstrong
Armstrong, Ontario
Entity Candidate Generation
Donnerstag, 14. März 13
4242 42 4224424242 42 4242Semantic AnalysisNamed Entity Mapping
„Armstrong landed the Eagle on the Moon.“Text
Determine possible Entity Mapping Candidates• linguistic analysis (POS tagging)• normalization• encoding and spelling• special (language dependent) characters• language dependent spellings• abbreviations, acronyms• type dependent spellings• alternative names and synonyms• fuzzy string mapping• ...
Entity Candidate Generation
Donnerstag, 14. März 13
4242 42 4224424242 42 4242Semantic AnalysisNamed Entity Mapping
„Armstrong landed the Eagle on the Moon.“Text
Entity Selection is determined by• context• ambiguity of source data / mapping• accuracy /reliability of source data / mapping
Armstrong, Florida
Armstrong, Ontario
Armstrong County, Texas
Armstrong Tunnel
Louis Armstrong
Armstrong Tools
Armstrong (moon crater)
Armstrong (car)
The Armstrongs
Craig Armstrong
Anton Armstrong
Edward Armstrong
Gary Armstrong
George Armstrong
Ian Armstrong
Neil Armstrong
Armstrong Bridge
Armstrong, Ontario
Entity Selection Process
Donnerstag, 14. März 13
4242 42 4224424242 42 4242Semantic AnalysisNamed Entity Mapping
TemporalContext
SpatialContext
Context Item
SocialContext
Contextual Description
ClassDiversity
Level of Structure
SourceReliability
SourceDiversity
Context Dimensions
Ambiguity Accuracy
influences influences
Relevance
determines
N.Steinmetz, H.Sack: Semantic Multimedia Information Retrieval Based on Contextual Descriptions, 2013
„Armstrong landed the Eagle on the Moon.“Text
SEMEX Multimedia Context Model
Donnerstag, 14. März 13
Armstrong
George Armstrong Custer
Neil Armstrong
The Armstrong Twins
Armstrong, Florida
Armstrong, Ontario
Armstrong Automobile
Joe ArmstrongArmstrong County, Texass
Armstrong Gun
Craig Armstrong
Armstrong (Moon Crater)
Louis Armstrong
Armstrong Tunnel
Louis Armstrong International Airport
Armstrong‘s Theorem
Sir Thomas Armstrong
Ian Armstrong
Eagle Moon
Eagle (Bird)
Eagle (heraldry)
USCGC Eagle
The Eagle (2011 film)
Eagle (song)
John H. EagleEagle (typeface)
Eagle Falls (Washington)
Eagle (Moon Crater)
Eagle (comic)
Eagle (lunar module)
Eagle TV
Armstrong Tunnel
The Eagle (Pub)
War Eagle
The Eagle (newspaper)
Eagle (racehorse)
Angela EagleLinda Eagle
James Philipp Eagle
95 entities448 entities
Armstrong (British Columbia)Karen Armstrong
Curtis Armstrong
Gillian Armstrong Hilary Armstrong
William L. Armstrong
156 entities
Man on the Moon (film)
Moon (song)
Moon Son-Ri
C Moon
The Moon (Tarot card)
Edgar Moon
Moon OSMoon (Band)
Moon
Moon 44
Man on the Moon (soundtrack)
William Moon
Lottie Moon
Mr. Moon (song)
Man on the Moon (musical)
Darvin Moon
Moon 83
Francis MoonGary Moon
Robert Charles Moon
Black Moon
Allan Moon
Ban-Ki Moon
Fly me to the Moon (song)
Semantic AnalysisNamed Entity Mapping
„Armstrong landed the Eagle on the Moon.“
Consider all entities within the same context
Donnerstag, 14. März 13
Select matching entities from all possible candidate entities: • Popularity based strategies• Linguistical strategies• Statistical strategies• Semantic based strategies
General Approach1. Make an assumption 2. Do the strategies support or contradict your assumption3. Make decision according to logical and probabilistic rules/constraints
Semantic AnalysisNamed Entity Recognition
N. Ludwig, H. Sack, “Named entity recognition for user-generated tags,TIR 2011
• reference text corpus(wikipedia)
• link graph (wikipedia)• semantic graph
(dbpedia)
Entity Selection Process
Donnerstag, 14. März 13
Armstrong
George Armstrong Custer
The Armstrong Twins
Armstrong, Florida
Armstrong, Ontario
Armstrong Automobile
Joe ArmstrongArmstrong County, Texass
Armstrong Gun
Craig Armstrong
Armstrong (Moon Crater)
Armstrong Tunnel
Louis Armstrong International Airport
Armstrong‘s Theorem
Sir Thomas Armstrong
Ian Armstrong
Eagle Moon
Eagle (Bird)
Eagle (heraldry)
USCGC Eagle
The Eagle (2011 film)
Eagle (song)
John H. EagleEagle (typeface)
Eagle Falls (Washington)
Eagle (Moon Crater)
Eagle (comic)
Eagle TV
Armstrong Tunnel
The Eagle (Pub)
War Eagle
The Eagle (newspaper)
Eagle (racehorse)
Angela EagleLinda Eagle
James Philipp Eagle
95 entities448 entities
Armstrong (British Columbia)Karen Armstrong
Curtis Armstrong
Gillian Armstrong Hilary Armstrong
William L. Armstrong
156 entities
Man on the Moon (film)
Moon (song)
Moon Son-Ri
C Moon
The Moon (Tarot card)
Edgar Moon
Moon OSMoon (Band)
Moon 44
Man on the Moon (soundtrack)
William Moon
Lottie Moon
Mr. Moon (song)
Man on the Moon (musical)
Darvin Moon
Moon 83
Francis MoonGary Moon
Robert Charles Moon
Black Moon
Allan Moon
Ban-Ki Moon
Neil Armstrong
Eagle (lunar module)
Moon
Louis Armstrong
Fly me to the Moon (song)
Semantic AnalysisNamed Entity Recognition
„Armstrong landed the Eagle on the Moon.“
N. Steinmetz, H.Sack: Semantic Multimedia Information Retrieval Based on Contextual Descriptions, 2013
Entity Selection Process(Semantic) Graph Analysis
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamTurmbau zu Babel, Pieter Brueghel, 1563
Semantic Search
Inhalt:■ Introduction■ Media Analysis ■ Semantic Analysis■ Semantic Search■ Explorative Search■ Realization
Donnerstag, 14. März 13
vfm - Seminar: Metadatenmanagement in Medienunternehmen, 05. September 2012, Bonn Jörg Waitelonis, Hasso-Plattner-Institut PotsdamTurmbau zu Babel, Pieter Brueghel, 1563
How to use semantic metadata in retrieval?
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
51
(One of many Definitions...)Semantic Search
• Annotation of (text-based) metadata with semantic entities• Entity-based Information Retrieval• Make use of semantic relations, as e.g. content-based
similarities of relationships• Interoperable metadata via semantic annotations• for content-based description• for structural / technical description (Multimedia Ontologies)
Overall Goal: Quantitative and qualitative improvement of Information Retrieval
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamTurmbau zu Babel, Pieter Brueghel, 1563
Semantic metadata enable improvement of traditional keyword-based retrieval by(1) Query String Extension/Refinement
enables more precise or more complete search results(2) Cross Referencing
enables to complement search results with additional associated or similar information
(3) Exploratory Search enables visualization and navigation of the search space
(4) Reasoningenables to complement search results with implicitly given information
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
53
Semantic SearchQuery String Extension
• Keyword-based search does not deliver all search results that are relevant for a query, because synonyms and metaphors might describe the queried content.
• Extension of the original query string (Query Extension)• from dictionaries and thesauri
• extend query with synonyms, hyponyms (specializations), etc.• from domain ontologies
• extend query with meronyms (part-of), related concepts, etc.
Original query string: Bank
possible extensions: Bank ∨ depository financial institution ∨ credit union ∨ acquirer ∨ federal reserve ∨ ... increase recall
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
54
Semantic SearchQuery String Refinement
• Keyword-based search does also deliver search results that are not relevant for a query, because query terms and document terms might be ambiguous.
• Refinement of the original query string (Query Refinement)• from dictionaries and thesauri
• disambiguate polysemic terms with hypernyms (generalizations)• from domain ontologies
• disambiguate polysemic terms with holonyms
Original query string: Bank
possible refinements: (1) Bank ∧ financial institution (2) Bank ∧ incline ∧ slope ∧ side (3) Bank ∧ container (4) Bank ∧ deposit ∧ repository increase precision
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
55
Semantic SearchCross Referencing
• Provide search results that do not literally contain the query string but are closely related to the query by content• Apply domain ontologies for determining related concepts• Appy statistical analysis of large (text) document
corpora
dbpedia:Neil_Armstrong
dbpedia:Apollo_11
dbprop:mission
Neil Armstrong NER
dbprop:mission
dbprop:mission
query string
dbpedia:Buzz_Aldrin
dbpedia:Michael_Collins
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
56
Semantic SearchExploratory Search
• Provide additional search results that do not necessarely contain the query string but are related to the query by content or also are related to the search results achieved by the direct query• Apply domain ontologies and heuristics to determine the
relevance of facts
95
dbpedia:Apollo_11
category:Apollo_program
dbpedia:Apollo_13
dcterms:subject
yago:Space_accidents_and_incidents
rdf:type
rdf:type
dbpedia-owl:mission
dbpedia:Neil_Armstrong
dbpedia:Space_Shuttle_Challenger
dcterms:subject
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
57
Semantic SearchReasoning
• Provide additional search results (and information) that do not necessarely contain the query string but are related to the query by content, whereby the relation may not be a direct one, but can be derived via entailment. • Apply domain ontologies, resoning algorithms and
heuristics to find new facts and determine the relevance of facts
95
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
58
Semantic SearchReasoning
95
dbpedia:Neil_Armstrong
dbpedia:Apollo_11
dbpedia-owl:mission
category:Missions_to_the_Moon
dcterms:subjectcategory:Exploration_of_the_Moon
dcterms:subject
category:Spaceflight
skos:broader
dbpedia:Moon category:Animals_in_Space
dcterms:subject skos:broader
Example: query string= Neil Armstrong
(Hard) questions to solve via reasoning:•Will there be the Moon or documents about the Moon in the search results?• How is Neil Armstrong related to the Moon? (is he?)•Was Neil Armstrong (really) on the Moon?• ...
category:Moon
skos:broader
Donnerstag, 14. März 13
vfm - Seminar: Metadatenmanagement in Medienunternehmen, 05. September 2012, Bonn Jörg Waitelonis, Hasso-Plattner-Institut Potsdam
596
Semantic Search
Inhalt:■ Introduction■ Media Analysis ■ Semantic Analysis■ Semantic Search■ Explorative Search■ Realization
http://www.gocomics.com/calvinandhobbes/Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
60
Searching is not always
just searching
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
61
I‘m looking for the book „Brave New World“ by Aldous Huxley in the first German edition...
Brave New World. - Aldous H U X L E Y.
- The Albatros Continental Library, 47
(Hamburg usw., Albatros Verlag, 1933)
257 S. 8“
II 1, 2506, 34548
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
62
I really liked „Brave New World“ by Aldous Huxley but how should I find what to read next...?
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
63
Exploratory Search• What, if the user does not know, which query string to use?• What, if the user is looking for complex answers?• What, if the user does not know the domain he/she is looking for?• What, if the user wants to know all(!) about a specific topic?
• ...,Browsing‘ instead of ,Searching‘• ...to find something by chance, i.e. Serendipity• ...to get an overview• ...enable content based navigation
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
64
Gather knowledge about dbpedia:Brave_New_Worldand decide, which interesting fact to follow....
http://dbpedia.org/page/Brave_New_World
Enable Exploratory Search based on Linked Open Data
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
65
dbpedia:Brave_New_World
dbpedia-owl:author
dbpedia:Aldous_Huxley
dbpedia-owl:author
dbpedia-owl:author
dbpedia-owl:author
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
66
dbpedia:Brave_New_World
dbpedia-owl:author
dbpedia:Aldous_Huxley
dbpedia:ontology/influences
dbpedia:H._G._Wells
dbpedia:ontology/influences
dbpedia:George_Orwell
dbpedia:ontology/influences
dbpedia:Michel_Houellebecq
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
67
dbpedia:H._G._Wells dbpedia:George_Orwell dbpedia:Michel_Houellebecq
dbpedia-owl:notableWork
dbpedia:Les_Particules_élémentaires
dbpedia-owl:notableWork
dbpedia:Nineteen_Eighty-Four
dbpedia-owl:notableWork
dbpedia:The_Time_Machine
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
68
dbpedia-owl:author
dbpedia:Aldous_Huxley
...and now please surprise me.....SERENDIPITY
dbpedia:Tim_Berners-Leerdf:type
dbpedia:World_Wide_Web
dbpprop:inventor
Yago:EnglishExpatriatesInTheUnitedStates
rdf:type
rdf:type
dbpedia:Patrick_Stewart
dbpedia:Star_Trek:_The_Next_Generation
dbpedia-owl:starring
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
69
Explorative Search
dbpedia-owl:mission
dbpedia:Neil_Armstrong
dbpedia:Apollo_11dbpedia-owl:mission
category:Apollo_program
dcterms:subject
dbpedia:Apollo_13
dcterms:subject
yago:Space_accidents_and_incidents
rdf:type
rdf:type
dbpedia:Space_Shuttle_Challenger
dbpedia-owl:mission
dbpedia:Buzz_Collins
dbpedia:Michael_Collins
Donnerstag, 14. März 13
Exploratory Search and Serendipity
•Find something that you were not looking for on purpose ...
dbpedia:Buzz_Collins
dbpedia:Cookie_Monster
dbpedia:Strictly_Come_Dancing
Donnerstag, 14. März 13
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
71
Semantic Search
Inhalt:■ Introduction■ Multimedia Analysis ■ Semantic Analysis■ Semantic Search■ Explorative Search■ Realization
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
72
Entity Based Search
• linguistic ambiguities of traditional keyword based search can be avoided
• enables high precision and high recall retrieval
http://www.yovisto.com/labs/autosuggestion/
• Query string refinement / extension• entity auto-suggestion• interpretation of natural language queries
J. Osterhoff, J. Waitelonis, H. Sack, Widen the Peepholes! Entity-Based Auto-Suggestion as a rich and yet immediate Starting Point for Exploratory Search, IVDW 2012
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
73
http://mediaglobe.yovisto.com:8080/mggui-dev2/
search facets
C. Hentschel, H. Sack, et al., Open up cultural heritage in video archives with mediaglobe, I2CS 2012
Donnerstag, 14. März 13
Donnerstag, 14. März 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
75
Exploratory Search with yovisto
J. Waitelonis, H. Sack: Towards exploratory video search using linked data, MTAP Volume 59, Number 2 (2012), 645-672
http://mediaglobe.yovisto.com:8080/
Donnerstag, 14. März 13
76
Semantic Search
Inhalt:■ Introduction■ Media Analysis ■ Semantic Analysis■ Semantic Search■ Explorative Search■ Realization
Albrecht Dürer: Melancholia I, 1514
Donnerstag, 14. März 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
Contact:Dr. Harald SackHasso-Plattner-Institut für SoftwaresystemtechnikUniversität PotsdamProf.-Dr.-Helmert-Str. 2-3D-14482 Potsdam
Homepage:http://www.hpi.uni-potsdam.de/meinel/team/sack.htmlBlog: http://yovisto.blogspot.com/E-Mail: [email protected] Twitter: lysander07 / biblionomicon / yovisto Slides can be found at http://slideshare.com/lysander07/
more about Semantic Web Technologies t http://www.openhpi.de/
Thank you very much
for your attention!
Donnerstag, 14. März 13