19
Antoine Isaac Europeana – VU University Amsterdam Dagstuhl Multilingual Semantic Web seminar

Antoine Isaac Europeana – VU University Amsterdam

Embed Size (px)

DESCRIPTION

Antoine Isaac Europeana – VU University Amsterdam. Dagstuhl Multilingual Semantic Web seminar. Europeana. “A digital library that is a single, direct and multilingual access point to the European cultural heritage.” European Parliament. 24 M objects ( images, text, sound and video) - PowerPoint PPT Presentation

Citation preview

Page 1: Antoine Isaac Europeana – VU University Amsterdam

Antoine Isaac

Europeana – VU University Amsterdam

Dagstuhl Multilingual Semantic Web seminar

Page 2: Antoine Isaac Europeana – VU University Amsterdam

Europeana

24 M objects (images, text, sound and video)

From over 2.200 libraries, museums, archives

From 33 countries

For everyone

“A digital library that is a single, direct and multilingual access point to the European cultural heritage.”

European Parliament

Page 3: Antoine Isaac Europeana – VU University Amsterdam

Multilingual Access in Europeana

Page 4: Antoine Isaac Europeana – VU University Amsterdam

Dimensions of multilingual access

Interface

Search (query translation or document translation)

Result presentation

Browsing

Page 5: Antoine Isaac Europeana – VU University Amsterdam

Europeana's efforts

Interface translated into 26 languages

Query translation: only prototype

Query result filtering by country/language

Document translation (user enabled)

Semantic contextualization of objects

• Multilingual enrichment/annotation of metadata

Page 6: Antoine Isaac Europeana – VU University Amsterdam

Making metadata work for multilingual access

Page 7: Antoine Isaac Europeana – VU University Amsterdam

Current metadata in Europeana

Simple object records

Flat (text values)

Without language tags!

Only language-related info on metadata is at collection level

• Can be "mul"

Need to change!

a new Europeana Data Model (EDM)

Page 8: Antoine Isaac Europeana – VU University Amsterdam

"Semantic layer" of contextual resources(concepts, persons, places, events...)

Networked objects

Cultural artefact

PaintingSculptureBuildling

Exploiting semantic relationse.g. “broader concept”, “place of birth”, “involved

person”…

Page 9: Antoine Isaac Europeana – VU University Amsterdam

Multilingual metadata

Page 10: Antoine Isaac Europeana – VU University Amsterdam

Fetching already available linked data

http://www.w3.org/2005/Incubator/lld/XGR-lld-vocabdataset/

E.g., from libraries

Page 11: Antoine Isaac Europeana – VU University Amsterdam

Interoperability

Encouraging the use of RDF + common and simple elements

Page 12: Antoine Isaac Europeana – VU University Amsterdam

Interoperability

Encouraging the use of common and simple data elements

<skos:Concept rdf:about="http://www.mimo-db.eu/InstrumentsKeywords/2308"> <skos:prefLabel xml:lang="fr">Piano carré</skos:prefLabel> <skos:prefLabel xml:lang="it">Pianoforte a tavolino</skos:prefLabel> <skos:prefLabel xml:lang="en">Square pianoforte</skos:prefLabel> <skos:prefLabel xml:lang="de">Tafelklavier</skos:prefLabel> <skos:prefLabel xml:lang="nl">Tafelpiano</skos:prefLabel> <skos:prefLabel xml:lang="sv">Taffel</skos:prefLabel> <skos:broader> <skos:Concept rdf:about="http://www.mimo-db.eu/InstrumentsKeywords/2273"> <skos:prefLabel xml:lang="en">Pianofortes</skos:prefLabel> </skos:Concept> </skos:broader></skos:Concept>

Page 13: Antoine Isaac Europeana – VU University Amsterdam

Interoperability

mixed nature of eligible contextual resources: dictionaries, synonym/translation lists, thesauri, authority lists, gazetteers…

interplay: “semantic” data next to multilingual data

Page 14: Antoine Isaac Europeana – VU University Amsterdam

Simultaneous approaches

Getting richer semantic/multilingual metadata from providers

Fetching third-party contextual data and linking it to “un-contextualized” objects

Linking contextual data from an institution to another more general / more commonly used contextual dataset

• Dbpedia.org, VIAF.org…

Page 15: Antoine Isaac Europeana – VU University Amsterdam

Status and challenges

Page 16: Antoine Isaac Europeana – VU University Amsterdam

Current status

All this is work in progress and will take time

R&D prototypes (EuropeanaConnect) showing the challenges of gathering appropriate multilingual tools and data

First tests of simple techniques in production portal: GeoNames (places) and GEMET (concepts)

Encouraging, but illustrate issues with too naïve approaches (no NLP) and incomplete data

• Cheval

• Poison

http://www.europeana.eu

Page 17: Antoine Isaac Europeana – VU University Amsterdam

Problems & requirements

For providers & Europeana

Continue work on metadata

Benchmarking (cf. CHiC lab @ CLEF)

Positioning as consumers and contributors of data (cf Asun’s slides)

data.europeana.eu

For language-intensive tools and resources

Availability: open resources

Interoperability

Simplicity

• But not always! E.g., not only “first hit” translations

Scale: scalability of tools, number and scope of datasets

Many languages, some lesser-resourced (wrt. English)

Page 18: Antoine Isaac Europeana – VU University Amsterdam

Another illustration: VOICES project

Something entirely different but not completely unrelated

Voice-based community-centric mobile services for social development

Easing communication on agricultural trade

Listing of products/prices via phone/radio

Pilot in Mali

Challenges

Data-centric project, but language technology plays a crucial role

Objects should be provided with textual and audio labels (text-to-speech system) in different languages

Local languages: e.g., Bambara

Lack of resource: need low-cost, easy-to-adapt solutions

Victor de Boer, VU Amsterdam ([email protected])

Page 19: Antoine Isaac Europeana – VU University Amsterdam

Thank you

[email protected]

http://www.few.vu.nl/~aisaac/

Some slides based on Marlies Olensky and Juliane Stiller -

Multilingual Web Workshop, June 11, 2012, Dublin