17
Multilingual Information Exchange APAN, Bangkok 27 January 2005 [email protected]

Multilingual Information Exchange APAN, Bangkok 27 January 2005 [email protected]

Embed Size (px)

Citation preview

Page 1: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

Multilingual Information Exchange

APAN, Bangkok

27 January 2005

[email protected]

Page 2: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

The general problem

• Searching for multilingual resources is not easy: – on the web– on metadata catalogues / bibliographical databases– on full text documents

• Results are generally in the language used in the search query

=> We need a multilingual approach and multilingual tools (Thesauri / Ontologies, etc.)

Page 3: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

What we can achieve (1): Multilingual concept resolution

• With a multilingual thesaurus or ontology we can find resources on any language

Because we can realize ......

Multilingual concept resolution!

Page 4: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

What we can achieve (2): BrokeringWith a multilingual thesaurus or ontology we can find resources from several sources also if we do not know the terminology and the language used in these sources

vesselscraftsfishing vessels

shipsnavionavire船舶

bateau de pêchefishing boat

fishing vessel

Results in multiple languages from multiple databases

Page 5: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

How to build a multilingual Thesaurus / Ontology

• Lexicalizations of concepts in multiple languages: – {… fishing boat; bateau de pêche; 捕捞渔船 … }

• For every language we can have synonyms: – { … fishing vessel, fishing boat, fishing craft … }

– { … bateau de pêche, navire de pêche, … }

– { … 捕捞渔船, … }

Page 6: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

FAO activities (ongoing)

• Food safety ontology (English, Spanish, French)• Fishery ontology (English, Chinese)• Food and Nutrition ontology-based portal (English,

Spanish, French)• Extensive work with AGROVOC

– RDFS / OWL version– Semantic refinements– Expand multilingual coverage– Expand subject coverage

Page 7: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

The multilingual vocabulary...

• Must cover all concepts of interest to the users in the various languages,

• ... at a minimum all domain concepts lexicalized in any of the participating languages

• Must accommodate hierarchical structures suggested by different languages

(Dr. Soergel)

Page 8: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

Problems (1)• Translation of an English thesaurus into

German does not make a German thesaurus=> whenever possible we need to consider the

concept in his globality (many languages, definitions, “surrounding context” etc.)

• Equivalence of terms holds only in some contexts

• More difficult to translate non-specialized terms

(Dr. Soergel)

Page 9: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

Problems (2)• Two terms mean almost the same thing but differ slightly

in meaning or connotation:– English: alcoholism – French: alcoholisme

– English: vegetable (includes potatoes)– German: Gemüse (does not include potatoes)

• If the difference is big enough, one needs to introduce two separate concepts under a broader term; otherwise a scope note needs to clearly instruct indexers in all languages how the term is to be used so that the indexing stays, as far as possible, free from cultural bias or reflects multiple biases by assigning several descriptors. (Dr. Soergel)

Page 10: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

Available resources: example

• SuperThes, ...

• SWAD-Europe initiative: thesaurus activities– RDF encoding of multilingual thesaurus

• Multilingual labelling approach (mirroring relations for every language)

• Interlingual mapping approach (different structures to be mapped)

Page 11: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

SWAD-Europe: Inter-Thesaurus Mapping• SKOS mapping:

– Exact – Inexact – Major – Minor – Partial – Broad – Narrow – AND – OR – NOT

Page 12: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

Inter-Thesaurus Mapping: example

<ag:Concept><descriptor xml:lang="fr">Academie</descriptor><map:exactMatch>

<map:AND><map:memberList rdf:parseType="Collection"> <aat:Concept>

<descriptor xml:lang="en">Academy</descriptor> </aat:Concept> <aat:Concept>

<descriptor xml:lang="en">Buildings</descriptor> </aat:Concept></map:memberList>

</map:AND></map:exactMatch></ag:Concept>

Page 13: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

Available resources: another possibility

• Use OWL– Define concepts– Define terms– Define string– Define relationships between these 3 elements:

• <similatTo>, <equivalentTo>, (+ skos suggestions)• <hasSynonym>, <hasAntonym>, <hasCognate> • <hasSpellingVariant>, <hasTranslation>

Page 14: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

Available resources: other techniques NLP

• Knowledge discovery: helps on the creation of ontologies in a specific language

• Used to create good IS– Concept extraction– Multilingual search engine

• …

Page 15: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

Conclusion

• We need multilingual tools– Ontologies better than traditional thesauri

• The task is not easy– Subject experts are essential– NLP could help

• We need tools– To help experts to realize the mapping– To do annotations– …

Page 16: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

Live demo

http://www.fao.org

Page 17: Multilingual Information Exchange APAN, Bangkok 27 January 2005 Margherita.sini@fao.org

Thank you.