View
1.205
Download
0
Embed Size (px)
Citation preview
AGROVOC and the NAL Agricultural Thesaurus at the
Ontology Alignment Evaluation Initiative
Willem Robert van HageTNO Industrie & Techniek /
Vrije Universiteit Amsterdam
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Overview
• AGROVOC & NALT• the Goal
– federated access to heterogeneous data sources
• the Method– align the ontologies– the OAEI workshop (2006 and 2007)
• a challenge for ontology alignment researchers• an opportunity for environmental researchers
AGROVOC & NALT
• AGROVOC 28.000 descriptor terms 10.000 non-descriptor terms– ten languages (currently 17 and another 4 under construction)
– used to index AGRIS/CARIS (and other resources)
• NALT 41.000 descriptor terms 24.000 non-descriptor terms,– english (currently also spanish)
– used to index AGRICOLA, FSRIO, AgNIC, NALDR (and other resources)
AGROVOC & NALT
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
AGROVOC NALT
AGROVOC NALT
AGRICOLAFSRIOAgNICNALDR
AGRIS/CARIS
wheat aphids
wheat aphid, RussianUSE: Diuraphis noxia
Diuraphis noxia麦双尾蚜
麦双尾蚜
Diuraphis noxiaDiuraphis
noxia
the Goal
• OAEI 2006– A friendly competition between scientists– Six ontology alignment tasks:
benchmark, anatomy, conference, directory, jobs, and AGROVOC-NALT
• OAEI 2006 AGROVOC-NALT– Five participants: Falcon-AO, PRIOR, RiMOM, COMA++, HMatch
• The systems align the SKOS version of the ontologies without manual intervention.
• The set of alignments (around 10.000 per participant) are collected and samples are judged by experts(FAO, USDA and AI researchers at the EKAW conference)
• The results are published at:http://www.few.vu.nl/~wrvhage/oaei2006
the Method - OAEI
OAEI 2006 - performance81%
50%
83%
46%
71%
45%
54%
23%
61%
46%
RiMOM
Falcon-AO
Prior
COMA++
HMatch
Precision Recall
OAEI 2006 - conclusions• Alignments that are easy to automate:
– High consensus topics such as:• Latin species names (rigid Linnaeic naming scheme)
• Some geographical names• Parts of the medical domain (parts of anatomy and diseases)
• Alignments that are hard to automate:– Low consensus topics such as:
• Different notations of chemicals(e.g. caffeine vs. theine vs. 1,3,7-trimethyl-1H-purine-2,6(3H,7H)-dione)
• Some other geographical names(e.g. time dependency or administrative vs. physical)
• Products and processes• Concepts that are culturally dependant
(e.g. law and economy)
• OAEI 2007– at ISWC 2007
in Busan, Korea, november 11th– again AGROVOC-NALT (did things improve?)– but also AGROVOC-GEMET, NALT-GEMET
• AGROVOC-NALT– Falcon-AO, PRIOR+, DSSim, RiMOM
• AGROVOC-GEMET, NALT-GEMET– Falcon-AO, PRIOR+, DSSim
• Deadlines – submission of results: october 1st– evaluation of results: november 11th
• More information can be found at:http://www.few.vu.nl/~wrvhage/oaei2007
OAEI 2007
References
• AGROVOC:http://www.fao.org/agrovoc
• NALT:http://agclass.nal.usda.gov/agt
• GEMEThttp://www.eionet.europa.eu/gemet/
• OAEI 2007:http://oaei.ontologymatching.org/2007
• OAEI 2007:http://www.few.vu.nl/~wrvhage/oaei2007
• OAEI 2006 results:http://www.few.vu.nl/~wrvhage/oaei2006