Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
July 2005
Ontologies creation,
extraction and maintenance
6th AOS Workshop
Vial Real(Portugal)
26-27 July 2005
Ontology creation, extraction, and maintenance
6th AOS WorkshopVial Real (Portugal)
26-27 July 2005
Discussion forum n.1Chair: Anita Liang
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
July 2005
Ontologies creation,
extraction and maintenance
6th AOS Workshop
Vial Real(Portugal)
26-27 July 2005
The AOS/CS Workbench
• Support and manage the multi-language terminology work of information management specialists in the development, maintenance, and quality assurance of the AOS/CS
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
July 2005
Ontologies creation,
extraction and maintenance
6th AOS Workshop
Vial Real(Portugal)
26-27 July 2005
The AOS/CS Workbench
• Features– Text processing– Corpus Creation– Corpus Analysis– Term/Relationship Management– Quality Assurance– Versioning and Deployment
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
July 2005
Ontologies creation,
extraction and maintenance
6th AOS Workshop
Vial Real(Portugal)
26-27 July 2005
.doc, .pdf, .xml, etc.
Concept Hierarchy
inpu
tAOS/CS Workbenchconcordance pattern-matchingmultilingual
text corpus
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
July 2005
Ontologies creation,
extraction and maintenance
6th AOS Workshop
Vial Real(Portugal)
26-27 July 2005
Tool features: Text Processing Capabilities
• Multilingual• Font support for Chinese and Arabic at
minimum, also Lao, Thai• Other
– Entity extraction– POS tagging– Parsing
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
July 2005
Ontologies creation,
extraction and maintenance
6th AOS Workshop
Vial Real(Portugal)
26-27 July 2005
Tool Features: Corpus Creation and Maintenance
• Spidering tools (http://www.manageability.org/blog/stuff/open-source-web-crawlers-java/view)
• Document input and storage– .doc, .pdf, .html, .xml
• Text extraction (http://multivalent.sourceforge.net/)• Domain-specific repositories
– specifiable: agriculture, chemistry– combine and remove
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
July 2005
Ontologies creation,
extraction and maintenance
6th AOS Workshop
Vial Real(Portugal)
26-27 July 2005
Tool Features: Corpus Analysis• Text file management
– Add/delete files– Add/delete Directories– Add/delete URLs
• Search: – Word (or part-word) or phrase; string, regular expression, tag search– Number of hits– Case (in)sensitive
• Display: – hide keyword option – toggle between a KWIC format and sentence mode
• Sorting– 1L (First Left), 1R, 2L, 2R, as well as by search word and by text order; – primary and secondary sorts (e.g., first right, then first left).
• Frequency information– display in alphabetical or frequency order of words
• Collocates– Search collocates of spans from 1L-1R to 4L-4R– Collocate highlighting
• Output: The concordance results can be saved to a file and/or printed. • Pattern-matching with pattern language• Other: pattern-matching using POS tags, parallel text concordancing
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
July 2005
Ontologies creation,
extraction and maintenance
6th AOS Workshop
Vial Real(Portugal)
26-27 July 2005
Tool Features: Automatic KOS Search
• Specify online KOS URLs• Automatic suggested parent and placement
within hierarchy
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
July 2005
Ontologies creation,
extraction and maintenance
6th AOS Workshop
Vial Real(Portugal)
26-27 July 2005
Tool Features: Term/Relationship Management
• Modifications (AGROVOC Maintenance Tool)– term
• add• delete• edit
– relationship• add• delete• edit
• Machine learning (Annotation Tool)– wordnet– agrovoc itself– other thesauri
• Batch/bulk modifications based on patterns and structure (rules-as-you-go)
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
July 2005
Ontologies creation,
extraction and maintenance
6th AOS Workshop
Vial Real(Portugal)
26-27 July 2005
AOS/CS Workbench
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
July 2005
Ontologies creation,
extraction and maintenance
6th AOS Workshop
Vial Real(Portugal)
26-27 July 2005
Tool Features: Versioning and deployment
– CVS-type system to check out and check in changes
– Administrator-level functionalities for publishing versions
– Language-level versioning
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
July 2005
Ontologies creation,
extraction and maintenance
6th AOS Workshop
Vial Real(Portugal)
26-27 July 2005
Tool Features: Quality Assurance
• Logging and reporting of user actions• Confirmation/verification of user actions• User rights management• Ownership
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
July 2005
Ontologies creation,
extraction and maintenance
6th AOS Workshop
Vial Real(Portugal)
26-27 July 2005
Technical Platform Details
• Centralized relational database backend– local and remote databases?– is there need for (1) referential integrity, triggers, etc. or
(2) can we get by with publishing and storage• (1): PostgreSQL• (2): MySQL
• Web-based GUI• Distributed client-server architecture• Java-based• Scalability and performance for network
– web services
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
July 2005
Ontologies creation,
extraction and maintenance
6th AOS Workshop
Vial Real(Portugal)
26-27 July 2005
Workshops & Training
• Promote the Workbench• Contact AGROVOC center of excellence• Nominate AGROVOC managers• Organize workshop• Organize training
Food and Agriculture
Organization of the UN
Library and Documentation
Systems Division
July 2005
Ontologies creation,
extraction and maintenance
6th AOS Workshop
Vial Real(Portugal)
26-27 July 2005
Discussion points
• Tools (workbench) • Modeling concept / term / string • Concepts vs instances• Knowledge representation language (OWL,
SKOS) • Update it’s a problem?