Upload
kathlyn-bates
View
215
Download
2
Embed Size (px)
Citation preview
LEXUS: a web based lexicon tool
Jacquelijn Ringersma
Max Planck Institute for Psycholinguistics
Nijmegen, The Netherlands
Content
Max Planck Institute – Archive of linguistic resources
Documentation of endangered languages projects (DoBeS)
Tool support (archiving software and enrichment software)
LEXUS and ViCoS
Interdisciplinary software development – challenges and problems
Max Planck Institute for psycholinguistics
Max Planck Gesellschaft
78 research institutes (Germany)
3 outside Germany:
2 Italy (art)
1 The Netherlands (psycholinguistics)
The study of mental processes involved in language production, language comprehension and language acquisition, as well as the relation between language, thought, and culture
Max Planck Institute for psycholinguistics
Archive for linguistic resources
Different types of linguistic material: endangered languages archive, the European second learner corpus, the National Corpus of Spoken Dutch, gesture corpora, acquisition corpora and language documentation corpora
More than 230.000 objects, 25 Tb data:digitized audio and videoimagesannotationsIncluded formats: o.a. XML, HTML, Chat, Toolbox, PDF, Wav, Mpeg1,2,4
Organization:Metadata descriptions, data base
Access via the Internet:Meta data search & content searchaccess to these resources is limited and can be made available upon request
Documentation of endangered languages
DoBeS = Dokumentation Bedrohter Sprachen
DoBeS has two major pillars:
language documentation by experienced teams to preserve part of our cultural heritage and
to help in revitalization where possible
creating an organized, accessible and persistent archive
Multimedia Lexicon
Typed Relations within the Lexicon
Annotated Media
Described Corpus
Archive Content: Yélî Dnye (Rossell Island)
Photos
Tool Support
Archiving: IMDI, LAMUS, AMS
Data enrichment: ELAN, Synpathy, ADDIT, ANNEX, LEXUS
LEXUS - Lexicon tool
LEXUS
Web based lexicon tool
Based on the ISO recommendations for linguistic resourcesLMF : Linguistic Markup Framework (lexicon structure)DCR: Data Category Registries (concept naming)
LMF/DCR: a modular structure for content interoperabilitybetween (all aspects) of lexical resources.
LEXUS - Lexicon tool
• Creation of lexica from scratch, import lexica from other formats
LEXUS - Lexicon tool
• Creation of lexica from scratch, import lexica from other formats • User defined view of the information in the lexical entries
LEXUS - Lexicon tool
• Creation of lexica from scratch, import lexica from other formats • User defined view of the information in the lexical entries
• Linking multi-media fragments to lexical entries
LEXUS - Lexicon tool
• Creation of lexica from scratch, import lexica from other formats • User defined view of the information in the lexical entries
• Linking multi-media fragments to lexical entries
• Creation of links in images
LEXUS - Lexicon tool
Link to: kauo’e mei ‘terminal bud (female)’
LEXUS - Lexicon tool
• Creation of lexica from scratch, import lexica from other formats • User defined view of the information in the lexical entries
• Linking multi-media fragments to lexical entries
• Creation of links in images
• Link to resources within the digital archive (or other external web-based resources) – interaction with other archiving tools
LEXUS - further developments
Towards a multi-media dictionary of the Marquesan and Tuamotuan languages of French polynesia
• Building a digital multi-media encyclopedic dictionary with LEXUS
• Improving basic LEXUS functionalities
• Conceptual spaces
• Improved User Interface
Project team:
• Linguist team (Gablitz, Mosel)
• Developers (Kemps, Zinn, Alcock)
• Speech community (Kape, Guillome, Tetahiotupa, Tahia, Mataiki, Bruneau Pati)
LEXUS - further developments
Towards a multi-media dictionary of the Marquesan and Tuamotuan languages of French polynesia
• Building a digital multi-media encyclopedic dictionary with LEXUS
• Improving basic LEXUS functionalities
• Conceptual spaces
• Improved User Interface
Aim:
• MM Dictionary
• Speech community input and extensions
• Community based instance of the lexicon
LEXUS - further developments
Project workflow
Field work Data archiving and annotation
Lexicon creation
Joint action linguist and speech community
Lexus basic functionalities
Developers
Definition of SW constraints
Definition of SW requirements
Lexicon import and creation of
Multi media encyclopedic lexicon
Further developments of LEXUS
all
LEXUS - further developments
Issues that came up:
User Interface
Conceptual spaces in multi media encyclopedia
LEXUS - further developments
User Interface
User wants to enter the lexicon through the lexical entries, either by from the listed lexicon or by search :
LEXUS - further developments
New User Interface
LEXUS - further developments
New User Interface
LEXUS - further developments
Conceptual spaces in multi media encyclopedia
Conventional paper dictionaries: network of meanings less visible
Paper dictionaries limited usefulness in language maintenance and language revival (Manning et al., 2000) Members of speech community prefer following semantic links of different semantic types (synonyms, antonyms, lexical, taxonomies)
LEXUS - further developments
Conceptual spaces in multi media encyclopedia
ViCoS
Complement lexical spaces with ontological spaces
Allow users to construct a space of culturally relevant concepts
Concepts as centres for all sorts of information
relations to other concepts
anchored in the language to express them
linked to multimedia archive to describe them
Vizualizing Conceptual Spaces
ViCoS
Interdisciplinary software development challenges and problems
Our challenge:
Design a product that fits the needs of the SC
and thus
contribute to maintain and possible revitalize a documented language and consequently present and preserve the cultural heritage
More practical:
Simple user interface for a complex tool – is it possible?
Collaborative workspaces to work in a Wiki-like manner
Interdisciplinary software development challenges and problems
So, what do we encounter:
Interesting project and collaboration, but NOT easy:
Need to bridge the ‘concept gap’
Communication over distances
Different expectations – different (sub)-goals
Software limitations of an online tool
IPR between developer team and linguist team
IPR between speech community and linguist team
Interdisciplinary software development challenges and problems
Is there a positive conclusion?
Interaction opens worlds
First reactions on concept UI and ViCoS from SC are positive
First experience of SC and LS is useful for the development of ViCoS
More DoBeS projects are interested in using LEXUS as an ‘exploitation’ tool
Still almost a year to go..
Acknowledgements:
Thanks to Gaby Cablitz, Jean Kape, Guillome Taimana for their contributions