13
Work Group 2: Ontological Concepts for Lexical Entries

Work Group 2: Ontological Concepts for Lexical Entries

Embed Size (px)

Citation preview

Page 1: Work Group 2: Ontological Concepts for Lexical Entries

Work Group 2: Ontological Concepts for Lexical Entries

Page 2: Work Group 2: Ontological Concepts for Lexical Entries

An example (Sesana; Gur; Ghana):hórro "at its heart, dirty": (2) "bad"

mε- (prefix, verb>modifier); productive but with exceptions

mεhórro 'dirty' (only)Must list in lexical entries which verbs take mε-

Proposed solution:Assign identifiers (senses and subsenses)Use subsense indentifier to link mεhorro to "be

dirty"

Page 3: Work Group 2: Ontological Concepts for Lexical Entries

<Citation form>:Attributes (1) native-speaker type (e.g. Ingush and Turkish - use infinitive) :

Navajo - infl form(2) linguist-conventions type (Ingush, Turkish, and Navajo: roots)

<Orthographic variant>replace <Allomorphs> with <Unpredicatable variation> morph/morpheme -?only when irregular (e.g. suppletion) types: Suppletion, ….<MSI> morphosyntactic information

could have a subtypes morphology and syntax, limited, vs.??<pronunciations>

link media stream to transcription (MMaxwell's Form)<senses> MM: definition , gloss, SciName

suggested elements:<realm>/<semantic field> - kinship term

<dialect><etymology> - cognate, reconstructions, loans/copies, source language <use> includes register and stylistic value - formal, informal, taboo, colloquial,

child language, archaic <comment>

Page 4: Work Group 2: Ontological Concepts for Lexical Entries

LexEntry type=

headword/lemma

MSI

head citation form

orthog. variant

sense id

unpredictable variation

sense transcription (phonetic, gesture)

media+ audio video image

example

id example gloss

semantic field sense id

etymology

use

id access

lexical relation

scientific term

dialect

region

Page 5: Work Group 2: Ontological Concepts for Lexical Entries
Page 6: Work Group 2: Ontological Concepts for Lexical Entries

• Our thinking about lexical resources/structures has been dominated by print models, primarily dictionaries, less so thesauruses and encyclopedias.

Page 7: Work Group 2: Ontological Concepts for Lexical Entries

• We have the opportunity to design electronic, specifically web, lexical resources in new ways, combining the parts in whatever way is best for specific purposes. This suggests a highly modular design so that the parts can be combined as needed, not just for looking up the meaning or pronunciation of individual words.

Page 8: Work Group 2: Ontological Concepts for Lexical Entries

• The natural unit of analysis is the lexical entry, or lexeme. But each of its parts: phonetic, phonological, orthographic, morphological, syntactic, semantic, pragmatic, perhaps even etymological, are discrete, separable and recurring.

Page 9: Work Group 2: Ontological Concepts for Lexical Entries

• Bell and Bird recognize this to the extent of suggesting as a data structure a set of triples L={Ta = <Fi, Mj, Sk>}, where each T, F, M, S can be separately identified and combined. We can go further with this breakdown, particularly customizing the parts covered by M for the language, providing complete paradigms, derivational patterns, etc.

Page 10: Work Group 2: Ontological Concepts for Lexical Entries

• We know how to break down phonology, orthography, and to some extent, morphology and syntax into smaller units of analysis. We have had less success, consensus, and hence experience with semantics.

Page 11: Work Group 2: Ontological Concepts for Lexical Entries

• We have on the one hand Bloomfield-Fodor “atomism”– the unit of meaning is the meaning of the morpheme– and on the other Pustejovsky-Wierzbicka “decompositionalism” into primitive semantic units (properties and relations).

Page 12: Work Group 2: Ontological Concepts for Lexical Entries

• We need to come to a practical working agreement about semantic analysis. We’re being guided/driven by our friends and colleagues in computer science and artificial intelligence to do so. They are busily developing commonsense ontologies (Cyc Corp, Teknowledge) and practical reasoners, the “agents” who will work for us behind the scenes in Web transactions, for example, so I recommend that we plunge into this research area w. gusto.

Page 13: Work Group 2: Ontological Concepts for Lexical Entries

Conclusion: A distributed lexicon, with the parts identified and some parts pre-assembled (e.g., Bird and Bell style N-tuples), others assemblable and presentable on the fly, e.g., the inflectional paradigms for a particular stem.