Upload
rosa-lloyd
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Defining User Access to the Romanian Online Dialect Atlas
Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler
York University, Toronto, Canada
Context
Romania
Source: http://en.wikipedia.org/wiki/Romanian_language#Geographic_distribution
Romanian
22+ million speakers critical exemplar of eastern
Romance language family
Noul Atlas lingvistic român. Crisana Crisana region in
north-west Romania
Hard copy atlas by Stan and Uritescu (1996, 2003, etc)
Digitize to make it more accessible
RODA: Romanian Online Dialect Atlas
Digitize and present hard copy atlas: Mostly graduate students
in Canada and Romania Enter data from maps into text files When complete, it will be posted to
the Internet for general use
Objective Use Information Technology to
permit a broad range of scholars to access the data, select the data appropriately, and present the data clearly;
and so gain greater understanding of its significance.
Example from RODA
Crisana, Romania
Seeing Words Change
Word final –u from Latin
Word final /u/ from Latin
Latin Romanian(standard and most
dialects)
Dialectal Variation
canto ‘I sing’ cânt cântu(vowel present)
cântu
(nonsyllabic)
oculum ‘eye’ ochi ochiu ochiu
Is word final /u/ random? Look for a geographic pattern over
all potential occurrences The maps for single examples such
as /ochi/ and others, are in dialect Atlas,
But total data for all examples is spread widely over many maps.
/u/ Pattern There is a pattern:
Word final /u/ is retained in central, and north-eastern areas
It is syllabic only in parts of the central area Latin noun vs Latin verb: no difference Non-Latin: less data but consistent with Latin
pattern.Note:
Horizontal values include all word final /u/ Vertical values are non-syllabic word final /u/
RODA as linguistic technology
The technology allows one to:
Ask a user-defined question Compare one query to another See the correlation (vertical vs
horizontal) See the strength of the data (short
vs long bars) Save the results for further
processing or presentation
Requirements Multiple comparisons, using:
Shapes Colours Symbols
Reference to original data: See numeric counts Locate raw data (especially when there
are few examples)
RODA: function Custom-defined maps
• You select the data• You see the result as a map
Programmable access to the whole set of digitized data• You ask about data spread over many maps• You can customize what you search for
(not just the editor’s choice)
RODA: selection of data Context of search becomes important
• Word-final vs non-final vs either• Plain character vs accented character• Character vs (superposed) alternate
Choice of fields to search• E.g. With nouns: sg. vs pl. entries• Variations heard by field workers• Flags to mark special situations (e.g.
hesitation)
Bigger challenge
Access to Data In the humanities,
Large amounts of data Diverse ways of selecting it
Information Technology Has the technology May not understand the needs
Need to learn how to apply IT to our discipline effectively
Development Process Requirements gathering
Prototypes Cycles of propose-and-revise
User testing Test versions on web User feedback is important
Explore technology Changes fast Much to learn
Summary Data will soon be available
You are invited to apply your techniques to the data
Digital data and IT methods permit: Widely accessible data Flexible searching and custom
presentation Repeatable processing
Contacts Sheila [email protected] Dorin [email protected] Eric [email protected]
Test sites: ericwheeler.ca/test aml.yorku.ca/~ewheeler/test