32
Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Embed Size (px)

Citation preview

Page 1: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Defining User Access to the Romanian Online Dialect Atlas

Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler

York University, Toronto, Canada

Page 2: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Context

Page 3: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Romania

Source: http://en.wikipedia.org/wiki/Romanian_language#Geographic_distribution

Page 4: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Romanian

22+ million speakers critical exemplar of eastern

Romance language family

Page 5: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Noul Atlas lingvistic român. Crisana Crisana region in

north-west Romania

Hard copy atlas by Stan and Uritescu (1996, 2003, etc)

Digitize to make it more accessible

Page 6: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

RODA: Romanian Online Dialect Atlas

Digitize and present hard copy atlas: Mostly graduate students

in Canada and Romania Enter data from maps into text files When complete, it will be posted to

the Internet for general use

Page 7: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Objective Use Information Technology to

permit a broad range of scholars to access the data, select the data appropriately, and present the data clearly;

and so gain greater understanding of its significance.

Page 8: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Example from RODA

Page 9: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Crisana, Romania

Page 10: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada
Page 11: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada
Page 12: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada
Page 13: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Seeing Words Change

Word final –u from Latin

Page 14: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Word final /u/ from Latin

Latin Romanian(standard and most

dialects)

Dialectal Variation

canto ‘I sing’ cânt cântu(vowel present)

cântu

(nonsyllabic)

oculum ‘eye’ ochi ochiu ochiu

Page 15: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Is word final /u/ random? Look for a geographic pattern over

all potential occurrences The maps for single examples such

as /ochi/ and others, are in dialect Atlas,

But total data for all examples is spread widely over many maps.

Page 16: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada
Page 17: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada
Page 18: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada
Page 19: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada
Page 20: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada
Page 21: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

/u/ Pattern There is a pattern:

Word final /u/ is retained in central, and north-eastern areas

It is syllabic only in parts of the central area Latin noun vs Latin verb: no difference Non-Latin: less data but consistent with Latin

pattern.Note:

Horizontal values include all word final /u/ Vertical values are non-syllabic word final /u/

Page 22: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

RODA as linguistic technology

Page 23: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

The technology allows one to:

Ask a user-defined question Compare one query to another See the correlation (vertical vs

horizontal) See the strength of the data (short

vs long bars) Save the results for further

processing or presentation

Page 24: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada
Page 25: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Requirements Multiple comparisons, using:

Shapes Colours Symbols

Reference to original data: See numeric counts Locate raw data (especially when there

are few examples)

Page 26: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

RODA: function Custom-defined maps

• You select the data• You see the result as a map

Programmable access to the whole set of digitized data• You ask about data spread over many maps• You can customize what you search for

(not just the editor’s choice)

Page 27: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

RODA: selection of data Context of search becomes important

• Word-final vs non-final vs either• Plain character vs accented character• Character vs (superposed) alternate

Choice of fields to search• E.g. With nouns: sg. vs pl. entries• Variations heard by field workers• Flags to mark special situations (e.g.

hesitation)

Page 28: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Bigger challenge

Page 29: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Access to Data In the humanities,

Large amounts of data Diverse ways of selecting it

Information Technology Has the technology May not understand the needs

Need to learn how to apply IT to our discipline effectively

Page 30: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Development Process Requirements gathering

Prototypes Cycles of propose-and-revise

User testing Test versions on web User feedback is important

Explore technology Changes fast Much to learn

Page 31: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Summary Data will soon be available

You are invited to apply your techniques to the data

Digital data and IT methods permit: Widely accessible data Flexible searching and custom

presentation Repeatable processing

Page 32: Defining User Access to the Romanian Online Dialect Atlas Sheila M. Embleton, Dorin Uritescu & Eric S. Wheeler York University, Toronto, Canada

Contacts Sheila [email protected] Dorin [email protected] Eric [email protected]

Test sites: ericwheeler.ca/test aml.yorku.ca/~ewheeler/test