22
Language-Sites: Accessing Language Re via Geographic Information Systems Dieter van Uytvanck, Alex Dukers, Paul Trilsbeek Jacquelijn Ringersma (Peter Wittenburg) MPI for Psycholinguistics DOBES Endangered Languages Project

Language-Sites: Accessing Language Resources via Geographic Information Systems

  • Upload
    liliha

  • View
    50

  • Download
    0

Embed Size (px)

DESCRIPTION

  . Language-Sites: Accessing Language Resources via Geographic Information Systems. Dieter van Uytvanck, Alex Dukers, Paul Trilsbeek Jacquelijn Ringersma (Peter Wittenburg) MPI for Psycholinguistics DOBES Endangered Languages Project.   . - PowerPoint PPT Presentation

Citation preview

Page 1: Language-Sites: Accessing Language Resources via Geographic Information Systems

Language-Sites: Accessing Language Resourcesvia Geographic Information Systems

Dieter van Uytvanck, Alex Dukers, Paul Trilsbeek Jacquelijn Ringersma(Peter Wittenburg) MPI for Psycholinguistics DOBES Endangered Languages Project

Page 2: Language-Sites: Accessing Language Resources via Geographic Information Systems

Little Background Information

In the MPI Archive we have• data for professionals in Computer Linguistics and Phonetics such as the Dutch Spoken Corpus, the Second Learner Corpus, Gesture corpora, etc.• but also • data about small languages, anthropological data etc

• the users of the latter are mainly • linguists, ethnologists, musicologists, ethnobiologists etc. and • speech community members

• overview of the “small languages” in the archive

Page 3: Language-Sites: Accessing Language Resources via Geographic Information Systems

DOBES Languages

40 language teams from the DOBES program documenting about 60 languages and working independently

Page 4: Language-Sites: Accessing Language Resources via Geographic Information Systems

MPI Languages

• about 100 researchers at the MPI • also increasing amount of deposits from external people

Page 5: Language-Sites: Accessing Language Resources via Geographic Information Systems

User Interests

• researchers have completely different interests compared to HLT • non-linguistic influences on language development • language contact effects (cognate sets)• music systems and relevance of patterns • cultural differences in parent-child relation• kinship and other relations between persons • cultural differences in relation between “language and thought” • etc

• speech community interests • revitalize the language • find identity and bring it over to their children • document cultural knowledge encoded in language + music• get acquainted with modern technology • etc

Page 6: Language-Sites: Accessing Language Resources via Geographic Information Systems

Standard way of Access in LAT

• standard way of accessing a large archive is to browse and/or search in a catalogue• MPI archive offers the IMDI infrastructure • such a canonical catalogue needs to be based on predefined classifications by the researcher and organization principles defined by the archivist

• some professionals like it since it is neutral and offers atomic access• most users find it boring and not-functional• certainly for the speech community this presentation is completely meaningless

metadata browsing& searching

LAT

Page 7: Language-Sites: Accessing Language Resources via Geographic Information Systems

Offering new Views in LAT

1. allow everyone to build his/her own virtual collection, i.e. step away from canonical pre-defined hierarchy

2. allow people to create community portals where metadata queries are used to present the resources in a web-site style

3. allow everyone to access complex objects such as annotated multimedia recordings

4. allow people to start from a semantic conceptual space

5. allow people to start from geographic information

LAT

Page 8: Language-Sites: Accessing Language Resources via Geographic Information Systems

Create own virtual Collections

• recombining and linking metadata descriptions• result is a new linked structure of nodes

• still the same “boring” style

LAT

Page 9: Language-Sites: Accessing Language Resources via Geographic Information Systems

Create Community Portals

• creating “nice” web-sites with categories according to some criteria such as genre• take care: our genres are not the same as community genres • basis is a dynamic REST-based query on the metadata registry and properly filled in metadata

• communities like this and it is maintainable for archivist

LAT

Page 10: Language-Sites: Accessing Language Resources via Geographic Information Systems

Complex Access to Resources

• navigate from resource to resource by using content links• resources can be annotated media resources, lexicons with multimedia extensions, metadata descriptions etc.

• nice, but very specific and time consuming (work in progress)

LAT

Page 11: Language-Sites: Accessing Language Resources via Geographic Information Systems

Navigation in Conceptual Spaces

• creating conceptual spaces with semantically meaningful relations• allow people to navigate in such spaces and jump to detail information in media, lexicons, photos, etc

• turns out to be very attractive to researchers and community members (work in progress)

LAT

Page 12: Language-Sites: Accessing Language Resources via Geographic Information Systems

Geographic Views

• for many users GIS view is very attractive • like to relate languages and cultures with regions • combining with other resources (geographical, historical, political, etc) • are creating GoogleEarth overlays (XML -> no dependency of big brother)• on the following slides some examples

LAT

Page 13: Language-Sites: Accessing Language Resources via Geographic Information Systems

GIS Link to Catalogue Node

• as appetizer and entry point to the appropriate catalogue node • then continuation in IMDI tree • automatic generation if coordinates are filled in(from Gunter Senft)

LAT

Page 14: Language-Sites: Accessing Language Resources via Geographic Information Systems

GIS Link to Complex Resources

LAT

• as appetizer and entry point to complex resources such as annotated media or lexicons(from Stephen Levinson)

Page 15: Language-Sites: Accessing Language Resources via Geographic Information Systems

GIS as organization Mechanism

LAT

• some researchers have organized their material according to field trips and visited places• GIS overlay gives easy links to all steps • from there link to the IMDI nodes

(from Niklas Burenholt)

Page 16: Language-Sites: Accessing Language Resources via Geographic Information Systems

GIS for anthropological Marks

LAT

• anthropologists like to set marks about mythical places, historical events and sociologically relevant material• combination with material from archeology for example• zooming in and out to see geographic relations

(from G. Boden)

Page 17: Language-Sites: Accessing Language Resources via Geographic Information Systems

GIS as entry points for Communities

LAT

• here an example from the DOBES Beaver team (Canada)• use to point to toponyms and their ethymology with direct links to resources, web-sites etc.

(from J. Miller)

Page 18: Language-Sites: Accessing Language Resources via Geographic Information Systems

GIS as entry point to LR Archives

• could be used to find regional archives with interesting language material • here the archive at CONICET in Buenos Aires

LAT

Page 19: Language-Sites: Accessing Language Resources via Geographic Information Systems

Other known Usages

• Jamieson: sounds of the world with Apple Hypercard

• CNRS/Quai Branly: explanation of aspects of languages in the world

• WALS (Haspelmath): relating language typology features to regions

• trends to combine geological and time information

• de Vriend: adding coordinates to lexemes for microvariation studies

LAT

Page 20: Language-Sites: Accessing Language Resources via Geographic Information Systems

Pros and Cons

• make GIS view one view on data amongst others but maintain a proper repository structure

• GIS is excellent for geographically oriented overviews almost everyone is used to understand maps equipment tuned to allow automatically adding coordinates

• GIS methods allow easy visual correlations geographic parameters influencing language contact very easy to see that big swamps hampered influences

• GIS optimal for bringing data from various disciplines together

• take care that you are not dependent from big brother

LAT

Page 21: Language-Sites: Accessing Language Resources via Geographic Information Systems

Thanks for your attention.

LAT

Page 22: Language-Sites: Accessing Language Resources via Geographic Information Systems

Language Archiving Technology

Shoebox/CHATTranscriber

XML

ELAN/LEXUS/SYNPATHY Annotation + Lexicon

IMDI Data Organization, Metadata

LAMUS Data Uploading and Management

Access Management

Data Archiving and Copying

IMDI / GISMetadata Browsing & Searching

ANNEX/LEXUS/IMEX/TROVA

Complex Access via Web ODIT/ISOcat Ontology

management framework

preparation

integration

utilization

ADDIT/VICOS/MELEnrichments/Views

LAT

• LAT to support operations during resource life-time

Archive GridFederation