Upload
truongbao
View
223
Download
0
Embed Size (px)
Citation preview
The European Language
Resources Coordination –
ELRC 2nd Conference
Prof. Josef van Genabith, Dr. Andrea Lösch
German Research Center
for Artificial Intelligence (DFKI)
1
• How do we bridge language gaps?
• How do we make sure that information does not stay in silos?
• How do we make sure that nobody is discriminated against because of language?
• How do we treat all languages in the same?
• Translation!
• Human translation
• Machine translation
• To support HT and sometimes also MT on its own
Translation
3
Human languages are complex!
4
• Human languages are:
– Elegant
– Efficient
– Flexible
– Complex
• One word/sentence may mean many
things
• Many ways of saying the same thing
• Meaning depends on context
• Literal and figurative language
(metaphor)
• Language and culture (different ways of
conceptualising the same thing)
• Language is complex
• We cannot compute it exactly
• We tried: rule-based LT …
• What do we do?
• Machine Learning
– Learns from data
– Approximate solution not perfect
• Robust
• Scalable
5
• Translation everywhere• Industry
• Culture
• Travel
• Education
• EC prime producer and consumer for translation
• One of largest translation operations on the planet
• Long term & expert user of MT in public service
• State-of-the-art SMT based on EU funded
research: the Moses SMT system
Translation, the EC and Beyond
10
Why ELRC…?
11
• EC has decided to expand translation to the needs of the
public services in the member states
• Automated Translation platform of the Connecting
Europe Facility (CEF AT) to facilitate multilingual
communication and exchange of documents in key
public administration scenarios:
Consumer rights, health, public procurement, social security,
culture, justice.
Public online services: Open Data Portal, Europeana, Online
Dispute Resolution, eJustice etc. (DSIs of CEF …)
• EC has good data for its own needs: EU parliamentary debates,
EU laws etc. …
• It doesn’t have the right kind of data for the needs of national
public services and the DSIs
• Training on Harry Potter and translation weather reports ….!
• It needs the right kind of data!
• Who has the best data for their needs?
• The national public services of the member states!
• ELRC: working for the EC with the national public services to
obtain this data for the EC to provide MT services back to
national public services and CEF DSIs
Why ELRC…?
12
Who is ELRC?
13
• The ELRC Consortium
– German Research Center for Artificial Intelligence (DFKI) – Josef
van Genabith, Andrea Lösch
– Evaluations and Language Resources Distribution Agency
(ELDA) – Khalid Choukri
– TILDE – Andrejs Vasiljevs
– ILSP (Institute for Language and Speech Processing) – Stelios
Piperidis
• PLUS:
– 30 ELRC Technological NAPs (one per CEF affiliated country)
– 30 ELRC Public Services NAPs (one per CEF affiliated country)
– Legal advisors (e.g. iRights)
ELRC Workshops
14
• Past workshops:– 24.09.15 Greece
– 29.09.15 Germany
– 05.08.15 Latvia
– 23.11.15 Hungary
– 01.12.15 Cyprus
– 08.12.15 Slovenia
– 15.12.15 Czech Republic
– 26.01.16 Spain
– 28.01.16 Ireland
– 11.02.16 Estonia
– 19.02.16 Finland
– 24.02.16 Lithuania
– 26.02.16 Malta
– 01.03.16 Portugal
– 07.03.16 Copenhagen
– 09.03.16 Poland
– 10.03.16 Sweden
– 15.03.16 Italy
– 18.03.16 Bulgaria
– 23.03.16 Romania
– 14.04.16 Slovakia
– 15.04.16 Austria
– 19.04.16 Netherlands
– 21.04.16 Croatia
– 11.05.16 France
– 08.06.16 Norway
– 14.06.16 Luxemburg
ELRC Workshops
16
• Localized workshops in each of the 30 participating
countries
• Target audience: National public service administrations
• Goals:
– To raise awareness about the importance of language data held
by public administrations for public administrations
– To understand the needs of national public service
administrations with regard to automated translation
– To jointly identify relevant sources of multi-lingual language
resources
– To discuss any technical and legal issues involved in the use of
data for automated translation
ELRC Helpdesk
20
• Continuous support for data contributors
• Accessible through
– ELRC website: http://www.lr-coordination.eu/helpdesk
– Phone, Skype, Email
• Services:
– Answering any technical questions related to the use, production,
collection, processing, and sharing of language resources.
– Answering any legal questions related to the use, production,
collection, processing, and sharing of language resources.
• Response times:
– 24 hours (simple query)
– Up to 5 days (complex query)