49
META-NET has received funding from the EU’s Horizon 2020 research and innovation programme through the contract CRACKER (grant agreement no.: 645357). Formerly co-funded by FP7 and ICT PSP through the contracts T4ME (grant agreement no.: 249119), CESAR (grant agreement no.: 271022), METANET4U (grant agreement no.: 270893) and META-NORD (grant agreement no.: 270899). The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany [email protected] Lisbon, Portugal, July 04/05, 2016

The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany [email protected]

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

META-NET has received funding from the EU’s Horizon 2020 research and innovation programme through the contract CRACKER (grant agreement no.: 645357). Formerly co-funded by FP7 and ICT PSP through the contracts T4ME (grant agreement no.: 249119), CESAR (grant agreement no.: 271022), METANET4U (grant agreement no.: 270893) and META-NORD (grant agreement no.: 270899).

The Strategic Agenda for the Multilingual Digital Single Market V0.9

Georg RehmMETA-NET General Secretary

DFKI, [email protected]

Lisbon, Portugal, July 04/05, 2016

Page 2: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

q  Top priority in the European Union.

q  Expected to add 400b€ to European GDP and hundreds of thousands of new jobs.

q  Unfortunately, the language topic is not included in the EC’s Digital Single Market strategy (published in May 2015).

Page 3: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de
Page 4: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de
Page 5: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Andrus Ansip’s Blog Post

q  First public acknowledgment of the EC that the language topic is of very high relevance for the Digital Single Market.

q  “Overcoming language barriers is vital for building the DSM, which is by definition multilingual. It is now time to reduce and remove the language barriers that are holding back its advance, and turn them into competitive advantages.”

q  The door is open now.

http://www.meta-net.eu 5

Page 6: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

http://www.meta-net.eu 6

q  We need a Strategic Agenda for the Multilingual Digital Single Market.q  Multilingual Services and Multilingual Applications.q  Inherent component: EU data economy – LT for multilingual data value chains.

Page 7: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Language as a Data Type

q  Language technology is a necessary ingredient of the multilingual DSM and mandatory enabler for the European data economy.

q  Big Data is never only numerical – there’s always a language component: unstructured text content, column heads, metadata etc.

q  Without language technology, Big Data analytics won’t happen.

q  The EU Data Economy needs Multilingual Big Data Content Analytics and Multilingual Big Data Content Generation.

http://www.meta-net.eu 7

Unstructured data

Language Technology

Structured dataHeterogeneous data Homogeneous data

Big data KnowledgeUnorganised data Organised data

Multilingual big data Crosslingual analytics

Page 8: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

q  Overall goal: “deliver new Big Data technology allowing for deep analytics capacities on data-at-rest and data-in-motion while providing sufficient privacy guarantees, optimized user experience support and a sound data engineering framework.”

q  “In Europe, text-based data resources occur in many different languages […].”

q  “This multilingualism of data sources makes it often impossible to use existing tools and to align available resources, because they are generally provided only in the English language.”

q  “Thus, the seamless aligning of data sources for data analysis or business intelligence applications is hindered by the lack of language support and availability of appropriate resources.” (p. 23)

http://www.meta-net.eu 8

Page 9: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

BDVA SRIA V2.0: Challenges and Needsq  BDVA SRIA Technical Priority “Data Management”:

§  Tools for handling unstructured and semi-structured data for different languages.§  Annotation frameworks for integration of annotation technologies and data formats. §  Techniques for semantic interoperability such as standardised data models and

interoperable architectures for different sectors.§  Standards and multilingual knowledge repositories that allow the seamless linking of data.

q  BDVA SRIA Technical Priority “Data Analytics”: §  Improved, more accurate statistical models, especially with regard to semantic analysis. §  Deep learning, contextualisation, machine learning, NLP, smart data analytics and real-

time semantic analysis, including event and pattern discovery. §  Methods for unstructured multimedia analytics and data mining, linking

algorithms to deliver cross-domain and cross-sector intelligence.

q  BDVA SRIA Technical Priority “Data Processing”: §  Real-time analytics and event processing of highly heterogeneous

data sources and formats§  Processing, linking, aligning data sets with one another, including

semantic representations, unstructured, semi-structured and structured data, and multimedia data etc.

§  Knowledge extraction out of heterogeneous data sets.§  Special emphasis on quality, precision, robustness

http://www.meta-net.eu 9

Page 10: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

New Version of the SRIA

q  SRIA V0.5 unveiled at META-FORUM 2015.q  SRIA V0.9 unveiled at META-FORUM 2016.q  Goal is to fully align V1.0 with BDVA SRIA V2.0/V3.0.q  Prepared and presented by Cracking the Language

Barrier federation (editorial team: 13 colleagues).q  SRIA addresses how the LT community is going

to act united in order to make the DSM multilingual.q  Framework constraints are straightforward (2018-2020).q  Document available on http://www.cracker-project.eu

and also on http://www.cracking-the-language-barrier.eu.

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFT

DRAFTStrategic Agenda for the

Multilingual Digital Single Market

Technologies for Overcoming Language Barriers towardsa truly integrated European Online Market

DRAFT

Version 0.5 – April 22, 2015

Page 11: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Strategic Research and Innovation Agenda

Language as a Data Type and Key Challenge for Big Data

Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing

and curating natural language content

SRIA Editorial Team

Version 0.9 – July 2016

http

://w

ww.

crac

ker-p

roje

ct.e

uhttp://w

ww.cracking-the-language-barrier.eu

Page 12: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Value Programme

q  Multilingual Value Programe (MLV Programme)§  Highly focused three-year programme§  Requires small and modest investment

q  Three components address the main needs of the Multilingual DSM and how to put them into practice:1.  Multilingual Application Areas2.  Multilingual Services3.  Research

http://www.meta-net.eu 12

Strategic Research and Innovation Agenda

Language as a Data Type and Key Challenge for Big Data

Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing

and curating natural language content

SRIA Editorial Team

Version 0.9 – July 2016

Page 13: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

High-Level Goals and Needs

q  Crosslingual communication for SMEs, public institutions, citizensq  Crosslingual SME presales communication and aftersales servicesq  Multilingual (big) data, language and knowledge value chainsq  Multilingual websites, product catalogues and product descriptionsq  Multilingual knowledge bases and knowledge graphsq  Multilingual voice interfaces for connected devicesq  Crosslingual business intelligenceq  Crosslingual social media analytics for EU-wide societal issuesq  Multilingual text and report generation from big data sourcesq  All services must be domain-adaptable (no one size fits all)q  Translation Centre – high-quality automated translation for all

http://www.meta-net.eu 13

Page 14: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Page 15: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Institutions Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Page 16: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Page 17: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programmeen

able

make use of

Page 18: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Institutions Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Page 19: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Page 20: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Page 21: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

We can start with the

Multilingual Services.

Page 22: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

No need to start with

and wait for Research

(due to previous EC investments)

Page 23: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

But: We need continuous

research efforts to support

Services and Applications!

Page 24: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Activities on all three layers can be

started or continued at the same time.

Page 25: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

RAs

RIAsRAs

RIAs

CSAsfor MLV

planning purposes

Projects need to address

at least two of the layers

Page 26: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

CSA (business models, community building, interoperability, standardisation, testing and evaluation, communication with stakeholders etc.)

CSA (business models, community building, interoperability, standardisation, testing and evaluation, communication with stakeholders etc.)

CSA (research infrastructures, methods, datasets, community building etc.)

CSA

Targeted support

through several CSAs.

Page 27: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

RESTful services,

cloud services, SaaS;

integratable components

Page 28: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Start with a small set of

mission-critical services

Page 29: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Initial services need to be

able to scale organically into

one or more bigger platforms.

Page 30: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Important: provide flexibility by

stimulating a highly innovative

ecosystem that enables the

emergence of more complex

sets of services and platforms.

Organic emergence of multiple, de-centralised platforms(high performance and quality, reliability, scalability, privacy)

Page 31: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Decentralised and decoupled:

•  interoperable services to be developed independently

•  platforms grow organically

•  enabler for an agile multilingual value ecosystem

•  driven by innovative research and interoperability

Page 32: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

After the MLV Programme,

the Multilingual Services need

to become independent,

profitable and self-sustainable.

Page 33: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Multilingual Applications

are meant to include a

commercial component

right from the start.

Page 34: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

From projects

to spinoffs,

startups or

products.

Spinoff, startup or product (B2B)

Spinoff, startup or product (B2B)

Spinoff, startup or product (B2B)

Page 35: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV ProgrammeSpinoff, startup, or product (B2B, B2C)

Spinoff, startup, or product (B2B, B2C)

Spinoff, startup, or product (B2B, B2C)

From projects

to spinoffs,

startups or

products.

Page 36: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

From services to platforms to startups/spinoffs?

Maybe! However: huge, non-trivial

design and planning overhead!

Spinoffs or startups (B2B)

Emerging platforms – assemble services, technologies, resources, datasets …

MLV Programme

Page 37: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Fully adopt hybrid research,

i.e., tight and integrated loop of

research, development, operations

(early testing, short cycles)

Page 38: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Still to be discussed:

priorities and concrete features

Page 39: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Page 40: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Application Areas

q  Multilingual E-commerce§  Customer-facing vs. back-office facing (after-market, after-sales)§  Crosslingual search, CRM, helpdesks, processes, workflows§  Semantic, crosslingual product descriptions and catalogues§  Online dispute resolution

q  Multilingual Content, Media, Verticals§  Content analytics, curation, generation (incl. authoring support)§  Multimodal communication (speech, written, IoT)§  Vertical domains: health, government, mobility, energy, legal.

q  Translation, Language, Knowledge, Data§  Translation Centre – written/spoken, automatic/human§  Crosslingual public and social intelligence, business intelligence§  HQ resources, under-resourced languages, domain-specific LRs

Page 41: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Roadmap (excerpt)q  2018: Technologies and workflows for Multilingual CMSsq  2018: Domain-specific multilingual vocabularies, product cataloguesq  2018: Multilingual, multimodal text, data, media analyticsq  2018: Tools, systems, workflows for bridging human translation and MTq  2018: Integrating multilingual technologies into CRM systemsq  2019: Multilingual, crosslingual search engines and product aggregationq  2019: Integration of content and data across modalitiesq  2019: Product descriptions for cross-lingual product discoveryq  2019: Generation of reports based on Big Data or Linked Dataq  2019: Analysing user feedback for cross-lingual communication in CRMq  2019: Intelligent cross-lingual authoring and enrichment of contentq  2019: HQ MT for many languages, subject fields and text typesq  2020: Automatic translation of online shopsq  2020: Unified, multilingual customer experienceq  2020: Support EC’s ODR platform through cross-lingual technologiesq  2020: Repurposing of media content across languagesq  2020: Content curation services for new business models in mediaq  2020: Semantic interoperability of data sourcesq  2020: Structured data analysis for automatic multilingual text generationq  2020: Automatic localisation and translation of ecommerce text types

Page 42: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Page 43: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Roadmap (excerpt)q  2018: Large-scale dynamic, multilingual knowledge graphs including LOD and

domain-specific vocabularies, taxonomies, ontologies, data setsq  2018: Ease re-use of linguistic resources in all parts of the data value chain

across languages and sectors.q  2018: Basic monolingual technologies and resources for Multilingual Europeq  2018: Next generation LR/LT services to cope with current requirements q  2019: Scalable creation, discovery and exploitation of multilingual public sector

information data sets for re-use of PSI across languages and countriesq  2019: Self-contained, adaptable, flexible, services for generic, pluggable,

configurable data, language and knowledge services that can grow into a larger ecosystem, also including Big Data; wide range of services, e.g., basic low-level technologies such as POS tagging and high-level (combined) ones such as MT including special terminology and human post-editing, generation of spoken usage instructions, or email classification by sentiment and enrichment with background information.

q  2019: Allow for joint exploitation of public and private data sourcesq  2019: Automatise the creation of data needed for multilingual and cross-lingual

semantic annotation scenarios in a scalable and sustainable mannerq  2020: Generate rich, linked knowledge resources for multimodal and multilingual

repurposing of heterogeneous content for different challenges, natural languages, and audiences (including linking resources, visual story generation from multimodal data, semantic user profiles). Linked Data to create a unified information space by bringing together heterogeneous data including product data, customer data, and social data.

q  2020: Multilingual technologies for Multilingual Europe, especially for under-resourced languages

Page 44: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Multilingual Digital Single Market

Automated Translation

E-Commerce Content, Media, Verticals

Translation, Language, Knowledge, Data

Knowledge and Data Repositories

Multilingual Applications

Multilingual Services

ResearchCrosslingual

Big Data Language Analytics

Meaning, Semantics, Knowledge

High-Quality Machine

Translation

SMEs CEF DSIs IT Integrators Researchprovide innovative

applications

fills gaps

H2020 RIAs

H2020 CSAs, IAs, RIAs

H2020 CSAs, RAs, national funding

Multimodal Interaction

Language Processing, Analysis and Production – Language Resources

Citizens Public Business

interoperable and standardised

collaboration with member states

Conversational Technologies

MLV Programme

Page 45: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Three Phases: 2018–2020q  Pre-MLV (2016/2017): stakeholder discussions and consensus building; finalisation of

strategy and roadmap; selection and prioritisation of topics; small prototype projects for proofs of concept; CSAs for planning, coordination, support, community building.

q  MLV Phase 1 (2018): first Multilingual Services; conceptualise Multilingual Applications; integration of services into applications; start and continue activities in the priority research themes

q  MLV Phase 2 (2019): extension of Multilingual Services (coverage, quality, precision); standardisation activities; deployment of first applications; business models; continue research activities

q  MLV Phase 3 (2020): extension of Multilingual Services (incl. standardisation); deployment of Multilingual Applications; transformation of projects into sustainable entities; continue research

q  Post-MLV (2021+): Scaling up and extending Multilingual Applications and Multilingual Services; expanding language and domain coverage; going beyond Europe, penetrating other markets; exploration of novel research strands etc.

http://www.meta-net.eu 45

Page 46: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Setup – Timeframe – Costs

q  Close collaboration with BDVA, EC, EP and all stakeholders (including SMEs, research centres, universities, NGOs etc.).

q  Mix of funding sources: §  H2020 ICT LEIT (2018-2020) for EU projects (RA, RIA, CSA)§  Connecting Europe Facility, CEF AT – role to be discussed§  National/regional funding sources for work on individual LTs and

LRs and also to support and grow SMEs working in this area

q  Estimated costs for basic MLV implementation: 175-200M€ §  Set of mission-critical services and applications §  Timeframe: 2018, 2019, 2020§  Includes 20% industry contribution

http://www.meta-net.eu 46

Page 47: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

q  Published in early 2013.

q  First strategic research agenda for our field.

q  Complex process of collecting and shaping technology visions.

q  Hundreds of researchers participated.

q  Broad topics around multilingual Europe in general.

Page 48: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Timeline and Next Steps

a)  extend the Cracking the Language Barrier federation (bring more members on board, including, crucially, BDVA);

b)  discuss V0.9 of the MDSM SRIA within LT community and also with BDVA and the EC;

c)  discuss with EC potential role of CEF AT in MLV Programme;d)  specify concrete set of research results and services; prioritise

needed applications, services, research areas;e) MDSM SRIA V1.0 to be finalised by Sept./Oct. 2016;f)  get LT back on the radar of the EU as well as into the Work

Programme 2018-2020 – and also beyond.

http://www.meta-net.eu 48

Page 49: The Strategic Agenda for the Multilingual Digital …The Strategic Agenda for the Multilingual Digital Single Market V0.9 Georg Rehm META-NET General Secretary DFKI, Germany georg.rehm@dfki.de

Thank you.

[email protected]

http://www.meta-net.euhttp://www.facebook.com/META.Alliance

49

Strategic Research and Innovation Agenda

Language as a Data Type and Key Challenge for Big Data

Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing

and curating natural language content

SRIA Editorial Team

Version 0.9 – July 2016