Upload
openmintedeu
View
373
Download
1
Embed Size (px)
Citation preview
The Future is All MineText and Data Mining Projects in Europe
@openminted_eu @futuretdm
@openminted_eu @futuretdm
Funded by:
Projects funded by@openminted_e
u @futuretdm
Text and data mining is the future“Text and data mining (TDM) is the process of deriving information from machine-read material. It works by copying large quantities of material, extracting the data, and recombining it to identify patterns.” JISC
Projects funded by@openminted_e
u @futuretdm
Text and data mining helps us understand the pastMining historical books:the evolution of languageSource: http://www.sciencemag.org/content/331/6014/176 (Baylor College of Medicine, Houston)
Projects funded by@openminted_e
u @futuretdm
Text and data mining predicts the future
Mining newspapers:Predicts revolutions Source: http://journals.uic.edu/ojs/index.php/fm/article/view/3663/3040 (University of Illinois)
Projects funded by@openminted_e
u @futuretdm
Text and data mining saves the future
Mining scientific publications about diseases:Save livesSource: http://dl.acm.org/citation.cfm?id=2623667 (Baylor College of Medicine, Houston)
Projects funded by@openminted_e
u @futuretdm
Text mining – it seems so easy:
Linguistic Analysis:
Entity Recognition
Data MiningKnowledge Discovery
Information Extraction
STAGE 1 STAGE 2 STAGE 3 STAGE 4Information
Retrieval
Projects funded by@openminted_e
u @futuretdm
But it actually poses many challenges…
??
??
??
???
?? ??
??
??
How do I make my texts
readable by machines?
?Which mining method to
use?
STAGE 1 STAGE 2 STAGE 3 STAGE 4Where do I find data?
Projects funded by@openminted_e
u @futuretdm
9
Current Barriers in Europe
Awareness across Institutions & Stakeholders- Lack of awareness among research
communities- Lack of guidance to uncover TDM
potential
Skills and Tools- Availability and accessibility across
disciplines- Gap in skills across various sectors
Licensing & Open Access- License proliferation and interoperability
issues- License barriers to transparent open
access
Copyright and Data Protection- TDM activities infringing current
copyright laws- Legal and policy limitations and barriers
for TDM
Projects funded by@openminted_e
u @futuretdm
EU PROJECTS on TDMFutureTDM
Identify TDM barriers and policy solutions
Open mine
Build a TDM eInfrastructure
Projects funded by@openminted_e
u @futuretdm
ELABORATE a legal and policy framework for future TDM and specify a research agenda to foster the spread of TDM
BUILD a website: a Collaborative Knowledge Base and an Open Information Hub combined
ANALYSE current application areas and best
practices in TDM
ASSESS existing studies, legal
regulations and policies on TDM
Main Objectives of FutureTDM
INVOLVE all key stakeholders to
identify practices, requirements, and specific challenges
INCREASE awareness of TDM to attract new target groups and science domains
@openminted_eu @futuretdm
This project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No 665940.
Bottom-up approach:
Stakeholder workshops and
knowledge cafes throughout Europe
FutureTDM
@openminted_eu @futuretdm
This project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No 665940.
Data centre Data centre Data centre Data centre
in public cloud
Publisher text corpus
OpenAIRE/CORE text corpus
PMC text corpus
Other text corpora
Other text corpora
Other text corpora
Other types of text corpora
Layer 3:Interoperabilityto shared storage and computing resources
Language resources Language resources Language resources Language resources
Layer 2: Interoperability oflanguage resources & corpora
Layer 1:Interoperabilityof text mining services (platforms or components)
Language resources and corpora registry service
Platform services Registry Workflow ManagementAuth2 & Policy management Annotator Accounting
Mining Platforms Mining Platforms Mining Platforms
Proprietary architectures
Mining Platforms
Objective of OpenMinTeD
@openminted_eu Projects funded by@futuretdm
OpenMinTeD brings together:
14
ACCESSIBLE CONTENT
DISCOVERABLE SERVICES
EFFICIENTPROCESSING
TDM COMMUNITIES
VALUE ADDED APPS
Via standardised programmatic interfaces and access rules
Easily discoverable text mining services and workflows which process, analyse and annotate text
Operate on public e-Infrastructures via standarized APIs
Different scientific communities have different challenges
Community-driven applications to illustrate the value of the infastructure. Engage with industry.
OPENMINTED = The Open Mining Infrastructure for Text and Data
Become involvedFollow us on Twitter for the latest updates and blogs@openminted_eu@futuretdm
Follow our websiteswww.openminted.euwww.futuretdm.eu
Projects funded by@openminted_e
u @futuretdm
THANK YOU
• Athena RIC• Univ. of Manchester (NacTem)• Univ. of Darmstadt• INRA• EMBL-EBI• Agro-Know• LIBER• Univ. of Amsterdam• Open University UK• EPFL• CNIO• Univ. of Sheffield (GATE)• GESIS• GRNET• Frontiers• Univ. of Stirling
PARTNERS OPENMINTED PARTNERS FUTURETDM• SYNYO GmbH (SYNYO)• LIBER Europe• Open Knowledge Foundation
LBG (OK/CM) • Radboud Univ. Nijmegen• The British Library Board • Univ. of Amsterdam• Athena RIC• Ubiquity Press • Fundacja Projekt: Polska (FPP)