Upload
martin-kaltenboeck
View
329
Download
2
Tags:
Embed Size (px)
Citation preview
BIG DATA EUROPEHTTP://WWW.BIG-DATA-EUROPE.EU/
Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges
CMG-AE Tagung
Big Data: Strategien, Technologien und Nutzen
19. Mai 2015, Expat Center der Wirtschaftsagentur Wien
Semantic Web Company
(SWC)SWC was founded 2001, head-quartered in Vienna
25 experts in linked data technologies
Product: PoolParty Semantic Suite (launched 2009)
Serving customers from all over the world
EU- & US-based consulting services
Semantic Web Company
(SWC)Some of our Customers
● Credit Suisse
● Boehringer Ingelheim
● Roche
● Wolters Kluwer
● BMJ Publishing Group
● Red Bull Media House
● Canadian Broadcasting Corporation (CBC)
● Pearson
● Council of the EU
● DG Environment, EC
● Healthdirect Australia
● Ministry of Finance (Austria)
● World Bank Group
● Inter-American Development Bank (IADB)
● International Atomic Energy Agency (IAEA)
● Buildings Performance Institute Europe
(BPIE)
● Renewable Energy & Energy Efficiency P
(REEEP)
● Global Buildings Performance Network
(GBPN)
● American Physical Society
● Education Services Australia (ESA)
Finance / Automotive / Publisher / Health Care / Public Administration / Energy /
Education
Selected Partners● EBCONT
● EPAM Systems
● iQuest
● PwC
● Tenforce
● OpenLink Software
● Ontotext
● MarkLogic
● Gravity Zero
● Altotech
● Wolters Kluwer
● Taxonomy Strategies
● Digirati
● Fraunhofer (IAIS)
● University of Leipzig
(INFAI)
● The Open Data Instizute
We all have one goal in mind: Make machines smart enough so that
they can help us to find those needles in the haystack, which are
really relevant to us.
The Motivation – Big Data
Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the
world today has been created in the last two years alone.
This data comes from everywhere: sensors used to gather climate information, posts to
social media sites, digital pictures and videos, purchase transaction records, and cell phone
GPS signals to name a few.
This data is big data. Source:
IBM
Big Data Dimensions
Volume
Velocity
Variety
100010101010101010101010
010100101010101010010100
101010010100101010010100
101001010100101010100101
010001010101010010101010
101010101001010101010101
001010101110001010101010
101010101001010010101010
101001010010101001010010
101001010010100101010010
101010010101000101010101
00101010101
1000101010
1010101010
1010010100
1010101010
1001010010
…….
………….
……
………..…….
.
……………
1 0
1000101010
1010101010
1010010100
1010101010
100101oo11
Veracity!
Big Data Dimensions
Big Data in Europe: Challenges, Opportunities
HealthClimateEnergyTransportFoodSocietiesSecurity
Loremipsumdolors
KSDJOPSCKKSDKAB
LKASJL
LAWWD
S
wpweppe
pwpisio
we
10101001101010010101
0
Regional Data Repositories10101001101010010101
0
10101001101010010101
0
#2: Interlink, Centralise Access, Explore
101010100101010101001011010001010101010010101010100001011010001010101010010101010100100101010100101010101001011010
001010
Data Eleme
nt
#3: Analyse, Discover, Visualize#4: Mashup, Cross-domain Exploitation
Journalists Authorities
Big Data in Europe: Obstacles
19-mai-15
#1 Big Data “Variety“ problem Multiple Data Sources
Required: Integration, Harmonisation
#2 Opening-up Data concerns Loss of control, lack of tracking
Reservations about large corporations
#3 Limited Skills, Training, Technology
Lack of Data Scientists
Lack of Generic Architectures, components
Big Data in Europe: Obstacles
19-mai-15
Extraction, Curation Quality, Linking, Integration
Publication, Visualization, Analysis
Extraction, Curation, Quality, Linking, Integration, Publication,
Visualization, Analysis
Health
Transport
Security
Extraction Curation Quality Linking Integration Publication Visualization Analysis
Data Repositories Linked Open Data Cloud
Stage 1
Stage 2
Stage 3
Food SocietiesClimate Energy
Rationale
Show societal value of Big Data
Lower barrrier for using big data technologies
o Required effort and resources
o Limited data science skills
Help establishing cross-
lingual/organizational/domain Data Value
Chains19-mai-15
Rationale
COORDINATION
Stakeholder Engagement
(Requirements Elicitation)
SUPPORT
Design, Realise, Evaluate
Big Data Aggregator
Platform
Create and Manage Societal
Big Data Interest Groups
Cloud-deployment ready
Big Data Aggregator
Platform
CSA
Measures
Results
Summary
Two clearly defined coordination and support measures:
Coordination: Engaging with a diverse range of stakeholder groups representing particularly
the Horizon 2020 societal challenges Health, Food & Agriculture, Energy, Transport, Climate,
Social Sciences and Security; Collecting requirements for the ICT infrastructure needed by data-
intensive science practitioners tackling a wide range of societal challenges; covering all aspects
of publishing and consuming semantically interoperable, large-scale data and knowledge assets;
Support: Designing, realizing and evaluating a Big Data Aggregator platform infrastructure
that meets requirements, minimises disruption to current workflows, and maximises the
opportunities to take advantage of the latest European RTD developments (incl. multilingual data
harvesting, data analytics & visualisation).
BigDataEurope will implement and apply two main instruments to successfully realize these
measures:
Build Societal Big Data Interest/Community Groups in the W3C interest group scheme &
involving a large number of stakeholders from the Horizon 2020 societal challenges as well as
technical Big Data experts;
Design, integrate and deploy a cloud-deployment-ready Big Data aggregator platform
comprising key open-source Big Data technologies for real-time and batch processing, such as
Orthogonal Dimensions of Big Data
EcosystemsGeneric Big Data Enabling Technologies
Data Value Chain
Data Generation & Acquisition
Data Analysis & Processing
Data Storage & Curation
Data Visualization &
Usage
Data-driven Services
So
cie
tal
Ch
all
en
ge
s
Do
ma
in S
pe
cifi
c D
ata
Ass
ets
& T
ech
no
log
y
Healthcare
Food Security
Energy
Intelligent Transport
Climate & Environment
Inclusive & Reflective Societies
Secure Societies
BDE Stakeholder Engagement Approach &
Activities
BDE Community Tools – JOIN IN NOW !
• Website: news, events, community, …
• 7 x BDE W3C Community Groups
• 7+1x Mailing Lists
• 7 x SC Workshops/Year = 21 Workshops
• Full set of communication tool-set…
Future Outlook
• BDE Aggregator Platform
• For download / internal use
• Cloud Version
• Big Data Technology Support Tools
Domains, Focus Areas & Data
AssetsSocietal Domain Preliminary Big Data Focus area Selected Key Data assets
Life Sciences &
Health
Heterogeneous data Linking &
integration
Biomedical Semantic Indexing & QA
ACD Labs / ChemSpider, ChEBI, ChEMBL, Con-ceptWiki, DrugBank, EN-
ZYME, Gene Ontology, GO Annotation, Swis-sProt, UniProt, Wik-iPathways,
PubMed, MeSH, Disease Ontology (DO), Joint Chemical Dic-tionary
(Jochem), Bio-ASQ datasets
Food &
AgricultureLarge-scale distributed data integration
INFOODS, AQUASTAT Green Learning Network (GLN), Agricultural
Bibliography Network (ABN), AGRIS, AquaMaps, Fishbase
Energy
Real-time monitoring, stream
processing, data analytics, and
decision support
European Energy Exchange Data, smart meter measurement data,
gas/fuels/energy market/price data, consumption statistics, equipment
condition monitoring data)
TransportStreaming sensor network & geo-
spatial data integration
GTFS data, OSM/ LinkedGeoData, MobilityMaps, Transport sensor data,
ROSATTE Road safety attributes, European Road Data Infrastructure -
EuroRoadS
ClimateReal-time monitoring, stream
processing, and data analytics.
European Grid Infrastructure (EGI), Databases hosting atmospheric data.
Several software frameworks for simulation, calibration and reconstruction.
Social SciencesStatistical and research data linking &
integration
Federated social sciences data catalogs, statistical data from public data
portals and statistical offices (e.g. EuroStats, UNESCO, WorldBank)
Security
Real-time monitoring, stream
processing, and data analytics.
Image data analysis
Earth Observation data (e.g. Very High Resolution Satellite Imagery acquired
from commercial providers and governmental systems) and collateral data
for supporting CFSP/CSDP missions and operations, Databases hosting
atmospheric Data. Experimental and simulation data concerning dispersion
of hazardous substances
Work Packages & Implementation
Phases
Community Building
M1-M12 M13-M24 M25-M36
Enabling Technologies
Component Integration
Uptake
Integrator Deployment
Community Assessment
WP3 – Big Data Generic Enabling Technologies & Architecture
WP5 – Big Data Integrator Instances
WP7 – Dissemination & Communication
WP2 – Community Building & Requirements
WP4 – Big Data Integrator Platform
WP6 – Real-life Deployment & User Evaluation
Blueprint of the Data Aggregator
Platform
Batch Layer
Speed Layer
Data Storage
Real-time data &
Transactions …
Batch View
Real-time
View
me
ssa
ge
pa
ssin
g
message passing
Applications & Showcases
Real-time dashboards
Domain-specific BDE apps
Big Data Analytics
In-stream Mining
BD
E P
latfo
rm&
Inte
lligence
Input data
Stream
Spatial
Social
Statistical
Temporal
Transaction
al
Imagery
+ Semantic Layer (Retaining Semantics using LD
approach )
Lambda Architecture
Announcements….
Free: European Data Economy WorkshopFocus Data Value Chain & Big Data, 15.9.2015 WU ViennaREGISTER NOW: bit.ly/Data-Economy-WS-SEMANTiCS2015
Your Questions please….
www.big-data-europe.eu
Martin Kaltenböck, [email protected]
Semantic Web Company GmbH
Mariahilfer Strasse 70/8, A-1070 Vienna
+43-1-4021235
http://www.semantic-web.at
http://www.poolparty-software.com
http://slideshare.net/semwebcompany
http://youtube.com/semwebcompany
Slides: CC BY-NC-SA (except otherwise stated)
19-mai-15#BigDataEurope