43
BIG DATA EUROPE H2020 CSA (2015-17) BDE PILOT INSTANTIATION Ronald Siebes VU Amsterdam Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges 09.12.2016

SC1 Workshop 2 Pilot instantiations

Embed Size (px)

Citation preview

Page 1: SC1 Workshop 2 Pilot instantiations

BIG DATA EUROPEH2020 CSA (2015-17)

BDE PILOT INSTANTIATIONRonald Siebes VU Amsterdam

Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges

                    

09.12.2016

Page 2: SC1 Workshop 2 Pilot instantiations

BigDataEurope

3 mai 2023www.big-data-europe.eu

The 7 Societal Challenges and their first pilots

Page 3: SC1 Workshop 2 Pilot instantiations

SC1: Life Sciences & Health

3 mai 2023www.big-data-europe.eu

SC1: Life Sciences & Health

Page 4: SC1 Workshop 2 Pilot instantiations

SC1: Life Sciences & Health

3 mai 2023www.big-data-europe.eu

Partners:

A not-for-profit membership organization, which supports and continues the development of the information infrastructure created during the Open PHACTS project of the Innovative Medicines Initiative (IMI).

The VU Amsterdam was a key participant in the Open PHACTS project responsible for developingthe Linked-Data infrastructure.

Big Data Focus area: Large-scale heterogeneous pharma-research data linking & integrationSelected Key Data assets: ACD Labs / ChemSpider, ChEBI, ChEMBL, ConceptWiki, DrugBank, ENZYME, Gene Ontology, GO Annotation, SwissProt, WikiPathways

Page 5: SC1 Workshop 2 Pilot instantiations

SC1: Life Sciences & Health

3 mai 2023www.big-data-europe.eu

Page 6: SC1 Workshop 2 Pilot instantiations

SC1: Life Sciences & Health

3 mai 2023www.big-data-europe.eu

Page 7: SC1 Workshop 2 Pilot instantiations

SC1: Life Sciences & Health

3 mai 2023www.big-data-europe.eu

Page 8: SC1 Workshop 2 Pilot instantiations

SC1: Life Sciences & Health

3 mai 2023www.big-data-europe.eu

Pilot 1: Duplicate Open PHACTS functionality on the BDE infrastructure using Open Source solutionsReasons: • Deployment possible in-house • Vary domains (e.g. Agriculture)• Using extra BDE functionalities (e.g.

logging, analysis)

Page 9: SC1 Workshop 2 Pilot instantiations

SC1: Life Sciences & Health

3 mai 2023www.big-data-europe.eu

BDE infrastructure - Large scale RDF reasoning over 3 billion+ triples- RESTful API - Various front ends

Page 10: SC1 Workshop 2 Pilot instantiations

SC2: Food & Agriculture

3 mai 2023www.big-data-europe.eu

SC2: Food & Agriculture

Page 11: SC1 Workshop 2 Pilot instantiations

SC2: Food & Agriculture

3 mai 2023www.big-data-europe.eu

Partners:FAO, the largest autonomous agency within the United Nations system and one of the mainplayers in the agricultural informationcommunity.

Big Data Focus area: Large-scale distributed agricultural data integrationSelected Key Data assets: INFOODS, AQUASTAT Green Learning Network (GLN), Agricultural Bibliography Network (ABN), AgroVoc, AquaMaps, Fishbase

Semantic Web Company (SWC) is a technology provider headquartered in Vienna (Austria). SWC supports organizations from all industrial sectors worldwide to improve their information management. Their core product is to extract meaning from big data by making use of linked data technologies.

Page 12: SC1 Workshop 2 Pilot instantiations

SC2: Food & Agriculture

3 mai 2023www.big-data-europe.eu

AGINFRA

Page 13: SC1 Workshop 2 Pilot instantiations

SC2: Food & Agriculture

3 mai 2023www.big-data-europe.eu

Pilot focus area:Viticulture(from the Latin word for vine)

is the science, production,and study of grapes.

It deals with the series of events that occur in the vineyard.

Page 14: SC1 Workshop 2 Pilot instantiations

SC2: Food & Agriculture

3 mai 2023www.big-data-europe.eu

Pilot 2: Support advanced crop data discovery, processing, combining and visualization from distributed and heterogeneous data repositories

Vine and Wine sector: emerging market in EU

Sustainability and biodiversity challenges: local varieties are being lost

Exploitation of new grapevine varieties and clones in terms of climate change adaptation

Quality and health status of viticultural products

Contribution to human health (antioxidants, prevention of heart diseases etc.)

Wide variety of heterogeneous (and big) data from various information sources

Reasons:

Page 15: SC1 Workshop 2 Pilot instantiations

SC2: Food & Agriculture

3 mai 2023www.big-data-europe.eu

BDE infrastructure tasks- Large scale data extraction and integration processing from external

data sources (tables, figures texts)- Analysis batch jobs for generating statistical data- Rich query support combining various parameters (e.g. location,

geno/fenotypes, publications, soil data) - Various front ends similar to PubMed

Page 16: SC1 Workshop 2 Pilot instantiations

SC3: Energy

3 mai 2023www.big-data-europe.eu

SC3: Energy

Page 17: SC1 Workshop 2 Pilot instantiations

SC3: Energy

3 mai 2023www.big-data-europe.eu

Partners: A public entity supervised by the Ministry of Environment, Energy and Climate Change in Greece, founded in September 1987, active in the fields of Renewable Energy Sources (RES), Rational Use of Energy (RUE) and Energy Saving (ES).

Big Data Focus area: Real-time turbine monitoring stream processing and analyticsSelected Key Data assets: European Energy Exchange Data, smart meter sensor data, gas/fuels market/price data, consumption statistics, stratigraphic model data (geology, geophysics)

NCSR "Demokritos", the largest multidisciplinary research centre of Greece hosts significant scientific research, technological development and educational activities, coordinated by eight Institutes.

Page 18: SC1 Workshop 2 Pilot instantiations

SC3: Energy

3 mai 2023www.big-data-europe.eu

Pilot focus area:System monitoring

in energy production units.

Page 19: SC1 Workshop 2 Pilot instantiations

SC3: Energy

3 mai 2023www.big-data-europe.eu

Pilot 3: Operation, maintenance and production forecasting for wind turbines on real-time sensor data.

Current technology is not able to deal with full amount of available valuable data

Economic benefit of predicting output and prevention of damage (if one can predict one part about to fail it can be prevented that other parts get damaged)

Large continuous stream of sensor data, perfect to test our platform

Reasons:

Page 20: SC1 Workshop 2 Pilot instantiations

SC3: Energy

3 mai 2023www.big-data-europe.eu

Data:- Raw sensor and SCADA data from a

given wind farm- Third-party raw or synthetic data- Analysis results from built-in

analysis modulesProcessing:• Near-real time execution of

parameterized models to return operational statistics, including correlation analysis of data across units

• Weekly execution of operational statistics

• Weekly execution of model parametrization

Page 21: SC1 Workshop 2 Pilot instantiations

SC4: Transport

3 mai 2023www.big-data-europe.eu

SC4: Transport

Page 22: SC1 Workshop 2 Pilot instantiations

SC4: Transport

3 mai 2023www.big-data-europe.eu

Partners: The Fraunhofer Society is a German research organization with 67 institutes spread throughout Germany, each focusing on different fields of applied science.

Big Data Focus area: Real-time monitoring stream processing and analyticsSelected Key Data assets: European Energy Exchange Data, smart meter sensor data, gas/fuels market/price data, consumption statistics, stratigraphic model data (geology, geophysics)

The Centre for Research and Technology-Hellas (CERTH) founded in 2000 is one of the leading research centres in Greece. CERTH includes the Hellenic Institute of Transport (HIT): Land, Sea and Air Transportation as well as Sustainable Mobility servicesERTICO - ITS Europe is a partnership of around 100 companies and institutions involved in the production of Intelligent Transport Systems (ITS).

Page 23: SC1 Workshop 2 Pilot instantiations

SC4: Transport

3 mai 2023www.big-data-europe.eu

Pilot focus area:Info mobility andtraffic planning

Page 24: SC1 Workshop 2 Pilot instantiations

SC4: Transport

3 mai 2023www.big-data-europe.eu

Pilot 4: Multisource data collection for the provision of accurate info-mobility and advanced transport planning service in Thessaloniki, Greece

Congestion is a major problem in Europe, especially in urban areas.

utilizing real-time probe data for the provision of accurate info-mobility services and advanced transport planning, leads to better decisions

The use of mobility data coming from multiple sources presents significant challenges, especially due to the different nature of the datasets both in content and spatio-temporal terms as well as due to the fact that the data should be collected and processed in real time.

Reasons:

Page 25: SC1 Workshop 2 Pilot instantiations

SC4: Transport

3 mai 2023www.big-data-europe.eu

Data:• Traffic counts and speed (330

locations, a data set every 1.5 – 5 minutes, 300k records, 15 MB)

• Travel times from Bluetooth detectors (43 locations, a data set every 15 minutes, 250k-300k records, 50 MB)

• Floating Car Data position and speed (1200 vehicles, a data set every 2 minutes, 2M records, 200MB)

• Check-in events from social networks

Page 26: SC1 Workshop 2 Pilot instantiations

SC5: Climate

3 mai 2023www.big-data-europe.eu

SC5: Climate

Page 27: SC1 Workshop 2 Pilot instantiations

SC5: Climate

3 mai 2023www.big-data-europe.eu

Partners: A public entity supervised by the Ministry of Environment, Energy and Climate Change in Greece, founded in September 1987, active in the fields of Renewable Energy Sources (RES), Rational Use of Energy (RUE) and Energy Saving (ES).

Big Data Focus area: Enormous simulation time. Extremely complicated computing model. Selected Key Data assets: European Grid Infrastructure (EGI). Access to several data centres hosted at CNRS-Lyon, NCSR-D Athens, INFN-Milan, NIKhEF-Amsterdam.

NCSR "Demokritos", the largest multidisciplinary research centre of Greece hosts significant scientific research, technological development and educational activities, coordinated by eight Institutes.

Page 28: SC1 Workshop 2 Pilot instantiations

SC5: Climate

3 mai 2023www.big-data-europe.eu

Pilot focus area:Supporting data-intensive

climate research

Page 29: SC1 Workshop 2 Pilot instantiations

SC5: Climate

3 mai 2023www.big-data-europe.eu

Pilot 5: Downscaling, and retrievalprocess on (raw) climate data viaUser-defined parameters (e.g. geographical areas, time period, physical variables, computational grids, time steps)

The provision of Climate model data satisfies an important objective, that of assessing the potential impacts of climate change on well being for adaptation, prevention and mitigation measures and supporting other policy making decisions.

The awareness led to the availability of huge datasets

Downscaling is a computational intensive process

Reasons:

Page 30: SC1 Workshop 2 Pilot instantiations

SC5: Climate

3 mai 2023www.big-data-europe.eu

Data:• Earth System Grid Federation (ESGF) data:

• CMIP5 data (global climate model simulations)

• CORDEX data (regional climate model simulations)

• NetCDF data• European Centre for Medium range Weather

Forecasting (ECMWF) data

Page 31: SC1 Workshop 2 Pilot instantiations

SC5: Climate

3 mai 2023www.big-data-europe.eu

Page 32: SC1 Workshop 2 Pilot instantiations

SC6: Social Sciences

3 mai 2023www.big-data-europe.eu

SC6: Social Sciences

Page 33: SC1 Workshop 2 Pilot instantiations

SC6: Social Sciences

3 mai 2023www.big-data-europe.eu

Partners: CESSDA provides large scale, integrated and sustainable data services to the social sciences. CESSDA is organised as a limited company under Norwegian law owned and financed by the individual EU member states’ ministry of research or a delegated institution.

Big Data Focus area: Statistical and research data linking & integrationSelected Key Data assets: Federated social sciences data catalogs, statistical data from public data portals and statistical offices (e.g. EuroStats, UNESCO, WorldBank)

NCSR "Demokritos", the largest multidisciplinary research centre of Greece hosts significant scientific research, technological development and educational activities, coordinated by eight Institutes.

Page 34: SC1 Workshop 2 Pilot instantiations

SC6: Social Sciences

3 mai 2023www.big-data-europe.eu

Pilot focus area:Citizens budget

spending on municipal level

Page 35: SC1 Workshop 2 Pilot instantiations

SC6: Social Sciences

3 mai 2023www.big-data-europe.eu

Pilot 6: Citizens budget in municipal level

Budget: the most important document of public policy

Budget execution affects everyday lives

Citizens are more involved in city level

Having a platform that integrates heterogeneous budget data (many municipality have their own data formats) and calculates infographics would benefit the citizens, the research community and policy makers

Reasons:

Page 36: SC1 Workshop 2 Pilot instantiations

SC6: Social Sciences

3 mai 2023www.big-data-europe.eu

Data:• Datastream from Greek municipalities, with

codes that are unique identifiers based on national accounting system for municipalities

• Data from 3 cities in Greece (Highest detail)• Updated several times within the day (Streams

with no memory) ->Convert in daily observations

• Available through API or CSV/XLS

Page 37: SC1 Workshop 2 Pilot instantiations

SC7: Security

3 mai 2023www.big-data-europe.eu

SC7: Security

Page 38: SC1 Workshop 2 Pilot instantiations

SC7: Security

3 mai 2023www.big-data-europe.eu

Partners: The Centre supports the decision making of the European Union in the field of the Common Foreign and Security Policy (CFSP), by providing products and services resulting from the exploitation of relevant space assets and collateral data, including satellite imagery and aerial imagery, and related services.NCSR "Demokritos", the largest multidisciplinary research

centre of Greece hosts significant scientific research, technological development and educational activities, coordinated by eight Institutes.

Page 39: SC1 Workshop 2 Pilot instantiations

SC7: Security

3 mai 2023www.big-data-europe.eu

Big Data Focus area: Image data analysisSelected Key Data assets: Earth Observation data (e.g. Very High Resolution Satellite Imagery acquired from commercial providers and governmental systems) and collateral data for supporting CFSP/CSDP missions and operations

Page 40: SC1 Workshop 2 Pilot instantiations

SC7: Security

3 mai 2023www.big-data-europe.eu

Pilot focus area:Getting insight in man-made surface

changes triggered by automatic detection, news, or social media

information

Page 41: SC1 Workshop 2 Pilot instantiations

SC7: Security

3 mai 2023www.big-data-europe.eu

Pilot 7: Ingestion of remote sensing images and social sensing data

to detect and verify man-made changes on the Earth surface for security applications

Evacuation route planning Monitoring of critical infrastructures Border security Satellite image data is HUGE and

computational intensive to compare Smart ‘focus’ algorithms are

needed to prioritize the analysis jobs

Reasons:

Page 42: SC1 Workshop 2 Pilot instantiations

SC7: Security

3 mai 2023www.big-data-europe.eu

Data:• All data products are distributed in the

SENTINEL Standard Archive Format for Europe (SAFE) format

• The SENTINEL-SAFE format wraps a folder containing image data in a binary data format and product metadata in XML

• Social Media, which are demonstrated via consuming Twitter streams

• News agencies, which are demonstrated via consuming  Reuters RSS feeds

Page 43: SC1 Workshop 2 Pilot instantiations

SC7: Security

3 mai 2023www.big-data-europe.eu