BDE SC6-hang out - technology part-SWC - Martin

  • View
    332

  • Download
    4

  • Category

    Science

Preview:

Citation preview

BIG DATA EUROPEPILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVELEUROPE IN A CHANGING WORLD - INCLUSIVE, INNOVATIVE AND REFLECTIVE SOCIETIES

HANG OUT28 SEPTEMBER 2016MARTIN KALTENBOECK (CFO, SEMANTIC WEB COMPANY)

Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges

                    

BDE SC6 Hangout

Big Data Europe (CSA: 2015-17)

Show societal value of Big Data: 7 Domains

Lower barrier for using big data technologieso Required effort and resourceso Limited data science skills

Help establishing cross-lingual/organizational/domain Data Value Chains

3 mai 2023

Big Data Europe

3 mai 2023

COORDINATIONStakeholder Engagement

(Requirements Elicitation)

SUPPORTDesign, Realise, Evaluate

Big Data Aggregator Platform

Create and Manage Societal Big Data Interest

Groups

Cloud-deployment ready Big Data Aggregator

Platform

CSA Measures

Results

THE BDE PLATFORM ARCHITECTURE & COMPONENTS

Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges

                    

The three Big Data „V“ Variety is often neglected

Current State of Platform Architecture

Adding a Semantic Layer to Data Lakes

Manufacturing Marketing Sales SupportAccounting

Semantic Data Lake• central place for

model, schema and data historization

• Combination of Scale Out (cost reduction) and semantics (increased control & flexibility)

• grows incrementally (pay-as-you-go)

Inbound

Data Sources

Outbound and Consumption

Inbound Raw Data Store

Data Lake (order of magnitude cheaper scalable data store)

Knowledge Graph for Relationship Definition and Meta Data

Frontend to Access Relationship and KPI Definition / Documentation Frontend to Access (ad hoc) Reports Outbound Data Delivery to

Target Systems

JSON-LD CSVW R2RMLXML2RDF

Why to use BDE Technology?Hortonworks Cloudera MapR Bigtop BDE

File System HDFS HDFS NFS HDFS HDFS

Installation Native Native Native Native lightweight virtualization

Plug & play components (no rigid schema)

no no no no yes

High Availability Single failure recovery (yarn)

Single failure recovery (yarn)

Self healing, mult. failure rec.

Single failure recovery (yarn)

Multiple Failure recovery

Cost Commercial Commercial Commercial Free Free

Scaling Freemium Freemium Freemium Free Free

Addition of custom components

Not easy No No No Yes

Integration testing yes yes yes yes --

Operating systems Linux Linux Linux Linux All

Management tool Ambari Cloudera manager MapR Control system - Docker swarm UI+ Custom

SC6 PILOTCITIZENS BUDGET ON MUNICIPAL LEVELARCHITECTURE & COMPONENTS

Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges

                    

SC6: Social Sciences

3 mai 2023www.big-data-europe.eu

Pilot Architecture & Components

SC6: Social Sciences

3 mai 2023www.big-data-europe.eu

Pilot focus area:Citizens budget

spending on municipal levelBig Data Focus area:

Statistical and research data linking & integrationSelected Key Data assets: Detailed budget execution data in city level, statistical data from public data portals and statistical offices, federated social sciences data catalogs

4 Vs of Big Data in SC6 Pilot Variety: requirement based on the harvesting of budget data

and budget execution data from several sources, available in different structures and formats.

Volume: requirement regarding the growing amount of open budget data available as well as of budget execution data

Velocity: requirements regarding budget execution data that is provided on continuous basis by the publisher (daily, weekly, monthly).

Veracity: Veracity refers to the biases, noise and abnormality in data. Even for within the same country there are differences on the published data because often are coming from different systems or public accounting standards are not enforced absolutely uniformly (e.g. different municipal departments)

3 mai 2023www.big-data-europe.eu

SC6 Pilot - Architecture

3 mai 2023www.big-data-europe.eu

SC6 Pilot: Technical Components

Apache Flume, https://flume.apache.org/ (data ingestion) Apache Kafka, http://kafka.apache.org (messaging service) Apache Spark, http://spark.apache.org (distributed analysis, transformation) Apache HDFS, http://hadoop.apache.org (raw data storage) SWCs’ PoolParty Semantic Suite, http://poolparty.biz (data consolidation,

curation, mapping) OpenLink s’ Virtuoso, http://virtuoso.openlinksw.com (triple store – data

storage) Apache HTTP, http://httpd.apache.org (linked data serving) Apache Avro, http://avro.apache.org/docs/current/ (intermediate data

schema) D3 JS Library, https://d3js.org/ (visualisation of RDF data using SPARQL

queries) SWCs’ PoolParty GraphSearch (SPARQL based interface component for

filter & faceted search)

3 mai 2023www.big-data-europe.eu

SC6 Pilot: 1st MockUp / WireFrame

3 mai 2023www.big-data-europe.eu

SC6 Pilot: Pilot EvaluationEvaluation Approach SC6 Pilot: Invite municipalities to evaluate and use the system Invite community (open data, data community, BDE community,

W3C) Evaluate within the 2 participating projects (BDE, YourDataStories) BDE SC6 workshop in Cologne, 5.12.2016 + Overall BDE Tech

WS (ApacheCon)Additional evaluation – tests over time with a growing amount of data a growing number of different sources & formats docked onto the

system additional analytics in place

3 mai 2023www.big-data-europe.eu

How to benefit best from BDE

3 mai 2023www.big-data-europe.eu

Health19 October

Brussels

Standalone Workshop

Food&Agri 30 September 2016

Brussels

Collocated with DG AGRI WP2018-20 stakeholder consultation

Energy 20 September 2016

Brussels

Collocated with H2020 Energy InfoDay (19th)

Transport 16 September 2016

Brussels

Collocated with TM 2.0 Steering Body meeting

Climate February 2017 Brussels

Collocated with EC JRC ISPRA Workshop

Societies 5 December 2016

Cologne

Collocated with EDDI16- 8th Annual European DDI User Conference: http://bde-sc6-2016.eventbrite.com (40 seats)

Security 18 October 2016 Brussels

Standalone Workshop

• BDE Workshops& Webinars• Use & expand the BDE Platform• Visit Website: news, events,

community, …• Big Data Europe W3C

Community Group• 7+1x Mailing Lists

Contacts: CESSDA, http://cessda.net/ Ivana Ilijasic Versic, ivana.versic@cessda.netHossein Abroshan, hossein.abroshan@cessda.net

NCSR-D, http://www.demokritos.gr/?lang=en Michalis Vafopoulos, vafopoulos@gmail.com

Semantic Web Company (SWC), http://www.semantic-web.at Martin Kaltenböck, m.kaltenboeck@semantic-web.at Jürgen Jakobitsch, j.jakobitsch@semantic-web.at

3 mai 2023www.big-data-europe.eu

Questions & Contactswww.big-data-europe.eu

3 mai 2023#BigDataEurope

Martin KaltenböckCFO, Semantic Web Companym.kaltenboeck@semantic-web.at

http://www.linkedin.com/in/martinkaltenboeckhttps://twitter.com/kalte2707http://de.slideshare.net/MartinKaltenboeck http://blog.semantic-web.at

Recommended