190
Smart and Inclusive Solutions for a Better Life in Urban Districts Smart Data Platform Munich Deliverable D4.4.1 Version 2 This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 691876

Smart Data Platform Munich - Smarter Together

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Smart and Inclusive Solutions for a Better Life in Urban Districts

Smart Data Platform Munich Deliverable D4.4.1

Version 2

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 691876

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 2

REVISION CHART AND HISTORY LOG Versions

Version number Date Organisation name Comments

V0.01 24.09.2017 VMZ Berlin Table of Contents Proposal

V0.02 08.11.2017 VMZ Berlin Chapter 1

V0.03 14.12.2017 VMZ Berlin Chapter 2

V0.04 19.10.2018 VMZ Berlin Chapter 2.2.4

V0.05 25.01.2018 LHM/Montag/Glock Update Deliverable

V0.06 25.03.2018 VMZ Berlin Chapter 2.5

V0.07 12.04.2018 VMZ Berlin Restructuring of Document

V0.08 12.04.2018 ALG Layout and quality check

V1.0 16.04.2018 SPL Final version ready for submission

V1.1 23.01.2019 VMZ Berlin Adding additional

content, Chapter 3, Chapter 4

V1.2 05.02.2018 VMZ Berlin Adding additional

content, Chapter 5, Chapter 6

V1.3 12.02.2019 LHM/Montag/Glock Final update of deliverable for V2.0

V1.4 25.02.2019 ALG Quality check

V2 25.02.2019 VMZ Berlin Comments integrated

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 3

Deliverable quality review

Quality check Date Status Comments

Technical Manager 25/04/2019 Ok -

Quality Manager 26/02/2019 Ok

Project Coordinator 03/05/2019 Ok

This report reflects only the author’s view, neither the European Commission nor INEA is responsible for

any use that may be made of the information it contains.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 4

Table of contents 1. Introduction .......................................................................................................................................... 9

1.1 Purpose of the Document ......................................................................................................... 9

1.2 Structure of the Document ........................................................................................................ 9

2. Context and Background ................................................................................................................ 10

2.1 Origins and Development History ........................................................................................... 10

2.2 SMARTER TOGETHER Project Requirements towards the Platform ................................... 12

2.3 Synchronisation with Data Gatekeeper ............................................................................... 14

2.4 From City Intelligence Platform to Smart Data Platform .................................................... 15

3. Architecture and System Components ........................................................................................ 17

3.1 Architecture Overview ............................................................................................................. 17

3.2 System Components ................................................................................................................. 18

3.2.1 Date Warehouse .................................................................................................................. 18

3.2.2 Analytics Module .................................................................................................................. 19

3.2.3 Data Gatekeeper Registry ................................................................................................. 19

3.2.4 Single Sign-on Service .......................................................................................................... 20

3.2.5 Application Programming Interfaces ............................................................................... 21

3.2.6 API Catalogue ...................................................................................................................... 23

3.2.7 Analysis Dashboard ............................................................................................................. 24

3.2.8 Transparency Dashboard ................................................................................................... 26

4. Data Processing Framework ........................................................................................................... 28

4.1 Data Classification .................................................................................................................... 28

4.2 DGK Data Model and Processing Rules ................................................................................ 34

4.3 Functional Library for Data Processing .................................................................................. 35

5. Data Exchange in Use Cases .......................................................................................................... 37

5.1 Energy .......................................................................................................................................... 37

5.2 Intelligent Lampposts ................................................................................................................ 40

5.3 Mobility ........................................................................................................................................ 43

5.4 KPIs ................................................................................................................................................ 45

6. Conclusion .......................................................................................................................................... 47

7. Annex ................................................................................................................................................... 48

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 5

List of figures Figure 1 : Trends covered by City Intelligence Platform ........................................................................................... 11 Figure 2 : Example analysis of bus prioritisation performance in the city of Böblingen .................................... 12 Figure 3 : New and existing components of the Smart Data Platform ................................................................. 16 Figure 4 : Smart Data Platform Architecture Overview ............................................................................................. 18 Figure 5 : Impacts of Data Gatekeeper registry on Smart Data Platform ........................................................... 20 Figure 6: JSON data format example for smart home data ................................................................................... 22 Figure 7: JSON data format example for intelligent lamppost data ..................................................................... 22 Figure 8 : Screenshot of the API catalogue ................................................................................................................. 24 Figure 9 : Analysis Dashboard Start Page_Pilot Area with POIs ............................................................................... 26 Figure 10 : Screenshot of Transparency Dashboard (browser view) ..................................................................... 27 Figure 11 : Screenshot of Transparency Dashboard (mobile device view) ........................................................ 28 Figure 12: Role assignment for SDP users/clients ......................................................................................................... 32 Figure 13: Screenshot of roles list (extract) .................................................................................................................. 33 Figure 14: Screenshot of role assignment for Global_CompositeRole .................................................................. 33 Figure 15: Screenshot of role assignment for Energy_CompositeRole .................................................................. 33 Figure 16: Data Gatekeeper Data Model .................................................................................................................... 34 Figure 17 : SDP Workflow of Use Case Energy ............................................................................................................. 38 Figure 18: Screenshot 1 Energy analyses ...................................................................................................................... 39 Figure 19: Screenshot 2 Energy analyses ...................................................................................................................... 40 Figure 20: Workflow Use Case Lampposts .................................................................................................................... 41 Figure 21: Screenshots from Munich Smart City App with Intelligent Lampposts .............................................. 42 Figure 22: Screenshot Intelligent Lampposts analyses .............................................................................................. 43 Figure 23: Workflow Use Case Mobility .......................................................................................................................... 44 Figure 24: Screenshot Mobility analyses ........................................................................................................................ 45 Figure 25: Workflow Use Case KPIs .................................................................................................................................. 46 Figure 26: Screenshot KPIs analyses ................................................................................................................................ 47

Liste of tables Table 1: Data Classification Criteria .................................................................................................................. 29 Table 2 : Technical Implementation of Data Classification ............................................................................ 31 Table 3 : SDP Functional Libary .......................................................................................................................... 37

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 6

Glossary

API Application Programming Interface

CIP City Intelligence Platform

CSS Cascading Style Sheets

CSV Comma-separated Values

DGK Data Gatekeeper

DWD German Meteorologic Office

HTML Hypertext Markup Language

HTTPS Hypertext Transport Protocol Secure

IAM Identity and Access Management

ID Identity

IoT Internet of Things

JSON JavaScript Object Notation

JWT JASON Web Token

KPI Key Performance Indicators

MQTT Message Queuing Telemetry Transport

POI Point of Interest

RDBMS Relational DataBase Management System

REST Representational State Transfer

SDP Smart Data Platform

SQL Structured Query Language

SSO Single Sign-On

SVG Scalable Vector Graphics

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 7

SMARTER TOGETHER BENEFICIARIES

N° Organisation name Short name Country 1 Lyon Confluence SPL France

2 Lyon Métropole GLY France

3 HESPUL Association HES France

4 Toshiba TSF France

5 Enedis END France

6 Enertech ETC France

7 City of Munich MUC Germany

8 Bettervest BET Germany

9 G5-Partners G5 Germany

10 Siemens Germany SIDE Germany

11 Spectrum Mobil STA Germany

12 Securitas SCU Germany

13 City of Vienna VIE Austria

14 BWS Gemeinnutzige BWSG Austria

15 Wiener Stadtwerke WSTW Austria

16 Kelag Wärme KWG Austria

17 Siemens Austria SIAT Austria

18 Sycube Informationstechnologie SYC Austria

19 Austrian Post POST Austria

20 Fraunhofer FHG Germany

21 Austrian Institute of Technology AIT Austria

22 Energy Cities ENC France

23 Gopa COM GPC Belgium

24 University of St Gallen UNISG Switzerland

25 Technical University of Munich TUM Germany

26 Deutsches Institut fuer Normung DIN Germany

27 Algoé ALG France

28 City of Santiago de Compostela STC Spain

29 City of Sofia SOF Bulgaria

30 City of Venice VEN Italy

31 SA Régionale d’HLM de Lyon HLM France

32 Wavestone WAV France

33 WEG Radolfzeller str. 40-46 RZL Germany

34 WEG Wiesenthauerstr. 16 WHR Germany

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 8

EXECUTIVE SUMMARY The Smart Data Platform Munich (SDP) is the central data management platform for the SMARTER TOGETHER lighthouse city of Munich. It enables the implementation of all those project use cases, which require an integration, analysis and exchange of data. The Smart Data Platform is based on the City Intelligence Platform (CIP) - a product developed by Siemens AG - and has been modified to meet the specific project requirements of SMARTER TOGETHER, with the synchronization of the SDP and the Data Gatekeeper Concept being the core challenge. The Data Gatekeeper (DGK) is a comprehensive blueprint that describes the approach of how to handle data (incl. data privacy and data security aspects) in a Smart City context.

The role of the Smart Data Platform in the lighthouse city of Munich is to provide open and easy implementable interfaces to integrate data from various use cases into the platform, store and refine data in the database layer, analyse, predict and simulate, combine and visualize data for the platform´s dashboards and provide data to third party applications services over standardized Application Programming Interfaces (API). All these data handling processes need to be in line with the data privacy rules and multi-client access authorizations defined by the project´s Data Gatekeeper Concept.

Whereas the architecture and the standard APIs of the platform are defined, the system is flexible enough to react to the specific requirements of different use cases and provides an open ecosystem for data integration, analysis and distribution taking into account the project specific use cases. The platform has evolved to the final system, which will be used in the operating phase of the project. In order to present the outcomes of this process this document provides an in-depth documentation of the platform, its technical components and functionalities.

The main platform building blocks and system elements are installed and running. The general functionality has been presented in fall 2017 within a proof-of-concept demonstrator including all main system elements (raw data input, data correlation, API-definition, smart data output). The main project use-case related elements of the platform have been realised during 2018. Some final works and use-case related adaptations of the Smart Data Platform have been finalized by end of February 2019.

A comprehensible and detailed technical description of all structural and architectural elements of the Smart Data Platform is included in this document.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 9

1. Introduction The Introduction of this report summarizes the purpose of this deliverable and explains the structure of the provided content in different chapters.

1.1 Purpose of the Document This document aims at providing a detailed documentation of the Smart Data Platform. It should give the reader a short but comprehensive description of both the technical components, dashboards and utilized technologies as well as of the data processing rules and mechanisms based on the use case and data privacy requirements. After reading this documentation, it should be clear how the platform has been implemented in terms of components and functionalities, why certain technical decisions have been taken in coordination with the project stakeholders and what added value the platform provides to SMARTER TOGETHER.

It shall outline the evolution from the City Intelligence Platform to the Smart Data Platform by taking into account and implement the project requirements and introduce the system and its interfaces for the project´s operation phase. In the course of this, the document covers two major topics: the system architecture and it´s technical components on backend and frontend level, as well as the data processing rules and functionalities based on the data-privacy classification of each data set as defined by the Data Gatekeeper rules and standards.

1.2 Structure of the Document Chapter 1 describes the purpose and structure of the document.

In chapter 2, a short development history will be provided describing the development process of Siemens´ City Intelligence Platform and how it has evolved to the Smart Data Platform in the context of SMARTER TOGETHER. The delta between the CIP and SDP results from specific project requirements, which necessitate a customized technical solution in terms of components and data processing functionalities. The major requirement, which will be described in a separate subchapter, is the synchronization of the platform´s data management framework with the regulations defined by the Data Gatekeeper concept.

Chapter 3 focuses on the system architecture and the technical components. It describes the architectural approach, a system overview as well as dedicated subchapters on the platform´s main components: Data Warehouse, the Data Gatekeeper Registry and the two frontends of the platform, namely the Analysis Dashboard and the Transparency Dashboard.

In chapter 4, the data processing framework is covered. It describes the general flow of data through the platform from data integration over two standardized APIs, the storage

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 10

and management of data in the data warehouse, to the analysis and fusion of different data for visualization purposes in a dashboard, and finally to the provision of enhanced data to third party applications and systems over an API provided by the platform.

Chapter 5 takes up the general data processing framework and the system functionalities and shows these processes taking the practical examples of the different use cases implemented in the platform.

In chapter 6, a summary is provided presenting the current status of the implementation, open issues to be discussed as well as an outlook on the operation phase of the project.

2. Context and Background In this chapter, the evolutionary steps shall be outlined how the system evolved from the City Intelligence Platform to the Smart Data Platform used in SMARTER TOGETHER, especially focusing on the project requirements, which led to the adaptations of the existing product for the context of SMARTER TOGETHER.

2.1 Origins and Development History The Smart Data Platform is based on the CIP, which was developed by Siemens in the context of different research projects over 5 years until product status has been reached at the end of 2016.

The CIP is an open platform, developed and implemented by Siemens Corporate Technology. It has been designed to meet requirements from global trends like urbanization, climate change and pollution, the impacts of which create significant new challenges for urban infrastructure.

It is expected that, within the next years, a huge amount of new data will become available. Examples include Floating Car Data, Car to X-Technology and Crowd Sourced data in the mobility domain, which require tools to access, store and evaluate the data and to create services and user interfaces for various user groups.

In this context, the CIP supports fast, research oriented development and feasibility tests, in particular in new technology fields. Thus, the CIP shall complement existing and well established Siemens product platforms, such as the modular Sitraffic platform in the field of road traffic control, management and parking guidance (Sitraffic Scala, Concert and Guide), with more than 100 implementations worldwide.

The principle behind the CIP implementation is to develop services and applications for various new types of data and use cases. It offers an open interface to enable an IT ecosystem around its deployment. This includes the possibility to exchange raw and aggregated data as well as results from data analytics. It also offers the possibility for third parties to develop their own applications-making use of open interfaces. Most technical

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 11

innovations are being made within the analytics and evaluation modules. Those have the task to aggregate and analyse available data in a way to enable new IT solutions within the relevant infrastructure domains.

Figure 1 : Trends covered by City Intelligence Platform

In the analytics module of CIP different data analyses based on research projects have been implemented and visualized via diagrams or maps as shown in the figure below. In this example, different analyses have been implemented in a research project based on real-time and historic data to help better understanding bus prioritization performance in the city of Böblingen/Germany.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 12

Figure 2 : Example analysis of bus prioritisation performance in the city of Böblingen

After the use of the CIP in multiple pilots, the platform reached the product status at the end of 2016. Thus, the platform has been shifted from a Siemens development unit to an operating unit, which is VMZ Berlin, a 100% subsidiary company of Siemens. VMZ Berlin continues operating and enhancing the CIP and will, in parallel, adjusts the existing system to the Smart Data Platform used in SMARTER TOGETHER taking into account the specific project requirements described in the following subchapters.

2.2 SMARTER TOGETHER Project Requirements towards the Platform There are various requirements from different SMARTER TOGETHER project domains, which need to be taken into account when evolving the City Intelligence Platform to the Smart Data Platform. These requirements consist of technological, administrative or use-case-specific demands towards the implementation of the system. However, these requirements will also be found in similar form outside of the SMARTER TOGETHER project framework so that the transferability of the platform into other contexts is always given consideration.

First of all the Smart Data Platform (SDP) needs to be able to cover a variety of use cases. The use cases implemented in SMARTER TOGETHER range from the mobility domain over energy topics to intelligent lamppost sensor solutions and require an open platform ecosystem that is capable to provide technological and data management solutions

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 13

covering a broad field of topics. The SDP is thus required to be an open, secure and citywide IT ecosystem: it should act as a virtual data-backbone for collecting city-data in different domains as a basis for a holistic view of city-data and operated under the control of the public authority to offer security and quality of data.

The project requirements affect different technological layers of the Smart Data Platform as well as administrative data management regulations.

Data integration

As the central data platform for the lighthouse city of Munich, the SDP integrates all data that is needed for the use cases of SMARTER TOGETHER. The data integration covers the import of static data from different databases outside the platform, georeferenced objects for analyses and visualization in maps, real-time backend to backend interfaces as well as real-time sensor interfaces.

The standard APIs of the platform need to be suitable for the communication with various data sources, secure and well documented. The standard APIs of the platform are described in chapter 4.1.

Data processing and storage

The integrated data needs to be securely stored in the platform´s data warehouse. Different data analysis services of the platform are built upon this database layer in order to combine and visualize data from different sources and/or provide it to third party applications using the SMARTER TOGETHER API.

Depending on the confidentiality of data, the platform is required to provide anonymization and aggregation mechanisms, before the data is entered into the database. Furthermore, it can be required that data is automatically deleted from the database after a defined period.

Data analysis and visualization

One major objective of the Smart Data Platform is to create added value for city decision-makers by calculating and visualizing relevant analyses based on the data integrated. These analyses need to be able to combine data from different data sources and across different domains. Based on the available data, the Munich Task-leaders and other project stakeholders responsible for a use case discussed together which analyses would be helpful for future decision-making and how those should be visualized in order to be understandable and valuable for the final users.

In this context the SDP is not only required to perform and visualize these analyses but also to verify which data is actually allowed to be combined based on data privacy

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 14

regulations and which users are allowed to use specific analyses based on their access authorizations defined in the Data Gatekeeper (see chapter 2.3).

Data distribution

The Smart Data Platform does not only integrate, store, combine and visualize data, it is also the central data provision point for third party systems, which want to request data for their apps or services. The central access point for different data sets is the SMARTER TOGETHER API.

For data distribution, it is crucial that only those data sets are accessible over the SMARTER TOGETHER API, which are allowed to be openly distributed. Other data sets might be restricted to certain third party users who have signed a certain licensing contract. Therefore, it is important that the SMARTER TOGETHER API Store is able to distinguish between different users, manage different access authorization rights and provide additional user validity checks such as use of certificates or tokens.

It can be summarized that there are various requirements that need to be taken into account for the development of the Smart Data Platform. One major aspect in this context is the confidentiality classification of each integrated data set that determines the data processing and distribution rules applied to specific data. One major objective of the Smart Data Platform is to make as transparent as possible what happens inside the platform with each data set and to guarantee data privacy where it is needed. This regulatory framework for data management in SMARTER TOGETHER is the so-called Data Gatekeeper concept described in the next subchapter.

The following SMARTER TOGETHER Use Case Data deriving from WP 4, Tasks 4.2 – 4.6 are implemented into and analysed within the Smart Data Platform:

▪ Information exchange with the Task Citizens and Stakeholder Engagement (Task 4.2)

▪ Data exchange with the Task Low Energy Districts (Task 4.3)

▪ Data exchange with the Task Integrated Infrastructure and Services (Task 4.4)

▪ Data exchange with the Task Sustainable Mobility (Task 4.5)

▪ Data exchange with the Smart City App (part of Task 4.4)

▪ Information exchange with the Task Monitoring and Evaluation (Task 4.6)

2.3 Synchronisation with Data Gatekeeper The Data Gatekeeper in SMARER TOGETHER is a strategic and organisational concept ensuring data privacy, data quality and other compliance issues and describes a data

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 15

management and processing framework as a transferable and generic blueprint providing requirements for the implementation of the Smart Data Platform in the lighthouse city of Munich.

The Data Gatekeeper (DGK) defines strategic guidelines for the Smart Data Platform as mandatory external requirements, which needed to be considered before the implementation process starts. These include applicable data protections laws and legal restrictions, fundamental technical recommendations for data security as well as city-specific strategies regarding open data access and data transparency, which, for example, resulted in the implementation of the SDP´s transparency dashboard.

It is furthermore defined in the DGK how smart data use cases are created and administratively handled in SMARTER TOGETHER. It is specified which stakeholders should be involved in the use case creation process and who is responsible for defining what is going to be happen with certain data sets inside the platform. These requirements rather affect the administration processes and mechanisms of the Smart Data Platform than the direct technical implementation.

The Data Gatekeeper also sets the framework for the classification of data sets that are integrated into the SDP. Each classification has its own impacts on how the data can be processed within the platform. For example, as the data classification requires that certain data sets need to be pseudonymized or aggregated before further processing, the SDP provides corresponding mechanisms. All these data processing rules, which are based on the metadata model of the DGK, are described in detail in chapter 4.

2.4 From City Intelligence Platform to Smart Data Platform As described the implemented system is based on the technology of the existing City Intelligence Platform. However, there are new project-specific requirements towards the platform of which the implementation of the Data Gatekeeper framework is the most crucial one. The following figure illustrates which components of the Smart Data Platform were already parts of the CIP, and which parts have been introduced due to Smarter Together project requirements.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 16

Figure 3 : New and existing components of the Smart Data Platform

The existing components include the standard integration APIs of the CIP that includes the options to transfer data over a JSON/REST API or over MQTT protocol. These APIs are described in detail in chapter 3.2.5.

Furthermore, the Data Warehouse of the CIP, which consists of a database layer of raw data and another layer for aggregated or pseudonymized data, is used in the Smart Data Platform. The Data Warehouse is depicted in chapter 3.2.1

The other components have been newly introduced in the SMARTER TOGETHER context. The Data Gatekeeper Registry, described in chapter 3.2.3, is the component that translates the requirements of the Data Gatekeeper such as the data classification into data processing rules within the platform. This can be the requirement to aggregate raw data due to data privacy reasons or limit the external access to one of the platform´s APIs. The data processing rules configured in the DGK registry are subject of chapter 4.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 17

The API catalogue (chapter 3.2.6) is the access point for third party systems and applications to retrieve data from the SDP e.g. intelligent lampposts data. The API catalogue is linked to the platform´s single sign-on service (chapter 3.2.4) which manages the access authorizations for the different APIs based on the DGK registry.

The Smart Data Platform also provides two frontends in SMARTER TOGETHER project. The first is the Analysis Dashboard (chapter 3.2.7) which visualizes the thematic analyses defined in each use case, platform statistics as well as the project-specific KPIs (Key Performance Indicators). The analysis dashboard is linked with the single sign-on service to check if the respective analysis dashboard users are allowed to access certain analyses.

The second frontend is the Transparency Dashboard described in chapter 3.2.8, which is a website for the general public providing information about the different data sets that are integrated into the platform and their data privacy classification.

3. Architecture and System Components In this chapter, the architectural approach of the Smart Data Platform and the main technical components as well as the used technologies will be illustrated.

3.1 Architecture Overview The design of the Smart Data Platform architecture needs to take into account different requirements. Data sources from different project use cases need to be integrated in an easy way, data needs to be processed in the context of data analysis and enhanced data must be provided to third party systems over APIs that are simple to understand and provide added value for the applications and systems accessing the Smart Data Platform. The backend system architecture can be layered in the domains of data integration, data management, analysis and data provision. Data for the project use cases are transferred from different data sources to the platform via the two standard interface technologies. However, the exact data fields and communication intervals of each interface need to be always specified, between data providers, platform operator and use case stakeholders, to ensure that all data, which is required for the different analysis use cases, is available in the platform.

In the data management and data analysis layer the smart data platform provides different types of databases for storing the data. Different data base operations set up on the raw data base layer to aggregate and pseudonymize data if required by the Data Gatekeeper framework and/or process data within the analytics module of the Smart Data Platform.

The data provision layer provides different APIs to allow the analysis dashboard as well as external systems to access raw or aggregated data from the platform. The APIs are

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 18

thematically divided for the different use cases (e.g. Intelligent Lamppost API) and are summarized in an API catalogue user interface for developers. Retrieving data over these APIs requires an access authorization, which is managed by the platform operation over a single sign-on service.

Figure 4 : Smart Data Platform Architecture Overview

3.2 System Components In this subchapter, the technology stack of the main components will be outlined including backend components such as the data warehouse, frontend components of the dashboards as well as the standard APIs of the Smart Data Platform.

3.2.1 Date Warehouse For the data warehouse, two different types of databases are provided to manage and store the data integrated in the different use cases: A relational database management system (RDBMS) using SQL (Structured Query Language) as well as a document-oriented NoSQL database. Which database type will be chosen depends on the specific data storage and processing requirements of each use case. If the schema of certain data (table and field-types) is well defined before integrating it in the Smart Data Platform it is

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 19

more likely that a relational database will be applied, whereas a NoSQL database may be suited for certain use cases where the initial data requirements are difficult to ascertain as this kind of database provides more flexibility. Both databases are in line with the data processing requirements posed by the Data Gatekeeper. The relational database used in the Smart Data Platform is namely Oracle DB, the NoSQL database applied is MongoDB.

3.2.2 Analytics Module The analytics module is the system layer, which manages the different database operations that calculate, aggregate, combine and enhance the raw data from the different data sources to smart data ready to be presented in the analysis dashboard or to be provided over the platform APIs.

Besides implementing sensible and correct algorithms, a focus of the analytics module is to provide a high performance of the data base operations to minimize the response time when a user requests a certain analysis in the analysis dashboard.

3.2.3 Data Gatekeeper Registry The Data Gatekeeper Registry is the central component, which translates the requirements of the Data Gatekeeper such as the data classification into a data processing system within the platform. This might include pseudonymizing raw data due to data privacy reasons or control the external access to one of the platform´s APIs.

The Data Gatekeeper Registry is more a data processing framework to be implemented manually by the platform administrator than a coherent and automated piece of software. More precisely the Data Gatekeeper registry can be found in different other components such as the data warehouse or the single sign-on service. In the following figure, it is illustrated in which steps of data processing in the Smart Data Platform the Data Gatekeeper registry comes into play.

After integrating raw data into the Smart Data Platform, it might be required by the data provider or data privacy regulations that only aggregated or pseudonymized data may be stored. This requirement is defined in the data classification for each data source. In this case, the raw data will be deleted after aggregation/pseudonymization. The next step where the Data Gatekeeper registry affects the platform processes is between the data warehouse and the analytics module. The data classification might prohibit that a certain data sets are combined (e.g. when the combination of data might lead to an identification of individuals) which restricts the range of possible analyses. The data gatekeeper registry also sets the framework for which data can be published over an API to an external system and which system is allowed to access which data at all. Furthermore it is defined which user can view which analyses in the analysis dashboard as it might be the case that a certain person is allowed to view analyses of use case 1 but not of use case 2.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 20

Figure 5 : Impacts of Data Gatekeeper registry on Smart Data Platform

3.2.4 Single Sign-on Service The Smart Data Platform uses a single sign-on service to manage the different user roles and access rights (Identity and Access Management (IAM)). The single sign-on service manages the access rights for the communication with the Smart Data Platform over the APIs as well as the access rights to certain analysis domains in the analysis dashboard. The users and authorizations are configured in the graphical user interface of the single sign-on service. The service also provides authorization tokens (JSON Web Tokens) to the external systems and only with valid tokens a communication with the Smart Data Platform can be established. The single sign-on service implemented in the platform is namely the open-source tool Keycloak. This single-sign on service also provides the functionalities to manage the data processing framework of the Data Gatekeeper Registry by connecting different operations such as a database query with different the different roles of a user or a system component.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 21

3.2.5 Application Programming Interfaces The Smart Data Platform provides two standard APIs for integrating data in the context of the various use cases: MQTT (Message Queue Telemetry Transport) protocol and JSON REST. These two interfaces, of which JSON REST will be the more likely solution in most use cases, are described in the following.

The MQTT protocol is suitable for use cases that include sensor data that will be directly sent from sensor to the Smart Data Platform. The communication is triggered by a tool locally installed on the communication unit of the sensor; the tool is provided by the Smart Data Platform operator. The tool includes the application for the JSON export, the required java version as well as preconfigured certificates for a secured data transfer. Over the MQTT communication protocol the tool automatically exports the current JSON data sets to the Smart Data Platform in configurable intervals. After the export, the data sets will be removed from the sensor unit. MQTT is an open message protocol and currently in standardization process for the context of machine to machine communication or, respectively, the internet of things (IoT).

The JSON REST interface is more suitable for backend to backend communication. Via POST requests, the current data sets are sent from the data provider backend to the Smart Data Platform. The respective endpoints are provided by the platform operator. To secure the communication, each request has to contain a valid JSON Web Token in the http header. This access token is issued by the single sign-on service and can be retrieved over an upstream request. The token has a defined validity period and must be renewed respectively. The JSON data format is predefined by the platform operator but can be modified in consultation with the data provider in order to meet the use case requirements.

The following two examples show the JSON data formats of the use cases smart home data and intelligent lampposts. For the smart home data use case, the MQTT export tool has been implemented whereas in the intelligent lampposts use case the JSON REST interface is applied.

In case of the smart home data, the MQTT export tool sends data sets to an endpoint of the Smart Data Platform whenever new measured values for humidity, temperature and battery level are available. This means that there is no fixed transmission interval. A simplified example of a smart home data set is shown in the figure below.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 22

Figure 6: JSON data format example for smart home data

For the case of intelligent lamppost data, the JSON REST interface is applied. The process to send data includes one POST request to retrieve a valid token and a second POST request with the received token in the header to send the actual data sets. Below, a simplified data set for intelligent lamppost data is illustrated.

Figure 7: JSON data format example for intelligent lamppost data

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 23

3.2.6 API Catalogue The API catalogue is a user interface for developers that want to access data of the Smart Data Platform over an API (JSON REST APIs). The API catalogue lists the different APIs that the Smart Data Platform provides. Every accessing system has to register for each of these APIs and access is granted by the single sign-on service according to data classifications and agreements among the project stakeholders. The requests towards the Smart Data Platform or the API catalogue respectively are secured via HTTPS and access authorization via OAuth2 with JSON web tokens. Only if a valid token is included in the request the external system can access the data. The data fields of the interface and the request parameters are designed to meet the specific use case requirements, if possible in coordination with the requesting external system / app. Over a “try it out” button, responses with real data can be requested as a practical help for developers to get a feeling how the API works. Furthermore, an example data format and response codes are provided. The figure below shows a screenshot of the SMARTER TOGETHER API catalogue.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 24

Figure 8 : Screenshot of the API catalogue

3.2.7 Analysis Dashboard The analysis dashboard is the central user interface to view the different visualized use case specific analyses. It is firstly linked to the analytics module that generates use case specific data sets based on the data warehouse technologies described above. Secondly, it is linked to the single sign-on service, which manages the different user access rights for the different analyses as certain users are only allowed to see certain analyses.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 25

For the visualization of the analyses (charts, diagrams), the Smart Data Platform uses the open source software D3.js, which is a JavaScript library for generating dynamic and interactive data visualizations in web browsers. It makes use of the widely implemented SVG, HTML5, and CSS standards. Furthermore, the cross-platform JavaScript library jQuery is used to simplify the client-side scripting of HTML. jQuery is a free and open-source software using the MIT License.

The analysis dashboard is divided into four main domains: “Home”, “Thematic Analyses”, “Platform Statistics” and “Key Performance Indicators”. On the “Home”-screen, a map of the pilot area in Munich is shown containing different types of points of interest (POIs) such as intelligent lampposts or mobility stations. When clicking on one of these POIs a details window opens showing master data of this POI (e.g. type of POI, address, name etc.) as well as dynamic data (e.g. last measured PM10 value of lamppost sensor).

Under “Thematic Analyses” the dashboard user accesses the use case specific analyses such as for example the historical data of pollution sensor measurements as a line chart or the usage of mobility stations etc. “Thematic Analyses” is subdivided into the three topics of “Mobility”, “Intelligent Lampposts” and “Energy”.

“Platform Statistics” enables the user to visualize different statistics on the Smart Data Platform performance the number of sent data over the data integration and data provision APIs including indications on API downtimes.

Lastly, the domain “Key performance indicators” summarizes and visualises the key performance indicators for the different project use cases that are needed for the project performance monitoring.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 26

Figure 9 : Analysis Dashboard Start Page_Pilot Area with POIs

3.2.8 Transparency Dashboard The Transparency Dashboard is a public website that aims at showing citizens which data are collected in SMARTER TOGETHER Munich and how they are being used in the project use cases. It is the “window” into the Smart Data Platform Munich and illustrates transparently which data sources are integrated, how data protection issues are addressed and what the different data are used for.

The Transparency Dashboard shows

▪ which data in the platform are publicly available

▪ what happens with the data within the Smart Data Platform

▪ how these data can be used / accessed

The information on the Transparency Dashboard is divided into the following domains: general project information, energy, mobility, intelligent lamppost and data usage with further information and subsites.

The Transparency Dashboard is a responsive website with no database in the background. No special technologies are necessary apart from a content management system to dynamically add or modify text and information. The CMS used for the transparency dashboard is Wordpress.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 27

The figures below show the screenshots in browser http://transparency.smartdataplatform.info and mobile device resolution.

Figure 10 : Screenshot of Transparency Dashboard (browser view)

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 28

Figure 11 : Screenshot of Transparency Dashboard (mobile device view)

4. Data Processing Framework Whereas chapter 3 presents the different system components, which together form the overall software system, this chapter outlines the framework in which the different data are processed, managed, combined and published. Only together can the software components and the data processing mechanisms constitute a smart data platform.

4.1 Data Classification As different data sets need to be handled differently in the Smart Data Platform according to the specific requirements towards data security and privacy, each integrated data source receives a data classification along multiple criteria. The data classification criteria were defined by the SMARTER TOGETHER project partners in Munich involved in the design of the Smart Data Platform. The criteria are the following: Integrity and Data Privacy, Data Processing and Data Analysis, Transfer and Allocation of Data, Back-up and Deletion of data.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 29

For each criterion, one of the four scale levels has to be chosen per data source so that the data processing mechanisms in the Smart Data Platform can be adjusted accordingly. The definition of the criteria has to be performed by the data owner under consultation of the project stakeholders.

Scale Level 1 Scale Level 1 Scale Level 1 Scale Level 1

Integrity and data privacy

I1: No further data privacy measures necessary

I2: Data has to be anonymized / pseudonymized before further processing

I3: Data has to be anonymized / pseudonymized as well as aggregated before further processing

I4: Data may be stored in SDP´s data warehouse but can only be processed after individually defined criteria

Data processing and data analysis

D1: Data may be analysed and correlated with any other data

D2: Data may only be correlated with other data if no information hereby occurs with which individuals or individual groups can be identified

D3: Data may only be correlated or analysed according to the specific criteria defined for the individual use case

D4: Data may not be correlated or analysed at all

Transfer and allocation of data

T1: Data may be transferred to any user internally and externally

T2: Data may only be transferred to external users under specific conditions

T3: Data may only be transferred to the original data provider

T4: Data may only be provided to predefined internal users

Back-up and deletion of data.

B1: Data may be stored at any physical location and as long as desired

B2: Data may be internally stored as long as desired

B3: Data may be stored internally with a time limit after which the data must be deleted

B4: Data may only be stored at predefined physical locations with access protection. A time limit must be provided if needed

Table 1: Data Classification Criteria

In the following table, it is listed how these different criteria and their scale levels are technically implemented in the Smart Data Platform.

Data Classification Scale Level Technical Implementation in Smart Data Platform

Integrity and Data Privacy

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 30

I1: No further data privacy measures necessary

Raw data is stored in SDP´s data warehouse without anonymization / pseudonymisation measures and provided to analytics module

I2: Data has to be anonymized / pseudonymized before further processing

If data has not been anonymised / pseudonymised by data provider in the first place before integration into SDP, a data converter is applied to anonymise / pseudonymise the data before storing it in SDP´s databases. The anonymization / pseudonymisation criteria are defined together with data provider.

I3: Data has to be anonymized / pseudonymized as well as aggregated before further processing

In addition to anonymization / pseudonymisation measures, the data will be automatically aggregated to a level to be discussed with data provider and other use case stakeholders

I4: Data may be stored in SDP´s data warehouse but can only be processed after individually defined criteria

Data are stored in the data warehouse but are only processed according to use case specific and individual processing rules to be defined by data provider and other use case stakeholders

Data Processing and Data Analysis

D1: Data may be analysed and correlated with any other data

Data stored in database is provided to analytics module and 3rd party systems without any restrictions

D2: Data may only be correlated with other data if no information hereby occurs with which individuals or individual groups can be identified

If this scale level is chosen, data is only correlated with other data if no individual-related data evolves. In case of any doubt all use case stakeholders have to agree to the specific analysis and data visualisation. Since all analyses and data visualisations have to be developed individually it is not possible that data correlations occur which are not intended.

D3: Data may only be correlated or analysed according to the specific criteria defined for the individual use case

If this scale level is chosen, analyses are only developed based on this data if all use case stakeholders agree to this analysis and data visualisation. Since all analyses and data visualisations have to be developed individually it is not possible that data correlations occur which are not intended.

D4: Data may not be correlated or analysed at all

No analyses will be developed based on this data

Transfer and allocation of data

T1: Data may be transferred to any user internally and externally

The data is accessible over the API catalogue of the SDP to all users. However, a registration at the Smarter Together API is necessary

T2: Data may only be transferred to external users under specific

The API catalogue and the connected Identity and Access Management distinguish between different users. Only approved users with the username and password can access

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 31

conditions the data over the Smarter Together API. On the Transparency Dashboard it is described under which conditions access to the data can be provided.

T3: Data may only be transferred to the original data provider

The API catalogue and the connected Identity and Access Management distinguish between different users. Only the original data provider can access the data

T4: Data may only be provided to predefined internal users

The API catalogue and the connected Identity and Access Management distinguish between different users. Only internal users with the permission of the original data provider can be unlocked for the specific API. Even the original data provider needs to be unlocked for the API

Back-up and deletion of data

B1: Data may be stored at any physical location and as long as desired

The data is stored at a location free of choice for SDP operator without any further requirements for the lifetime of the project and beyond.

B2: Data may be internally stored as long as desired

The data is stored in the data warehouse of the SDP for the lifetime of the project and beyond. If the data is transferred to a 3rd party system over the Smarter Together API, the handling of the data is the responsibility of the operator of that 3rd party system. The handling of the data should In this case be determined in a data licensing contract

B3: Data may be stored internally with a time limit after which the data must be deleted

The data are stored in the data warehouse of the SDP with a specific time-to-live attribute, which manages the automated deletion of the data after a validity period to be specified by the data provider

B4: Data may only be stored at predefined physical locations with access protection. A time limit must be provided if needed

Data providers, who have chosen this scale level, are briefed about the IT security measures and server locations of the SDP. If desired, the data provider can choose a certain preferred server location. The data are stored in with a specific time-to-live attribute, which manages the automated deletion of the data after a validity period to be specified by the data provider.

Table 2 : Technical Implementation of Data Classification

Although each single data classification entails a specific implementation in the Smart Data Platform as presented in the table above, this does not mean that these specific data processing mechanisms are triggered automatically. The data classification for each data source helps the SDP operator to know how manually handle, restrict and configure the data sets but this manual configuration is a necessary step in the current status of the Smart Data Platform. If, for example, a data owner requires that his or her data needs to be aggregated to a certain level before further processing, he can choose the data classification accordingly offline. However, this does not mean that the

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 32

aggregation is triggered automatically but that the SDP operator enters into a discussion with the data owner and the use case owner to define up to which level the data should be aggregated. After this offline discussion, the aggregation algorithm is implemented or configured accordingly.

In the Smart Data Platform, this configuration is handled via the user interface of the single sign-on service. Each system component or external actor is defined as a user/client in the single sign-on service to which certain roles are assigned. These roles determine which users/clients are allowed to do which database or system operations. Per user/client, these roles have to be assigned manually by the platform operator. These roles are defined and set by the SDP operator and represent the different system operations available. The roles are in principal divided into a three level hierarchy as the figure below shows. A user/client can be assigned to a 1st level composite role, which includes several 2nd level composite sub-roles which, in turn, contain the single 3rd level system operation roles.

Figure 12: Role assignment for SDP users/clients

The example of using the analysis dashboard of the Smart Data Platform will be shortly illustrated to better understand this process. An analysis dashboard user needs to register before being able to see any visualization of an analysis. After registration, the user receives a basic default role. More advanced roles with which certain analyses can be requested have to be additionally assigned to this user by the platform administrator. If the platform administrator is in doubt if a newly registered user is allowed to be unlocked for more advanced roles, he or she will check with the data and use case owner beforehand. If this user is allowed to see all thematic analyses, he or she can for example be assigned with a “Global_CompositeRole” which includes certain composite sub-roles (see figures below).

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 33

Figure 13: Screenshot of roles list (extract)

For each composite role, sub roles can be added or removed until they represent the exact scope and limitations of each role as here for the example of a Global_CompositeRole.

Figure 14: Screenshot of role assignment for Global_CompositeRole

The sub roles, in turn, include the different system operation roles that can also be added and removed. In the screenshot below, we see an extract of the third level system operation roles assigned to the “Energy_CompositeRole” as an example.

Figure 15: Screenshot of role assignment for Energy_CompositeRole

This configuration process of role assignment per user and the involved data processing settings in accordance to data classification is, as mentioned above, a manual process to be performed by the Smart Data Platform operator.

This manual configuration could be prospectively replaced by an automated process in which a smart city platform decides “itself” how certain data should be handled and

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 34

which system operations should be allowed by which components. This could be possible if all data sets carry a classification attribute directly which automatically defines and triggers the role composition and the involved access restrictions for all system components and platform users. If a bigger number of use cases evolve and resemble each other in such a way that they require similar system operations, it could be helpful to automate this process. For the research context of SMARTER TOGETHER with a limited number of use cases that are very different from each other, it was decided that the efforts for automation would have been too big for the added value it had provided.

4.2 DGK Data Model and Processing Rules It has been documented in this report, how data needs to be handled in the SDP according to different data classification and other requirements set by the Data Gatekeeper concept. One important outcome of the Data Gatekeeper is the data model that documents the different role schemes of various stakeholders for handling data sets. The data model is shown in the figure below and illustrates the different components and roles in the SDP as well as the flow of data through the system.

Figure 16: Data Gatekeeper Data Model

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 35

The data model consists of rectangles representing system components and system actors as well as of arrows representing system interactions such as events or information flow. The data model is structured in four sectors: “Inside Smart Data Platform”, “User Roles”, External” and “Smart Data Systems”.

In order to explain the data model, the flow of data provided by an external data source will be illustrated through the different processing steps of the SDP towards a third party system requesting data from the SDP over the SMARTER TOGETHER API catalogue. The initial data set is provided by a third party data source such as a backend, which is administered and operated by the Supplier´s Technical Data Steward. Over an API or file export, the data is being imported into the data warehouse of the SDP. This raw data set is immediately processes according to the specific data classification and/or aggregation/pseudonymization mechanisms set in the Data Gatekeeper Registry for this data source. If e.g. an aggregation mechanism is necessary, the raw data is processed accordingly and, from that point on, stored in the data warehouse as a “Derived Data Set”. The initial raw data may be deleted after that process if required by the data classification. The data classification and aggregation framework is set by the data owner after discussion with other stakeholders especially the “owner” of that specific use case.

The derived data set is then being used for visualisation purposes in the analysis dashboard and/or for publication towards third party systems over the SMARTER TOGETHER API. A user of the analysis dashboard is only allowed and able to see the visualisation of that derived data set if he is granted access by the access authorization service (SSO). Which users are allowed to see which visualized analyses is administered in the DGK Registry and decided by the data provider and the use case owner. In addition, the access to the SMARTER TOGETHER API is restricted by this framework. Only external systems or apps can request the derived data set over the API catalogue if they are unlocked for this specific part of the API. The decision on which external systems can access which data over the SMARTER TOGETHER API catalogue is taken by the owner of the data and the use case owner.

4.3 Functional Library for Data Processing As some of the data sets to be integrated in the SDP need to be processed according to the specific needs of each data classification, the platform provides a toolset of data processing mechanisms. Data providers and use case owners can choose from this toolset how specific data sets should be processed within the SDP. There are different mechanisms available in the platform how to anonymize / pseudonymize data sets. In addition, different options are available how certain data can be aggregated to meet data privacy requirements. The data analysis and data access functionalities, in turn, do not provide different options to choose. For these topics, the table rather informs which functions are implemented in the SDP.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 36

Functionality 1 Functionality 2 Functionality 3

Anonymization / Pseudonymization of person-related data

After the integration of raw data into the Smart Data Platform over the JSON REST interface, data can be anonymised / pseudonymised before writing it into the platform´s database

Removing person-related data fields

For a full anonymization of the transferred data sets, data fields containing person-related data sets can be removed completely before writing it into the database. These data fields need to be specified by the data provider and other use case stakeholders.

Removing ID digits

Before writing raw data sets into the database, the person-related IDs indicated in specific data fields can reduced by a number of digits to be specified by the data provider and other use case stakeholders. If the data set contains a person-related ID as for example “D42197334”, it can be written into the database as e.g. “D42197” or “D42197XXX”. The number of digits to be cut off needs to ensure that no data can be referenced to a specific person.

Creating hash values

For data pseudonymization purposes, a hash value can be created for a person-related ID before writing the data into the database. The content of a specific data field is replaced by a new “string” created with the algorithm of the hash-function. The created hash-value can only be referenced to the original data values with the correct key which has been used for creating the hash-value. This key can either be deleted or safely stored in a different system.

Data Aggregation

Another set of functions to ensure data protection requirements is the aggregation of data either before writing it into the platform´s database or before providing data to the platform´s analysis module.

Remove subcategories of a data set

Subcategories of a data set can be removed. If a data field contains a category and there are subcategories specifying this category even further, these subcategories can be removed. For example, if the data set contains a location category “city” and the subcategory “district” or “street”, the latter two subcategories can be removed so that the data set is aggregated to city-level.

Create an average value for a defined period of time

If data sets contain measured values for an interval of e.g. 5 minutes, but due to data protection guidelines only measured values for e.g. an hour should be processed in the platform, an average value for the desired period of time can be calculated (either before writing the data set into the data base or before further processing in the analysis module)

Aggregate (sub-)categories

Categories of a data set can be aggregated to a “higher” category level. If a data set contains for example the category “district”, the districts “1” and “2” can be aggregated to the new category “1/2” to provide a higher aggregation level.

Access to data

The Smart Data Platform foresees two ways of accessing the data. The first way is to access the visualized analyses of data

Access to visualized analyses with log-in credentials

On the analysis dashboard the user needs to log in and will only be able to

Access over JSON REST API

Access to a specific API will be granted after approval by the use case stakeholders, especially the data provider. The

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 37

in the analysis dashboard. The second way is to request data over a JSON REST API for e.g. integration in applications. In both cases of data access, the user needs to have the access rights to receive certain data. For some data, it might for example be necessary to sign a licensing contract with the data provider. In case all requirements to access certain data are fulfilled and approved “offline”, the Smart Data Platform will grant access to certain analyses or APIs with the implemented multi-client-capability.

access the visualized analyses connected to his access rights to be approved by the use case

stakeholders.

endpoint of the API will be secured via OAuth2 with JSON Web Token (JWT). The access token will be exchanged in the http headers. The access token contains a signed JSON file including the defined access rights of the user.

Table 3 : SDP Functional Library

5. Data Exchange in Use Cases This chapter illustrates how the platform processes different data in the four main use cases of the Munich smart data pilot. After the general system and the data processing framework has been explained in this document, this chapter shortly shows the workflow of the SDP in the concrete project use cases.

5.1 Energy The Use Case “Energy” is about collecting data that measure temperature and air humidity in flats. A measurable direct effect on energy efficiency and energy consumption is not part of this use case.

In the Energy Use Case, two data sources are integrated. Firstly, smart home sensors of the provider Securitas which measure temperature and humidity in different apartments and buildings, and secondly, outdoor weather data (temperature and humidity) from a weather station of DWD (German Meteorological Office) located in Munich. In the Securitas backend, all single measurements of the different sensors are bundled in a database. This database is exported by a tool provided to Securitas by the Smart Data Platform operator. The tool sends all JSON files with the current measurements over the MQTT protocol to the SDP where the raw data is stored in the data warehouse.

The DWD weather data is accessible to third parties (in this case the SDP) as .csv-Files from the DWD weather data server. As only historical data is needed for the further analyses in the SDP, the file transfer is scheduled for only once a day.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 38

The two data sources are used for analyses visualized in the analysis dashboard under the menu point thematic analyses > energy. Only users that are specifically unlocked for the energy analyses can open this menu point.

Figure 17 : SDP Workflow of Use Case Energy

The raw data of these two data sources are processed, analysed, and combined so that they can be presented in the analysis dashboard of the SDP. The screenshots below show two different visualisations of the analysed data. In the first screenshot the temperature measurements of different smart home sensors are visualised throughout a single day. For the same day the outdoor temperature is included based on the DWD weather data for comparison.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 39

Figure 18: Screenshot 1 Energy analyses

Furthermore, a comfort analysis has been implemented. This analysis presents the measurements of a smart home sensor, along with the dimensions of indoor temperature, absolute humidity and relative humidity, that lie within the so-called comfort field for indoor climate. Based on the DWD weather data it can, for example, be filtered that the measurements should be presented for a day with certain outdoor weather conditions such as the coldest day of the year.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 40

Figure 19: Screenshot 2 Energy analyses

5.2 Intelligent Lampposts The use case of Intelligent Lampposts contains data from different sensors of multiple sensor operators, which are installed at different lampposts across the pilot area. The use case of Intelligent Lampposts can be divided into the two sub-topics of environmental sensors and traffic sensors.

For all sensor use cases, the data is firstly sent from the sensor modules to the respective operator backends, where the data is bundled and processed. Over different JSON REST interfaces provided by the sensor operators, the data is transferred (pull) to the Smart Data Platform where the raw data is stored. Additionally, the static data of the lampposts (address, location, height etc.) is imported as JSON file provided by the geo data server of Munich.

The different sensor data are processed and combined by the analytics module of the SDP and the resulting data is stored so that it can be requested by the analysis dashboard for visualisation.

In this use case, the lamppost data is also provided to a 3rd party system over the SMARTER TOGETHER API.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 41

Figure 20: Workflow Use Case Lampposts

This JSON REST API over which the Munich Smart City App accesses the lamppost data is provided by the Smart Data Platform via an API catalogue. Over a GET request the app can retrieve data from three different endpoints. The Munich Smart City App has to register for each of these APIs and access is granted by the single sign-on service according to data classification and agreements among the project stakeholders. Each GET request is secured via HTTPS and access authorization via OAuth2 with JSON web tokens. Only if a valid token is included in the GET request the app can access the data. The data fields of the interface and the request parameters were designed to meet the specific use case requirements to display the values in the Smart City App.

Over the “authorize” button, the user, in this case the Munich Smart City App, requests access to the specific API. Once the access is granted real data can be retrieved. The different request parameters that are possible for the API are listed in the API documentation.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 42

The Munich Smart City App uses this data to display the positions of the different Intelligent Lampposts in the pilot area and to show the different measured values of the different sensors installed at the lampposts. Both the current measurements as well as historical value diagrams, e.g. the development of values in the last week, are shown to the app user.

Figure 21: Screenshots from Munich Smart City App with Intelligent Lampposts

Different analyses are calculated for the Intelligent Lampposts use case, which are visualized in the analysis dashboard that can be accessed by the dashboard users who are unlocked for the topic of Intelligent Lampposts. The screenshot below shows an example for an Intelligent Lampposts analysis. In this diagram it can be flexibly chosen between all environmental sensors to display the different measured values (temperature, relative humidity, wind speed, air pressure, wind direction, SO2, CO, NO2, Ozone, PM2,5, PM10) over the specified period of time. Different aggregation levels can be chosen to display single measurements as well as hourly, daily and monthly average values. In order to compare different value types, e.g. temperature with SO2, the diagram can display these different values on two y-axes.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 43

Figure 22: Screenshot Intelligent Lampposts analyses

5.3 Mobility In the Mobility use case, different data for the mobility offerings at the pilot area´s mobility stations are integrated in the SDP. For this use case, the data is not transferred over an API to the SDP but over csv.-Files regularly provided by the Mobility use case owner to the SDP operator. These .csv-Files are initially generated by the operators of the different mobility offerings at the mobility stations such as MVG for bike sharing or STATTAUTO for carsharing. The data is processed for visualisation in the analysis dashboard.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 44

Figure 23: Workflow Use Case Mobility

The analyses implemented for Mobility are basically usage statistics of the different mobility services in the pilot area. Per mobility station, it can be analysed how the use of the services has evolved in the last days, weeks or months. Furthermore the usage data of the different services is combined with weather data of the environmental sensors integrated in the Intelligent Lamppost use case to investigate how different weather conditions influence the use of, for example, bike sharing.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 45

Figure 24: Screenshot Mobility analyses

5.4 KPIs The purpose of the KPI use case is to present the key performance indicators of SMARTER TOGETHER project in the analysis dashboard as the central visualisation interface in a bundled way.

For each reporting period of SMARTER TOGETHER a csv.-file with all KPIs for all topics to be monitored is provided to the SDP by the monitoring task leader. Based on these numbers, the overall KPI tables, and based on that, different diagrams of KPI development are visualized in the SDP. An exception is the topic of Energy. In this case, the KPIs are not provided by the task leader but are calculated by the SDP automatically based on data integrated over a JSON REST API from an external database called E-Manager where all energy related developments are documented by the owner of the use case Energy.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 46

Figure 25: Workflow Use Case KPIs

For each of the KPI topics of Participation, Energy, Integrated Infrastructure, Mobility, Monitoring, Replication and Other, the respective KPIs are presented in an overview table. For many KPIs, single diagrams of the value development can be observed as well as diagrams of the current status in percent how much of the target value for each KPI has already been achieved. All tables and diagrams, which as well applies to all diagrams of the other use cases described above, can be exported as .csv- or .svg-files.

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 47

Figure 26: Screenshot KPIs analyses

6. Conclusion The Smart Data Platform, as an evolution of the former City Intelligence Platform, was designed and implemented in a way to meet all requirements coming from the SMARTER TOGETHER use cases regarding the integration, processing, analysis and the exchange of data. The example of the use case Intelligent Lampposts shows exemplarily the full platform workflow from integrating different lamppost data from various data providers, processing, combining and analysing data, and, finally, visualising the analyses in a user interface and providing data to a 3rd party over a dedicated Smarter Together API.

A wide range of different APIs has been implemented between data providers and the Smart Data Platform. In half of the cases, the JSON REST APIs have been designed and provided by the SDP, and in the other cases proprietary APIs (also JSON REST) have been provided by data providers for implementation by the SDP. The implementation of the Smart Data Platform in SMARTER TOGETHER has demonstrated that a wide range of technical interfaces and very different data types can be integrated into one system and brought together meaningfully for cross-topic analyses and use cases.

It has turned out that the user interface for visualisation of analyses, the analysis dashboard, is a crucial component of the Smart Data Platform as this dashboard makes all data processing mechanisms visible and represents an important added value for actual users. Even if the backend services require more development effort, is the analysis dashboard the most important visible part of the platform within SMARTER TOGETHER. As basis for the actual development process, it can be summarized that

SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 48

many efforts have to be made for designing, discussing and redesigning the analyses. It has proved helpful to work with several loops of screen designs as a development basis, which has been agreed by all involved stakeholders, especially the later users of the analyses. As the challenge lies in creating meaningful analyses based on big amounts of raw data, define filter options for the user interface, and decide on a suitable visualisation, this process has turned out to be a very manual one. Based on the experience within the use cases of SMARTER TOGETHER it seems impracticable to design automated standard analyses.

It can also be summarized that the system architecture and the design of the single components cannot stand for itself but has to reflect with its functionalities the specific data processing framework defined by the various use cases stakeholders. The developed data model and data classification framework of the DGK had to be translated into specific technical implementations in the Smart Data Platform described in chapter 4. As documented, this is mainly still a manual configuration and not an automated process.

After all use cases have been fully implemented, the Smart Data Platform is ready for the operational phase from February 2019 until the end of the projects. Apart from developing small improvements, the efforts for the SDP thus switch from development to hosting and operation of the implemented APIs, services and applications to guarantee a stable flow of data through the system. Furthermore, the Smart Data Platform will be used for the upcoming monitoring and replication tasks.

7. Annex The concept description “Data Gatekeeper” (DGK) is added as annex (but as a separate document) to this deliverable. Although the DGK is an independent concept that is applicable to any Data Platform environment, the Smart Data Platform frequently refers to the DGK when integrating SW building blocks, data classification and the complete data model.

The project partner Fraunhofer IAO designed the DGK concept in close cooperation with Landeshauptstadt München.

Smart and Inclusive

Solutions for a Better

Life in Urban Districts

Data Gatekeeper

Munich Final Documentation

This project has received funding from the

European Union’s Horizon 2020 research and

innovation programme under grant agreement

No 691876

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 2

Table of Contents Table of Contents .................................................................................................................................... 2

List of Figures ........................................................................................................................................... 7

Preambel ................................................................................................................................................. 7

Executive Summary ................................................................................................................................. 9

Introduction ................................................................................................................................... 12

1.1 General Approach of the Data Gatekeeper ........................................................................... 14

1.2 Structure of the Document ................................................................................................... 17

1.3 Use Case Example “Smart Home” ......................................................................................... 19

Strategic Guidelines ....................................................................................................................... 23

2.1 Legal Regulations ................................................................................................................... 24

2.1.1 Privacy for Personal Data .............................................................................................. 25

2.1.2 Opening Administrative Data ........................................................................................ 28

2.1.1 Use Case Reflection ....................................................................................................... 29

2.1.2 Golden Rules .................................................................................................................. 30

2.2 Internal Compliance .............................................................................................................. 30

2.2.1 Use Case Reflection ....................................................................................................... 31

2.2.2 Golden Rules .................................................................................................................. 32

2.3 Smart Data Infrastructure Basics ........................................................................................... 33

2.3.1 Data Protection Concept ............................................................................................... 34

2.3.2 Data Security ................................................................................................................. 35

2.3.3 Hardware Location ........................................................................................................ 35

2.3.4 Data Integration, Processing and Analysis .................................................................... 36

2.3.5 Use Case Reflection ....................................................................................................... 37

2.3.6 Golden Rules .................................................................................................................. 39

2.4 Smart Data Strategy .............................................................................................................. 40

2.4.1 Drivers for Smart Data ................................................................................................... 41

2.4.2 The Formulation of a City-specific Strategy .................................................................. 42

2.4.3 Use Case Reflection ....................................................................................................... 44

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 3

2.4.4 Golden Rules .................................................................................................................. 45

2.5 Responsibilities ...................................................................................................................... 46

2.5.1 Strategy ......................................................................................................................... 47

2.5.2 Participation .................................................................................................................. 48

2.5.3 Compliance .................................................................................................................... 49

2.5.4 Operations ..................................................................................................................... 50

2.5.5 Data Provision................................................................................................................ 52

2.5.6 Data Consumption ......................................................................................................... 53

2.5.7 Use Case Reflection ....................................................................................................... 54

2.5.8 Golden Rules .................................................................................................................. 56

2.6 Participation .......................................................................................................................... 56

2.6.1 Use Case Reflection ....................................................................................................... 58

2.6.2 Golden Rules .................................................................................................................. 61

Smart Data Creation Guidelines .................................................................................................... 62

3.1 Requirement Specification .................................................................................................... 63

3.1.1 User Goals ...................................................................................................................... 63

3.1.2 Use Case Project Information ........................................................................................ 63

3.1.3 Specification Checklist ................................................................................................... 64

3.1.4 Agreement Meeting and Approval ................................................................................ 66

3.1.5 Use Case Reflection ....................................................................................................... 66

3.1.6 Golden Rules .................................................................................................................. 67

3.2 Data Categorisation ............................................................................................................... 68

3.2.1 Categories ...................................................................................................................... 69

3.2.2 Use Case Reflection ....................................................................................................... 71

3.2.3 Golden rules .................................................................................................................. 72

3.3 Data Usage Agreement and Licensing ................................................................................... 72

3.3.1 Standard Licences .......................................................................................................... 73

3.3.2 Individual Agreement or Side Letter ............................................................................. 74

3.3.3 Use Case Reflection ....................................................................................................... 75

3.3.4 Golden Rules .................................................................................................................. 75

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 4

3.4 Data Classification ................................................................................................................. 75

3.4.1 Integrity and Data Protection ........................................................................................ 77

3.4.2 Processing and Analysis ................................................................................................. 77

3.4.3 Redistribution and modification .................................................................................... 78

3.4.4 Storage and Deletion ..................................................................................................... 78

3.4.5 Use Case Reflection ....................................................................................................... 78

3.4.6 Golden Rules .................................................................................................................. 80

3.5 Quality Gate 1 ........................................................................................................................ 81

3.5.1 Elaborate Final Use Case Concept ................................................................................. 81

3.5.2 Final Concept Approval (Q1) ......................................................................................... 81

3.5.3 Sign Data Usage Agreement .......................................................................................... 81

3.5.4 Use Case Reflection ....................................................................................................... 81

3.5.5 Golden Rules .................................................................................................................. 82

3.6 Data Sets and Technical Implementation.............................................................................. 83

3.6.1 Create Data Scheme and Interface Specification .......................................................... 83

3.6.2 Create Technical Specification....................................................................................... 83

3.6.3 Technical Specification Approval ................................................................................... 84

3.6.4 Implementation and Data Set Creation......................................................................... 84

3.6.5 Implementation of Data Interface ................................................................................. 84

3.6.6 Develop App or Hardware solution ............................................................................... 85

3.6.7 Use Case Reflection ....................................................................................................... 85

3.6.8 Golden Rules .................................................................................................................. 85

3.7 Data Processing ..................................................................................................................... 85

3.7.1 Anonymisation and Pseudonymisation ......................................................................... 87

3.7.2 Data Aggregation ........................................................................................................... 88

3.7.3 Golden Rules .................................................................................................................. 89

3.8 Quality Gate 2 ........................................................................................................................ 90

3.8.1 Specification and Implementation Review .................................................................... 90

3.8.2 Use Case Integration Test .............................................................................................. 90

3.8.3 Go Live Pilot Phase ........................................................................................................ 90

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 5

3.8.4 Analysis Test .................................................................................................................. 90

3.8.5 Final Approval and Go Live ............................................................................................ 91

3.8.6 Golden Rules .................................................................................................................. 91

Data Model .................................................................................................................................... 91

4.1 Examplary Data Model .......................................................................................................... 95

Smart Data Infrastructure ............................................................................................................. 96

5.1 Basic Architecture .................................................................................................................. 97

5.2 Data Integration .................................................................................................................. 100

5.3 Connection with External Systems ...................................................................................... 101

5.4 Analysis Dashboard ............................................................................................................. 102

5.5 Platform Statistics Dashboard ............................................................................................. 103

5.6 Transparency Dashboard ..................................................................................................... 104

Conclusion and Outlook .............................................................................................................. 105

Glossary ............................................................................................................................................... 108

Bibliography ......................................................................................................................................... 115

Appendix ...................................................................................................................................... 117

7.1 Legal Regulations ................................................................................................................. 117

7.1.1 Principles on Privacy for Personal Data ....................................................................... 117

7.1.2 Objectives and Principles of Opening Administrative Data......................................... 124

7.2 Checklist Templates for the Smart Data Creation ............................................................... 127

7.2.1 Requirement Specification (Epic) ................................................................................ 128

7.2.2 Data Categorisation ..................................................................................................... 130

7.2.3 Data Usage Agreement and Licensing ......................................................................... 130

7.2.4 Data Classification ....................................................................................................... 131

7.2.5 Quality Gate 1 .............................................................................................................. 132

7.2.6 Data Sets and Technical Implementation .................................................................... 132

7.2.7 Data Processing ........................................................................................................... 133

7.2.8 Quality Gate 2 .............................................................................................................. 134

7.3 RACI ..................................................................................................................................... 135

7.4 Categories ............................................................................................................................ 135

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 6

7.5 Classification ........................................................................................................................ 138

7.6 Data Model .......................................................................................................................... 142

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 7

List of Figures Figure 1: General Approach for Smart Data on Strategic, Organisational and Technical level .................. 16 Figure 2: Overview of the concept and the structure of the document ............................................................. 18 Figure 3: IT-Components of the Use Case .................................................................................................................... 38 Figure 4: Categorisation of the Use Case ..................................................................................................................... 72 Figure 5: Classification of the Use Case ........................................................................................................................ 79 Figure 6: Data Model: Inside Smart Data Platform and External ............................................................................ 92 Figure 7: Data Model: Smart Data Systems .................................................................................................................. 93 Figure 8: Users relevant for the Smart Data Platform ................................................................................................. 95 Figure 9: Data Model of a specific Smart Data Platform with two Use Cases .................................................... 95 Figure 10: Smart Data Platform Layer Architecture ................................................................................................... 98 Figure 11: Smart Data Platform Layer Architecture and Data Gatekeeper Registry ....................................... 99 Figure 12: Platform Core Functionalisties and their Administration ...................................................................... 101 Figure 13: Exemplary analysis of Smart Home Use Case ........................................................................................ 103 Figure 14: Example of a Transparency Dashboard Prototype for Smarter Together Munich) ..................... 104

Preambel The idea to create a Data Gatekeeper concept was born during the early preparation

phase of the EU-project “Smarter Together” in Munich. All team members were

convinced that, when dealing with subjects like Smart City, Digitalisation or Smart Data

Analysis in a City environment, the question of how to handle data correctly will

become a fundamental future leitmotif. In contrast to an institution that collects,

interprets and sells data for pure monetary business purposes, a City has various

different motivations and underlying benefit models of how and why to handle Smart

Data.

Very quickly it became obvious that “handling data correctly” includes several

aspects that need to be looked at. Not only the technical and legal aspects, but also

ethic thoughts of how a City wants to use data in a “digitalized future” need to be

discussed. In a more digitalized world, owning and handling data from a City

Administration perspective more and more becomes a game changing lever for the

steering and planning ability of a complex future City. Doing this in a most transparent

but still efficient way from the beginning is a key factor to convince and involve all

potential stakeholders and the citizens in and around a City that are required to start

a new level of a future City.

This concept should be regarded more as a living document approach rather than a

complete and concluding final paper. It should be looked at as a helping hand

mainly, but not exclusively, for City Administration’s IT-Strategy and City Planning

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 8

management whose tasks are to start, design or implement innovative data driven

Smart City and Digitalization concepts and projects.

In order to discuss the Data Gatekeeper approach as comprehensive as possible this

document offers several functional viewpoints to involve as many as possible

developing aspects. These viewpoints should be looked at as impulses to better

understand the interdisciplinary nature of such a concept.

In a first step we look at the topic from a legal and City strategic perspective. In parallel

and to reflect more practical experiences we constantly describe and discuss a given

real Use Case of the Smarter Together project, offering “Golden Rules” that can be

interpreted as potential best practice Data Gatekeeper experiences. Last but not

least the IT-reflection and underlying more technical implementation models are

described in order to offer possible approaches of how to integrate a Data

Gatekeeper concept into an existing IT-landscape or into a Smart City Data Platform.

Also a short overview on the underlying Data Model and IT-structures is provided.

Capitalizations are used for references to chapters or glossary.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 9

Executive Summary Urban contribution and shaping in the development of Smart Cities is important to

keeping up with the digitized industry. Digitalisation is the main driver for analysis of

urban challenges as it allows the occurence of more and more data and accelerates

information-exchange processes. When processing citizens data in a Smart City

context, transparency needs to be ensured that recent data privacy debates claim

for.

The role of a Smart City in the creation of Smart Data is to position oneself as trustworthy

partner, install IT-solutions that allow efficient data exchange and analysis, and comply

with national and international law and regulations. Moreover, the sufficient

maintance and governance of data as well as the prevention of security vulnerabilities

are unrelenting efforts in order to ensure data quality, data privacy and other

compliance issues.

Therefore, this document proposes a holistic concept that contains the strategic

anchoring, the creation of Use Cases as well as the underlying Smart Data Platform

concept as a Smart City is an interdisciplinary challenge. The so-called “Data

Gatekeeper concept” is a generic concept to be realized either in a City, in a

consortium of local towns or urban environments in a practically oriented

comprehensible manner.

A systematic approach was applied in this document to guide cities through the

implementation of organisational structures and a Smart Data Platform. These

guidelines are illustrated in a practical manner with a paradigmatic Use Case

reflection. The example used for the Use Case reflection is “Smart Home”. Data, like

temperature and humidity as well as the battery status, is submitted to the Smart Data

Platform in order to analyse the advantage of refurbishment. The data is made

available for the voluntary residents via App for information and comparison with other

participants.

The Data Model is the essential bridge between the Data Gatekeeper as

organisational concept for Smart Data and the Smart Data Platform. It translates

considerations and specifications of the Strategic and Smart Data Creation Guidelines

into logical IT-oriented structures including all available and selectable options that

characterizes needed components within the Smart Data Platform.

The latter allows an efficient data exchange between City stakeholders and offers

innovative data analysis methodologies. Its architecture is designed including the

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 10

Transparency Dashboard as an additional offering for data processing transparency

towards citizens for all Use Cases.

Driving motivations for a City to create Smart Data may be to increase quality of life,

environmental sustainability through less emissions or in the case of the “Smart Home”

Use Case, the support of renovation planning. Drivers on the side of companies can

be new revenue opportunities or improvements in efficiency, productivity or decision

making. Cost savings in e.g. energy or time savings regarding e.g. public transportation

and traffic are beneficial examples to citizens.

The recommended tasks of a Smart City are to

1. make use of appropriate IT-solutions that converts available data into Smart

Data and to introduce a collaborative data platform which allows an efficient

data exchange between the City stakeholders.

2. maintain and govern data sufficiently, so no data becomes obsolete or privacy

flaws occur

3. position themselves as trustworthy partners in the context of data collection

and handle data security and privacy in conformity with society

o Involvement of many stakeholders by means of co-creation activities

o Bulding a positive transparency image e.g. with a public accessible

website

o Setting up responsibilities and organisational data governance structures

in order to ensure that rules including approval steps are strictly followed

by the stakeholders

o Complying with national and international law and regulations at any

time

4. define a Smart Data Strategy including a long-term vision as well as a related

financial budget to ensure backing and sponsorship of the City’s mayor and

municipal council

5. raise awareness and educate the citizens, as well as internal and external Data

Suppliers, how and where data should be collected and which analyzes and

information evaluations should be generated or expected from the raw data

collected as well as to whom the evaluated data will be made available

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 11

The development of Smart Cities is important to keeping up with the industrial

digitalization and not admitting prevalence to individual companies. The threshold of

entry into a digitized city is lower if relationships with suppliers already exist and one

can build on existing structures. Actively innovating is important to prevent expensive

catching up of missed development.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 12

Introduction The expression „Smart City“ is used in discussions on urban projects like e-mobility,

energy efficiency, City planning or environmental protection. The term Smart City can

be defined as the provision of all kinds of new innovative services as well as existing

services in a more efficient manner and includes the usage of information and

communication technologies in cities.

“Smart Data” is the key for these innovations. Within the given context the term is

defined as digital information that is filtered, processed, formatted and optimized from

all available data sources (e.g. sensors) in a way that is ready to use and allows more

precise and focused analytics of the use cases. So what makes Smart Data smart is

that it allows a much faster extraction of valid and meaningful information than

“conventional” data.

Driving objectives are e.g. to find answers to challenges like monitoring aging

infrastructure, handling the increase of traffic, improving the urban administration’s

effectiveness and efficiency as well as raising citizens’ quality of life in metropolitan

areas.

Today, a main driver that influences the way problems can be analyzed and

eventually be solved is digitalization. The process is called “digital transformation”

which takes place in industries (“Industry 4.0”), governments (“Smart/Future Cities”) as

well as in the society as a whole.

Especially in very data privacy-aware countries like e.g. Germany social media

websites and internet companies currently tend to have an image of overcollection

data. Additionally, over the past years many data privacy issues have been raised by

national and European governmental and judicial systems. They especially have

complained about intransparent policies (cf. BBC, 2012). If actors and stakeholders in

a Smart City context also collect and process bigger amounts of citizen’s data, they

need to ensure an outstanding transparency from the very beginning.

Digital transformation is often used to specify and accelerate collaboration processes

between involved stakeholders and their underlying processes of exchanging

information. Therefore, the role of a City in the context of digitalization is rapidly

changing as more and more data from citizens, authorities and private organisations

is being generated.

This implies three main requirements for a future-proof City:

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 13

– First, determine how cities position themselves as trustworthy partners in the

context of data collection and how to handle data security and data privacy

issues. An open and transparent position will define how citizens accept or

reject “Smart Data” concepts and proposed “Smart City” solutions in the future.

– Second, make use of appropriate IT-solutions that convert available data (incl.

the possibility of big amounts of data) into Smart Data and to introduce a

collaborative data platform that allows an efficient data exchange between

the City stakeholders and offer innovative data analysis methodologies which

allow for more integrated City planning activities (“Smart City Platform”). Smart

Data in that context can be interpreted as data that is filtered and optimized

from all available data sourves in a way to allow more precise and focused

analytics of the originally described topics or use cases.

– Third, cities need to ensure they comply with national and international law and

regulations at any time. Even when not taking into account public visibility, non-

compliance leads to increasing financial penalties, especially with regard to

new EU-regulations like GPDR, and growing mistrust from the public.

When all three requirements are addressed adequately, it is not sufficient to just install

a technical solution or instruct an IT service provider to conduct a Smart Data Platform.

If data is not maintained and governed sufficiently, it becomes obsolete. With regard

to technical solutions, security vulnerabilities occur due to old software releases. A

Smart Data Platform as a technical solution for a Smart City therefore must be

enhanced with a strategic and organisational pendant ensuring data quality, data

privacy and other compliance issues. A holistic socio-technical solution can be

understood as a “managed data control area” which ensures own trustworthiness

rules, given compliance regulations as well as essential data security and data privacy

rules at any time.

The present document aims at describing a concept for operating a management

system involving such a ”data control area” including underlying guiding principles.

The concept named “Data Gatekeeper” describes an implementation of

organisational structures as well as a Smart Data Platform for cities, town consortiums,

City districts or urban environments in a practically oriented comprehensible manner.

The described Data Gatekeeper mainly consists of the following aspects:

– Transparent and traceable definition of compliance rules that all input and

output data have to follow strictly. These rules cover the relevant data security

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 14

legislations, data privacy aspects as well as individual policies that a City and

other relevant stakeholders might have defined in a co-creation manner as

being indispensable for a secure and reliable Smart Data Platform operation.

– Organisational data governance and workflow structures ensuring every

stakeholder to strictly follow these rules including requested approval steps.

Furthermore, clear responsibilities shall define who is allowed to change City-

specific components of these rules.

– Well documented principles, Data Schemes and workflows as part of the IT-

architecture allowing to implement a Data Gatekeeper in a given City

environment, including the coupling of an IT-data platform.

Therefore, the Data Gatekeeper concept plays an important role not only to help

protecting sensitive data against potential misuse, but also to underline that crucial

future factors of success for any City are reliable and transparent ways of how to

arrange IT-based Smart City concepts and how to handle Smart Data in general.

The following subsections give a brief overview of the general approach of the Data

Gatekeeper concept and outline the document structure.

1.1 General Approach of the Data Gatekeeper

The Data Gatekeeper Concept is a strategic and organisational concept ensuring

data quality, data privacy and other compliance issues. The whole concept aims to

be a generic concept to be realized either in a City, in a consortium of local towns or

urban environment as a tool for the upkeep of a secure data handling reputation.

Data Privacy

In general there are two approaches of addressing data privacy leaks: “a priori” and

“a posteriori” (cf. Li et al., 2016). Compliance management brings both together: It

prevents, detects and elaborates strategies for responding (cf. Holzmann, 2016, p.37).

The Data Gatekeeper concept implements an “a priori” approach specifically for

Smart Data in cities.

If the privacy impacts of a particular product or service shall be evaluated, a so-called

Privacy Impact Assessment (PIA) is necessary (BSI 2011). A PIA must always be

conducted on a concrete Use Case level and follows the goals to ensure law

conformity to determine risks and effects as well as to evaluate protection and

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 15

mitigation measures. Described quality gates in the Use Case Creation Process of the

Data gatekeeper concept address this issue.

In case a City provides administrative data itself for a particular Smart City Use Case,

all guidelines and recommendations for opening governmental data apply (Manske,

Knobloch 2017).

On top of the above mentioned aspects the impact of fundamental elements of the

new GDPR (General Data Protection Regulation) is briefely discussed in this document.

Data Quality

Main principles in the following general approach stem from the concept of Data

Governance, which is an organisational function to ensure data quality (cf. Scheuch,

Gansor & Ziller, 2012). Data quality can be seen as a sub-aspect of compliance, as

some privacy principles as well as regulations, which especially demand personal data

to be accurate, correct, precise and current (cf. EU, 2016; Wang & Kobsa, 2008).

The Data Gatekeeper Concept is a three-tier approach addressing strategy,

organisation and technology. On the strategic level, prerequisites regarding Legal

Regulations and City-internal compliance are taken into account as a basis for the

Smart Data Strategy, responsibilities and participation strategy (cf. Figure 2 and

structure of the document). Especially the City-specific Smart Data Strategy must be

defined once, taking into account all relevant stakeholders including a political

mandate from the municipal council of the City in order to have a clear commitment

on budget and planned activities.

Further an enhancing participation strategy for citizen’s involvement is part of the

concept.

On an organisational level Smart Data responsibilities and operations are next steps to

be established according to the Smart Data Strategy (cf. chapter 2). A continuous

improvement according to citizens’ feedback, analysis reports and Smart Data

Platform statistics should be realized by Smart Data responsibles. According to Smart

Data Creation processes (cf. Chapter 3) now digital smart services based on an

evolving Data Model can be created following citizen, company and City

departments requirements. Important task of the smart services is ensuring processing

transparency e.g. by a transparency dashboard providing data Classification details

publicly to data owners and citizens.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 16

Main focus of the organisational level are the Use Case lifecycle processes. The Data

Gatekeeper Concept focuses on the most important Use Case lifecycle process,

which is the Use Case of the Creation Process. Further also relevant processes are the

Use Case Update Urocess – for instance in case a compliance shortcoming has been

identified – and the use Case Deletion Process. As two Smart Data Use Cases may

share a single Data Set, also lifecycle processes of Data Sets are relevant, even when

mostly run as part of a particular Use Case creation or Update Process.

On the technical tier a data architecture is necessary with a specific Data Model

based on the Data Gatekeeper Data Model holding relevant master and transaction

data. Operational smart services finally ensure the optional distribution of data to

various other systems or to a citizen’s smartphone via a service interface.

The following figure illustrates the general approach as a holistic framework for Smart

Data in Cities ensuring transparency and Participation as well as compliance and data

privacy for personal data. The parts of the Strategic Guidelines are highlighted blue.

Figure 1: General Approach for Smart Data on strategic, organisational and technical level

Legal Regulations

Internal Compliance

Data ProvisionData Consumption

Problem Awareness

Requirements SpecificationRequirements

Specification

Collect Data

Benefit Models

Efficiency Increase

Benefit for Stakeholders

Ch

allenges &

Pro

blem

s

Solu

tion

s & B

enefits

Communication

Political Support & Drivers

Stakeholder

Operations

Coordination

Responsibilities

Smart Data Lifecycle

Data Model

Smart Data Infrastructure

Analyze Data

Process Data

Change Processes

Smart Data Platform

ICT Architecture

Data SetsUse Cases

Data Rules

Process RulesProcessing Transparency

Monitoring

ClassificationCompl iance RolesDecider

Use Case Creation, Change & Deletion

Transparency

Participation

SMART DATA

STRATEGY

Legal Regulations

Internal Compliance

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 17

Due to research project limitations the following document solely focuses on some

aspects of a necessary holistic approach also addressing a continuous monitoring of

compliance flaws according to the concept of compliance management. Especially

the strategic parts to be defined once per City and the Use Case Creation Process are

in particular focus. Further necessary key aspects of the Smart Data Architecture are

described, such as the Data Model and the smart services with the Transparency

Dashboard.

1.2 Structure of the Document

In general the document is structured according to the previous described general

approach. All terms written in capital letters are described in a glossary, which is

enhanced at the end of the document. Figure 2 gives an overview of document

structure at hand.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 18

Figure 2: Overview of the concept and the structure of the document

The document is divided in the following chapters:

The first chapter serves as Introduction, containing a brief description of the motivation

and approach. It summarizes also the structure of the Data Gatekeeper concept.

Chapter 2 describes the underlying Strategic Guidelines. These guidelines comprise

mandatory axioms, which are given boundary conditions that cannot be neglected.

The chapter consists of a description of the given laws and other legal or local

restrictions. Furthermore, aspects regarding a Smart Data Strategy are described as

well as Smart Data Infrastructure, Tasks and Responsibilities and Participation. The

information on these topics support the responsible project stakeholder to take into

consideration the principal starting conditions before detailing any Gatekeeper

implementation.

g

Participation

Transparency

Data Provision

Data Consumption

Problem Awareness

Requirements Specification

Requirements Specification

Collect Data

Benefit Models

Efiiciency Increase

Benefit for Stakeholders

Smart Data Strategy

Political Support & Drivers

OperationsResponsibilities

Smart Data Lifecycle

Data Model

Smart Data Infrastructure

Legal Regulations

Internal Compliance

Introduction

Strategic Guidelines

Smart Data Creation Guidelines

Conclusion and Outlook

Data Model

Structure ofthe Document

General Approach

Chapter 1:

Chapter 2:

Chapter 3:

Chapter 4:

Chapter 5:

Chapter 6:

Use Case „Smart Home“

1.1 1.2 1.3

Smart Data Platform andServices

Examplary Data Model: „Smart Home“

5.1 5.2 5.3 5.4 5.5 5.6

Data Integration

Basic Architecture

Connection with External

Systems

Analysis Dashboard

TransparencyDashboard

Platform Statistics

Dashboard

Quality Gate 1

Quality Gate 2

3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8

Data Processing

Data Usage Agreement

and

Licensing

Data Categori-

zation

Data Classification

Data Sets and

Technical Implemen

tation

RequirementSpecification

2.1

2.22.3

2.4

2.5

2.6

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 19

Chapter 3 describes the Smart Data Creation Guidelines which summarize the

individual variables and preconditions that need to be defined, thought of or brought

together before starting any concrete Data Gatekeeper procedure. Therefore, the

chapter describes the preparation work that is required to articulate an individual Use

Case. Furthermore for each subsection it contains a description of an exemplary Use

Case.

Chapter 4 contains the design and description of the Data Model that summarizes the

considerations and specifications from the previous chapters. The Data Model serves

as a steering basis for the Smart Data Platform. The goal of the Model is to visualize a

comprehensive structure including all available and selectable options that

characterize the Data Gatekeeper concept.

Chapter 5 discusses the Smart Data Infrastructure. The chapter contains a description

of necessary technical requirements and gives an overview of the architecture and

Dashboards of an exemplary Smart Data Platform.

The last chapter 6 summarizes the findings and provides a Conclusion and Outlook.

1.3 Use Case Example “Smart Home”

The Use Case „Smart Home“ that was chosen as a representative example in this Data

Gatekeeper concept. It consists of an adapted workflow description of a real Use

Case that was implemented during the Smarter Together project in the City of Munich.

The description contains all relevant elements of the real Use Case implementation.

For simplicity reasons however some stakeholder names and minor process details are

not mentioned here.

The main stakeholders being involved in this Use Case are:

Flat inhabitants (to provide raw data and benefit from feedback of measurements),

External service provider (to initially collect, monitor and store raw data from flats),

Operator of the underlying Smart Data Platform (to collect additional data and

execute predefined analysis),

University to define required information / Data Sets as well as definition of scientific

analysis scheme,

City including the required City departments and subsidiary companies. One of the

subsidiary companies, that offers refurbishment consulting and coordinates related

refurbishment projects for the City of Munich, defines the overall Use Case. A “Smart

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 20

City Team” of the City IT-department coordinates the complete Use Case from a

technical viewpoint with all relevant City and external stakeholders

The Use Case “Smart Home” was developed in parallel to the creation and definition

of the Data Gatekeeper and the Smart Data Platform. Therefore we chose for an

iterative approach with stakeholders being invited to various workshops. In these

workshops we elaborated the necessary steps, described the common goals, all

mandatory data inputs as well as the required Analysis Dashboards. Although most of

the stakeholders where familiar with the content of a Use Case in general, it took a

stepwise approach to bring all new requirements and thoughts concerning the Data

Gatekeeper and Smart Data Platform into this new Use Case description.

Goal of the Use Case

The Use Case “Smart Home” is about measuring temperature and humidity in flats in

the Munich project area.

The goal of the Use Case is to measure the efficiency of refurbishment activities before

and after the refurbishment. On top of this it offers inhabitants additional information

on their ventilation behaviour via the so-called “feel-good-app” which indicates the

impact of ventilation activities.

Data Sources required for this Use Case are:

– Ongoing temperature and air humidity of individual flats (deriving from external

service provider)

– Address of the flats (deriving from external service provider)

– External weather conditions (deriving from German weather forecast institute,

DWD)

– Information about flat refurbishment status (deriving from City of Munich internal

Database)

Therefore about 400 starter sets including each 2 temperature/air-humidity sensors and

a base-station were offered “free of cost” to the flat inhabitants of the project area.

The interested inhabitants were asked to register themselves using a WEB-based

process. This register process included a clear description of the planned analysis goals

as well as the individual paticipant’s allowance to use their required privacy related

data for the data collection and analysis process. An access to an individual password

protected WEB-APP was then granted to each participating user. The information on

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 21

the personalised WEB-APP indicates the individual data measured in the personal flat

as well as the measured average data from all participating flats.

The complete process including supply of the starter set, raw data transfer, as well as

the design and provisioning of the WEB-APP was owned by an external service

provider participating in the Smarter Together project, but not by the City

Administration itself.

All relevant data from each flat was transmitted to a central Database of the external

service provider (using the starter kit base-station and an existing Internet access in

each flat).

Only the part of the transmitted raw data which is relevant for the above mentioned

scientific examination was then transmitted from the external service provider to the

Smart Data Platform (which is the central project platform to store, analyse and

distribute all project related data). Another project partner (University) then got access

to the transmitted data and executed their evaluations, also taking into account

additional information deriving from other Data Sources like external temperature and

air humidity and refurbishing status of the affected flats/buildings.

Findings (GoldenRules) for the Data Gatekeeper:

– As a first initial and most important step, for each Use Case an underlying

business and/or benefit model has to be defined that offers a clear and

comprehensive definition of the individual contribution as well as the individual

benefit of the planned outcome for each individually involved stakeholder,

participant, partner and the public

– If possible the business model has to initially pass through a professional reality

check process in advance to avoid useless investments or repair potential weak

points from the beginning. A main driver for complexity in that phase can be

the necessity to collect and analyse data that is considered as “privacy

related”. The more privacy related data are involved, the more complex the

discussions and policies that need to be considered subsequently (see also

“Golden Rules” in Chapter “Legal Regulations”)

– If possible avoid the necessity to work with any privacy related data in any Use

Case from the beginning. If not possible to avoid, restrict privacy related data

needs to an absolute minimum.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 22

Brief Summary of “Introduction”

Smart City concepts and Digitalisation are big drivers and solution pillars for Cities to

tackle some of their main arising real-world problems. Being able to provide and

analyse Smart Data is one of the levers a City needs to control in the future. One of

the challenging problems in that context for cities will be to find ways to correctly

handle the required raw-data as well as the Smart Data deriving from analysis

processes. Data handling requires not only to comply with the latest data-privacy

related EU-laws, but also to take into consideration the City’s data handling policies,

to evolve the desired “data-image” of the City, and to sensitise the City’s organisation

on how to behave correctly in that context. Last but not least it also requires to find

ways of how to implement the correct processes into the existing IT-infrastructure. The

Data Gatekeeper mainly addresses City administration’s IT-Strategy and City planning

management. It is written as a living document for cities. It offers a structured view on

several of these aspects, from a legal perspective to a potential technical

implementation description including concrete experiences and golden rules from a

project Use Case. The concept is always focussing on the goal to offer a broader

awareness of how to potentially handle data in a smart and digitalised City of the

future.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 23

Strategic Guidelines In order to establish Smart Data applications and Use Cases in a City, a strategic

baseline is needed. The following chapter describes aspects that need to be

considered as well as measures that shall be applied initially once per City when a

common Smart Data organisation and architecture is envisioned.

On the one hand, the following strategic concept translates externally given boundary

conditions like EU-law to strategic guidelines for an establishment of architecture and

applications supervised by the City administration. On the other hand, strategic

decisions on for instance the management of citizen data are needed in order to

define the image of the City as it is perceived in public. The following chapter on

strategic guidelines is a general approach on questions and aspects each City should

take into account when establishing a Smart Data Platform and a coherent

organisation. The following subsection therefore answers two main questions with

respective sub aspects:

– What are the prerequisites for using Smart Data?

o Legal Regulations

o Internal Compliance

o Smart Data Infrastructure

– What do we want to define strategically? How to proceed?

o Smart Data Strategy

o Roles and Responsibilities

o Participiation

Before sections describe details on these aspects an executive summary briefly

summarizes key issues of the chapter.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 24

2.1 Legal Regulations

When considering Smart Data in the context of Cities or urban environments, legal

issues in two main dimensions are relevant. First, data privacy for the storage and

processing of personal data or data that refer to personal data is highly relevant.

Second, in case City-internal data is also used and provided to public within the scope

of Smart Data the opening of administrative data may be an issue.

In this occasion, when selecting Data Sets for Use Cases, a distinction must be made

between personal and public data in order to be able to treat them with the right

protection need.

“‘personal data’ means any information relating to an identified or identifiable natural

person (‘Data Subject’)”; an identifiable natural person is one who can be identified,

directly or indirectly, in particular by reference to an identifier such as a name, an

identification number, location data, an online identifier or to one or more factors

specific to the physical, physiological, genetic, mental, economic, cultural or social

identity of that natural person” (GDPR, Article 4).

All other information is public data, which can be accessible to the general public and

freely used, reused and redistributed by anyone. In contrast to public data, open data

meet a format standard and are released with ability for re-use.

Sensitive data can be seen as a special category of personal data and “Processing

of personal data revealing racial or ethnic origin, political opinions, religious or

philosophical beliefs, […] shall be prohibited.” (GDPR, Article 9, 1).

An exception applies, for instance, if the “processing relates to personal data which

are manifestly made public by the Data Subject;” (GDPR, Article 9, 2 (e)). When the

Participation

Transparency

Data Provision

Data Consumption

Problem Awareness

Requirements Specification

Requirements Specification

Collect Data

Benefit Models

Efiiciency Increase

Benefit for Stakeholders

Ch

allenges &

Pro

blem

s

Solu

tion

s and

Ben

efits

Smart Data Strategy

Political Support & Drivers

OperationsResponsibilities

Smart Data Lifecycle

Data Model

Smart Data Infrastructure

Legal Regulations

Internal Compliance

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 25

Data Subject communicates and diffuses personal data to public (e.g. by social

media), retraceability is made possible but the data is assigned to public data.

This is an indication that assignment of Data Sets to personal or open data is not

obvious or final and it is highly recommended to confide a data protection officer.

In chapter 3.2.1, an example for the distinction between four fundamental

Classification Categories of data types is presented, claiming different protection

needs. The next sections provide an overview on personal and open data especially

with regard to current EU law.

2.1.1 Privacy for Personal Data

As legal compliance is highly relevant for a City’s Smart Data Strategy the following

subsections describe principles of personal data based on law. They are mandatory

and refer to the EU (2016) General Data Protection Regulation (GDPR), which result

from the European data protection reform1. After a two-year transition period the

European Data Protection Regulation will be applicable as of May 25th, 2018 and

replace the Data Protection Directive (Directive 95/46/EC) from 1995. The regulation

updates the principles of the old directive to guarantee people’s right to personal

data protection.

The General Data Protection Regulation applies to everyone located/established or

resident in the European Union, who processes personal data through the use of IT

systems or through the storage in file systems, as well as for everyone who processes

data of persons in the European Union (GDPR, Article 1).

This section aims an increase in sensitivity to the handling with personal data through

data laws, but it is recommended to consult a Data Protection Officer as there exist

further regulations for instance on (artistic) copyrights or e-privacy.

Required aspects of privacy for personal data are outlined as key principles enhanced

with quotes from the GDPR in the following:

1 The European data protection reform was adopted by the European Parliament and the European Council on April 27th, 2016. The data protection reform package includes the General Data Protection Regulation and the Data Protection Directive for the police and criminal justice sector. More details are provided at https://gdpr-info.eu .

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 26

Personal Data shall be…

1) “...processed lawfully […].“ (GDPR, Article 5, 1(a)) and has given consent:

o necessary for the performance of the Data Subjects (see Glossary)

contract and of a task carried out in the public interest or in the exercise

of official authority

o for compliance with a legal obligation

o to protect the vital interests of the Data Subject or of another person

o for the purposes of the legitimate interests pursued by the controller or by

a third party

2) “… limited to what is necessary in relation to the purposes for which they are

processed.“ (GDPR, Article 5, 1(c))

o to the need

o to the least privacy-invasive option

o to time of process for which they are necessary

3) “… collected for specified, explicit and legitimate purposes and not further

processed in a manner that is incompatible with those purposes.“ (GDPR, Article

5, 1(b)) and otherwise needs to get consent again

4) “… kept in a form which permits identification of Data Subjects for no longer

than is necessary for the purposes for which the personal data are processed.“

(GDPR, Article 5, 1(e)), unless it will be processed solely for archiving purposes in

the public interest, scientific or historical research purposes or statistical

purposes (see GDPR, 156, 162)

5) “…processed […] in a transparent manner in relation to the Data Subject.“

(GDPR, Article 5, 1(a)) by providing information to Data Subject (e.g. purposes

of the processing), information on the actions taken on a request (free of

charge) and facilitating the exercise of Data Subject rights.

The Data Subject has the right…

o to obtain confirmation whether processing data or not and of access to

the personal data and information

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 27

o to obtain the rectification of inaccurate personal data

o to obtain from the controller the erasure

o to obtain from the controller restriction of processing (e.g. when the

processing is unlawful or the accuracy is contested etc.)

o to receive the provided personal data and to transmit those data to

another controller (e.g. when the processing is based on consent or it

doesn’t affect the rights and freedoms of others)

o to object on grounds relating to his or her particular situation

o not to be subject to a decision based solely on automated processing,

including profiling.

6) “… accurate and, where necessary, kept up to date.“ (GDPR, Article 5, 1(d))

Every reasonable step must be taken to ensure that personal data that are

inaccurate are erased or rectified immediately.

7) “… processed in a manner that ensures appropriate security of the personal

data […].“ (GDPR, Article 5, 1(f))

The controller and the processor are responsible for the implementation of

appropriate technical and organisational measures to ensure a level of security

appropriate to the risk and

o consider the state of the art, costs of implementation, scope, context,

and the risk of varying likelihood and severity for the rights and freedoms

of natural persons.

o consider especially risks regarding accidental or unlawful destruction,

loss, alteration and unauthorised disclosure of personal data as well as

access to transmitted, stored or otherwise processed personal data.

o using an approved code of conduct or certification mechanism to

demonstrate compliance with this principle.

o ensure that any natural person who has access to personal data does

not process them except on instructions from the controller.

8) processed with several Responsibilities and statutory obligations of the controller

and processor, who needs to

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 28

o erase it (e.g. when unlawfully processed, for compliance reasons, the

Data Subject withdraws consent, etc)

o notify the supervisory authority of a personal data breach, not later than

72 hours

o notify the Data Subject, when it is likely to result in a high risk to the rights

and freedoms of natural persons and carry out an assessment of the

impact and seek the advice of the data protection officer

o designate a data protection officer (e.g. when the processing is carried

out by a public authority or body

You will find an overview with wider explanation for each key principle listed in the

appendix.

2.1.2 Opening Administrative Data

Opening administrative data means making generally available documents held by

the public sector freely available to everyone to use and republish. This is seen as a

fundamental instrument for extending intellectual property rights and knowledge,

which are basic principles of democracy.

According to the EU (2003) differences in the rules and practices in the different

member states concerning the use and reuse of public sector data/information

prevents the full economic use of this resource. Therefore, a general framework for the

conditions governing re-use of public sector documents is needed in order to ensure

fair, proportionate and non-discriminatory conditions for the re-use of such

information. This idea is supported by the directive 2003/98/EC on the re-use of public

sector information (see objectives in the appendix 7.1.2).

The general principle of the directive states that:

‘Member States shall ensure that, where the re-use of documents held by public sector

bodies is allowed, these documents shall be re-usable for commercial or non-

commercial purposes […]’ (EU, 2003)

The directive includes principles covering the aspects ‘requests for re-use’, ‘conditions

for re-use’ and ‘non-discrimination and fair trading’ (see appendix).

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 29

2.1.1 Use Case Reflection

The Use Case „Smart Home“ also deals with personal data such as names and

addresses of the participating inhabitants in the project area. By defining the Use Case

the new EU-laws of data protection (the so called “General Data Protection

Regulation” (GDPR) for handling of personal data and the underlying principles like

data minimisation and others, as described above), were considered in a first step. At

last the open data strategies of the EU have to be considered.

Within the Use Case “Smart Home” specific user agreements have to be signed

between the participants and the responsible project partners (Smart Home Partner)

and between the other involved project partners (City, University…) to get and secure

the allowance to transfer and process certain personal data that is required to define

a reasonable analysis in context with the measured sensor data.

For the analysis of the ventilation behaviour of a flat inhabitant in comparison to the

energetic refurbishment status of the flat, it is necessary to know the address of each

participant (but not the individual family name).

The involved stakeholders (like e.g. the City and the University) need to define the data

that is required for their analysis requirements in advance, as the new EU-law allows to

collect and analyse data only for the predefined and specified scope. Collect and

store of other data for potential additional future evaluation is not allowed offhand. In

any case the concerned people need to have insight and agree to analyse this data.

Temperature and air-humidity do not seem to be privacy related data. If measured

inside a home however, temperature can give away whether a person is at home or

not. This data might seem not directly privacy related, but can be used to derive

private information. This shows that even in apparently non-critical cases the data

collection and handling has to be discussed and defined very carefully. The

allowance however to use the address for further analysis required a written

agreement with the flat inhabitant. This consent process therefore demanded to

clearly describe and limit the goals of the analysis and the further usage of the

analysed data and make this transparent to the flat inhabitant during the registration

process.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 30

2.1.2 Golden Rules

~ Precise analysis if any potential privacy related data is required or not. Knowing

this from the beginning is crucial for all discussions with Data Security Officers

and for the organisation of any definition of data processing

~ Define the required data transmission interfaces and data formats, especially

discuss the secure end-to-end internet transmission policies of the project with

the data stakeholders.

~ Offer an availability and easy accessibility of information for all inhabitants

about measures, goals and kind of collected data in a Smart Data Platform

(SDP) giving a transparent access to the project using e.g. a Transparency

Dashboard, a webpage to inform all participants on the status of all used data

within the project

~ Investigation if licenses (legal or commercial) for collected data are required

2.2 Internal Compliance

Compliance requires the ‘observance of the requirements of the general law and of

rules and regulations imposed by any regulatory bodies to which a firm is subject.’

(Edwards & Wolfe, 2005, p.55)

Participation

Transparency

Data Provision

Data Consumption

Problem Awareness

Requirements Specification

Requirements Specification

Collect Data

Benefit Models

Efiiciency Increase

Benefit for Stakeholders

Ch

allenges &

Pro

blem

s

Solu

tion

s and

Ben

efits

Smart Data Strategy

Political Support & Drivers

OperationsResponsibilities

Smart Data Lifecycle

Data Model

Smart Data Infrastructure

Legal Regulations

Internal Compliance

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 31

Compliance stands for conforming to a rule, usually with respect to laws. Besides

binding regulatory compliance it is usually necessary to adhere to further rules,

standards, policies, regulations and other requirements. So there are external rules as

well as internal rules, but not all of them need to be mandatory.

Internal Compliance comprises optional regulations and seals, which are specified by

the City (authorities) itself, e.g. for publicity reasons and for the public image of the

City. When defining a Smart Data Strategy, it is important to know these internal

policies and take them into account.

Internal compliance typically include rules and requirements with respect to:

– IT strategy

– IT governance

– IT security

– Information security

– Privacy levels

– Risk management

– Objects and their protection needs

– Necessary provisions in order to comply to a certain standard

In general for setting up a Smart Data Strategy it is recommended to know the City’s

or involved organisation’s internal compliance regulations in order to build on them

and integrate them in technical and organisational descisions.

2.2.1 Use Case Reflection

In the next step the Use Case „Smart Home“ was checked on the Munich IT-Strategy

in general and mainly on the guidelines for IT security and information security. By this

step the relevant people like IT-strategists and data security officers were involved in

the process so that the City specific guidelines and regulations or required processes

for handling of personal data like risk declarations and conformity declarations were

considered.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 32

As all data that is generated within the Use Case “Smart Home” will be stored in

external databases of project partners (Smart Home Platform and Smart Data

Platform) no further processes have to be considered within the City of Munich to

guarantee the safety of the data. Therefore the main task of the City was to provide

all relevant documents covering risk management and IT-security to the defined

partners and to elaborate and check whether all these guideline were covered

accordingly. More detailed information on IT-Infrastructure can be found in the

following chapters “Smart Data Infrastructure”.

In the Smarter Together project, the Use Case Responsible is a subsidiary of the City of

Munich und works mainly for the planning office of the City. Their main task is to advise

citizens and companies in terms of Energy Refurbishment issues.

Although the Smart Data Platform was built in a project environment that is only

connected to the existing City IT-network via defined interfaces, all data security

aspects and data Classification rules (see also examples in chapter “Data

Classification”) have to be considered and an underlying risk analysis has to be done

oriented on the existing compliance rules of the City of Munich. The handling of

privacy related information as well as the terms of use of collected data were

discussed and finalised in tight cooperation with the partners and the Data Protection

Officer.

2.2.2 Golden Rules

Stakeholder specific and comprehensible description of the City compliance rules

that apply for this Use Case must be defined and signed

~ Involvement of the Data Protection Officer of the City (assuring e.g. data

access security in a data center as well as data handling risk minimisation, in

contrast to the Data Security Office taking care about the legal data privacy

aspects) to understand, tighten and agree to the underlying Use Case

requirements from the beginning. (This Golden Rule applies accordingly in the

chapter “Smart Data Infrastructure”, but it is recommended to also involve the

Data Protection Officer as early as possible into the process, once the relevant

stakeholders dealing with the technical equipment are defined.)

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 33

Brief Summary of “Strategic Guidelines”

One fundamental new aspect of future data handling principles is based on the EU-

legislation “General Data Protection Regulation” (GPDR), applicable latest in spring

2018. It contains clear rules of how EU-cities (amongst others) need to behave when it

comes to data collection, data analysis and data privacy. Often the potentially

occurring problems when dealing with data are underestimated and might create

severe legal implication during or after the implementation of a Use Case. Therefore,

although the described legal rules seem simple and logical at first sight, it turns out that

any data driven Use Case (whether data collection, data analysis or opening

administrative data) needs to be checked and released mandatory by a City’s Data

Protection Officer.

A second aspect is to know and strictly follow the City Internal Compliance rules. It

might not be as momentous as an EU-law. Nevertheless, a City, when claiming and

introducing innovative new ways of “Smart City” or “Digitalisation” approaches, needs

to be able to refer to proven and reasonable own rules that are transparent and

known to all involved stakeholders including the citizens. In most cases the public

notice and a strict compliance of these rules form the base of a City’s positive data

handling image that is seen as one basic necessity when planning or performing Smart

City or digitalization projects.

2.3 Smart Data Infrastructure Basics

It is expected that, within the next years, a huge amount of new data will become

available. Examples include Floating Car Data, Car to X-Technology and Crowd

Sourced data in the mobility domain, which requires tools to access, store and

Participation

Transparency

Data Provision

Data Consumption

Problem Awareness

Requirements Specification

Requirements Specification

Collect Data

Benefit Models

Efiiciency Increase

Benefit for Stakeholders

Ch

allenges &

Pro

blem

s

Solu

tion

s and

Ben

efits

Smart Data Strategy

Political Support & Drivers

OperationsResponsibilities

Smart Data Lifecycle

Data Model

Smart Data Infrastructure

Legal Regulations

Internal Compliance

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 34

evaluate the data and to create services and user interfaces for various user groups.

Therfore a Smart Data Platform has to meet several challenges.

The following subsection summarizes strategic technical aspects that are relevant for

setting up an urban Smart Data solution by the City administration. As key topic a data

protection concept is recommended. For safeguarding data with adequate

technical means, guidelines for data security are given. Further, a strategic decision

on hardware location is necessary due to impacts on Legal Regulations. The main

recommendation is further to establish a central Smart Data Platform per City or urban

environment realizing a data integration, processing and analysis.

2.3.1 Data Protection Concept

Due to the complexity of data protection issues and the many prevailing

interdependencies between these topics, it is important to either base the Smart Data

Strategy on the existing data protection concept or to define such a concept in the

scope of the City’s Smart Data Strategy. The concept should be comprehensive and

seamlessly integrated within the organisational Smart Data concept. The most

important fields of actions to consider when developing a data protection strategy

are:

– Backup and recovery: the safeguarding of data by making offline copies of the

data to be restored in the event of disaster or data corruption. The backup must

be responsive to external demands in cases where users want to be deleted.

– Remote data movement: the real-time or near-real-time moving of data to a

location outside the primary storage system or to another facility to protect

against physical damage to systems and buildings. The two most common

forms of this technique are remote copy and replication. These techniques

duplicate data from one system to another, in a different location.

– Storage system security: applying best practices and security technology to the

storage system to augment server and network security measures.

– Data Lifecycle Management (DLM): the automated movement of critical data

to online and offline storage. Important aspects of DLM are placing data

considered to be in a final state into read-only storage, where it cannot be

changed, and moving data to different types of storage depending on its age.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 35

– Information Lifecycle Management (ILM): a comprehensive strategy for valuing,

cataloging and protecting information assets. It is tied to regulatory compliance

as well. ILM, while similar to DLM, operates on information, not raw data.

Decisions are driven by the content of the information, requiring policies to take

into account the context of the information.

All these methods should be deployed together to form a proper data protection

strategy.

2.3.2 Data Security

In general, data security is the degree of protection to safeguard an IT system. This

covers all the processes and mechanisms by which information and services are

protected from unintended or unauthorized access, change or destruction.

Data security, including logical security (authorisation, authentication, encryption and

passwords) along with physical security (locked doors, surveillance or access control),

has traditionally been associated with large enterprise applications or environments

with sensitive data. The reality is, that given increased reliance on information and

data privacy awareness, data security is an issue of concern for all environments.

Logical security includes securing networks with firewalls, running anti-spyware and

virus-detection programs on servers and network-addressed storage systems. No

storage security strategy is complete without making sure that applications,

databases, file systems and server operating systems are secure to prevent

unauthorized or disruptive access to the stored data. Implement storage system based

volume or logical unit number mapping and masking as a last line of defense for the

stored data.

2.3.3 Hardware Location

The location of the servers of a data platform is in that sense important as it affects

which regional, national and supranational data privacy regulations apply.

One option is that a platform is exclusively hosted in the premises of the Platform

Provider. Usually a Platform Provider has its own computer center where all platform

components could be hosted. This option would mean that the national data privacy

regulations of the Platform Provider are applicable as well as supranational data

privacy regulations beyond national borders as for example in the case of European

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 36

data privacy regulations. The drawback of this locally hosted option would be that the

reliability level is lower than a cloud hosting solution with redundant instances in

different computer centers of cloud hosting providers.

Another possibility would be a cloud hosted solution where the servers of the computer

centers are either located in one specific country or a distributed system with server

locations in different countries. Which data privacy policies apply in this case is

determined by the server locations of the cloud hosting provider and needs to be

considered carefully. Some cloud hosting providers make their server locations

transparent and even allow selecting the specific location of servers on which the

platform would be hosted. In general, it has to be stated that different levels of

reliability can be configured in each cloud hosting service. For research projects it is

usually a lower level than for business projects as result of cost-benefit-considerations.

Cloud hosting services also allow for a dynamic adjustment of server utilisation based

on load-balancing principles in times of higher number of accesses.

2.3.4 Data Integration, Processing and Analysis

In order to establish an organisational and technical Smart Data approach for an

urban environment or City it is recommended to establish a central technical Smart

Data Platform once per City. The platform shall not only store Data Sets, but also import

and integrate them just-in-time from technical interfaces of Data Sources provided by

Data Suppliers (see glossary).

Requirements for such a central infrastructure can briefly be summarized as follows:

– Data Storage and Integration

– Capability for Pre- and Post-Processing for Anonymisation and

Pseudonymisation

– Data management, Aggregation and Linkage

– Several User Interfaces for Data Visualisation

– Connection of analysis modules with the data management system

– Precalculation of coordinated analyses for faster response-time within the

application

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 37

– Data Categorisation and Rights Management

– Use Case independent and consistent data provision

– Formulation of specific data formats/shapes for later analysis

– Implementation of control mechanisms in order to uphold data quality

These central technical capabilities are the basis for the organisational concept

described in the following. For more details regarding the Smart Data Platform’s

requirements and architecture refer to chapter 5.

2.3.5 Use Case Reflection

The Use Case „Smart Home“ runs (like all Smart City Use Cases) in a secure Use Case-

owner controlled and trusted area to guarantee all needed aspects in terms of the

above mentioned frame conditions. The IT-infrastructure needed for the Use Case

“Smart Home” though was not integrated into the existing City’s IT-infrastructure. For

simplicity and flexibility reasons (during the project phase) all elements were installed

and controlled in the stakeholders’ IT-departments and in their specific built project

environments. All collaboration policies and data handling rules between every

involved stakeholder (excluding only the flat inhabitants) were included in the General

Agreement of the “Smarter Together” project. One existing and common agreement

between all stakeholders shortened the individual discussions and avoided the work

needed to set up custom-made agreements. Also, a common agreement between

all stakeholders underlined the transparency character of the Use Case handling in

general.

In Smarter Together the following IT-components have been built up for the Use Case:

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 38

Figure 3: IT-Components of the Use Case

– Smart Home-Platform (SHP) for storing Master Data of the participants and

measurement data (temperature and humidity); the server is also the Database

for the so-called “feel good app” operated by the Smart Home Partner

– Smart Data Platform (SDP) for importing and storing of the anonymised

measurement data and other Data Sources for energy Master Data of

buildings, for analysis and export data for external use

– Secure interfaces for importing data, referencing actual geo data from the geo

portal of Munich and providing data for external applications e.g. Smart City

App

All the platforms mentioned above were built up through the particular project

partners (external service provider, operator of the Smart Data Platform, City of

Munich subsidiary company). In the next steps all partners discussed and agreed to

the following parameters and definitions:

– All platforms are operated autonomous by the partners (SaaS or Cloud Service),

without installation of Smarter Together specific server components onsite the

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 39

City of Munich (only using interfaces to existing systems like the Munich geo

portal.

– Communication only via defined and secure data interfaces (JSON Rest) using

secure data transfer protocols (https)

– All used interfaces, API and data handling processes must be replicable and

based on open standards

– All Software designs of all platforms must be scalable and “cloud-ready”

– System architecture incl. the Data Gatekeeper Data Model are designed as an

open “Blueprint” for future reuse.

– All components can be used as a playground during the development phase

(prototyping …)

After defining the above mentioned parameters the companies started implementing

their components and functionalities based on this defined Use Case.

2.3.6 Golden Rules

~ Define and sign common technical processes and policies between involved

stakeholders also taking into consideration existing partner infrastructures, rules

and know-how

~ Discuss and plan the seamless integration of new Smart City-related

infrastructures with your IT-department (if required). An overarching new Smart

City platform or application normally will not be physically integrated onto an

already existing City IT-Server. Plan for a reasonable long test phase, running all

new Smart City related equipment on dedicated new servers, cloud

environments or in a Smart City “sandbox”.

~ It is not necessary to build up the whole smart IT-infrastructure on your own –

better think about starting smart and also use existing infrastructure and

functionality from your partners (early “Make or Buy” decision saves a lot of

discussions about later deep implementation into existing & complex IT-

infrastructure afterwards)

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 40

~ All components must have a user management for access control, defined

interfaces (e.g. JSON-Rest) and a secure data transfer (https) between all

components and data repositories

2.4 Smart Data Strategy

When dealing with internal and external data, it is most important to ensure backing

and sponsorship of the City’s mayor and municipal council. This can be managed by

defining a Smart Data Strategy and request for a related financial budget. As in the

scope of the whole concept, the term “City” can also refer to either a group of local

cities or towns or particular City district. ”Strategy” in this context means having a basic

plan for realizing a long-term goal. The term itself is more widely used on executive

level and helps to incorporate City leaders and to secure their wide support.

On the other hand, a short-term goal for each Use Case should be defined. Benefit-

models clarify the aimed (not necessarily monetary) profit from data provision of all

participants, partners and the public as well as benefits of stakeholders from data

consumption. The establishment of drivers of the Use Cases are part of the

Requirement Specification and the basis of the benefit models, which can be

determined in the agreement meetings but not further focused in this document2.

The following sections outline potential motivations and drivers for Smart Data. Finally

a general approach on how to proceed in order to set up a City’s strategy for Smart

Data is described and enriched with examples.

2 A good external reference is st gallen ST deliverable of the « Smarter Together » project.

Participation

Transparency

Data Provision

Data Consumption

Problem Awareness

Requirements Specification

Requirements Specification

Collect Data

Benefit Models

Efiiciency Increase

Benefit for Stakeholders

Ch

allenges &

Pro

blem

s

Solu

tion

s and

Ben

efits

Legal Regulations

Internal Compliance

Smart Data Strategy

Political Support & Drivers

OperationsResponsibilities

Smart Data Lifecycle

Data Model

Smart Data Infrastructure

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 41

2.4.1 Drivers for Smart Data

Why does a City want to become smart? Generally, the City’s field of action may

increase and multiple problems can be addressed in a more elaborated way.

A service and data platform may have assisting effects on the development of a City

district. Other potential reasons why cities get involved in Smart Data and decide to

become active as a Smart Data Platform Provider are:

– Benefit from existing data of Data Suppliers (e.g. company that delivers flat

temperatures)

– Benefit of Smart Data related to City infrastructure and link several data sets

– Enrich the citizen’s/user’s data with other administrative infratsructure data and

use for subject-matter insight analysis (e.g. on usage of cycle tracks)

More general, there are drivers for value creation that exist in every City. Those drivers

and ideas indicate how economic value can be generated in a Smart City and help

creating a viable business model with regard to the respective boundary conditions

of every City.

Drivers on the side of the public sector can be:

– Development of touristic sector and ecomomic growth

– Attract new industries to settle in town

– Sell valuable add-on City information to commercial 3rd party service providers

– (Environmental) sustainability through less emissions

– Efficiency improvements within City departments

– Cost savings by offering digital services

– Better and faster decision making

– Tackling problems with increasing traffic volume

Drivers on the side of companies can be:

– New markets and new revenue opportunities

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 42

– Efficiency improvements

– Growth in productivity

– Better and faster decision making and more precise planning

Drivers on the side of citizens can be:

– Cost savings in e.g. energy

– Time savings regarding e.g. public transportation and traffic

– Empowerment and engagement

– Value addition, e.g. obtaining additional information from the City

– Increase general attractiveness of City and quality of life

2.4.2 The Formulation of a City-specific Strategy

In order to gain first insights for being able to define a Smart Data Strategy, it is

recommended to perform a stakeholder analysis and interview key stakeholders

relevant for Smart Data. The targets that are part of the Smart Data Strategy should

be derived from general City targets and should be clearly formulated as a 3-5 item

bullet list. Further, a reference to internal compliance prerequisites as for instance an

IT strategy should be contained.

When defining a strategy for data handling it is important to consider these three

important steps as follows:

– Target definition: A general question for a City operating a Smart Data Platform

is why it should do so and to what extent it is assisting for its core operations. A

reference to the general City strategy may be helpful.

– Strategic Mission Statement: Derived from long-term targets, more concrete Dos

or Dont's should be formulated. The sooner possible realisation options are

eliminated, the less effort they cost.

– Plan for Realisation: In order to not lose sight of aspects, at least rough milestones

on a year or quarter year level should be defined.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 43

A City could e.g. define its target to corporate with industry for providing Smart Data

applications and position itself as a trustworthy partner for citizen’s services. However,

those two targets for instance are difficult to harmonize. Hence, for a Smart City

strategy it is all the more important to also define targets for general directives on

desired practices that go beyond compliance to legal regulations. For instance it

could be compliant to use (anonymized) citizen’s data from companies but not

desired due to a potential bad image of the City administration. Therefore, citizen’s

trust of their City and the City’s image in public could play a role for a strategic target

related to Smart Data and related compliance issues.

Possible areas of target definitions are:

– Overfulfill existing Legal Regulations (e.g. comply to a seal)

– External partners and internal stakeholders (corporations or only public

organisation)

– Opening administrative data to citizens as part of Smart Data Use Cases

(Legal Regulations see section 2.1)

– New organisational roles and efforts to be spent on employees

– Conditions and licensing (e.g. only use particular license contract)

– Data Classification (only process data with particular category)

– IT infrastructure (e.g. two databases for different purposes, use a particular

system or not)

In the context of this concept, a new Internal Compliance directive could be raised,

e.g. claiming City data services complying with a particular seal or regulation even if

not prescribed by law. As an important aspect this directive should be communicated

to employees and optionally to the public.

As an example a City could decide not to provide any fee-based services or not to

allow the development of smart apps by external companies. A typical directive

related to personal-data could be the proposition to citizens to comply to GDPR or to

a seal claiming more than GDPR.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 44

2.4.3 Use Case Reflection

For Smarter Together the City specific strategy in the City of Munich generally is

oriented on the definitions and implementations of the existing E-government and

Open-government activities (eo-gov, www.muenchen.de/eogovernment) as well as

on the currently evolving and adapting general Digitalisation Guidelines. One of these

specific strategies is to produce and to provide open data for the use of the City as

well as for the benefit of the broader community. Another aspect is to use as less

personal data as possible and to be as transparent as possible for our partners and the

citizens. The City of Munich aims to underline their goal and image of the City only to

collect and provide data for the benefit of the City and their citizens.

To achieve the goal to produce and to provide mainly open data, it is necessary to

check which basic data is needed and whether these can be used as open data. The

discussion and contractual binding agreements about which resulting data should be

classified as open was integrated into the early Use Case definitions.

City related data needs to be collected, treated and analysed for the wellbeing of

the citizens. More and more digitalization of all kind of City related processes create

valuable databases that need a neutral and non-discriminating handling and control.

The City of Munich sees its position to manage such data and behave as a neutral

“data gatekeeper” instance.

In this context the City of Munich compliance guidelines foresee to act as a

transparent and fair data handling partner with clear defined and published rules of

how and why the data collection is used to contribute to the City goals (see e.g

https://www.muenchen.de/rathaus/Stadtrecht/Informationsfreiheitssatzung.html).

One concrete driver-element of the broader Smarter Together Task “Low Energy

Districts” in Munich is to analyse temperature and air humidity data from dedicated

flats in the project area (being the “Smart Home” Use Case). The analysis of this data

has two goals for the City:

– One is to examine how inhabitants of energy-refurbished and non-energy-

refurbished flats behave over time concerning ventilation of their flats in

conjunction of the flat temperature and air humidity. This outcome of the

analysis contributes to improve the efficiency of future refurbishment projects in

the City.

– The other is to inform the inhabitants (via a dedicated WEB-APP) when an

optimal ratio between flat temperature and air humidity is reached and how

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 45

energy consumption can be decreased and mould formation can be avoided

by an adjusted individual ventilation behavior. Besides the possibility to

individually contribute to an innovative EU research project, this aspect is the

main driver and motivation for the flat inhabitant to contribute his/her data to

the project.

The underlying concrete motivation of the Smart Home Use Case for the City of Munich

is a measurable contribution to the overall City goal to decrease CO2 emission and

energy consumption in the City.

2.4.4 Golden Rules

~ Know your City strategy, e.g. in the context of IT, data transparency, E-Gov and

open data policies and also related to comparable overall City image building

or confirming backgrounds.

~ Apply specific Use Case to the benefit of the City strategy

Short Summary of “Smart Data Infrastructure Basics and

Strategy”

Although it is clear for any IT-department at the first sight, of how to set up a new data

infrastructure, the aspect of a Smart Data Platform, including a Data Gatekeeper

approach, requires some additional strategic considerations first. The driver and

challenge of such an additional IT-approach is not the technology itself. Most

questions arise, when discussing the City’s initial drivers for a Smart City approach.

What are underlying Business models or Benefit Models for a City? What are the

expected outcomes of the various analysis? Who provides data, who owns them and

who administers them? Which departments need to contribute their available data

(often seen as private department treasures) and who might get access to it? Last but

not least the question needs to be discussed if the Smart Data Platform incl. the Data

Gatekeeper will be installed on one additional City-wide and centralised IT-platform.

Also the discussion about which technical approach should be chosen (e.g. cloud

versus on-premise) often arises in these contexts. Hence, one general activity before

starting any Smart City, digitalization or Data Gatekeeper project should be, to design

and approve a clear strategic and infrastructure Smart City & digitalization guideline

that drafts the main pillars and drivers going forward.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 46

2.5 Responsibilities

City districts, towns or countys are organised in hierachical structures and particular

roles with corresponding Responsibilities. These structures can be used in order to

implement a Smart Data Strategy and harmonize existing roles within the

administration with new tasks and Responsibilities arising from the provision of smart

services.

Required Roles and Responsibilities in the context of a Data Gatekeeper strongly

depend on various aspects of a real City, e.g. how the City is organised, which

department-overarching structures are in place, which know-how is available, how

many people are available to staff a Smart City & Data Gatekeeper team, which

concrete collaboration policies are established in a City, etc.

The following discussion of roles and Responsibilities offers a broad view on which roles

need to be planned in a perfect City. Knowing that every City is different, various

aspects should be considered and its implementation impacts need to be discussed

before any final decision on roles is taken. The goal is not to implement as many as

possible theoretical roles, but rather find a pragmatic way of how to integrate the

main required Responsibilities into a City’s concrete situation, in order to not underrate

the integration complexity of an innovative Smart City & Data Gatekeeper project

over time.

Fundamentally, Smart City is being understood as a long-term function which as soon

as established brings benefits for cities as well as their citizens without temporal

limitation. Therefore it is not regarded as a program or project, which both terms

inherently carry end points with them by definition. With that in mind it is cruicial for

Participation

Transparency

Data Provision

Data Consumption

Problem Awareness

Requirements Specification

Requirements Specification

Collect Data

Benefit Models

Efiiciency Increase

Benefit for Stakeholders

Ch

allenges &

Pro

blem

s

Solu

tion

s and

Ben

efits

Smart Data Strategy

Political Support & Drivers

OperationsResponsibilities

Smart Data Lifecycle

Data Model

Smart Data Infrastructure

Legal Regulations

Internal Compliance

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 47

municipalities to overlook the whole transformation process and gain a solid

understanding of upcoming tasks and Responsibilities.

The following concept describes Responsibilities that all should be fulfiled by either the

City or by the external stakeholders in order to establish a complete Smart Data

compliance organisation. In exceptional cases it even may be possible for one person

to fulfil multiple Responsibilities – as soon as the dual control principle is not affected.

The following Responsibilities are categorized in the fields of Strategy, Participation,

Compliance, Operations, Data Provision and Data Consumption.

2.5.1 Strategy

On a strategic level it is recommendend to take Responsibilities for decision making

and managing.

Strategic Comitee

A decision making body is consulted in case of any strategic decisions or for solving

operative conflicts as escalation level. It serves as decision instance for deciding on

new Smart Data concepts or Use Cases.

Possible members of the administration for major strategic issues and organisational

design changes are:

– Mayor

– Responsible member of municipal council

– Main City-internal Data Suppliers

– External Data Suppliers as e.g. companies

Participation

Transparency

Smart Data Strategy

Data ProvisionData Consumption

Communication

Political Support & Mandate

Stakeholder

Operations

Coordination

Responsibilities

Smart Data Lifecycle

Data Model

SmartServices

Analyze Data

Process Data

Change Processes

Smart Data Platform

ICT Architecture

Data Sets

Use Cases

Data Rules

Process RulesProcessing Transparency

Problem Awareness

Requirements Specification

Requirements Specification

Collect Data

Monitoring

ClassificationCompliance Roles

Decider

Use Case Creation, Change & Deletion

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 48

– Data Protection Officer

Tasks:

– Escalation of Decisions

– Approve new Smart Data concepts, campaigns or Use Cases

– Decide strategically on new Smart City Use Cases and Smart Data topics

Strategic Manager

Moreover an additional Responsible for the data strategy is needed. This stakeholder

has to realize the comitees decisions on what kind of data the City would like to collect

and to publish through the Smart City Data Platform. The Strategic Manager (Could

be the Chief Digital Officer –CDO- of a City) must maintain an overview (e.g. in form

of a list) that protocols the City`s intern and extern data ownership. She or he can be

part of central decision making body.

He or she should be accompanied by someone (or himself or herself is) fulfilling an

internal strategic and communicative role that also elaborates concepts for new

Smart Data Use Cases or new features of the Smart Data Platform. He or she should

further moderate across City internal departments and promote Smart City Use Cases

for new digital City Use Cases.

Tasks:

– Leading person (e.g. CDO or municipal board member) responsible for the

Smart Data Platform

– Sponsoring and Smart Data budget main responsibility

– Elaborates Data Set usage agreements with suppliers

– Moderates and conceptualizes Use Cases

2.5.2 Participation

It is important to take responsibility for the management of public relations work and

other communicational tasks.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 49

Communication and Participation Manager

This stakeholder is in charge of all external communication by means of informing the

public and attracting attention about the Use Cases to public as an essential part of

the transparency principle. This communication raises awareness and increases the

interest and Participation of users, whose integration into the Use Case can be

accomplished by Participation events. Their organisation and moderation is the the

main task of a Communication and Participation Manager.

Tasks:

– Communicate Use Cases

– Public relations work

– Organize Participation events

– Continuous design companion of Use Case concept and development

– Moderate and provide users

2.5.3 Compliance

In the field of compliance it is recommended to take responsibility for data protection

on the side of the City and, if necessary, additionally on the side of the broader Use

Case (that could include external stakeholders).

Data Protection Officer

This role is key for the concept and must be staffed. She or he must hold an adequate

legal education. The reason that she or he has to proof that all legal requirements and

privacy laws are met in particular cases. However, as this cannot be carried out by a

non-technical person only, she or he has to have good moderation competencies

and collaborate with the operational roles of platform administrator, data analyst and

service developer. This might be challenging due to the multiple sources of data in a

Smart City environment which might all together jeopardize the privacy of its citizens.

The full transparency on data and the future use of it should be prioritized in any case.

Personal data must be handled cautiously and sensitive data that could conflict with

privacy issues must be anonymized. The corresponding person therefore acts in the

citizen’s privacy concerns and the best interest of the law and order. The role approves

the principle workflow but has no access to any privacy related raw data:

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 50

With regard to the Use Case Creation Process, the Data Protection Officer receives

the completed form (see appendix). On the basis of the form, she or he will be able to

understand the intended procedure and verify it accordingly. If there are no conflicts

or privacy issues she or he will approve the Use Case and the form can be forwarded

to the Use Case Responsible for the final approval. If there are conflicts or privacy

issues, she or he needs to describe the problem(s) and either decide what measures

need to be taken or suggest possible alternatives in order to ensure that the Use Case

is legally acceptable. The Use Case needs to be revised thereupon and again

approved with regard to data protection standards. This process continues until she or

he can guarantee that it complies with all relevant data protection law and approves

on the part of the City.

Tasks:

– Decide on necessary data processing steps in order to prevent possible legal

violations

– Ensure compliance to Legal Regulations as e.g. data privacy

– Responsible for legal aspects and pre-check of final approval of data provision

– Ensure compliance of internal and external data usage for the whole data

platform and specific Use Cases

2.5.4 Operations

For Smart Data Operations the Responsibilities for platform administration, analysis and

service developement occur. Their tasks can either be set up in a newly founded

department or put in external charge.

Platform Administrator

Data Security and reliability of the Smart Data Platform and related systems has to be

guaranteed. In this concept, this is ensured by two roles. The Platform Administrator for

example therefore connects new Data Sets technically to the platform and provides

support to the data responsible on supplier side. He or she administrates the technical

systems Analysis Dashboard, Transparency Dashboard and Platform Statistics

Dashboard (see chapter 5).

Tasks:

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 51

– Ensure working data platform to all stakeholders

– Collaborate with Data Protection Officer for Data Security and privacy issues

– Maintain access control following the “need to know” principle

– Provide internal and external support

Data Analyst

Any person concerned with insights from data or aggregated data inside or also

outside the City administration could become a Data Analyst. Typical eligible persons

may be employees of for instance the mobility or other subject-matter departments in

a City or district administration. Analysts may use any created Data Sets and

visualisations in case a respective purposes of use has been granted by the Data

Subject (see glossary) or owner.

Tasks:

– Access various Data Sets and analyze them for granted purposes of use

– Provide Data Analysis results to the Use Case Responsibles

Service Developer

The responsible person within this task is responsible for preventing data security flaws

on the application/hardware level (compare section 2.3.2). In case the hardware or

software app involves user interactions as e.g. with citizens he or she needs additional

competences in user experience and usability engineering and interaction design.

Otherwise an additional responsible for conceptual user interface design and

evaluation may be needed.

Tasks:

– Develop App or Hardware solution

– Prevent security flaws on application level

– Ensure availability of Use Case and consult City

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 52

2.5.5 Data Provision

In the field of data provision the Data Owner and Data Administrator occur and are

described in the following.

Data Owner

The Data Owner is responsible for the Use Cases Data Sets. He is the project manager

at side of the supplier organisation (e.g. company) and hence involved as a

stakeholder.

He bears the responsibility and is therefore required to sign the form, containing all the

information listed in the previous chapter, at the end and thus approve the Use Case.

Therefore, this role should be taken by an executive.

Tasks:

– Main Responsible for data

– Defines general or specific purposes of use data may be used with

Data Administrator

The mentioned responsibility is primary having technical related tasks. She or he may

for instance be responsible for an application programming interface (API) provided

to the Platform Administrator. He or she ensures that the data is provided according

to the contract and monitors this process.

Due to her or his technical know-how she or he should be involved in the Use Case

Creation Process, especially when discussing technical related parts and filling in those

in the form.

If there is already another person responsible for this, it is necessary that in any case

someone with technical expertise and inside verifies the details made (in the form) to

ensure that they are correct and that it can be implemented as described.

Tasks:

– Technically supply data to the City data platform

– Provide technical information in Use Case Creation Process

– Supply support to City and Use Case developers regarding data

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 53

– Finally approve data provision

2.5.6 Data Consumption

In the field of data consumption, persons from City departments, companies or citizens

can elaborate their own Use Case, become a Use Case responsible, request for

particular smart services and therefore, profit by the results of the Use Case – the Smart

Data.

Use Case Responsible

An internal responsible is needed for every Use Case, whereby it is possible that a

person is the owner of more than one Use Case. She or he is in charge for the Use Case

and is therefore required to sign the completed form.

The tasks should be taken by a person involved with the topic and the other

stakeholder(s), like a City department manager.

Tasks:

– Responsible for subject-matter aspects of a Smart City Use Case

– Ensure running operation and provide support

– Analyze usage and use data for City-internal or external purposes according to

usage agreements

If the Use Case that shall be run on the Smart Data Platform is a City internal project,

the responsibilites may be taken over by persons within the municipality. In general it

is possible that a person can have the same or another responsibility in different Use

Cases. Especially as soon as many Use Cases are run on the platform however it is

sufficient to staff the data owner from respective administrational departments.

Especially for municipalities which are often understaffed and often missfinancial

ressources or technical competences it is recommended to fall back on external Data

Supplier and solutions.

If data for a Use Case is provided by a City-external supplier, there must be an

additional data compliance approver from the external organisation who supervises

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 54

the published data and thus guarantees that it complies with relevant data protection

law from the other organisation’s perspective.

The glossary contains explicit examples on how these listed tasks may be allocated to

specific roles within the Smart Data Platform.

2.5.7 Use Case Reflection

Although the previous chapter discusses the Responsibilities in an extensive and

detailed way, it must be clear that not all of these theoretic tasks, roles and

Responsibilities are realisable without any restriction in any City. Each city initially has

its (gr)own rules and hierarchical facts. Nevertheless it is important to know that any

new or innovative Smart City project approach will challenge the existing structures.

(Perhaps in some cities it is even one of the main tasks of a Smart City project to reflect

and challenge existing out-of-date processes and propose new ways of thinking or

break up structural silos within the organisation).

In most cases a Smart City is an overarching project approach that requires support

and commitment from various (if not all) established departments in parallel. A top

down support and backing therefore is the main and first lever to get involvement and

real commitment among the various operative departments of the City.

In Munich an interdisciplinary team staffed by all relevant departments of the City was

established from the very beginning, creating the needed common rules and goals

for the project collaboration. As all participants are experienced and highly motivated

specialists with a good knowledge of the standard City processes, all described

Responsibilities in the City were covered. In some cases (also in order to break up silos)

a combination of Responsibilities was given over to single staff members.

The Use Case „Smart Home“ needs to be integrated in the existing processes and

policies of the whole City. In some cases although new roles had to be defined (and

staffed) to cover new processes like the co-creation process (see dedicated

description in this document) within the project.

Within the Use Case “Smart Home” the following main roles were defined:

– Flat inhabitant

– External Service Provider

– University

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 55

– City

– Operator SmartDataPlatform

– City

The impact and understanding of this chapter was a very important prerequisite for

the definition of the Use Case.

First the stakeholders and their project specific roles, Responsibilities as well as their

motivation to participate had to be discussed and straightened. Even in such a rather

simple process the number of relevant stakeholders was remarkable:

The Flat Inhabitant is owner of the raw data and shares them with the project on a

volunteer base. He/she has to agree to share the raw data and potential other

information needed for the scientific analysis.

The External Service Provider owns the process of collecting the data, defines the

contract incl. all legal aspects with the inhabitant, designs and runs the WEB-APP and

transmits the necessary data for the scientific analysis to the operator of the project

data base. A data processing supply contract with the operator of the data platform

has to be negotiated as well as a Classification of the transferred data that determines

the usage within the Smart Data Platform (SDP).

The University as scientific analyst of the data had to sign an agreement that describes

the proceeding and goals of the analysis and determines the legal usage of the

analysed data.

The Operator of the SDP had to sign a contract that defines the legal and technical

usage of the transferred and analysed data as well as the complete service of the

platform.

The City (forming one internal Smart City team including participants from all relevant

departments and subsidiaries) acts as responsible and very visible overall project

partner. It had to supervise and steer the complete end-to-end process. The City is

interested in the analysed data for their future Smart City replication activities. Also in

any digitalization project it is very important for the image of the City to comply with

all legal and City internal laws and rules and communicate with the citizens and

project stakeholders in a most transparent way.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 56

2.5.8 Golden Rules

~ Clear Definition of all stakeholders in advance including their motivation and

roles simplifies the discussion and interaction during the project.

~ The formation of an interdisciplinary “Smart City Team” including all relevant

City departments and strongly backing them with a top-down support (e.g. by

installing an upper management Smart City steering committee incl. the major

or his/her deputy) is one of the (perhaps even the) most powerful levers to

overcome potential and established communication barriers between grown

departments of a City. Without an open and honest inter-department

communication based upon common goals it is unlikely to successfully execute

an innovative Smart City driven project.

2.6 Participation

Before starting any Smart City project that includes the necessity to generate and

evaluate data, the responsible project-lead / stakeholder has to describe which raw-

data needs to be retrieved or collected and where it should derive from. In this context

it needs to be determined, which data evaluation or information analysis is requested

and in which format the outcome needs to be made available as well as for whom.

Such an overall description of goals and items is called „Use Case“.

In the future, Use Case descriptions in the context of a Smart City more and more

depend on data input coming from sensors. These sensors might be placed in private

or public space to automatically provide raw-data and even aggregate data

(depends on sensor). Sensor data can e.g. comprise traffic counting or measurement

Participation

Transparency

Data Provision

Data Consumption

Problem Awareness

Requirements Specification

Requirements Specification

Collect Data

Benefit Models

Efiiciency Increase

Benefit for Stakeholders

Ch

allenges &

Pro

blem

s

Solu

tion

s and

Ben

efits

Smart Data Strategy

Political Support & Drivers

OperationsResponsibilities

Smart Data Lifecycle

Data Model

Smart Data Infrastructure

Legal Regulations

Internal Compliance

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 57

of air pollution. This data allows e.g. for more focused urban development planning in

dedicated focus-areas or supports other Use Case goals.

The installation of sensors in public space (e.g. to detect available parking places in a

certain district) and possible resulting new services for the public can induce a high

degree of uncertainty or scepticism of e.g. citizens, especially if they are not well

informed about the nature and goals of the sensor installation. Therefore and

irrespective of necessary legal requirements or City-internal policies, it is

recommended to actively involve the affected citizens in an early stage of a Smart

City project. This is done to achieve a best possible local Participation. This can be

done by co-creation workshops for basic elements of a Smart City (mobility solutions,

energy efficiency, etc.) or more detailed co-creation workshops e.g. for the use of

sensors. It is also helpful to organize new Participation formats like barcamps or to bring

individual questions to specific target groups in order to get new and fresh ideas as

input for the co-creation workshops. Young people and students e.g. are often taking

part in govjam's and the local ICT-community is interested in hackathons.

Even in case target groups are not directly or personally affected by a data collection,

data analysis or data visualisation (e.g. in Use Cases that deal with “local

meteorological data”, “anonymous traffic counting” or “general utilisation of mobility

hubs”) the organisation of appropriate information measures is key to allow for a

successful Smart City project.

A so-called “co-creation process” can be helpful to concretise or challenge goals

addressed by the Smart City project. In context with the Data Gatekeeper concept

the co-creation process is understood as a number of successive workshops with

relevant stakeholders located in different areas of expertise reaching from potential

data providers, over City administration personnel, industry partners to citizens as well

as local residents. The content of these workshops exceeds information, discussion and

consultation, and includes collective idea generation sessions and collaborative

development of concrete solutions/ recommendations for the implementation. It is

necessary to receive input about which solutions are acceptable, but also to get an

idea about which kind of data-collecting is not tolerated by local stakeholders. The

overall workshop goal is to uncover diverse stakeholders’ views on data security and

related controversies. Furhtermore, it is important to determine which scenarios

consisting of appropriate technology, local measures and planned visualisation of

data shall be further developed and which are not suitable for the specific context.

Thus, the workshops need to be individually designed according to a local context

(topic, location, legal frameworks, timeframe, planning practices etc.). The outcome

of the stakeholder involvement should be an increased mutual understanding,

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 58

substantial contribution to and acceptance of a planned Smart City project, or a part

of a project.

2.6.1 Use Case Reflection

For the Use Case „Smart Home“ it is important to involve the participants very early in

the process because doing Smart City projects is not only technical integration it also

deals with social integration and collaboration with the citizens.

Due to the fact that in the Use Case Smart Home the required functionality was

already fixed –besides the normal information process- it did not make any sense to

perform any co-creation processes. In the following example we therefore describe

a comparable Participation process as part of the co-creation process that was

conducted in the affected City district of Munich. In this other Use Case (dealing with

the sensors in the intelligent lampposts) we offered several “creative workshops” and

bar camp sessions to define desired functionalities.

Several months before the planned implementation phase, the local citizens of the

project area were guided towards the topic of “Smart Home” within the following

activities:

1) Public events introducing the topic broadly and sensitising possible concerned

groups like e.g. inhabitants of the complete project City district, dedicated

inhabitants of the block of flats to be potentially refurbished

2) Additional meetings with interested citizens in the project meeting-hall

demonstrating the functionality at a demo wall showing all needed

components of the Smart Home

3) Info booth and discussions with citizens at the local district festival conducting

the campaign “become a smart home pioneer”

The local citizens of the project area were guided towards the topic of “sensors”

within the following process:

1) Meetings and discussions in the district lab presenting the system with slides and

demo equipment

2) Additional meetings with interested citizens demonstrating the functionality at

a demo wall showing all needed components and the GUI (feel-good-app) of

the Smart Home including following aspects:

o Live demo of Smart Home

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 59

o What is a Smart Home and what is my benefit?

o Presentation of additional Smart Home components (security …)

3) Info booth and discussions with citizens at the local district festival conducting

the campaign “become a smart home pioneer”

o Discussions and demo of the Smart Home

o Photo-session for the children “Smart Home Pioneer”

o Presentation, discussion and documentation of the outcomes

Description of the Co-Creation Workshops that were organised by Smarter Together

Munich for the innovative lamppost sensors approach. In close co-operation with the

City's project stakeholders, the Technical University of Munich (TUM) defined and

conducted the following series of workshops:

The professorship for Participatory Technology Design is based at the Munich Center

for Technology in Society and the TUM Department of Architecture. In the context of

Smarter Together it has been in charge of enabling instants of co-creation in support

of the tasks mobility, energy and technology. Therefore customized co-creation

processes were designed for several Smarter Together solutions in order to make sure

that possible public concerns and requirements will be incorporated into the

planned infrastructural enlargements. In principal everybody is invited to contribute

to the co-creation processes, however they are not designed for the involvement of

a large amount of people and hence are no equivalent for a broad public

engagement. In fact the goal of the co-creation process is to establish an ongoing

collaboration between City officials, concerned residents and other relevant civil

society actors throughout the ideation, planning and implementation phase of the

respective solution.

A significant element for the success of co-creation processes is the openness of all

responsible stakeholders to disclose certain components of the planned projects for

additional stakeholders. Co-creation is not merely collecting and evaluating ideas,

but to allow for the co-design of concrete aspects within a project.

In the following example a co-creation process is described that was conducted in

Munich:

The local citizens of the project area were guided towards the topic of “sensors”

within the following process:

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 60

1) Public events introducing the topic broadly and sensitizing possible concerned

groups:

o Panel discussion on the use of gamification principles for Participation

and sustainable urban planning

o Moderated exhibition: What are „smart services“? Do I need them? And

what do they need from me?

2) First co-creation workshop to make the smart lamppost topic accessible,

dissolve strong hierarchies of expertise and stimulate collaboration

o Short Input on the planned measures

o Playful group activity on the questions: What is a sensor and what can it

do?

o Presentation, discussion and documentation of the outcomes of the

game

3) Second co-creation workshop to develop ideas for useful sensor-based services

o Take home task: Find an everyday situation where a sensor-based service

could be helpful

o Group activity based on previous workshop to define desirable and

undesirable services (i.e. impact on privacy or quality of stay, usefulness,

etc.)

o Presentation, discussion and documentation of the outcomes

4) Third co-creation workshop to develop recommendations to be incorporated

into the planned open call for sensors

o Site visits for “reality check” of imagined services

o Expert-input from legal scholar on data protection and privacy in smart

cities

o Collective formulation of recommendations concerning 1) technical

requirements, 2) functional requirements, 3) “no go” criteria, 4) data

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 61

ownership, open data and licensing as well as 5) transparency

requirements.

All data collection that local citizens define as “no go” during the workshops shall be

bindingly described as such in the related Use Cases. All other data-related items can

be included into the Use Cases, under the condition that potential boundary

conditions for future implementations will be maintained. Examples for boundary

conditions could be that dedicated sensors can be used to collect data, when

dedicated technical or organisational guidelines can be guaranteed and will be

documented in a transparent way. A continuous communication/ collaboration on

identified and not yet identified issues is highly recommendable.

2.6.2 Golden Rules

~ A Smart City without a reasonable social integration or Participation process is

not really smart

~ Design of processes need the involvement and acceptance of the employees,

partners and citizens

~ An innovative and professional Participation or Co-Creation process sometimes

cannot be performed yet by existing City stakeholders. The special skills required

for such a process often are only available from dedicated agencies or

University departments and need to be stepwise built up or elaborated by City

employees.

Short Summary of “Responsibilities and Participation”

At a first glance, discussions about “Responsibilities” or “Participation” do not seem to

have much to do with the described Data Gatekeeper item. But the more one is

thinking about the implications of what a Data Gatekeeper approach within any Use

Case really means for an innovative and data driven Smart City environment, the

more the question arises about how a city needs to change or adapt old behavior to

be able to respond to the upcoming challenges in the “new data driven world”, also

from an organizational perspective. Any Smart City approach comprising the

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 62

described Data Gatekeeper approach always includes both, technical and social

integration aspects.

Often the established roles in a given City organization and the traditional interworking

between departments is not designed to support “data driven benefit models”.

Interdisciplinary teams need to be set up and well supported in their new roles from a

“top down” perspective. The City’s own role to enable structured and honest citizen

Participation processes often has to be designed from scratch too.

It is not the theoretical approach of how a perfect team should look like, it is more the

pragmatic approach to challenge outdated collaboration traditions in a city and to

allow, trial and optimize more “Use Case driven collaboration methodologies” for a

future and data driven Smart City.

Smart Data Creation Guidelines Smart data results from a Smart Data Platform by collecting and processing data from

several separate Use Cases. Therefore, the main aspect of Smart Data creation is the

creation of Smart Data Use Cases. The following chapter describes a detailed process

that needs to be initiated at the beginning of every Use Case creation and passed

completely once per Smart Data Use Case.

The following subsections describe the workflow of the creation of a concrete Smart

Data Use Case, which includes the following necessary steps:

– Create a requirement specification

– Classify each Data Set

– Define a data usage contract/agreement

– First Quality Gate and approval by all Use Case stakeholders

– Describe Data Sets and consider technical implementation issues

– Choose adequate data processing methods

– Second Quality Gate with technical acceptance by all stakeholders

A checklist for each step is attached to the document.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 63

3.1 Requirement Specification

As soon as the responsible bodies have initiated the Use Case creation the

requirements specification starts as first activity of the Use Case Creation Process.

When creating a Use Case, several aspects need to be considered and specified

beforehand. On the one hand classical requirements engineering questions regarding

the goals of a user are relevant like User goals, preconditions and other Use Case

descriptions. On the other hand first considerations for data Classification are in

demand. Following subsections give an overview on these aspects including a brief

specification checklist for validation and an agreement meeting.

3.1.1 User Goals

A Use Case defines procedures and interactions between actors and/or systems to

achieve a specified goal. Therefore, a Use Case description includes information

about the goal that should be achieved and the activities and steps that need to be

taken in order to achieve it. Furthermore, it specifies the steps that need to be taken

for achieving the described goal as well as technical requirements regarding the used

system.

Every Use Case needs a detailed description of the analysis purpose(s), its underlying

motivation and goals as well as the requested results, including their presentation.

Different roles may have different goals within the Use Case.

3.1.2 Use Case Project Information

In addition, it contains comprehensive information of all needed steps and

requirements to get a solid overview of the Use Case’s project in order to be able to

understand the context completely.

Smart Data Creation Guidelines

Chapter 3:

Quality Gate 1

Quality Gate 2

3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8

DataProcessing

RequirementSpecification

Data Usage Agreement

and Licensing

Data Categori-

zation

Data Classification

Data Sets and

Technical Implement

ation

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 64

That means in effect that a Use Case defines all required data (sources) and it specifies

the data streams and processing steps up to the final requested result(s). Meaning

further that a Use Case is always project based.

Knowing the objectives of the data analysis in all details and at an early stage,

simplifies the determination of whether potential data from other sources are needed.

Further benefits are being able to ensure data security and data privacy. In order to

achieve that, for every Use Case the privacy Classification level (see chapter 3.4)

needs to be specified in advance according to its protection needs. Showing on the

one hand if special protection is necessary due to privacy risks, on the other hand it

reflects further conditions that were agreed on.

Describing the exact purpose and approach from the beginning simplifies discussions

about data handling, data Classification, data exchange, technical interface

requirements as well as possible data licensing, data security and privacy. A detailed

documentation further allows the detection of errors in advance and makes it possible

to respond accordingly. Moreover, when handling personal data, law prescribes a

detailed documentation, like described in chapter 2.1.

3.1.3 Specification Checklist

A Use Case should therefore contain:

– A basic Use Case description including:

o An identification number

o A unique name

o Analysis purpose and motivation/business model

(what should be analyzed and why)

o Task assignments: How to achieve the goal

(indicating every single step and the persons involved)

o Analysis results (output)

o A precise description of the data that is needed

o The time period

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 65

o Where applicable: Design ideas of the analysis

o Where applicable: Exceptions or special cases

o Where applicable: Specific requirements or special conditions

– Drivers for the Use Case (see 2.4 strategy):

o Benefits from data provision of all participants, partners and the public

o Benefits from data consumption of all participants, partners and the

public

– Information on Project Stakeholders and Use Case specific Responsibilities:

o Definition of contact persons

o Assignment of required tasks

o Where applicable: Definition of further required tasks

o Where applicable: Assignment of new task(s)

– Technical Requirements:

o The kind of Data Set (e.g. data base/sensor) and the location

o Data format(s)

o Interfaces for the import and export of the data (API access)

o Time and rate of exchange

o Where applicable: A description of further specific requirements or

special conditions (modalities)

– Suggestion for Data Processing and Exchange

o Type of contract for data sharing/usage

o A Categorisation (see chapter 3.2) of the data involved

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 66

o Potential necessary Pre- and Post-Processing measures for

Anonymisation and Pseudonymisation

All this information should be set down in a written agreement. The creation and use

of a standardized form is recommended, which should be filled out compulsory for

every Use Case.

3.1.4 Agreement Meeting and Approval

As soon as the specification checklist is filled out in a first draft version, the responsible

person or bodies initiate and moderate a Use Case agreement meeting. The relevant

stakeholder are defined within the Tasks and Responsibilities 2.5.

All kind of Data Analyses queries have to be specified and confirmed by all

stakeholders in a discussion in order to check legal requirements in the next steps.

A clear benefit model, taking into consideration all stakeholders (data providers,

analysis benificiaries, others), needs to be defined.

In a next step the technical feasibility needs to be approved. Hereby especially the

proposed data formats and interfaces are evaluated. Finally a legal check of the

Requirement Specification is carried out.

In the scope of this concept basic information on data classification as well as allowed

purposes of use (incl. approval of open data, restrictive use or exclusive use of Data

for defined target groups etc.) are defined. It needs to be determined which

granularity of information should be made open to the public by means of the

Transparency Dashboard (chapter 5.6). Therby an image of transparency is suggested

to public that helps for establishing the city or urban district as a trustworthy partner.

3.1.5 Use Case Reflection

For any Use Case, including „Smart Home“, it is important to define clear and

complete requirements from the very beginning. Otherwise you risk to get incomplete

solutions and expensive work afterwards. All requirements and possible data-usage

restrictions from all stakeholders should be defined at this point.

The overall goal of the use case “Smart Home” is to measure the efficiency of

refurbishment activities before and after the refurbishment and give inhabitants

additional information on their ventilation behaviour via the so-called “feel-good-

app” which indicates the impact of ventilation activities.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 67

Therefore about 400 starter sets including each 2 temperature/air-humidity sensors and

a base-station were offered “free of cost” to the flat inhabitants of the project area.

The requirement specification of the Use Case “Smart Home” was basically divided

into the following processes:

– Data import from the City’s own “Energy Master Database” into the Smart Data

Platform (energetic Master Data information of buildings in the district) are

prepared and imported into the SDP to gain needed master information on flats

and buildings where owners have been advised and flats and buildings might

be refurbished. The classification (usage restictions of data) was defined.

– Data import from the Service Provider’s „Smart Home Platform“ into the Smart

Data Platform (sensor data (temperature and humidity) have been collected

in selected flats, measurement data from sensors are transferred via the base

station to Smart Home Provider, Smart Home Provider prepares data and makes

them anonymous and transfers data to the SDP). The classification (usage

restictions of data) was defined.

– Data import from „Deutscher Wetterdienst” (DWD, German Weather Forcast

Service) into the SmartDataPlatform (to receive the external temperature and

air humidity conditions on a regular base)

– Analysis of the required indicators for the EU-reporting (SDP imports data and

provides several functionality for maintaining, analysing, presenting and

exporting of data, analysis algorithm based on specifications worked out by

TUM, University of Munich )

3.1.6 Golden Rules

~ Glossary definition at the beginning – always keep in mind that a Use Case is

not necessarily a familiar tool for the involved City departments and other

stakeholders. The Use Case content should translate between the customer’s

requirements and the future IT-solution.

~ Clearly understanding and describing the underlying department needs, as well

as the requested analysis demands, might take several steps for all involved

people. A reasonable time schedule should be considered for this work.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 68

~ The more precise the definition of all requirements from the beginning, the less

work and “misunderstanding” arise during the implementation phase.

~ Clear and detailed description of all required analysis results (“how should the

output for the involved departments look like, what is the intended output

format, etc.”) is a key element for a successful implementation.

3.2 Data Categorisation

Data categorisation is understood as the assignment of data privacy and security

levels to a Data Set stemming from one Data Source. A categorisation can also

include economic or other important aspects that might restrict or open a data set.

This is necessary due to compliance reasons that result from legal or City-internal rules,

which are listed above (see chapter 2.1 and 2.2).

One possible realization of proceeding can be done in the following 2 steps:

- Within this concept Data Sets first are assigned to different previously specified

categories with regards to their protection needs. The protection needs of each

Data Set result from legal requirements and demands laid down by the Data

Owner or Data Provider (Categorisation).

- Second, the Data Set is rated in 4 Classification Dimensions (levels: 1–4).

Labeling data before entering the Smart Data Platform ensures legal and safe

processing.

__________________________________________________________________________________

Remark: In this Data Gatekeeper concept we chose for a 2 step approach to raise

the awareness of how data Classification can be done in detail. We first described 4

possible Categories followed by 4 Classification types, each with an additional 4-step

level rating approach. Please keep in mind that this procedure is an example but not

the only possible approach. Any reasonable existing methodology can be fine, as long

as it is used continuously within a City or a project.

Smart Data Creation Guidelines

Chapter 3:

Quality Gate 2

DataProcessing

RequirementSpecification

Data Usage Agreement

and Licensing

Data Categori-

zation

Data Classification

Data Sets and

Technical Implement

ation

Quality Gate 1

3.1 3.2 3.3 3.4 3.6 3.7 3.83.5

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 69

In case the available raw-data is not useable for an analysis within the context e.g. of

the existing data privacy laws and/or city policies, using pre- and postprocessing

algorithms (like anonymisation or pseudonymisation) for data (e.g. at the data

provider or within the Smart Data Platform) might allow changes in the Category and

Classification of data. A solid anonymization algorithm often is a prerequisite to allow

the analysis of data that originally was not allowed to use (e.g. due to privacy

constraints). The application of this aspect in close cooperation with the data security

/ data protection officer can be a very powerful method to overcome potential

privacy barriers, without violating or even touching any privacy law or policy.

Introducing e.g. innovative anonymization algorithms into a Smart Data Plattform can

change the complete categorisation and classification scheme that originally was

required by the data providers. The present version of the Data Gatekeeper concept

does not examine these potential dynamic aspects in depth. Possible Categorisation

and Classification changes facilitated by robust and reliable algorithms of a Smart

Data Platform need to be examined more in detail in future extrapolations of the Data

Gatekeeper concept.

__________________________________________________________________________________

According to this exemplary approach, (see Annex Data Categorization and

Classification), the following section describes the four Categories from open data to

personal data. Depending on which individual use case or Categorisation scheme is

used in other projects, the following process description can vary.

3.2.1 Categories

In the given context, a distinction is made between four fundamental Categories of

data types, claiming different protection needs:

- category 1: open data

The main focus lies on topics concerning the prerequisites for passing on the

data and respective user agreements, licensing, and further agreed on

conditions. Data in this category is not subject to any privacy restrictions.

Meaning there are no personal data or personal related data therein. This data

has the lowest level of protection need out of the four Categories.

- category 2: non-personal data but not open (organisation internal)

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 70

Data in this category cannot be unequivocally classified as public data, but is

not subject to any privacy restrictions. Meaning it does not contain any

protective information about persons and cannot consequently be used for the

violation of individual personality rights.

For the reason that the data were not (yet) released/marked as open data,

there might be restrictions regarding their use, processing, distribution or

storage. Topics that need to be considered within this category therefore

include usage and distribution rights and agreements as well as licenses and

further restrictions.

- category 3: non-personal data but restricted

Data belonging to this category cannot be unequivocally classified as personal

data, but contains a (potential) risk that under special circumstances this data

can be attributed to persons (or other restricted areas with a certain degree of

risk potential) and therefore needs to be treated accordingly.

This means that data in this category does not contain any direct personal or

privacy related information. It is however possible, when linked or evaluated

with other information, that the data allows to extract individual information,

such as the behavior of small groups, or even conclusions about an individual

itself. Furthermore, data in this category can be designed in such a way that

there is a certain (individually specified) risk of misuse, even without any person-

related information being involved.

When processing this kind of data, it needs to be verified and ensured, that the

processing does not allow any kind of traceability to individuals or micro groups.

If this is not possible, the data is either not allowed to be processed within the

Smart Data Platform or it needs to be treated as personal data and hence

compliance to all privacy regulations needs to be ensured.

- category 4: Personal data

Personal data is highly sensitive and confidential data and hence has the

highest protection need. When processing personal data, the responsible or

assigned stakeholder is obliged to create a detailed written documentation of

the reasons for the processing including possible risks and all taken measures to

ensure data privacy and security. If it is not absolutely necessary for the

fulfilment of a use case, personal data should not be processed. In some cases,

however, it may be necessary to process personal data, which is possible if the

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 71

principles in chapter 2.1 and 2.2 are taken into account and adhered to. There

are different possibilities on how to transform (personal) data within a platform

so that it cannot any longer be referenced to an individual. Measures that can

or need to be taken can be found in chapter 3.7.

For more details see Checklist Template in the Annex.

3.2.2 Use Case Reflection

All used data within the Use Case was first analysed and categorised to clarify the

terms of licencing and terms of use. Often data providers initially do not have a clear

picture of which data Classification they need to choose. Also, data providers might

get uncertain of how their data should be classified during the whole end-to-end

process and therefore would restrict the usage to a minimum. For that reason, we

consider it as helpful to offer a first orientation of how data can be categorized. We

used the Categorization discussion as a preparation for the following data

classification. The Classification is a much more precise tool than the Categorisation,

but it requires a certain base understanding of how data processing in the Smart Data

Platform is organized.

Therefore the following three steps have to be done:

– Analyse Categorisation of data

– Check terms of licensing (see also chapter 3.3)

– and / or check compliance (terms of use)

Within the Use Case Smart Home, Categorisation, licence checking and compliance

checking was needed for:

– Master Data from the Geo Portal Munich

– Master Data of buildings and refurbishment information deriving from the City´s

“Energy Master Database”

– Measurement data from flats (sensors)

– Weather data from DWD3

3 Deutscher Wetterdienst – National German Weather Service (www.dwd.de)

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 72

Analysis results of data as follows:

Data Source Categoristaion Type licence terms of use

Master Data from Geo Portal Munich Restricted (cat 3) no yes

energy Master Data Restricted (cat 3) no yes

Measurement data from flats Restricted (cat 3) no yes

weather data from DWD Open (cat 1) yes yes

Figure 4: Categorisation of the Use Case

3.2.3 Golden rules

~ The more detailed the analysis of the provided data the easier is the IT

implementation, and misunderstandings can be avoided in an early stage of

the project

~ Categorisation of data often helps the data provider to get a better

understanding of the real value of the offered data. This simplifies the following

classification of the data and generates a trustful collaboration base between

the involved stakeholders.

~ Arranging and actively using pe- and post processing algorithms for data (e.g.

at the data provider or within the Smart Data Platform) might allow changes in

the category and Classification of data. This tool often is a prerequisite to allow

the analysis of data that originally was not allowed to use (e.g. due to privacy

constraints)

3.3 Data Usage Agreement and Licensing

Smart Data Creation Guidelines

Chapter 3:

Quality Gate 1

Quality Gate 2

3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8

DataProcessing

RequirementSpecification

Data Usage Agreement

and Licensing

Data Categori-

zation

Data Classification

Data Sets and

Technical Implement

ation

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 73

There are different options to share data and make it available for others to use. In all

cases it is necessary to specify the conditions of use in a Data Usage Agreement. Either

they are already subject to an existing license like for instance the Creative Commons

license, or an individual agreement is necessary. An individual agreement additional

to a standard licence is called Side letter.

3.3.1 Standard Licences

In the usual licensing procedure no negotiations takes place between supplier and

user in the individual case. Hence, the use of open data licenses or sample user

contracts (templates) minimize efforts. Moreover, it is possible to agree on further

conditions by an additional agreement if necessary (side letter agreement).

Common standard open data licenses in Germany and beyond are briefly

summarized in the following:

Creative Commons (CC)

URL: https://creativecommons.org/share-your-work/licensing-types-examples/

- Internationally known and widespread public copyright licenses that enables

the free distribution of an otherwise copyrighted work

- Several types of CC licenses

- Current version: 4.0

German translation was provided in January 2017

There is no porting for German law yet

- There is a porting of the older version 3.0 for German law

Datenlizenz Deutschland

URL: https://www.govdata.de/lizenzen

- Current version: 2.0

- Available in two versions:

„Datenlizenz Deutschland – Namensnennung – Version 2.0"

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 74

Obligates the data user to state (the name of) the data provider

„Datenlizenz Deutschland – Zero – Version 2.0"

Allows unrestricted use

Geodatennutzungsverordnung (GeoNutzV)

URL: http://www.gesetze-im-internet.de/geonutzv/

- Applicable for geodata, geodata services and metadata

- Must be adapted by the data-holding body to the respective circumstances

3.3.2 Individual Agreement or Side Letter

In case there is no existing license or an additional agreement is desired, it is necessary

to set up a Data Usage Agreement contract. The agreement should contain detailed

information on the conditions of use, the delivery arrangements and all the tasks in

detail with respective Responsibilities and rights of the two parties. Topics like further

distribution and the conditions for every state of the data should be addressed as well.

Such a Data Usage Agreement should either be created by a lawyer or at least be

reviewed and approved by one.

If the data is not already under a license, it is necessary to draft the Data Usage

Agreement as a contract defining the rules and conditions of the data usage and

processing. If the agreement is a Side Letter to an existing license, it just needs to

describe addional usage conditions or exchange arrangements.

Distinguished by the kind of mutual compensation of the two parties, common types

of user agreements are:

– Fee-based agreements

– Exchange agreements (e.g. data exchange)

– Individual additional agreement (e.g. additionally to an existing license)

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 75

3.3.3 Use Case Reflection

In the Use Case “Smart Home” some of the data usage agreements between the

consortium partners were already defined within the General Agreement of the EU

project “Smarter Together”. Details however concerning concrete Classification and

licensing of Data Sets (e.g. data privacy concerns or commercial aspects) had to be

done on an individual base. Other external licence agreements we had to consider

and investigate in more detail was the DWD weather forecast data (which is open

data). Also we needed to find an agreement of how to restrict the usage of the flat

inhabitants’ personal data (“Address”). This was done by the above mentioned

“written individual agreement” that every participating flat inhabitant had to sign (on

a volunteer base) before starting to use the offered service.

3.3.4 Golden Rules

~ Make sure that for each Data Source that exchanges data with the project

data base (e.g. Smart Data Platform), there is a written agreement with the

data provider of how the data is categorised.

~ Make sure that for every Data Source that exchanges data with the project

data base (e.g. Smart Data Platform), there is a clear written agreement of the

data usage licenses (e.g. monetary conditions or other restrictions required)

3.4 Data Classification

Remark: (see also data Categorisation): In this Data Gatekeeper concept a 2 step

approach was chosen to raise the awareness of how data Classification can be done

in detail. Four possible Categories and four Classification types were described, each

with an additional 4-step level rating approach.

Please keep in mind that this procedure is not the only possible approach. Any

reasonable existing methodology can be fine, as long as it is used continuously within

a City or a project.

Smart Data Creation Guidelines

Chapter 3:

Quality Gate 1

Quality Gate 2

3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8

DataProcessing

RequirementSpecification

Data Usage Agreement

and Licensing

Data Categori-

zation

Data Classification

Data Sets and

Technical Implement

ation

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 76

Also note, that the subdividing of respective Categories (Classification) must be

realized by means of previously defined pre- or post-processing steps (see Data

Processing) of respective derived Data Sources as it is an instrument for Anonymisation

and Pseudonymisation.

__________________________________________________________________________________

In this concept, Classification was made in the following four Dimensions:

I for integrity and data protection (Level I1 – I4)

P for processing and analysis (Level P1 – P4)

R for redistribution and modification (Level R1 – R4)

S for storage and deletion (Level S1 – S4)

There are four different levels of scale for each Classification dimension. The level

shows the required type of treatment for the data on the Smart Data Platform. Each

Classification Dimension has four different settings, starting with level 1 for open data,

with the lowest protection needs up until level 4 for personal data, with the highest

protection needs.

Thus, in total there are 16 different Classification settings. A Data Set therefore will

always have 4 Classification Data Fields, one for every Classification level.

A Data Set can be level 1 regarding integrity and data protection but level 2 regarding

redistribution and modification (e.g. I1-V1-W2-S1). The exact meaning of each

Classification setting will be described in the following subchapters.

In addition to the guidelines shown in chapter 2, mainly resulting from legal

requirements, the Classification is a further orientation guidance by specifying in which

area a Use Case can be set up.

The Categorisation hence helps (potential) Use Case creators by setting rules and

requirements that need to be complied with. It offers further guidance by giving an

overview of the different existing data Categories, enabling the creator to see whether

a Use Case idea is legal and realisable. Use Cases and data that do not meet these

basic requirements or which do not comply with the basic legal regulations are not

allowed on the Smart Data Platform.

Within one Use Case, depending e.g. on the analysis goals, there can be several

Categorisations of Data Sets simultaneously. They however must be clearly separated

from each other in order to not produce misunderstandings in the data processing.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 77

As relevant information for processing within the Smart Data Platform, the following

information is stored: Whether the Data Set is Open Data or not (refer to respective

Category), a time-to-live for each Data Object (may be unset or set upon decision by

Data Protection Officers), The 4-level rating for the Classification Dimensions Integrity

& Data Protection (I), Processing & Analysis (P), Redistribution & Modification (R) and

Storage & Deletion (S).

The four Classification Dimensions and their possible settings are detailed in the

following.

3.4.1 Integrity and Data Protection

The Classification I stands for Integrity and Data Protection. It ensures, that data is

processed lawfully and securely with regards to those aspects.

Level 1 (I1): No further measures need to be taken.

Level 2 (I2): Data need to be anonymized before further processing.

Level 3 (I3): Data need to be anonymized before further processing and if

needed also aggregated before that.

Level 4 (I4): Data needs to be treated as personal data and therefore it is

necessary to comply with data protection laws and legal requirements,

described in chapter 3.4.4.

3.4.2 Processing and Analysis

The Classification P stands for Processing and Analysis. It ensures, that data is

processed lawfully and securely with regards to those aspects.

Level 1 (P1): No restrictions.

Level 2 (P2): Data can only be processed and analyzed with other data when

it is ensured that by processing, it (still) won’t be possible to trace data back to

an individual.

Level 3 (P3): Data can only be processed and analyzed with preassigned data

within the Use Case.

Level 4 (P4): Data are not allowed to be processed or analyzed with other data.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 78

3.4.3 Redistribution and modification

The classification R stands for Redistribution and Modification. It ensures, that data is

processed lawfully and securely with regards to those aspects.

Level 1 (R1): No restrictions.

Level 2 (R2): Data can be transmitted to external users, only under specific

conditions, which are specified in the usage agreement.

Level 3 (R3): Data are only allowed to be transmitted internally and to the data

provider.

Level 4 (R4): Data are only allowed to be transmitted to specified internal users.

3.4.4 Storage and Deletion

The Classification S stands for Storage and Deletion. It ensures, that data is processed

lawfully and securely with regards to those aspects.

Level 1 (S1): No restrictions.

Level 2 (S2): Data need to be stored in-house, but are not subject to any

restriction regarding deletion.

Level 3 (S3): Data need to be stored in-house and the deletion limit needs to be

observed.

Level 4 (S4): Data contain personal data or comparable vulnerable information

and therefore need to be stored at a preassigned, secure place if necessary

with specific access protection and the deletion limit needs to be observed.

3.4.5 Use Case Reflection

The previously categorised data are classified in detail to determine the processing

and maintaining rules within the SDP

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 79

In this step the Classification of the following data will be determined in close

cooperation with data owners:

– Master Data from geo portal Munich

– Energy Master Data of buildings and refurbishment

– Measurement data from flats (sensors)

– Weather data from DWD

All data in the SDP deriving from the flats are classified by the data owner or the data

provider in order to assure the above mentioned legal aspects and technically define

and restrict the complete data handling on the SDP accordingly (use of

anonymization and aggregation …). In case e.g. only 2 flats are project participators

but the overall building/address has 20 flats, it could be relatively easily determined

which flat owner contributed which data. This would theoretically allow to use

algorithms which conclude a personal behaviour of a flat inhabitant. To avoid this kind

of potential misuse from scratch, also anonymization or aggregation procedures for

the collected data needed to be taken into account.

The results of the Classification are as follows:

Data Source Integrity and

Data

Protection

Processing

and Analysis

Redistribution /

Modification:

Storage /

Deletion

Master Data from geo portal

Munich

Tbd² Tbd² Tbd² Tbd²

Energy Master Data Tbd² Tbd² Tbd² Tbd²

Measurement data from flats 4 2 4 4

weather data from DWD 1 1 1 1

Figure 5: Classification of the Use Case

__________________________________________________________________________________

Remark: ²: some of the project specific classifications couldn’t be finalised during the

description of this concept.

__________________________________________________________________________________

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 80

3.4.6 Golden Rules

~ Classification of data needs to be discussed in close cooperation with the data

owner and the chief data officer before transferring any data into the Smart

Data Platform. Using more than 4 levels of Classification increases the danger

to overstretch complexity and lose track of the details. All impacts and

individual conditions need to be written down and agreed to with the platform

operator and the party that implements the rulebase (algorithms) into the Smart

Data Platform.

~ Arranging and actively using pre- and post processing algorithms for data (e.g.

at the data provider or within the Smart Data Platform) might allow changes in

the category and Classification of data. This “tool” often is the only reasonable

way to allow the analysis of data that originally was not allowed to use (e.g.

due to privacy constraints) (applying same golden rule as already mentioned

in “Data Categorisation”)

Short Summary of “Data Creation Guidelines incl. Data

Categorisation and Classification”

The preparation and creation of the Use Case is the interdisciplinary translating process

between all participating stakeholders, from business orientation to IT-implementation.

It is one of the most important preparation parts where the essential elements of the

Data Gatekeeper concept are brought together and where its required

implementation details are recorded. It stepwise sharpens and concretises all of the

original requirements and guideline parameters in a way that in the end a

programmer can perfectly understand them and implement them in the Smart Data

Platform.

The Use Case anticipates the complete end-to-end process. It describes the

challenges that need to be analysed e.g. by the operative department, it lists the Data

Sources required, it defines the output parameters of the analysis. It determines the

data security and data privacy aspects as well as the underlying business or benefit

models for the City, a City department and other stakeholders. And last but not least

it rises the value of the input and output information data by categorising and

classifying it accordingly. The discussion about the Classification of the data often is a

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 81

very fruitful and important step to express the fundamental Data Gatekeeper related

rules of a Use Case.

3.5 Quality Gate 1

The Smart Data and Use Case Creation Process has several quality gates that ensure

lawful processing of the data as well as data security and privacy.

The first quality gate in the process contains the final Use Case concept elaboration,

its approval and the signment of the Data Usage Agreement.

3.5.1 Elaborate Final Use Case Concept

As soon as the Requirements Specification, the Data Categorisation and Classification

as well as the Data Usage Agreement and related meetings and negotiations are

finished the responsible instance can elaborate the final Use Case concept. It is

recommended to summarize agreements in a document including the previously

provided specification checklist. For a checklist template on activities until Quality

Gate refer to Appendix (7.2.2f.).

3.5.2 Final Concept Approval (Q1)

The Data Owner now approves the final Use Case Concept in agreement with the

Data Supplier. An approval of the decision making body concerning the related

budget is required too. Other important stakeholders need to be informed on

decisions and eventual demanded rectifications and shortcomings.

3.5.3 Sign Data Usage Agreement

If the approval has been successful, the data owner on the side of the City and the

Data Supplier sign the Data Usage Agreement as negotiated in previous steps.

3.5.4 Use Case Reflection

Quality Gate 1 is for checking the Use Case compliance and Use Case completeness

of all Legal Regulations and internal rules and all definitions that have been

determined so far. It is the basis for the implementation of the Use Case in the different

systems and platforms. Incompleteness and unclearness of definition can cause high

costs in terms of needed rework.

Smart Data Creation Guidelines

Chapter 3:

Quality Gate 2

3.1 3.2 3.3 3.43.5

3.6 3.7 3.8

DataProcessing

RequirementSpecification

Data Usage Agreement

and Licensing

Data Categori-

zation

Quality Gate 1

Data Classification

Data Sets and

Technical Implement

ation

3.1 3.2 3.3 3.4 3.6 3.7 3.8

3.5

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 82

Within the Use Case “Smart Home” we prepared the written Use Case together with

the involved parties. Every party had the possibility to look at the complete Use Case

description before a final approval. Also we organized a final meeting between the

involved stakeholders “Use Case owner”, “Analytics draftsman” and “Data platform

operator” in order to go through all requirements again and answer all remaining

questions.

We recommend to create a checklist for every single step that has been done so far

and to sign this by at least your Chief Data Officer, Data Security Officer, Use Case

owner and Data Platform Provider.

As a minimum check you should verify:

– Used data and corresponding terms of use

– Classification and corresponding processing rules

– Use Case completeness and workflow (incl. requirement spec for Analytics)

– Interface definition (all systems involved)

– Functionality and GUI of the components (depending of the implementation

process) such as Analysis Dashboard, SC-App …

3.5.5 Golden Rules

~ Clearly identify the issue and the benefit of the Use Case. Get the commitment

on this from the involved partners

~ Invite all relevant Use Case stakeholders to comment on the final version of the

Use Case (give them a reasonable time schedule), invite them for a final

approval and Q&A session (if needed) before releasing the Use Case for the

concrete implementation work.

~ Keep the checklist for the quality gate as simple as possible but insist on a signed

and agreed quality gate 1

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 83

3.6 Data Sets and Technical Implementation

In the previous chapters primarily organisational tasks and measures were described.

The following section describes an example for the technical layer, focusing on related

procedures with respect to the previously described organisational part. As described

in section 2.3 it is recommended to establish a central Smart Data Platform

infrastructure for Integration, Processing and Data Analysis. The following section

describes necessary steps of the Data Platform and potential other Use Case-specific

systems.

3.6.1 Create Data Scheme and Interface Specification

The first main task of the Data Supplier is to create a Data Scheme based on the Data

Model outlined in chapter 4. Further she or he can consult with the Platform

Administrator on the Data Scheme and the planed interface of the external Data

Source for data exchange.

Arising key questions of the Data Supplier can be:

– Is the data format compatible with the interface and the Smart Data Platform?

– Does the Data Set have all the necessary attributes to be correctly processed

in the Smart Data Platform? Does this necessiate further steps on Supplier side?

– What sampling rate is needed - are different time granularities needed?

The Platform Administrator can assist in answering these questions by providing

established standard solutions that are in use for several other Smart Data Use Cases.

3.6.2 Create Technical Specification

The Service Developer, who designs and develops the main Use Case hardware

component or software application, has been involved in previous requirements

engineering and concept steps. As first main responsibility he or she now elaborates a

Smart Data Creation Guidelines

Chapter 3:

Quality Gate 1

Quality Gate 2

3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8

DataProcessing

RequirementSpecification

Data Usage Agreement

and Licensing

Data Categori-

zation

Data Classification

Data Sets and

Technical Implement

ation

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 84

technical specification for his External System that uses the Smart Data Platform API4

of the Smart Data Platform.

As the conceptual approach is now translated into technical concepts the first time,

a question that may arise is wheter it is necessary to add human-generated content

in the Data Set. Do the user generate data on their own while using e.g. an App?

Where is it stored safely? Either the Smart Data Platform or another service run by the

conceptual and technical developer could overtake this task.

__________________________________________________________________________________

Remark: The following descriptions (3.6.3. – 3.6.6.) are kept rather short, as most

activities follow a standard IT implementation process.

__________________________________________________________________________________

3.6.3 Technical Specification Approval

As soon as the technical specification for e.g. an interaction of users (citizen), External

System integration, Smart Data Platform (API, Analysis Dashboard) and Data Source

Interfaces has been finished, the Use Case Responsible can approve it in consultancy

with the assigned responsible persion or forum (see chapter responsibilities) .

3.6.4 Implementation and Data Set Creation

Now the technical implementation of the Use Case content can start. Most important

is the creation of the Data Set with all required attributes in the Smart Data Platform or

in case it already exists, to check the compatibility for the given Use Case. This is done

by the Platform Administrator. He or she further ensures the linkage of the Analysis

Dashboard and several Derived Data Sets and their visualisation to Data Analysts.

3.6.5 Implementation of Data Interface

The Data Supplier now is realizing the previously planed Interface of the external Data

Source and tests the solution together with the Platform Administrator.

4 Application Programming Interface

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 85

3.6.6 Develop App or Hardware solution

Now the External System, which is dependent on the Use Case a smartphone app or

a hardware solution can be developed and tested by the Service Developers.

3.6.7 Use Case Reflection

The definition of the technical specifications mainly used established standard IT-

processing tools. One important aspect however was the discussion of potential data

formats, protocols and the sandbox-testing of APIs and data flows. As most of the

project partners did not know each other personally before the project, it was

important to agree to a common process, to arrange meetings with specialists and to

arrange test flows before the final technical implementation took place.

3.6.8 Golden Rules

In this phase of the project normally other people than during the preparation and

Use Case definition phase are involved. Although these people will have very

dedicated and specific IT-related work to do (e.g. installing and testing SW and

HW, programming algorithms and GUIs, etc.), it is very helpful to inform them about

the complete end-to-end process and background of the Use Case. For that

reason it is recommended to generate a standard documentation (e.g. slides),

explaining the Use Case and providing the most important facts and expectations

of the relevant stakeholders.

3.7 Data Processing

__________________________________________________________________________________

Remark: The following descriptions (3.7 – 3.7.3.) describe Standards in many IT-projects.

Nevertheless some background information is provided within the scope of this

Smart Data Creation Guidelines

Quality Gate 1

Quality Gate 2

3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8

DataProcessing

RequirementSpecification

Data Usage Agreement

and Licensing

Data Categori-

zation

Data Classification

Data Sets and

Technical Implement

ation

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 86

document to facilitate a better understanding of some Data Gatekeeper aspects for

readers who are not familiar with these IT-related details.

__________________________________________________________________________________

As next activity the realisation of previously defined pre- or post-processing steps of

respective derived Data Sources is recommended. These designated processing steps

now have to be substantiated by adequate processing functionalities and

Classification realized by means of processing. There is no standard processing,

however, the exemplary proceeding for processing, which is ensuring different aspects

relevant for a Smart Data Platform, is as follows:

1) extract, transform and load process

2) adequate processing of data according to following functionalities

3) load of data to Smart Data Platform

In the scope of this concept the conceptual procedures Anonymisation /

Pseudonimisation are understood as a processing measure. The technical and more

concrete realisation of this measure can be done using different processing

functionalities as for instance the removing of a technical identified (ID). Function

libraries e.g. are a collection of possible software tools that can be applied to format

any available raw-data or raw-information.

Technical key concepts relevant for processing are the Raw Data Set (see glossary)

which stores data obtained by an external Data Source (via technical interface) and

the Derived Data Data Set that uses an Algorithm to calculate modificated Data

Objects (see glossary) from the Raw Data Set.

There are two general different possibilities (mechanisms/measures) on how to

transform data within the platform so that it cannot any longer be referenced to an

individual.

– Pre-Processing: After the integration of raw data into the Smart Data Platform,

data can be anonymised / pseudonymised before writing it into the platform´s

Database.

– Post-Processing: The data is stored on the platform’s Database as Raw Data Set

and dynamically processed when accessed by a Derived Data Set. The

platform´s Analysis Dashboard, in turn, accesses the Derived Data Set.

In the scope of this concept the two main tasks of the Platform Administrator is to

derive necessary processing measures and functions and to implement processing

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 87

functions. In consultation with all responsible stakeholders for data security and

protection (poss. the Data Analyst and Use Case Responsible), they are in charge of

ensuring compliance with internal and external requirements of data privacy.

The following subsections describe the two Data Processing Measures

Anonymisation/Pseudonymisation for person-related data and Data Aggregation.

Beyond these, further measures may be necessary, implemented and used in the

future.

3.7.1 Anonymisation and Pseudonymisation

Anonymisation of person-related data means processing it with the aim of irreversibly

preventing the identification of the individual to whom it relates to. Also it must be

impossible to identify any individual from the data by any further processing of that

data or by processing it together with other information which is available or likely to

be available.

It is rather difficult to predict if a particular technique will be 100% effective in

protecting the identity of Data Subjects except for methods offerimg formal

guarantees. However, it is possible to minimise the risks of Data Subjects when

anonymising data. Identification means the possibility of retrieving a person's name

and/or address, but also the potential identifiability by singling out, linkability and

inference.

Pseudonymisation of person-related data means replacing any identifying

characteristics of data with a pseudonym, or, in other words, a value which does not

allow the Data Subject to be directly identified.

Although pseudonymisation has many uses, it should be distinguished from

anonymisation, as it only provides a limited protection for the identity of Data Subjects

in many cases as it still allows identification using indirect means. Where a pseudonym

is used, it is often possible to identify the Data Subject by analysing the underlying or

related data.

This means that the processing of personal data is manipulated in such a way that the

data can no longer be attributed to a specific Data Subject without the use of

additional information. In order to pseudonymize a Data Object, the “additional

information” must be kept separately and subject to technical and organisational

measures to ensure non-attribution to an identified or identifiable person. In sum, it is a

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 88

privacy-enhancing technique where directly identifying data is held separately and

securely from processed data to ensure non-attribution.

The following Data Processing Functionalities realize Anonymisation and

Pseudonimisation Measures:

Removing person-related Data Fields

For a full anonymisation of the transferred Data Sets, Data Fields containing person-

related Data Sets can be removed completely before writing it into the data-base.

These Data Fields need to be specified by the data provider and other Use Case

stakeholders.

Removing ID digits

Before writing raw Data Sets into the Database, the person-related IDs indicated in

specific Data Fields can be reduced by a number of digits to be specified by the data

provider and other Use Case stakeholders. If the Data Set contains a person-related

ID as for example “D42197334”, it can be written into the Database as e.g. “D42197”

or “D42197XXX”. The number of digits to be cut off needs to ensure that no data can

be referenced to a specific person.

Creating hash values

A hash value is a numeric value of a fixed length that uniquely identifies data. Hash

values represent large amounts of data as much smaller numeric values, so they are

used with digital signatures. You can sign a hash value more efficiently than signing

the larger value.

For data pseudonymisation purposes a hash value can be created for a person-

related ID before writing the data into the Database. The content of a specific Data

Field is replaced by a new “string” created with the algorithm of the hash-function.

The created hash-value can only be referenced to the original Data Values with the

correct key which has been used for creating the hash-value. This key can either be

deleted or safely stored in a different system.

3.7.2 Data Aggregation

The Processing Measure “Data Aggregation” is understood as the compiling of

information from one or several Data Sets with the intent to prepare a combined

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 89

Derived Data Set that does not include all information from original Data Sets. A set

of Processing Functionalities to ensure data protection requirements for the

aggregation of data is described in the following.

Remove subcategories of a Data Set

Subcategories of a Data Set can be removed. If a Data Field contains a category

and there are more subcategories specifying this category even further, these

subcategories can be removed. If the Data Set for example contains a location

category “City” and the subcategory “district” or “street”, the later two

subcategories can be removed so that the Data Set is aggregated to City-level.

Create an average value for a defined period of time

If Data Sets contain measured values for an interval of e.g. 5 minutes, but due to

Data Protection guidelines only measured values for e.g. an hour should be

processed in the platform, an average value for the desired period of time can be

calculated (either before writing the Data Det into the data base or before further

processing in the analysis module)

Aggregate (sub-) categories

Categories of a Data Set can be aggregated to a “higher” category level. If a Data

Set contains for example the category “district”, the districts “1” and “2” can be

aggregated to the new category “1/2” to provide a higher aggregation level.

3.7.3 Golden Rules

~ Consultation: inform the Data Supplier about changes. The Data Analyst or Use

Case responsible may also be consulted for figuring out their concrete ideas on

their desired Data Analysis Queries – in case they are compliance-critical or

different to realize.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 90

3.8 Quality Gate 2

The second quality gate ensures the final testing and approval of the Smart Data Use

Case. It contains the specification and implementation reviews, a Use Case

integration test and a pilot phase for testing the External System with a small number

of people. A subsequent test by the Data Analyst allows to adjust the Data Analysis

Queries based on data and experiences from the pilot phase. Finally the Use Case is

completed by a final approval and a Go Live of the Use Case.

3.8.1 Specification and Implementation Review

As soon as the implementation of technical solutions inside and outside of the Smart

Data Platform are completed, the Use Case Responsible reviews the specification and

Implementation and raises her or his comments and feedback to the respective

developer responsibles (Data Steward, Service Developer and Platform Administrator).

3.8.2 Use Case Integration Test

The Use Case integration test by the Strategic Manager is a last laboratory test of all

external components under stress conditions. Detail shortcomings can be straightened

out by the developers. The Data Analysis Queries still can be in a prototype status, as

only a basic test can be performed with mocked data in the laboratory.

3.8.3 Go Live Pilot Phase

A pilot phase within a fixed time frame is testing the External System with a small

number of test users. Depending on the analysis requrements the Data Analysis Queries

can be substantiated by means of real data of the test users. The Use Case Responsible

is in charge of the pilot testing.

3.8.4 Analysis Test

The Data Analyst test allows the Use Case Responsible and other Analysts that are

interested in findings from Smart Data to test the Data Analysis Queries in the platform’s

Analysis Dashboard. They can raise usability issues and request adjustments to the

organizing Strategic Manager.

Smart Data Creation Guidelines

Chapter 3:

Quality Gate 1

Quality Gate 2

3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8

DataProcessing

RequirementSpecification

Data Usage Agreement

and Licensing

Data Categori-

zation

Data Classification

Data Sets and

Technical Implement

ation

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 91

3.8.5 Final Approval and Go Live

3.8.6 Golden Rules

refer to existing test scenarios of your IT department or IT-outsourcing partner

Data Model

__________________________________________________________________________________

Remark: The following descriptions of the Data Gatekeeper’s Data Model should not

be seen as a fixed and “ready to use” Blueprint. Although a Data Model is important

and necessary to implement all stakeholders’ requirements and processes into a Data

Platform, the character of the present Data Model, although it covers all main needed

elements, is “preliminary” and “project specific” and requires a more detailed

description in next possible updating-steps of the Data Gatekeeper. It shoud be seen

as one possible example of how a Data Gatekeeper’s Data Model can be designed.

The Data Model strongly depends on the particular use case and focus of an individual

Data Gatekeeper implementation. The design of the Data Model as well as the

implementation into an existing Data Platform requires specialised Know How and

design matching from both IT Architects as well as IT Programmers.

__________________________________________________________________________________

This chapter contains the design and description of the Data Model that summarizes

as formal relationship glossary the considerations and specifications from the previous

chapters as formal relationship glossary. It is arranged as a minimal set of information

necessary for the implementation of data handling (described in the Data

Gatekeeper Concept). The DGK Registry (The Data Gatekeeper (DGK)-Registry is a

Software instance within the Smart Data Platform that assures the Data Gatekeeper’s

Data Model implementation process requirements within the Data Platform) for

instance is only one possible realization of many but can serve as template for the IT-

responsibles like the Service Developer, Data or Platform Administrator.

The Data Model is a scheme that translates the Use Case wordings and options into

logical IT-oriented structures. The goal of the Data Model is a comprehensive structure

including all available and selectable options that characterizes needed components

within the Smart Data Platform and the perfect interplay with the Data Gatekeeper

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 92

concept. Data Schemes elaborated in the scope of a Use Case (see section 3.6.1)

need to comply with the Data Model and realize together with the Data Schemes of

all other Use Cases the concrete Data Model of the Smarter Together Platform. Several

organisational aspects are not included in the Data Model. For instance only users with

access to the system are represented.

The Data Model in Figure 6 shall ensure standards in syntax and allow a semantic

representation of all compliance and licensing aspects.

Figure 6: Data Model: Inside Smart Data Platform and External

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 93

Figure 7: Data Model: Smart Data Systems

Figure 6 and Figure 7 show data entities relevant for the Data Gatekeeper Concept in

UML class chart notation. They represent a conceptual model including classes, their

relationships including cardinalities (partially annotated) as well as attributes of the

class entities. The two Data Models are connected and interacting (see appendix,

7.6).

Figure 6 illustrates the Data Model inside the Smart Data Platform and external (Figure

7). There are two kinds of Data Sets. Raw Data Sets are obtained from an external Data

Source and stored in the platform’s Database. Derived Data Sets refer to a Raw Data

Set and are dynamically calculated by the system. A Derived Data Set needs an

Algorithm and one or several Raw Data Sets to obtain data from. The external Data

Sources stream data in a particular time period to their Data Set pendant. This update

rate may be a long time period as e.g. once per 5 years in case of a database and a

short time interval as e.g. once per minute in case of a sensor. A Data Set holds various

Data Objects (see glossary) and has various Data Fields.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 94

Each Data Set can further be assigned to one or several Smart Data Use Cases. A Use

Case is characterized by a title and a description. The further existing documentation

on the Use Cases (see section 3.2 ff.) is not relevant and not stored in the Smart Data

Platform.

Data Set refers to exactly one Data Classification. The attribute “Is Open Data”

determines wheter the corresponding Data Set is under an open data licence, which

has impact on processing permissions. The attribute “Time-to-live” can either be null in

case of no duty to delete this Data Set after a particular time period or be set with a

time period after which it should be deleted. The time period starts at the time of

creation of the corresponding Data Object.

Several levels of privacy indicate the general Classification of all Data Objects

belonging to this Data Set. The different Classification attributes I, P, R and S can each

have a different level from 1 to 4, depending on the protection needs of the data.

Figure 7 shows the Data Model of the Smart Data Systems. Relevant entities are the

Transparency Dashboard, the Analysis Dashboard, the Acces Authorization, the DGK

Registry, the Smart Data Platform API and the External System.

The Data Gatekeeper (DGK) Registry contains Aggregation rules, Pseudonymization

rules and Restrictions and is also related to the Access Authorization. The Access

Authorization is determined by the DGK Registry and grants access to both the Analysis

Dashboard and the Smart Data Platform API, which is accessed by External Systems in

order to access Data Sets and Data Objects from the platform.

All Users relevant for the Smart Data Platform and their possible interactions with System

Components are further visualised in Figure 8.

User Interactions with System Components

Use Case Responsible Owns the Smart Data Use Case

Agrees on the Data Classification

Decides on the Access

Authorization for his Use Case

Data Analyst Can read the Analysis Dashboard

Platform Administrator Operates the DGK Registry

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 95

Data Supplier Owns the external Data Source

belonging to a Raw Data set

Sets the Data Classification

Figure 8: Users relevant for the Smart Data Platform

4.1 Examplary Data Model

Figure 9 is an example of a Data Model of a concrete Smart Data Platform with two

Use Cases.

Figure 9: Data Model of a specific Smart Data Platform with two Use Cases

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 96

– Both Use Cases have two Data Sets. The Use Case “Smart Home” uses the Data

Sets “Internal Temperature” and “Humidity”. The Data Set “Internal

Temperature” contains Raw Data which is provided by an external temperature

sensor as well as a so called Feel-good factor calculated by an algorithm using

raw data provided by the two external Data Sources, a thermometer and

hygrometer sensor, and a calculation rule. The Data Set “Humidity” is a Raw

Data Set provided by the hygromether sensor and does not contain Derived

Data. Every Data Set has a Data Classification. In this data model, for example,

the internal temperature is 23 degrees celcius on the 13th march, 2018. This Data

Set is classified as open data, has a time-to-live of one month and has specific

levels of privacy.

– In the Use Case “Lamppost” the two Data Sets used are “External Temperature”

and “Air Pollution”. The Data Set for “External Temperature” consists of Raw

Data provided by a temperature sensor, the Data Set “Air Pollution” consists of

both Raw Data, provided by an external source, and Derived Data. The Raw

Data of this Data Set is generated by an Air Pollution Sensor once per minute

and the Derived Data is calculated by the algorithm “Pollution calculation”. In

contrast to the Derived Data Set “Feel-good Factor” from the Use Case “Smart

Home”, the Derived Data Set “Air Pollution Index” uses only one Raw Data Set

in the algorithm “Pollution Calculation” in order to determine the Derived Data

Set.

Smart Data Infrastructure The Smart Data Platform converts available raw data into Smart Data and serves as a

collaborative platform for smart services that allows an efficient data exchange

between City stakeholders. It offers innovative data analysis methodologies and

needs to meet many data integration and analysis requirements while at the same

time ensuring data privacy and security.

The following chapter first summarizes technical requirements to the platform and then

describes the basic architecture. The core features data integration, connection with

External Systems and data analysis are further presented more deeply, before the

Transparency Dashboard is introduced as important feature for data processing

transparency towards citizens.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 97

In total, this chapter offers an exemplary platform architecture based on the Use Case

of Munich and aims to serve as a template for the IT-Responsibilities like the Service

Developer, the Data and Platform Administrators. On the other hand, its structural

content can be used by the Strategic Comitee as an announcement for a Smart Data

Platform.

Before main sections follow, an executive summary briefly summarizes key issues of the

chapter.

5.1 Basic Architecture

The basic principle of the recommended Smart Data Platform is to develop services

and applications for various new types of data and Use Cases. It offers an open

interface to enable an IT ecosystem around its deployment. This includes the possibility

to exchange raw and aggregated data as well as results from data analytics. It also

offers the possibility for third parties to develop their own applications-making use of

open interfaces. Most technical innovations are being made within the analytics and

evaluation modules.

The platform furthermore enables the integration of different City domains, such as

Water, Energy, Mobility, Buildings and others, and helps to obtain synergies to enhance

the efficiency of urban infrastructure.

The Smart Data Platform offers mechanisms to extract, transform, load (ETL), store and

analyze data, as a basis for new services with dedicated user interfaces.

Figure 10 shows the layer architecture of the Smart Data Platform and an exemplary

bunge of potential Use Cases (at the bottom).

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 98

Figure 10: Smart Data Platform Layer Architecture

The Smart Data Platform provides the following core features:

External Systems realize Smart Data Use Case functionalities such as water asset

management, building energy monitoring & controlling or mobility functionalities.

These systems deliver data in various formats and demand for flexible and robust

integration techniques.

Data Integration/ETL is performed to obtain data from External Systems and to load

data into an integrated schema. Data integration is important to get a complete view

of the problem context.

Datawarehouse stores data in a persistent manner and enables the composition of

new data. The Datawarehouse must be scalable and deliver results in a timely manner.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 99

Analytics Modules include various analysis algorithms that provide insights into

complex processes and dependencies. Algorithms must be efficient in computing

results.

APIs/Services provide access to data and functionality in a standardized manner. This

helps to establish an ecosystem based on data and services.

Web Dashboards provides means for visualisation and presentation of results to a

variety of users.

Figure 11: Smart Data Platform Layer Architecture and Data Gatekeeper Registry

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 100

5.2 Data Integration

Core feature of the platform is the storage of Data Sets as well as the import and just-

in-time integration from technical interfaces provided by Data Suppliers (see glossary).

The platform replicates all data streamed from external Data Sources (see glossary)

and allows data management, Linkage and Aggregation by means of Derived Data

Sets (see glossary). Some processing steps are performed automatically by the Smart

Data Platform. For some others manual implementaions are necessary.

As described in chapter 3, based on the agreements of Data Supplier, Use Case

Responsible and other involved stakeholders, the Data Classification and

Categorisation is configured by the Platform Administrator. The Admin Tool of the DGK

Registry component allows the Platform Administrator to create the Data Sets and its

Classification, while other manual interventions and code adjustions may be needed.

There is no automated processes for the creation of Use Cases and related Data Sets

or for Analysis Queries. The present Data Gatekeeper concept thus can not be seen

as a process description implementing any automated process to assure data privacy

or to eliminate any data privacy leaks. All necessary steps are described, but still the

implementation needs to be done in a stepwise and manual approach.

The configuration is considered by the DGK Registry component accordingly. The

Data classification is matched to the different Data Sets during the data integration.

In case Pre-Processing has been configured (cp. 3.7 on Data Processing) before Raw

Data Sets are stored, it is necessary to anonymize or pseudonymize and aggregate

the data as specified. Figures xx illustrate the core functionalities of the Smart Data

Platform that are coordinated by the Admin Tool.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 101

Figure 12: Platform Core Functionalisties and their Administration

The Smart Data Platform foresees two ways of accessing the integrated data. The first

way is to access the visualized analyses of data in the analysis dashboard. The second

way is to request data over an API for e.g. integration in applications. In both cases of

data access, the user needs to have the access rights to receive certain data. The

Smart Data Platform grants access to certain Data Analysts or External Systems

connected via API with the implemented multi-client-capability.

5.3 Connection with External Systems

One important functionality of an integrated Smart Data Platform is the easy access

of various Smart Data applications and Systems that provide smart services to citizens

or other Smart Data users.

The linkage of the External Systems developed by the Service developer is done by an

API. In order to obtain data the API user needs to register for an API Gateway. Users

are granted access to specific APIs in line with data Classification and their specified

access rights.

Access to a specific API will be granted after approval by the Use Case stakeholders,

especially the data provider. The endpoint of the API is secured following current

security standards. Accessing data from the API is possible based on the API’s

documentation specification.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 102

Short Summary of “Flow of Data, Technical

implementation and underlying Architecture”

The technical description and implementation of all previously discussed and defined

parameters is the last step before the Data Gatekeeper rules can be executed in the

Data Platform. This work cannot be executed automatically, it requires special IT-skills

as well as a deep understanding of the Use Case, but it is an integral part of the end-

to-end process of the Data Gatekeeper implementation. The Data Model e.g.

describes all logical sequences that need to be followed by the Smart Data Platform,

so that e.g. the Use Case specific data Classification will be met during the complete

analysis process. The programmer or architect of the Smart Data Platform gets precise

information about how the IT-system needs to behave to fulfil a dedicated Use Case.

For that purpose, the Data Gatekeeper Registry is a special program entity in the

underlying Smart Data Platform that enables a structured input of the Data

Gatekeeper rules, which then influence the other Smart Data Platform data handling

algorithms accordingly.

Also the input and output interfaces to transfer the raw data into or from the Smart

Data Platform based on the given Data Gatekeeper rules as well as the underlying

data exchange formats and security rules need to be described, implemented and

tested in order to guarantee a frictionless functioning.

5.4 Analysis Dashboard

Another important feature of a Smart Data Platform is the existence of various analysis

capabilities for users interested in statistical findings. Not only for specific users but for

all users of a particular Use Case e.g. the Use Case Responsible and Data Analysts. The

two user interfaces Analysis Dashboard and Platform Statistics Dashboard address

those users.

The different data stored in the platform´s Database is used by the analysis module of

the platform to calculate certain data interpretations. The Data Set combinations and

their visualisations can be browsed by Data Analysts in the Analysis Dashboard user

interface. Figure 11 shows an exemplary analysis user interface for the analysis of

passenger amounts in public transportation.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 103

Figure 13: Exemplary analysis of Smart Home Use Case (Source : TUM)

For the Analysis Dashboard the user needs to log-in. He or she is only able to access

the visualized analyses belonging to his access rights.

As a boundary condition the analysis of data and Data Analyst access rights must be

inline with the Data Classification, which is ensuren by the Use Case Creation Process

and the Data Usage Agreement between Data Owner and City.

5.5 Platform Statistics Dashboard

The Platform Statsitics Dashboard enhances the capabilities provided to the Data

Analysts by the means of analyzing the usage statistics of the External Systems and the

corresponding API accesses registered in the Smart Data Platform. Targetet analyses

of user workflows can support the contious improvement of Use Cases.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 104

5.6 Transparency Dashboard

Another important feature of this particular Smart Data Platform approach is the

Transparency Dashboard, which allows citizens to get an overview on all data

Classification, processing and analysis undertaken for a particular Smart Data Use

Case. The opportunity of gaining insights into all processing and analysis steps the data

is used for, builds trust in the City as Smart Data operator and addresses the strategic

goal of building a positive and transparency image.

Figure14: Example of a Transparency Dashboard Prototype for Smarter Together Munich

(Source : VMZ)

The Transparency Dashboard allows external Smart Data Users data providers, Use

Case Responsibles and other project stakeholders to provide the list of available data

and their „fact sheet“ including current Categorisation on a website, also giving

information on how to access the data as a third party.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 105

The Transpartency Dashboard is realized as open website for interested citizens. It is

build a static website that is administrated by a standard content managemen system.

Contents are baed on the Use Case Concept Document and Data Classification

Sheets.

Short Summary of “Analysis and Transparency Dashboard”

The appropriate visualization of the Use Case input and output parameters is one key

element to assure transparency and to gain acceptance of the stakeholders and end

users. The Analysis Dashboard summarises the main Use Case specific parameters that

are used to define the Data Gatekeeper rules. The most important task of the Analysis

Dashboard however is to offer reasonable and comprehensible presentations of the

analysis, taking into account the possible restrictions that have been submitted by the

Data Gatekeeper rules.

The Transparency Dashboard is a Web-page that is designed for citizens. It can

contain an abstract of the Use Cases, a list of Data Sources, analysis goals and applied

data privacy rules. Its main task is to inform a broader public about the background

of the Use Cases, the possibilities to get access to related open data, to describe

which sensors have been used or to offer the possibility to get in touch with the City

stakeholders. It is one possible IT-tool for a City to demonstrate transparency in Smart

City or Digitalisation projects.

Conclusion and Outlook

Why

The present Data Gatekeeper Concept was created to establish a possible new view

on how to handle data in terms of data security and data privacy within an innovative

Smart City approach. Dealing with a Data Gatekeeper concept in a Smart City

environment automatically leads the stakeholders to consider a huge complexity and

variety of items, leading from legal and compliance aspects over an adequate

organisation to various expertise and finally towards an IT-integration.

How to see

The concept should therefore not only be looked at as a technical model of how to

handle data. It also offers a lot of discussion points of what a new view on a Smart City

could be in the future.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 106

Looking at the resulting complexity and number of triggered items when discussing a

Data Gatekeeper approach, the present concept should be looked at as a starting

point, underlining the character as a “living document” or going forward, as “Blue

Print”.

How to work with

Participation in the concept, feedback and share of additional experiences from

other cities therefore will help the Data Gatekeeper to stepwise expand to a more

and more robust and practical approach. It is the opportunity mainly of the

participating cities to trial this approach, discuss it and stepwise improve it over time.

Although this concept is a first step, it can already be used in City administrations to

rethink or align the way how the future City building material “Smart Data” needs to

be handled accordingly.

First conclusion

Concluding on the complex items discussed in this concept shows that data and

digitalization processes as well as involving citizens in an appropriate Participation

process will play an outstanding role in any strategic approach around any future

Smart City concept – a smooth transformation of processes and a cultural change of

people is needed.

What’s it for

The Data Gatekeeper with its focus on terms like “data handling”, including “data

security” and “data privacy” is the filter to correctly collect the Smart Data that are

needed to help cities worldwide to face their enormous challenges in the areas like

traffic, air pollution, the housing situation and to help guarantee last but not least the

economical attractiveness of the City for investors.

Why using it

The mentioned problems can be reduced by using European- or worldwide solution

approaches taking into consideration new administrative proceedings (“think and act

beyond traditional borders”), cooperative exchange of experiences and harmonised

political course of actions. As an example, it doesn’t seem reasonable that every

European City would describe and develop a different own Data Gatekeeper

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 107

approach. Only the intensive exchange of experiences, ideas and innovative ideas

will finally bring appropriate answers.

What is the benefit of it

New and smarter Data Sources are needed to illustrate the status-quo and help

analysing future scenarios – this is a key asset in a modern City administration.

Furthermore it is necessary to establish new Data Sources like innovative sensors and

actuators to add new information on how the City and its inhabitants are potentially

behaving over time is a crucial task and benefit for any City planning process. These

aspects often are extremely challenging for a City, especially when dealing with the

organizational set up in traditional department silos.

Also Cities will need convenient and innovative planning tools for the future. These

tools will only help to solve the above mentioned huge challenges when more useable

Smart Data will be provided and more intelligent analysis methodologies will be

incorporated.

Some ideas for intelligent planning and simulation of very complex processes in the

above mentioned areas are already evolving or are currently scheduled. Intelligent

“Digital Twins” of a City e.g. could enable innovative and predictive planning

simulation. Those approaches could be a first step to project which practical solutions

in a “real City” environment would be realisable.

All of these potential benefits for a future Smart City depend on mainly two things:

First is the availability of the right Use Cases. The creation of a valuable Use Case is a

challenging and time consuming process that should not be underestimated.

Second is the availability of data that can be considered as smart enough (and

useable for a City) to develop the required analysis that help solving the initial

problems.

New problems

New technologies lead to even more data and over time citizens might become more

and more skeptical when a City talks about sensors and innovative data collections

and analysis. Data privacy and the awareness of the potential misuse of it becomes a

focus area for many involved citizens and companies. For that reason a Smart City

approach without reasonable information and Participation processes seems very

hard to execute over time.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 108

On the other hand, discussions about growing criminal data hacking, data leakages

and almost uncontrollable commercialization of personal and private data also

affects cities in their new and growing role of Smart Data handlers. When it comes to

Smart City solutions, the City needs to provide suitable technical and organizational

answers to the above mentioned challenges and the citizens’ concerns. A Smart City

approach needs to position itself as a trustworthy and transparent data handler who

is only committed to the wellbeing of the citizens and a balanced City itself.

Also, taking into consideration changing legal situations and adapting existing or

planned processes to it, might create problems, as initial starting conditions might

have been changed by new regulations over time.

Final conclusion

Here is where the Data Gatekeeper comes into the game again. It is a strong

approach and vision for a holistic view on how Smart City data should be handled in

the future. It is also a concrete and stepwise improving tool that cities can use,

introduce and further integrate into their processes to underline their willingness to use

Smart Data as indispensable future raw material and at the same time be a

transparent and trustable fair data player.

But it is only when cities really adopt the idea of the current Data Gatekeeper

concept, pilot it, improve and enhance it over time, that it will become a robust and

trustworthy model that can be implemented in any given Smart City administration

over time. Actively communicating and openly exchanging the individual

experiences, hurdles and successes between the cities and their stakeholders will help

to accelerate this process.

A possible future “Data Gatekeeper Inside” label for European Cities is certainly a

strong vision that would emphasize the trustful position a Smart City should play for all

the stakeholders in the complex network that are needed to help implementing the

solutions that eventually will solve the upcoming challenges.

Glossary

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 109

Term Definition

Basic Use Case

Description

Key question for each Use Case, such as user goals (Data Analyst

as well as Smart Data Users), analysis purposes, user tasks, internal

and external stakeholders etc.

City In the scope of this concept a city can mean a City as e.g.

Munich or an urban district of a City or e.g. a consortium of

several Cities. This depends on the local or regional granularity a

Smart Data Platform shall be run and managed. Roles defined

per City therefore can also mean roles per urban environment.

Categorisation Within this concept Data Sets therefore first are assigned to

different in advance specified Categories with regards to their

protection needs (see Data Categories).

Categories /

Categorisation

Dimensions

The following Data Cateories are distinguished in the

Categorisation within this concept:

category 1: open data

category 2: non-personal data but not open

category 3: non-personal data but restricted

category 4: Personal data

Classification Data Classification is understood as the assignment of data

security levels to a Data Set stemming from a Data Source. This is

necessary due to compliance reasons. Within this concept the

Data Set is rated quantitatively (rating level 1–4) in 4

Classification Dimensions. Further, whether the Data Set is Open

Data or not (refer to respective Category) as well as a time-to-

live for respective Data Objects is assesses as part of the

Classification.

Classification

Dimensions

In this concpt the following Classification Dimensions are used to

quantify the Classification for each Data Set:

• I for integrity and data protection (Level I1 – I4)

• P for processing and analysis (Level P1 – P4)

• R for redistribution and modification (Level R1 – R4)

• S for storage and deletion (Level S1 – S4)

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 110

Term Definition

Data Administrator Technical role on Data Supplier side ensuring the technical

provision of data to the Smart Data Platform. He or she provides

technical and data scheme information during the Smart Data

Creation Process. Further he or she helps out with support to

various Use Case roles and works on technical tasks related to

the data ownership of the Data Owner.

(example for task: Operations)

Data Analysis

Query

Either the Use Case Responsible himself or several Use Case-

specific Data Analysts e.g. from City departments demand for

gaining insights from Smart Data. In The Scope of this concept a

Data Analysis Query is a concrete demand for combined or

simple data analysis raised by a Data Analyst or Use Case Owner.

The Analysis Dashboard visualized several Data Analysis Queries

by means of Derived Data Sets. More precisely, it is a Query that

processes one Data Set or concatenates several Data Sets.

(example for task: Operations)

Data Analyst

Various users with interest in gaining insights from Smart Data as

e.g. employees of various city departments. Within the concept

they have access to dedicated Data Sets and related Data

Analysis Queries using the Data Analysis Dashboard and can

analyze them for granted purposes of use.

(example for task: Operations)

Data Field Label describing all single Data Values per data column

Data Model In the scope of this concept the Data Model describes many

possible flexible Data Models e.g. consisting of two Use Cases

and tow related data sets or 5 Use Case referring to 6 different

data sets.

Data Object Row of Data Values with multiple Data Fields and corresponding

Data Values

Data Owner The Data Owner is a leading person on Supplier side. He or she

defines general or specific purposes of use data may be used

with and plays a key roles for decisions of a related Smart Data

Use Case.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 111

Term Definition

(example for task: Data Provision)

Data Protection

Officer

The Data protection Officer ensures compliance to Legal

Regulations as e.g. data privacy. He or she is the legal

responsible for the whole data platform including all Use Cases

and ensures the data just being used for purposes the Data

Owner agrees with.

(example for task: Data Consumption)

Data Scheme Schematic structure of one or several Data Sets related to one

Use Case provided by the Data Administrator to the Platform

Administrator

Data Set Set of Data Objects that is hosted on the Smart Data Platform

and

Data Source External System providing a regulary data stream that stems from

sensors or databases

Data Subject For personal data as well as for usage data related to humans,

the Data Subject is the person that owns the data and can

decide whether she or he wants to allow a respective purpose

of use e.g. by a Use Case or by the City administration. The Data

Subject is granted many rights on his or her personal data.

Data Supplier The Data Supplier is a company or an organisation supplying a

Data Set to the Smart Data Platform. This regularly happens in the

scope of a Smart Data Use Case.

Data Value Atomic data value that is described by a Data Field

Data Interface A technical interface (e.g. REST webservice) regularly providing

a stream of datasets for a Data Set

Database Information system persisting data for business transactions and

analysis

External System External System that either is an interactive system used by Smart

Data Users or a system serving the Smart Data Platform as a Data

Source

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 112

Term Definition

Infrastructure Data The most common domain of Master Data in a City context. In

contrary to real-time data Master Data have a low update

frequency.

Master Data Data describing fundamental and independent entities as e.g.

the temperature sensor id and its location (following ISO 8000). In

the context of the data gatekeeper concept, this means

metadata and data relevant for steering the usage of

databases or describing data of databases.

Participation Process of continuous involvement of citizens and various

stakeholders inside and outside the City. Co-Creation is one

possible method of Participation.

Platform

Administrator

Technical Administrator for the Smart Data Platform inside or

outside the City. He or she ensures the availablitly of the data

platform to all stakeholders and realized technical adjustments,

configurations and implementation in case of new Use Cases.

(example for task: Operations)

Platform Provider In case a City does not want to run a Smart Data Platform on

won infrastructure a Platform Provider Company or Organisation

could run a Smart Data Platform on behalf of the City.

Processing

Functionionality

Concrete data manipulatipon function as e.g. the replacement

of Data Object identifiers

Processing

Measure

Manipuliation of single Data Objects and its Data Values of a

Data Set necessary for compliance reasons. In the scope of this

concept the conceptual procedures

Anonymisation/Pseudonimisation are understood as a

Processing measure that can be realized by various Processing

Functionalities.

Real-Time Data Real-Time Data are data stemming from sensors with a small

update time interval (e.g. 1 min or 30 sec.).

Service Developer

The Service Developer designs and evaluates a human-

computer interaction concept for an External System if

necessary (e.g. not necessary for hardware solution like Lamp

Post). Therefore he or she is involved in the Requirements

Specification and other phases of the Use Case Creation

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 113

Term Definition

Process. Further he or she develops an app or constructs and

tests a hardware solution. His or her responsibility is further to

prevent security flaws on application level and to ensure the

availability of the Smart Data Use Case.

(example for task: Data Provision)

Smart City A Smart City is understood as an urban environment with digitally

wired inhabitants, a wide net of actors exist (cf. Walser & Haller,

2016, p. 19).

Smart Data Smart Data is understood as analysis solutions based on big data

with a clear purpose, semantic, data quality, security and data

privacy (based on Jähnichen 2015).

Smart Data

Administration

All roles involved in the Smart Data Management on side of the

City.

(example for task: Strategy)

Smart Data Board Comitte of various leading persons within the City or the urban

environment. The Decide strategically on new Smart City Use

Cases and Smart Data topics and serve for escalating decisions.

(example for task: Strategy)

Smart Data

Creation

Process that elaborates and realizes a single Smart Data Use

Case, see Use Case Creation Process

Smart Data Owner Leading person (e.g. municipal board member) responsible and

accountable for Smart Data in the City or urban environment. He

or she is internal Sponsor and has budget main responsibility.

(example for task: Strategy)

Smart Data

Platform

Technical Platform that serves as basis for smart services and for

the organisational Data Gatekeeper Concept.

Smart Data User User that uses a Smart Data app or an other External System that

uses Data Sets from the Smart Data Platform

(example for task: Operations)

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 114

Term Definition

Strategic Manager Key project manager and responsible for the Smart Data

Platform and related strategic decisions, moderations and

negotiations. He or she elaborates Data Usage Agreements with

Data Suppliers.

(example for task: Strategy)

Transaction Data Data that changes dynamically as e.g. tempaerature values

from a sensor Data Set.

Use Case In the context of this concept a Smart Data Use Case is a

concrete Smart Data User scenario like e.g. Smart Home or Smart

Lamp Posts.

Use Case Creation

Process

Process that has to be started in case a new Smart Data Use case

shall be elaborated and added to the Smart Data Platform

(described in chapter 3 and 4). Synonyme:

Use Case

Responsible

The Use Case Responsible can be at City side or on Side of an

other organisation or on side Request for Data Analysis Queries

Responsible for subject-matter aspects of a Smart City Use Case

Ensure running operation and provide support

Analyze usage and use data for city-internal or external purposes

according to usage agreements

(example for task: Operations)

Use Case Topic

Area

Groups of Use Cases in the Smarter Together Project like e.g.

Smart Home, Mobility or Energy.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 115

Bibliography

BBC (2012). Google 'fails to meet EU rules' on new privacy policy.

URL: http://www.bbc.com/news/business-17192234

BSI (2011). Privacy Impact Assessment Guideline Authors: Marie Caroline Oetzel, Sarah

Spiekermann, Ingrid Grüning, Harald Kelter, Sabine Mull, Editor: Julian Cantella,

Bundesamt für Sicherheit in der Informationstechnik, 53133 Bonn. URL:

https://www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/ElekAusweise/PIA/Pri

vacy_Impact_Assessment_Guideline_Kurzfasssung.pdf?__blob=publicationFile

&v=1

Edwards, J., & Wolfe, S. (2005). Compliance: A review. Journal of Financial Regulation

and Compliance, 13(1), 48–59.

EU (2016). General Data Protection Regulation (GDPR): Regulation (EU) 2016/679 of

the European Parliament and of the Council of 27 April 2016 on the protection

of natural persons with regard to the processing of personal data and on the

free movement of such data, and repealing Directive 95/46/EC. URL: http://eur-

lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32016R0679

EU (2003). Directive 2003/98/EC of the European Parliament and of the Council of 17

November 2003 on the re-use of public sector information. URL: http://eur-

lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32003L0098

EU (1995). Directive 95/46/EC of the European Parliament and of the Council of 24

October 1995 on the protection of individuals with regard to the processing of

personal data and on the free movement of such data. URL: http://eur-

lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:31995L0046

Holzmann, R. (2016). Betrug und Korruption im Experiment: Ansätze für ein

evidenzbasiertes Compliance-Management. Wiesbaden: Springer

Fachmedien.

Jacka, J. M., & Keller, P. J. (2009). Business Process Mapping: Improving Customer

Satisfaction. New Jersey: John Wiley & Sons

Jähnichen, S. (2015). Von Big Data zu Smart Data – Herausforderungen für die

Wirtschaft. Smart Data Newsletter Volume 1/2015, Smart-Data-Begleitforschung

c/o Loesch Hund Liepold Kommunikation GmbH, 10115 Berlin. URL:

http://www.digitale-

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 116

technologien.de/DT/Redaktion/DE/Downloads/Publikation/SmartData_NL1.pd

f?__blob=publicationFile&v=5

Li, Y., Dai, W., Ming, Z., & Qiu, M. (2016). Privacy Protection for Preventing Data Over-

Collection in Smart City. IEEE Transactions on Computers, 65(5), 1339-1350.

doi:10.1109/TC.2015.2470247

OECD (2013). The OECD Privacy Framework. URL:

http://www.oecd.org/sti/ieconomy/oecd_privacy_framework.pdf

Scheuch, R., Gansor, T., & Ziller, C. (2012). Master Data Management: Strategie,

Organisation, Architektur. Heidelberg: dpunkt-Verlag.

Walser, K., & Haller, S. (2016). Smart Governance in Smart Cities. In: A. Meier & E.

Portmann (Eds.), Smart City: Strategie, Governance und Projekte (p 19-46).

Wiesbaden: Springer Vieweg.

Wang, Y., & Kobsa, A. (2008). Privacy-Enhancing Technologies. URL:

https://www.ics.uci.edu/~kobsa/papers/2008-IUI-Book-kobsa.pdf

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 117

Appendix

7.1 Legal Regulations

7.1.1 Principles on Privacy for Personal Data

An overview with wider explanation for each key principle listed in chapter 2.1 Legal

Regulations:

Consent and Necessity of Processing

„Personal data shall be processed lawfully […].“ GDPR, Article 5, 1(a)

There are only a few possibilities for collecting and processing personal data:

1) The Data Subject has given consent

2) It is necessary for the performance of a contract to which the Data Subject is

party

3) It is necessary for compliance with a legal obligation

4) It is necessary in order to protect the vital interests of the Data Subject or of

another person

5) It is necessary for the performance of a task carried out in the public interest or

in the exercise of official authority

6) It is necessary for the purposes of the legitimate interests pursued by the

controller or by a third party (except where such interests are overridden by the

interests or fundamental rights and freedoms of the data subject).

At least one of the above need to apply for each purpose.

If the collection and processing of data is based on consent, it is necessary to comply

with the conditions for consent, listed in Article 7 (GDPR). Moreover, there are special

conditions regarding the processing of personal data of children, special Categories

of personal data and personal data relating to criminal convictions and offences (see

GDPR, Article 8-10).

References: GDPR, Article 6-10

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 118

Data minimisation

„Personal data shall be […] limited to what is necessary in relation to the purposes for

which they are processed.“ GDPR, Article 5, 1(c)

1) It is necessary to evaluate carefully what data are needed for the purpose,

especially when processing personal data.

2) The collected data needs to be in relation to the purpose.

3) It must be ensured, that only the needed data are collected and processed.

4) If there are several options to achieve the purpose, the least privacy-invasive

one should always be chosen.

5) Personal data must be erased without undue delay when the personal data are

no longer necessary in relation to the purposes.

References: GDPR, Article 5, 1(c)

Purpose limitation and specification

„Personal data shall be collected for specified, explicit and legitimate purposes and

not further processed in a manner that is incompatible with those purposes.“ GDPR,

Article 5, 1(b)

1) Data is only allowed to be collected for legitimate purposes (see principle

‘lawfulness’).

2) The purposes for which the data are collected needs to be specified in

advance.

3) The data collected for specified purposes are only allowed to be used exactly

for these purposes, except:

o The purposes are compatible with the initial ones (see Article 6, 4).

4) For purposes, not compatible with the initial ones, the data are not allowed to

be processed and it is necessary to get the consent of the Data Subject again.

References: GDPR, Article 5, 1(b)

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 119

Storage limitation

„Personal data shall be kept in a form which permits identification of Data Subjects

for no longer than is necessary for the purposes for which the personal data are

processed.“ GDPR, Article 5, 1(e)

As soon as personal data are no longer needed for the purpose they have been

processed, they should be erased. It is possible to store personal data further insofar as

the personal data will be processed solely for:

– archiving purposes in the public interest

– scientific or historical research purposes

– statistical purposes (see GDPR, 156, 162)

References: GDPR, Article 5, 1(e)

Transparency

„Personal data shall be processed […] in a transparent manner in relation to the

Data Subject.“ GDPR, Article 5, 1(a)

The collection and processing of personal data should be handled in a transparent

way, therefore the controller is obligated to:

1) provide the Data Subject at the time when personal data are obtained, with

information (where applicable) about:

o the identity and the contact details of the controller and of the

controller's representative

o the contact details of the data protection officer

o the purposes of the processing for which the personal data are intended

as well as the legal basis for the processing

o the legitimate interests pursued by the controller or by a third party

o the recipients or Categories of recipients of the personal data

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 120

o the fact that the controller intends to transfer personal data to a third

country or international organisation

o the reference to the appropriate or suitable safeguards and the means

by which to obtain a copy of them or where they have been made

available

2) facilitate the exercise of Data Subject rights

3) provide information on action taken on a request (see Articles 15 - 22) to the

Data Subject without undue delay, if further delayed for e.g. because of

complexity, the Data Subject needs to be informed.

4) information provided as well as any communication and any actions taken

(under Articles 15 - 22 & 34) should be provided free of charge.

The controller is allowed to ask for further information to confirm the identity of

the requester if there are any doubts.

References: GDPR, Article 12-14; 19; 34

Data Subject’s Participation

„Personal data shall be processed […] in a transparent manner in relation to the

Data Subject.“ GDPR, Article 5, 1(a)

To guarantee a transparent handling of personal data, the Data Subject has the right

of access, the right to rectification, to erasure, to restriction of processing, to data

portability and to object, meaning in detail that the Data Subject has the right to:

1) obtain confirmation as to whether or not personal data concerning him or her

are being processed, and, where that is the case, access to the personal data

and the following information (where applicable):

o the purposes of the processing

o the Categories of personal data concerned

o the recipients or Categories of recipient to whom the personal data have

been or will be disclosed

o the envisaged period for which the personal data will be stored

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 121

o the rights of the Data Subject

o any available information as to their source, if not collected from the

Data Subject

o the existence of automated decision-making

2) obtain without undue delay the rectification of inaccurate personal data

concerning him or her.

3) obtain from the controller the erasure of personal data concerning him or her

without undue delay.

4) obtain from the controller restriction of processing, when:

o the processing is unlawful and the Data Subject opposes the erasure of

the personal data and requests the restriction of their use instead

o the accuracy of the personal data is contested

o the personal data are no longer necessary in relation to the purposes,

but they are required by the data subject for the establishment, exercise

or defence of legal claims

o the Data Subject has objected to processing until it is verified whether

the legitimate grounds of the controller override those of the Data

Subject

5) receive the personal data concerning him or her, which he or she has provided

to a controller and to transmit those data to another controller, when:

the processing is based on consent, on a contract or when it is carried

out by automated means

it doesn’t affect the rights and freedoms of others

6) object on grounds relating to his or her particular situation, at any time to

processing of personal data concerning him or her.

7) not to be subject to a decision based solely on automated processing,

including profiling.

References: GDPR, Article 15-18; 20-22

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 122

Accuracy

„Personal data shall be accurate and, where necessary, kept up to date.“ GDPR,

Article 5, 1(d)

When processing personal data, it is necessary to verify (regularly) that the data is

accurate. Every reasonable step must be taken to ensure that personal data that are

inaccurate are erased or rectified immediately.

References: GDPR, Article 1(d); 16

Security, confidentiality and integrity

„Personal data shall be processed in a manner that ensures appropriate security of

the personal data […].“ GDPR, Article 5, 1(f)

To ensure security for personal data, including protection against unauthorised or

unlawful processing and against accidental loss, destruction or damage, the

controller and the processor are responsible for the implementation of appropriate

technical and organisational measures to ensure a level of security appropriate to the

risk.

1) The state of the art, costs of implementation, scope, context, and the risk of

varying likelihood and severity for the rights and freedoms of natural persons

should be considered when deciding about appropriate measures, like:

o the pseudonymisation and encryption of personal data

o the ability to ensure the ongoing confidentiality, integrity, availability and

resilience of processing systems and services

o the ability to restore the availability and access to personal data after an

incident

o a process for regularly testing, assessing and evaluating the effectiveness

of technical and organisational measures for ensuring the security of the

processing

2) When assessing the appropriate level of security, the risks that are presented by

processing personal data need to be considered. Especially risks regarding

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 123

accidental or unlawful destruction, loss, alteration and unauthorised disclosure

of personal data as well as access to transmitted, stored or otherwise processed

personal data.

3) Adherence to an approved code of conduct (see Article 40) or an approved

certification mechanism (see Article 42) may be used to demonstrate

compliance with this principle.

4) Steps need to be taken, to ensure that any natural person who has access to

personal data does not process them except on instructions from the controller.

References: GDPR, Article 32, 35-36

Obligations and accountability

Processing personal data implies several Responsibilities and statutory obligations of

the controller and processor, who needs to:

1) erase personal data without undue delay when:

o the Data Subject withdraws consent

o objects to the processing

o the data have been unlawfully processed

o for compliance reasons

o the personal data are no longer necessary in relation to the purposes

except when:

o the Data Subject withdraws consent

o processing is necessary for exercising the right of freedom of expression

and information

o for compliance with a legal obligation

o for reasons of public interest in the area of public health

o for archiving purposes in the public interest, scientific or historical

research purposes or statistical purposes

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 124

o for the establishment, exercise or defence of legal claims

2) notify the supervisory authority of a personal data breach, without undue delay

(not later than 72 hours), unless the personal data breach is unlikely to result in

a risk to the rights and freedoms of natural persons.

3) notify the Data Subject of a personal data breach without undue delay, when

the personal data breach is likely to result in a high risk to the rights and freedoms

of natural persons.

4) Where a type of processing is likely to result in a high risk to the rights and

freedoms of natural persons, the controller should, prior to the processing:

o carry out an assessment of the impact of the envisaged processing

operations on the protection of personal data (see Article 35).

o seek the advice of the data protection officer, where designated, when

carrying out a data protection impact assessment (see Article 36).

5) designate a data protection officer in any case where:

o the processing is carried out by a public authority or body

o the core activities of the controller or the processor consist of processing

operations which, by virtue of their nature, their scope and/or their

purposes, require regular and systematic monitoring of data subjects on

a large scale.

o the core activities of the controller or the processor consist of processing

on a large scale of special Categories of data and personal data relating

to criminal convictions and offences.

References: GDPR, Article 24 - 31, 33 -37

7.1.2 Objectives and Principles of Opening Administrative Data

The objectives of the directive are ‘to facilitate the creation of community-wide

information products and services based on public sector documents, to enhance an

effective cross-border use of public sector documents by private companies for

added-value information products and services and to limit distortions of competition

on the Community market.’ (EU, 2003)

The directive contains a set of rules governing the re-use and the practical means of

facilitating re-use of existing documents held by public sector bodies of the member

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 125

states. It does not contain an obligation to allow re-use of documents. The decision

whether or not to authorize re-use will remain with the Member States or the public

sector body concerned.

The directive builds on the existing access regimes in the Member States. It should be

implemented and applied in full compliance with the principles relating to the

protection of personal data in accordance with the GDPR as well as with provisions of

international agreements, community and national law.

The full list of principles covering the aspects ‘requests for re-use’, ‘conditions for re-

use’ and ‘non-discrimination and fair trading’of the directive:

Requests for re-use:

1) Public sector bodies should, through electronic means where possible and

appropriate, process requests for re-use and make the document available for

re-use to the applicant.

2) Where no time limits or other rules regulating the timely provision of documents

have been established, requests should be processed within a timeframe of not

more than 20 working days after its receipt.

3) In case of a negative decision, the public sector bodies shall communicate the

grounds for refusal to the applicant on the basis of the relevant provisions of the

access regime in that Member State.

4) A negative decision should contain a reference to the means of redress.

Reference: 2003/98/EC, Article 4

Conditions for re-use:

1) Available formats:

o Documents should be made available by public sector bodies in any

pre-existing format or language, through electronic means where

possible and appropriate.

o It does not require public sector bodies to create or adapt documents in

order to comply with the request.

o Public sector bodies are not required to continue the production of a

certain type of documents.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 126

2) Principles governing charging:

o Where charges are made, the total income from supplying and allowing

re-use of documents should not exceed the cost of collection,

production, reproduction and dissemination, together with a reasonable

return on investment.

3) Transparency:

o Any applicable conditions and standard charges for the re-use of

documents held by public sector bodies should be pre-established and

published, through electronic means where possible and appropriate.

o If requested, the public sector body should indicate the calculation basis

for the published charge.

4) Licences:

o Public sector bodies may allow for re-use of documents without

conditions or may impose conditions, where appropriate through a

licence.

o Conditions should not unnecessarily restrict possibilities for re-use and

should not be used to restrict competition.

5) Practical arrangements:

o Member States should ensure that practical arrangements are in place

that facilitate the search for documents available for re-use.

o An Example for such arrangements are assets lists, accessible preferably

online.

References: 2003/98/EC, Article 5-9

Non-discrimination and fair trading:

1) Non-discrimination:

o Any applicable conditions for the re-use of documents should be non-

discriminatory for comparable Categories of re-use.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 127

o If documents are re-used by a public sector body as input for its

commercial activities which fall outside the scope of its public tasks, the

same charges and other conditions should apply as to other users.

2) Prohibition of exclusive arrangements:

o The re-use of documents should be open to all potential actors in the

market.

o Contracts or other arrangements between the public sector bodies

holding the documents and third parties should not grant exclusive rights.

o Where an exclusive right is necessary for the provision of a service in the

public interest, the validity for such an exclusive right should be reviewed

regularly (at least every 3 years).

o Exclusive arrangements should be transparent and made public.

References: 2003/98/EC, Article 10-11

7.2 Checklist Templates for the Smart Data Creation

This chapter contains a library of checklists, helping to fulfill the recommended steps in

the creation of a Smart Data Use Case Creation Process if executed in the exemplary

way.

The instructions given in chapter 3 are implemented, to simplify the subject and help

understand and reproduce the described approach. If some of the steps in the

following subchapters are unclear, it is suggested ro (re)read the corresponding

section in chapter 3, where each step is described more detailed.

The following chapter is structured in table form. The left column lists the keywords and

the right column is left open for answers. Table lines with blue background indicate a

headline for the following table lines. Table lines with green background indicate an

action and include an exemplary role that performs the task. A long (under)line

requires a name or another term to be filled in there.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 128

7.2.1 Requirement Specification (Epic)

Strategic Comitee : Discuss and Decide on Initiative for new Use Case

Basic Use Case Description

An identification number

A unique name

Analysis purpose and motivation/business

model

Task assignments

(how to achieve the goal)

Analysis results (output)

Needed Data

Time period

Design ideas of the analysis

Exceptions

Specific requirements

Project Stakeholders and Use Case specific Roles

Definition of contact persons Company C:

Mr. Max Mustermann

Address XY

Tel.:

E-Mail: example@company_c.com

Assignment of required roles Use Case Responsible:

Mr./Mrs. __________________________________

Data Owner:

Mr./Mrs. __________________________________

Data Protection Officer:

Mr./Mrs. __________________________________

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 129

Data Administrator:

Mr./Mrs. __________________________________

Service Developer: Mr./Mrs.

__________________________________

Definition of further required tasks or roles

Assignment of new task(s)/role(s)

Technical Requirements

The kind of Data Set and the location

Data format(s)

Interfaces for the import and export of the

data (API access)

Time and rate of exchange

Specific requirements (modalities)

Suggestion for Data Processing and Exchange

Data Categories open data

non-personal data but not open

non-personal data but restricted

personal data

Type of contract for data usage License

License with side letter

Individual Agreement

Measures anonymization

pseudonymization

aggregation

access restriction/protection

______________________

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 130

7.2.2 Data Categorisation

Data Categorization

Data Category Attribute 1: Temperature open data

non-personal data but not open

non-personal data but restricted

personal data

Data Category Attribute 2: Humidity open data

non-personal data but not open

non-personal data but restricted

personal data

Data Category Attribute 3: ________________ open data

non-personal data but not open

non-personal data but restricted

personal data

7.2.3 Data Usage Agreement and Licensing

Strategic Manager: Create draft for Data Usage Agreement

Strategic Manager: Initiate and moderate Data Usage Agreement meeting

Contract Specifications

Type of contract for data usage License

Strategic Manager: Initiate and moderate Use Case Agreement meeting

Data Administrator: Approve Data Format and Technical Feasibility

Data Protection Officer: Legal Check Requirement Specification

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 131

License with side letter

Individual Agreement

Licensing type ___________________________

Data Protection Officer: Legal Check Data Usage Agreement

7.2.4 Data Classification

Data Classification

Data Set Attribute 1: Temperature

Open Data yes no

Time-to-live ____________________

(Day / Month / Year )

Classification: Integrity and data protection

(I)

I1

I3

I2

I4

Classification: Processing and analysis (P) P1

P3

P2

P4

Classification: Redistribution and

modification (R)

R1

R3

R2

R4

Classification: Storage and deletion (S) S1

S3

S2

S4

Data Set Attribute 2: Humidity

Open Data yes no

Time-to-live ____________________

(Day / Month / Year )

Classification: Integrity and data protection

(I)

I1 I2

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 132

I3 I4

Classification: Processing and analysis (P) P1

P3

P2

P4

Classification: Redistribution and

modification (R)

R1

R3

R2

R4

Classification: Storage and deletion (S) S1

S3

S2

S4

Data Protection Officer: Legal Check Data Classification

Aggregate Classification

Calculation Rule: Take the maximum

of each I, of each P, of each R and

each S.

Take into account the (different)

time-to-live Attribute(s)

I ___

P ___

R ___

S ___

Time-to-live: ____________________

(Day / Month / Year )

7.2.5 Quality Gate 1

Use Case Responsible: Elaborate final Use Case concept

Strategic Manager: Final Concept Approval (Q1)

Strategic Manager: Sign Data Usage Agreement

7.2.6 Data Sets and Technical Implementation

Specification for Technical Implementation

Data Scheme

Compatible with Data Model yes no

Existence of all necessary Attributes yes no

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 133

7.2.7 Data Processing

Sampling rate

Data Administrator: Create Data Scheme and Interface Specification

Service Developer: Create Technical Specification

Use Case Responsible: Technical Specification Approval

Platform Administrator: Implementation and Data Set Creation

Data Administrator: Implementation of Data Interface

Processing Measures and Functions

Measures anonymization

pseudonymization

aggregation

access restriction/protection

______________________

Function for measure 1:

______________________

(if measure 1 is anonymization or

pseudonymization)

Removing person-related Data Fields

Removing ID digits

Creating hash values

______________________

Function for measure 2:

______________________

(if measure 2 is aggregation)

Remove subcategories of a Data Set

Create an average value for a defined

period of time

Aggregate (sub-) Categories

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 134

7.2.8 Quality Gate 2

Use Case Responsible: Specification and Implementation Review

Strategic Manager: Use Case Integration Test

Use Case Responsible: Go Live Pilot Phase

Strategic Manager: Analysis User Test

Strategic Manager: Final Use Case Approval and Go Live

______________________

Platform Administrator: Implement Processing Functions

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 135

7.3 RACI

Figure shows the activities described in the respective sections and the roles that are

in charge of the activities in the RACI matrix notoation (cf. Jacka & Keller, 2009).

7.4 Categories

Open data

For data belonging to the category ‘open data’ the main focus is on topics

concerning the prerequisites for passing on the data and respective user agreements,

licensing, and further agreed on conditions.

Data in this category is not subject to any privacy restrictions. Meaning there are no

personal data or personal related data therein. This data has the lowest level of

protection need out of the four Categories.

Sect

ion

Task

Smar

t D

ata

Boa

rdSm

art

Dat

a O

wne

r Ci

tySt

rate

gic

Smar

t D

ata

Res

pons

ible

Dat

a Pr

otec

tion

Off

icer

Cit

y

Plat

form

Adm

inis

trat

orA

naly

sis

Use

r

Use

Cas

e Re

spon

sibl

e Co

ncep

tual

and

Tec

hnic

al D

evel

oper

Dat

a O

wne

r Su

pplie

rTe

chni

cal D

ata

Stew

ard

Supp

lier

Dat

a Pr

otec

tion

Off

icer

Sup

plie

r

3.2 Discuss and Decide on Initiative for new Use Case C R C C

3.2 Basic Use Case Description C C R C C C

3.2 Specification Project Stakeholders and Roles R I

3.2 Technical Requirements I C R C

3.2 Suggestion for Data Processing and Exchange R I C

3.2 Initiate and moderate Use Case Agreement meeting I I R I C C C C I C

3.2 Approve Data Format and Technical Feasibility C C R

3.2 Legal Check Requirement Specification I R I C

3.3 Data Categorization per Data Set Attribute I C C R C

3.3 Data Classification per Data Set Attribute I C C R

3.3 Legal Check Data Classification I I R I I

3.4 Create draft for Data Usage Agreement R C C C

3.4 Initiate and moderate Data Usage Agreement meeting R C I C I C C

3.4 Legal Check Data Usage Agreement I R

3.5 Elaborate final Use Case concept C C C R C C

3.5 Final Concept Approval (Q1) C R I I C

3.5 Sign Data Usage Agreement A R C A C

3.6 Create Data Scheme and Interface Specification C I C R

3.6 Create Technical Specification C I R C

3.6 Technical Specification Approval C I R I

3.6 Implementation and Data Set Creation R I C

3.6 Implementation of Data Interface C I R

3.6 Develop App or Hardware solution C C R

3.7 Derive necessary Processing Measures and Functions C R C C C C

3.7 Implement Processing Functions R I

3.8 Specification and Implementation Review C C R C C

3.8 Use Case Integration Test R C I C C C

3.8 Go Live Pilot Phase C C R C C

3.8 Analysis User Test R C C C C

3.8 Final Use Case Approval and Go Live C R C I I C I C I

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 136

It is necessary to ensure or check that data in this category:

- doesn’t contain personal data

- doesn’t contain data that allows to trace it back to a person

- doesn’t contain data that, when combined with other data, allows to trace

it back to a person

- is proven to be open data

- the conditions under which data may be processed and/or distributed

- to which licenses is the Data Subject to

Non-personal data but not open

Data in this category cannot be unequivocally classified as public data, but is not

subject to any privacy restrictions. Meaning it does not contain any protective

information about persons and can consequently not be used for the violation of

individual personality rights.

For the reason that the data were not (yet) released/marked as open data, there

might be restrictions regarding their use, processing, distribution or storage.

Topics that need to be considered within this category therefore include usage and

distribution rights and agreements as well as licenses and further restrictions.

It is necessary to ensure or check that data in this category:

- doesn’t contain personal data

- doesn’t contain data that allows to trace it back to a person

- doesn’t contain data that, when combined with other data, allows to trace

it back to a person

- is not marked as open data

- has a lawful owner

- the conditions under which data may be processed and/or distributed

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 137

- to which restrictions is the Data Subject to

Non-personal data but restricted

Data belonging to this category cannot be unequivocally classified as personal data,

but contains a (potential) risk that under special circumstances this data can be

attributed to persons and therefore needs to be treated accordingly.

This means that data in this category does not contain any direct personal information.

It is however possible, when linked or evaluated with other information, that the data

allows to extract individual information, such as the behavior of small groups, or even

conclusions about an individual itself. Furthermore, data in this category can be

designed in such a way that there is a certain (individually specified) risk of misuse,

even without any person-related information being involved.

When processing this kind of data, it needs to be verified and ensured, that the

processing does not allow any kind of traceability to individuals or micro groups. If this

is not possible, the data is either not allowed to be processed within the Smart Data

Platform or it needs to be treated as personal data and hence compliance to all

privacy regulations needs to be ensured.

This means that in this category you have to specify individually what kind of protection

needs to be applied to this data.

It is necessary to consider following questions for data in this category:

- Does the data contain personal data?

- Is it possible to trace data back to a person? (Is there enough different

data?)

- Is it possible to trace data back to a person, when being processed with

other data?

- What is the risk of that data being able to violate personal rights?

- Measures should be taken to minimize risk

- What are the conditions for (a lawful) processing of the data?

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 138

Personal data

Personal data is highly sensitive and confidential data and hence has the highest

protection need. When processing personal data, the responsible person is obliged to

create a detailed written documentation of the reasons for the processing as well as

possible risks and all taken measures to ensure data privacy and security. If it is not

absolutely necessary, personal data should not be processed.

In some cases, however, it may be necessary to process personal data, which is

possible if the principles in chapter Legal Regulations are taken into account and

adhered to.

There are different possibilities on how to transform (personal) data within a platform

so that it cannot any longer be referenced to an individual. Measures that can or need

to be taken can be found in chapter Data Processing.

It is necessary to consider following questions for data in this category:

- Is it (originally) personal data?

- Is the processing legal?

- Is the personal data really necessary - are there other possible solutions?

- Is the survey in relation with the result/benefit?

- Is the data marked accordingly?

- What measures can/need to be taken?

- Are there any specific conditions that were agreed on?

7.5 Classification

Integrity and data protection

The Classification “I” stands for integrity and data protection. It ensures that data is

processed lawfully and securely with regards to those aspects.

Level 1 (I1): No further measures need to be taken.

Data does neither contain personal information nor allow backtracing to an individual.

Furthermore, the data is not subject to any license or agreement condition, which

requires specific treatment regarding integrity and data protection.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 139

Level 2 (I2): Data need to be anonymized or pseudonymised5 before further

processing.

Data contains information that could be used to identify a person when being

analyzed with other data and therefore needs to be anonymized or pseudonymised

accordingly.

Level 3 (I3): Data need to be aggregated and combined with more or similar data of

this kind in order to be able to process and analyze it. The necessary amount must be

identified individually per Use Case, written down in a text and coordinated.

Level 4 (I4): Data needs to be treated as personal data and therefore it is necessary

to comply with data protection laws and legal requirements, described in chapter

Legal Regulations.

Processing and analysis

The Classification “P” stands for “processing and analysis”. It ensures, that data is

processed lawfully and securely with regards to those aspects.

Level 1 (P1): No restrictions.

Data does neither contain personal information nor allow backtracing to an individual.

Furthermore, the data is not subject to any license or agreement condition, which

requires specific treatment regarding processing and analysis.

Level 2 (P2): Data can only be processed and analyzed with other data when it is

ensured that by processing, it (still) won’t be possible to trace data back to an

individual.

In this case, it is necessary to identify which or what kind of data would enable to trace

data back to a person. This needs to be described in detail, written down in a text and

coordinated.

Level 3 (P3): Data can only be processed and analyzed with preassigned data within

the Use Case.

5 Pseudonymisation is only applicable when it is ensured that there is no way to trace it back to its initial form, meaning the pseudonymisation needs to be irreversible.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 140

In this case, the conditions for further processing and analysis need to be set

individually per Use Case and embedded in the terms of use or the license. The

processing and analysis possibilities need to be coordinated, arranged and implement

with responsible stakeholders.

Level 4 (P4): Data are not allowed to be processed or analyzed with other data.

Data contains personal data or comparable vulnerable information and therefore it is

not allowed to use it for other analysis than for the initial purpose they were collected

for. If this data is needed for other analysis, it is necessary that the principle ‘Consent

and Necessity of Processing’ is met for the new purpose.

Redistribution and modification

The Classification “R” stands for “redistribution and modification”. It ensures, that data

is processed lawfully and securely with regards to those aspects.

Level 1 (R1): No restrictions.

Data does neither contain personal information nor allow backtracing to an individual.

Furthermore, the data is not subject to any license or agreement condition, which

requires specific treatment regarding regarding redistribution and modification.

Level 2 (R2): Data can be transmitted to external users, only under specific conditions,

which are specified in the usage agreement.

These conditions need to be set individually per Use Case and embedded in the terms

of use or the license. It is also necessary to coordinate those conditions with the

operator of the Smart Data Platform, in order to ensure that the distribution or

modification is technically possible and what is necessary to implement it.

Level 3 (R3): Data are only allowed to be transmitted internally and to the data

provider.

The provisions for the data needs to be defined individually with the operators of the

Smart Data Platform in the context of the Use Case.

Level 4 (R4): Data are only allowed to be transmitted to specified internal users.

The provisions for the data needs to be defined individually with the operators of the

Smart Data Platform in the context of the Use Case. Data contains personal data or

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 141

comparable vulnerable information and therefore it needs to be ensured that they

are only accessible for certain persons through access protection and are only be

transmitted to specific internal persons.

Storage and deletion

The Classification “S” stands for “storage and deletion”. It ensures, that data is

processed lawfully and securely with regards to those aspects.

Level 1 (S1): No restrictions.

Data does neither contain personal information nor allow backtracing to an individual.

Furthermore, the data is not subject to any license or agreement condition, which

requires specific treatment regarding storage and deletion.

Level 2 (S2): Data need to be stored in-house, but are not subject to any restriction

regarding deletion.

The implementation of the provisions needs to be defined with the operators of the

Smart Data Platform. In special cases or under certain circumstances, an access

protection will be necessary, so that it is only possible for specific (in advanced

defined) persons to access the data.

Level 3 (S3): Data need to be stored in-house and the deletion limit needs to be

observed.

The implementation of the provisions needs to be defined with the operators of the

Smart Data Platform. In special cases or under certain circumstances, an access

protection will be necessary, so that it is only possible for specific (in advanced

defined) persons to access the data.

If there is a deletion limit, the limited availability of this data needs to be mentioned

when transferring the data.Data that are already anonymized can be stored without

any restrictions, if not defined differently in the contract or license.

Level 4 (S4): Data contain personal data or comparable vulnerable information and

therefore need to be stored at a preassigned, secure place if necessary with specific

access protection and the deletion limit needs to be observed.

The storage and deletion of this kind of legally sensitive data needs to be defined

individually with the operators of the Smart Data Platform in the context of the Use

Case.

SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 142

7.6 Data Model