Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Smart and Inclusive Solutions for a Better Life in Urban Districts
Smart Data Platform Munich Deliverable D4.4.1
Version 2
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 691876
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 2
REVISION CHART AND HISTORY LOG Versions
Version number Date Organisation name Comments
V0.01 24.09.2017 VMZ Berlin Table of Contents Proposal
V0.02 08.11.2017 VMZ Berlin Chapter 1
V0.03 14.12.2017 VMZ Berlin Chapter 2
V0.04 19.10.2018 VMZ Berlin Chapter 2.2.4
V0.05 25.01.2018 LHM/Montag/Glock Update Deliverable
V0.06 25.03.2018 VMZ Berlin Chapter 2.5
V0.07 12.04.2018 VMZ Berlin Restructuring of Document
V0.08 12.04.2018 ALG Layout and quality check
V1.0 16.04.2018 SPL Final version ready for submission
V1.1 23.01.2019 VMZ Berlin Adding additional
content, Chapter 3, Chapter 4
V1.2 05.02.2018 VMZ Berlin Adding additional
content, Chapter 5, Chapter 6
V1.3 12.02.2019 LHM/Montag/Glock Final update of deliverable for V2.0
V1.4 25.02.2019 ALG Quality check
V2 25.02.2019 VMZ Berlin Comments integrated
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 3
Deliverable quality review
Quality check Date Status Comments
Technical Manager 25/04/2019 Ok -
Quality Manager 26/02/2019 Ok
Project Coordinator 03/05/2019 Ok
This report reflects only the author’s view, neither the European Commission nor INEA is responsible for
any use that may be made of the information it contains.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 4
Table of contents 1. Introduction .......................................................................................................................................... 9
1.1 Purpose of the Document ......................................................................................................... 9
1.2 Structure of the Document ........................................................................................................ 9
2. Context and Background ................................................................................................................ 10
2.1 Origins and Development History ........................................................................................... 10
2.2 SMARTER TOGETHER Project Requirements towards the Platform ................................... 12
2.3 Synchronisation with Data Gatekeeper ............................................................................... 14
2.4 From City Intelligence Platform to Smart Data Platform .................................................... 15
3. Architecture and System Components ........................................................................................ 17
3.1 Architecture Overview ............................................................................................................. 17
3.2 System Components ................................................................................................................. 18
3.2.1 Date Warehouse .................................................................................................................. 18
3.2.2 Analytics Module .................................................................................................................. 19
3.2.3 Data Gatekeeper Registry ................................................................................................. 19
3.2.4 Single Sign-on Service .......................................................................................................... 20
3.2.5 Application Programming Interfaces ............................................................................... 21
3.2.6 API Catalogue ...................................................................................................................... 23
3.2.7 Analysis Dashboard ............................................................................................................. 24
3.2.8 Transparency Dashboard ................................................................................................... 26
4. Data Processing Framework ........................................................................................................... 28
4.1 Data Classification .................................................................................................................... 28
4.2 DGK Data Model and Processing Rules ................................................................................ 34
4.3 Functional Library for Data Processing .................................................................................. 35
5. Data Exchange in Use Cases .......................................................................................................... 37
5.1 Energy .......................................................................................................................................... 37
5.2 Intelligent Lampposts ................................................................................................................ 40
5.3 Mobility ........................................................................................................................................ 43
5.4 KPIs ................................................................................................................................................ 45
6. Conclusion .......................................................................................................................................... 47
7. Annex ................................................................................................................................................... 48
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 5
List of figures Figure 1 : Trends covered by City Intelligence Platform ........................................................................................... 11 Figure 2 : Example analysis of bus prioritisation performance in the city of Böblingen .................................... 12 Figure 3 : New and existing components of the Smart Data Platform ................................................................. 16 Figure 4 : Smart Data Platform Architecture Overview ............................................................................................. 18 Figure 5 : Impacts of Data Gatekeeper registry on Smart Data Platform ........................................................... 20 Figure 6: JSON data format example for smart home data ................................................................................... 22 Figure 7: JSON data format example for intelligent lamppost data ..................................................................... 22 Figure 8 : Screenshot of the API catalogue ................................................................................................................. 24 Figure 9 : Analysis Dashboard Start Page_Pilot Area with POIs ............................................................................... 26 Figure 10 : Screenshot of Transparency Dashboard (browser view) ..................................................................... 27 Figure 11 : Screenshot of Transparency Dashboard (mobile device view) ........................................................ 28 Figure 12: Role assignment for SDP users/clients ......................................................................................................... 32 Figure 13: Screenshot of roles list (extract) .................................................................................................................. 33 Figure 14: Screenshot of role assignment for Global_CompositeRole .................................................................. 33 Figure 15: Screenshot of role assignment for Energy_CompositeRole .................................................................. 33 Figure 16: Data Gatekeeper Data Model .................................................................................................................... 34 Figure 17 : SDP Workflow of Use Case Energy ............................................................................................................. 38 Figure 18: Screenshot 1 Energy analyses ...................................................................................................................... 39 Figure 19: Screenshot 2 Energy analyses ...................................................................................................................... 40 Figure 20: Workflow Use Case Lampposts .................................................................................................................... 41 Figure 21: Screenshots from Munich Smart City App with Intelligent Lampposts .............................................. 42 Figure 22: Screenshot Intelligent Lampposts analyses .............................................................................................. 43 Figure 23: Workflow Use Case Mobility .......................................................................................................................... 44 Figure 24: Screenshot Mobility analyses ........................................................................................................................ 45 Figure 25: Workflow Use Case KPIs .................................................................................................................................. 46 Figure 26: Screenshot KPIs analyses ................................................................................................................................ 47
Liste of tables Table 1: Data Classification Criteria .................................................................................................................. 29 Table 2 : Technical Implementation of Data Classification ............................................................................ 31 Table 3 : SDP Functional Libary .......................................................................................................................... 37
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 6
Glossary
API Application Programming Interface
CIP City Intelligence Platform
CSS Cascading Style Sheets
CSV Comma-separated Values
DGK Data Gatekeeper
DWD German Meteorologic Office
HTML Hypertext Markup Language
HTTPS Hypertext Transport Protocol Secure
IAM Identity and Access Management
ID Identity
IoT Internet of Things
JSON JavaScript Object Notation
JWT JASON Web Token
KPI Key Performance Indicators
MQTT Message Queuing Telemetry Transport
POI Point of Interest
RDBMS Relational DataBase Management System
REST Representational State Transfer
SDP Smart Data Platform
SQL Structured Query Language
SSO Single Sign-On
SVG Scalable Vector Graphics
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 7
SMARTER TOGETHER BENEFICIARIES
N° Organisation name Short name Country 1 Lyon Confluence SPL France
2 Lyon Métropole GLY France
3 HESPUL Association HES France
4 Toshiba TSF France
5 Enedis END France
6 Enertech ETC France
7 City of Munich MUC Germany
8 Bettervest BET Germany
9 G5-Partners G5 Germany
10 Siemens Germany SIDE Germany
11 Spectrum Mobil STA Germany
12 Securitas SCU Germany
13 City of Vienna VIE Austria
14 BWS Gemeinnutzige BWSG Austria
15 Wiener Stadtwerke WSTW Austria
16 Kelag Wärme KWG Austria
17 Siemens Austria SIAT Austria
18 Sycube Informationstechnologie SYC Austria
19 Austrian Post POST Austria
20 Fraunhofer FHG Germany
21 Austrian Institute of Technology AIT Austria
22 Energy Cities ENC France
23 Gopa COM GPC Belgium
24 University of St Gallen UNISG Switzerland
25 Technical University of Munich TUM Germany
26 Deutsches Institut fuer Normung DIN Germany
27 Algoé ALG France
28 City of Santiago de Compostela STC Spain
29 City of Sofia SOF Bulgaria
30 City of Venice VEN Italy
31 SA Régionale d’HLM de Lyon HLM France
32 Wavestone WAV France
33 WEG Radolfzeller str. 40-46 RZL Germany
34 WEG Wiesenthauerstr. 16 WHR Germany
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 8
EXECUTIVE SUMMARY The Smart Data Platform Munich (SDP) is the central data management platform for the SMARTER TOGETHER lighthouse city of Munich. It enables the implementation of all those project use cases, which require an integration, analysis and exchange of data. The Smart Data Platform is based on the City Intelligence Platform (CIP) - a product developed by Siemens AG - and has been modified to meet the specific project requirements of SMARTER TOGETHER, with the synchronization of the SDP and the Data Gatekeeper Concept being the core challenge. The Data Gatekeeper (DGK) is a comprehensive blueprint that describes the approach of how to handle data (incl. data privacy and data security aspects) in a Smart City context.
The role of the Smart Data Platform in the lighthouse city of Munich is to provide open and easy implementable interfaces to integrate data from various use cases into the platform, store and refine data in the database layer, analyse, predict and simulate, combine and visualize data for the platform´s dashboards and provide data to third party applications services over standardized Application Programming Interfaces (API). All these data handling processes need to be in line with the data privacy rules and multi-client access authorizations defined by the project´s Data Gatekeeper Concept.
Whereas the architecture and the standard APIs of the platform are defined, the system is flexible enough to react to the specific requirements of different use cases and provides an open ecosystem for data integration, analysis and distribution taking into account the project specific use cases. The platform has evolved to the final system, which will be used in the operating phase of the project. In order to present the outcomes of this process this document provides an in-depth documentation of the platform, its technical components and functionalities.
The main platform building blocks and system elements are installed and running. The general functionality has been presented in fall 2017 within a proof-of-concept demonstrator including all main system elements (raw data input, data correlation, API-definition, smart data output). The main project use-case related elements of the platform have been realised during 2018. Some final works and use-case related adaptations of the Smart Data Platform have been finalized by end of February 2019.
A comprehensible and detailed technical description of all structural and architectural elements of the Smart Data Platform is included in this document.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 9
1. Introduction The Introduction of this report summarizes the purpose of this deliverable and explains the structure of the provided content in different chapters.
1.1 Purpose of the Document This document aims at providing a detailed documentation of the Smart Data Platform. It should give the reader a short but comprehensive description of both the technical components, dashboards and utilized technologies as well as of the data processing rules and mechanisms based on the use case and data privacy requirements. After reading this documentation, it should be clear how the platform has been implemented in terms of components and functionalities, why certain technical decisions have been taken in coordination with the project stakeholders and what added value the platform provides to SMARTER TOGETHER.
It shall outline the evolution from the City Intelligence Platform to the Smart Data Platform by taking into account and implement the project requirements and introduce the system and its interfaces for the project´s operation phase. In the course of this, the document covers two major topics: the system architecture and it´s technical components on backend and frontend level, as well as the data processing rules and functionalities based on the data-privacy classification of each data set as defined by the Data Gatekeeper rules and standards.
1.2 Structure of the Document Chapter 1 describes the purpose and structure of the document.
In chapter 2, a short development history will be provided describing the development process of Siemens´ City Intelligence Platform and how it has evolved to the Smart Data Platform in the context of SMARTER TOGETHER. The delta between the CIP and SDP results from specific project requirements, which necessitate a customized technical solution in terms of components and data processing functionalities. The major requirement, which will be described in a separate subchapter, is the synchronization of the platform´s data management framework with the regulations defined by the Data Gatekeeper concept.
Chapter 3 focuses on the system architecture and the technical components. It describes the architectural approach, a system overview as well as dedicated subchapters on the platform´s main components: Data Warehouse, the Data Gatekeeper Registry and the two frontends of the platform, namely the Analysis Dashboard and the Transparency Dashboard.
In chapter 4, the data processing framework is covered. It describes the general flow of data through the platform from data integration over two standardized APIs, the storage
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 10
and management of data in the data warehouse, to the analysis and fusion of different data for visualization purposes in a dashboard, and finally to the provision of enhanced data to third party applications and systems over an API provided by the platform.
Chapter 5 takes up the general data processing framework and the system functionalities and shows these processes taking the practical examples of the different use cases implemented in the platform.
In chapter 6, a summary is provided presenting the current status of the implementation, open issues to be discussed as well as an outlook on the operation phase of the project.
2. Context and Background In this chapter, the evolutionary steps shall be outlined how the system evolved from the City Intelligence Platform to the Smart Data Platform used in SMARTER TOGETHER, especially focusing on the project requirements, which led to the adaptations of the existing product for the context of SMARTER TOGETHER.
2.1 Origins and Development History The Smart Data Platform is based on the CIP, which was developed by Siemens in the context of different research projects over 5 years until product status has been reached at the end of 2016.
The CIP is an open platform, developed and implemented by Siemens Corporate Technology. It has been designed to meet requirements from global trends like urbanization, climate change and pollution, the impacts of which create significant new challenges for urban infrastructure.
It is expected that, within the next years, a huge amount of new data will become available. Examples include Floating Car Data, Car to X-Technology and Crowd Sourced data in the mobility domain, which require tools to access, store and evaluate the data and to create services and user interfaces for various user groups.
In this context, the CIP supports fast, research oriented development and feasibility tests, in particular in new technology fields. Thus, the CIP shall complement existing and well established Siemens product platforms, such as the modular Sitraffic platform in the field of road traffic control, management and parking guidance (Sitraffic Scala, Concert and Guide), with more than 100 implementations worldwide.
The principle behind the CIP implementation is to develop services and applications for various new types of data and use cases. It offers an open interface to enable an IT ecosystem around its deployment. This includes the possibility to exchange raw and aggregated data as well as results from data analytics. It also offers the possibility for third parties to develop their own applications-making use of open interfaces. Most technical
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 11
innovations are being made within the analytics and evaluation modules. Those have the task to aggregate and analyse available data in a way to enable new IT solutions within the relevant infrastructure domains.
Figure 1 : Trends covered by City Intelligence Platform
In the analytics module of CIP different data analyses based on research projects have been implemented and visualized via diagrams or maps as shown in the figure below. In this example, different analyses have been implemented in a research project based on real-time and historic data to help better understanding bus prioritization performance in the city of Böblingen/Germany.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 12
Figure 2 : Example analysis of bus prioritisation performance in the city of Böblingen
After the use of the CIP in multiple pilots, the platform reached the product status at the end of 2016. Thus, the platform has been shifted from a Siemens development unit to an operating unit, which is VMZ Berlin, a 100% subsidiary company of Siemens. VMZ Berlin continues operating and enhancing the CIP and will, in parallel, adjusts the existing system to the Smart Data Platform used in SMARTER TOGETHER taking into account the specific project requirements described in the following subchapters.
2.2 SMARTER TOGETHER Project Requirements towards the Platform There are various requirements from different SMARTER TOGETHER project domains, which need to be taken into account when evolving the City Intelligence Platform to the Smart Data Platform. These requirements consist of technological, administrative or use-case-specific demands towards the implementation of the system. However, these requirements will also be found in similar form outside of the SMARTER TOGETHER project framework so that the transferability of the platform into other contexts is always given consideration.
First of all the Smart Data Platform (SDP) needs to be able to cover a variety of use cases. The use cases implemented in SMARTER TOGETHER range from the mobility domain over energy topics to intelligent lamppost sensor solutions and require an open platform ecosystem that is capable to provide technological and data management solutions
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 13
covering a broad field of topics. The SDP is thus required to be an open, secure and citywide IT ecosystem: it should act as a virtual data-backbone for collecting city-data in different domains as a basis for a holistic view of city-data and operated under the control of the public authority to offer security and quality of data.
The project requirements affect different technological layers of the Smart Data Platform as well as administrative data management regulations.
Data integration
As the central data platform for the lighthouse city of Munich, the SDP integrates all data that is needed for the use cases of SMARTER TOGETHER. The data integration covers the import of static data from different databases outside the platform, georeferenced objects for analyses and visualization in maps, real-time backend to backend interfaces as well as real-time sensor interfaces.
The standard APIs of the platform need to be suitable for the communication with various data sources, secure and well documented. The standard APIs of the platform are described in chapter 4.1.
Data processing and storage
The integrated data needs to be securely stored in the platform´s data warehouse. Different data analysis services of the platform are built upon this database layer in order to combine and visualize data from different sources and/or provide it to third party applications using the SMARTER TOGETHER API.
Depending on the confidentiality of data, the platform is required to provide anonymization and aggregation mechanisms, before the data is entered into the database. Furthermore, it can be required that data is automatically deleted from the database after a defined period.
Data analysis and visualization
One major objective of the Smart Data Platform is to create added value for city decision-makers by calculating and visualizing relevant analyses based on the data integrated. These analyses need to be able to combine data from different data sources and across different domains. Based on the available data, the Munich Task-leaders and other project stakeholders responsible for a use case discussed together which analyses would be helpful for future decision-making and how those should be visualized in order to be understandable and valuable for the final users.
In this context the SDP is not only required to perform and visualize these analyses but also to verify which data is actually allowed to be combined based on data privacy
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 14
regulations and which users are allowed to use specific analyses based on their access authorizations defined in the Data Gatekeeper (see chapter 2.3).
Data distribution
The Smart Data Platform does not only integrate, store, combine and visualize data, it is also the central data provision point for third party systems, which want to request data for their apps or services. The central access point for different data sets is the SMARTER TOGETHER API.
For data distribution, it is crucial that only those data sets are accessible over the SMARTER TOGETHER API, which are allowed to be openly distributed. Other data sets might be restricted to certain third party users who have signed a certain licensing contract. Therefore, it is important that the SMARTER TOGETHER API Store is able to distinguish between different users, manage different access authorization rights and provide additional user validity checks such as use of certificates or tokens.
It can be summarized that there are various requirements that need to be taken into account for the development of the Smart Data Platform. One major aspect in this context is the confidentiality classification of each integrated data set that determines the data processing and distribution rules applied to specific data. One major objective of the Smart Data Platform is to make as transparent as possible what happens inside the platform with each data set and to guarantee data privacy where it is needed. This regulatory framework for data management in SMARTER TOGETHER is the so-called Data Gatekeeper concept described in the next subchapter.
The following SMARTER TOGETHER Use Case Data deriving from WP 4, Tasks 4.2 – 4.6 are implemented into and analysed within the Smart Data Platform:
▪ Information exchange with the Task Citizens and Stakeholder Engagement (Task 4.2)
▪ Data exchange with the Task Low Energy Districts (Task 4.3)
▪ Data exchange with the Task Integrated Infrastructure and Services (Task 4.4)
▪ Data exchange with the Task Sustainable Mobility (Task 4.5)
▪ Data exchange with the Smart City App (part of Task 4.4)
▪ Information exchange with the Task Monitoring and Evaluation (Task 4.6)
2.3 Synchronisation with Data Gatekeeper The Data Gatekeeper in SMARER TOGETHER is a strategic and organisational concept ensuring data privacy, data quality and other compliance issues and describes a data
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 15
management and processing framework as a transferable and generic blueprint providing requirements for the implementation of the Smart Data Platform in the lighthouse city of Munich.
The Data Gatekeeper (DGK) defines strategic guidelines for the Smart Data Platform as mandatory external requirements, which needed to be considered before the implementation process starts. These include applicable data protections laws and legal restrictions, fundamental technical recommendations for data security as well as city-specific strategies regarding open data access and data transparency, which, for example, resulted in the implementation of the SDP´s transparency dashboard.
It is furthermore defined in the DGK how smart data use cases are created and administratively handled in SMARTER TOGETHER. It is specified which stakeholders should be involved in the use case creation process and who is responsible for defining what is going to be happen with certain data sets inside the platform. These requirements rather affect the administration processes and mechanisms of the Smart Data Platform than the direct technical implementation.
The Data Gatekeeper also sets the framework for the classification of data sets that are integrated into the SDP. Each classification has its own impacts on how the data can be processed within the platform. For example, as the data classification requires that certain data sets need to be pseudonymized or aggregated before further processing, the SDP provides corresponding mechanisms. All these data processing rules, which are based on the metadata model of the DGK, are described in detail in chapter 4.
2.4 From City Intelligence Platform to Smart Data Platform As described the implemented system is based on the technology of the existing City Intelligence Platform. However, there are new project-specific requirements towards the platform of which the implementation of the Data Gatekeeper framework is the most crucial one. The following figure illustrates which components of the Smart Data Platform were already parts of the CIP, and which parts have been introduced due to Smarter Together project requirements.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 16
Figure 3 : New and existing components of the Smart Data Platform
The existing components include the standard integration APIs of the CIP that includes the options to transfer data over a JSON/REST API or over MQTT protocol. These APIs are described in detail in chapter 3.2.5.
Furthermore, the Data Warehouse of the CIP, which consists of a database layer of raw data and another layer for aggregated or pseudonymized data, is used in the Smart Data Platform. The Data Warehouse is depicted in chapter 3.2.1
The other components have been newly introduced in the SMARTER TOGETHER context. The Data Gatekeeper Registry, described in chapter 3.2.3, is the component that translates the requirements of the Data Gatekeeper such as the data classification into data processing rules within the platform. This can be the requirement to aggregate raw data due to data privacy reasons or limit the external access to one of the platform´s APIs. The data processing rules configured in the DGK registry are subject of chapter 4.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 17
The API catalogue (chapter 3.2.6) is the access point for third party systems and applications to retrieve data from the SDP e.g. intelligent lampposts data. The API catalogue is linked to the platform´s single sign-on service (chapter 3.2.4) which manages the access authorizations for the different APIs based on the DGK registry.
The Smart Data Platform also provides two frontends in SMARTER TOGETHER project. The first is the Analysis Dashboard (chapter 3.2.7) which visualizes the thematic analyses defined in each use case, platform statistics as well as the project-specific KPIs (Key Performance Indicators). The analysis dashboard is linked with the single sign-on service to check if the respective analysis dashboard users are allowed to access certain analyses.
The second frontend is the Transparency Dashboard described in chapter 3.2.8, which is a website for the general public providing information about the different data sets that are integrated into the platform and their data privacy classification.
3. Architecture and System Components In this chapter, the architectural approach of the Smart Data Platform and the main technical components as well as the used technologies will be illustrated.
3.1 Architecture Overview The design of the Smart Data Platform architecture needs to take into account different requirements. Data sources from different project use cases need to be integrated in an easy way, data needs to be processed in the context of data analysis and enhanced data must be provided to third party systems over APIs that are simple to understand and provide added value for the applications and systems accessing the Smart Data Platform. The backend system architecture can be layered in the domains of data integration, data management, analysis and data provision. Data for the project use cases are transferred from different data sources to the platform via the two standard interface technologies. However, the exact data fields and communication intervals of each interface need to be always specified, between data providers, platform operator and use case stakeholders, to ensure that all data, which is required for the different analysis use cases, is available in the platform.
In the data management and data analysis layer the smart data platform provides different types of databases for storing the data. Different data base operations set up on the raw data base layer to aggregate and pseudonymize data if required by the Data Gatekeeper framework and/or process data within the analytics module of the Smart Data Platform.
The data provision layer provides different APIs to allow the analysis dashboard as well as external systems to access raw or aggregated data from the platform. The APIs are
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 18
thematically divided for the different use cases (e.g. Intelligent Lamppost API) and are summarized in an API catalogue user interface for developers. Retrieving data over these APIs requires an access authorization, which is managed by the platform operation over a single sign-on service.
Figure 4 : Smart Data Platform Architecture Overview
3.2 System Components In this subchapter, the technology stack of the main components will be outlined including backend components such as the data warehouse, frontend components of the dashboards as well as the standard APIs of the Smart Data Platform.
3.2.1 Date Warehouse For the data warehouse, two different types of databases are provided to manage and store the data integrated in the different use cases: A relational database management system (RDBMS) using SQL (Structured Query Language) as well as a document-oriented NoSQL database. Which database type will be chosen depends on the specific data storage and processing requirements of each use case. If the schema of certain data (table and field-types) is well defined before integrating it in the Smart Data Platform it is
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 19
more likely that a relational database will be applied, whereas a NoSQL database may be suited for certain use cases where the initial data requirements are difficult to ascertain as this kind of database provides more flexibility. Both databases are in line with the data processing requirements posed by the Data Gatekeeper. The relational database used in the Smart Data Platform is namely Oracle DB, the NoSQL database applied is MongoDB.
3.2.2 Analytics Module The analytics module is the system layer, which manages the different database operations that calculate, aggregate, combine and enhance the raw data from the different data sources to smart data ready to be presented in the analysis dashboard or to be provided over the platform APIs.
Besides implementing sensible and correct algorithms, a focus of the analytics module is to provide a high performance of the data base operations to minimize the response time when a user requests a certain analysis in the analysis dashboard.
3.2.3 Data Gatekeeper Registry The Data Gatekeeper Registry is the central component, which translates the requirements of the Data Gatekeeper such as the data classification into a data processing system within the platform. This might include pseudonymizing raw data due to data privacy reasons or control the external access to one of the platform´s APIs.
The Data Gatekeeper Registry is more a data processing framework to be implemented manually by the platform administrator than a coherent and automated piece of software. More precisely the Data Gatekeeper registry can be found in different other components such as the data warehouse or the single sign-on service. In the following figure, it is illustrated in which steps of data processing in the Smart Data Platform the Data Gatekeeper registry comes into play.
After integrating raw data into the Smart Data Platform, it might be required by the data provider or data privacy regulations that only aggregated or pseudonymized data may be stored. This requirement is defined in the data classification for each data source. In this case, the raw data will be deleted after aggregation/pseudonymization. The next step where the Data Gatekeeper registry affects the platform processes is between the data warehouse and the analytics module. The data classification might prohibit that a certain data sets are combined (e.g. when the combination of data might lead to an identification of individuals) which restricts the range of possible analyses. The data gatekeeper registry also sets the framework for which data can be published over an API to an external system and which system is allowed to access which data at all. Furthermore it is defined which user can view which analyses in the analysis dashboard as it might be the case that a certain person is allowed to view analyses of use case 1 but not of use case 2.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 20
Figure 5 : Impacts of Data Gatekeeper registry on Smart Data Platform
3.2.4 Single Sign-on Service The Smart Data Platform uses a single sign-on service to manage the different user roles and access rights (Identity and Access Management (IAM)). The single sign-on service manages the access rights for the communication with the Smart Data Platform over the APIs as well as the access rights to certain analysis domains in the analysis dashboard. The users and authorizations are configured in the graphical user interface of the single sign-on service. The service also provides authorization tokens (JSON Web Tokens) to the external systems and only with valid tokens a communication with the Smart Data Platform can be established. The single sign-on service implemented in the platform is namely the open-source tool Keycloak. This single-sign on service also provides the functionalities to manage the data processing framework of the Data Gatekeeper Registry by connecting different operations such as a database query with different the different roles of a user or a system component.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 21
3.2.5 Application Programming Interfaces The Smart Data Platform provides two standard APIs for integrating data in the context of the various use cases: MQTT (Message Queue Telemetry Transport) protocol and JSON REST. These two interfaces, of which JSON REST will be the more likely solution in most use cases, are described in the following.
The MQTT protocol is suitable for use cases that include sensor data that will be directly sent from sensor to the Smart Data Platform. The communication is triggered by a tool locally installed on the communication unit of the sensor; the tool is provided by the Smart Data Platform operator. The tool includes the application for the JSON export, the required java version as well as preconfigured certificates for a secured data transfer. Over the MQTT communication protocol the tool automatically exports the current JSON data sets to the Smart Data Platform in configurable intervals. After the export, the data sets will be removed from the sensor unit. MQTT is an open message protocol and currently in standardization process for the context of machine to machine communication or, respectively, the internet of things (IoT).
The JSON REST interface is more suitable for backend to backend communication. Via POST requests, the current data sets are sent from the data provider backend to the Smart Data Platform. The respective endpoints are provided by the platform operator. To secure the communication, each request has to contain a valid JSON Web Token in the http header. This access token is issued by the single sign-on service and can be retrieved over an upstream request. The token has a defined validity period and must be renewed respectively. The JSON data format is predefined by the platform operator but can be modified in consultation with the data provider in order to meet the use case requirements.
The following two examples show the JSON data formats of the use cases smart home data and intelligent lampposts. For the smart home data use case, the MQTT export tool has been implemented whereas in the intelligent lampposts use case the JSON REST interface is applied.
In case of the smart home data, the MQTT export tool sends data sets to an endpoint of the Smart Data Platform whenever new measured values for humidity, temperature and battery level are available. This means that there is no fixed transmission interval. A simplified example of a smart home data set is shown in the figure below.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 22
Figure 6: JSON data format example for smart home data
For the case of intelligent lamppost data, the JSON REST interface is applied. The process to send data includes one POST request to retrieve a valid token and a second POST request with the received token in the header to send the actual data sets. Below, a simplified data set for intelligent lamppost data is illustrated.
Figure 7: JSON data format example for intelligent lamppost data
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 23
3.2.6 API Catalogue The API catalogue is a user interface for developers that want to access data of the Smart Data Platform over an API (JSON REST APIs). The API catalogue lists the different APIs that the Smart Data Platform provides. Every accessing system has to register for each of these APIs and access is granted by the single sign-on service according to data classifications and agreements among the project stakeholders. The requests towards the Smart Data Platform or the API catalogue respectively are secured via HTTPS and access authorization via OAuth2 with JSON web tokens. Only if a valid token is included in the request the external system can access the data. The data fields of the interface and the request parameters are designed to meet the specific use case requirements, if possible in coordination with the requesting external system / app. Over a “try it out” button, responses with real data can be requested as a practical help for developers to get a feeling how the API works. Furthermore, an example data format and response codes are provided. The figure below shows a screenshot of the SMARTER TOGETHER API catalogue.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 24
Figure 8 : Screenshot of the API catalogue
3.2.7 Analysis Dashboard The analysis dashboard is the central user interface to view the different visualized use case specific analyses. It is firstly linked to the analytics module that generates use case specific data sets based on the data warehouse technologies described above. Secondly, it is linked to the single sign-on service, which manages the different user access rights for the different analyses as certain users are only allowed to see certain analyses.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 25
For the visualization of the analyses (charts, diagrams), the Smart Data Platform uses the open source software D3.js, which is a JavaScript library for generating dynamic and interactive data visualizations in web browsers. It makes use of the widely implemented SVG, HTML5, and CSS standards. Furthermore, the cross-platform JavaScript library jQuery is used to simplify the client-side scripting of HTML. jQuery is a free and open-source software using the MIT License.
The analysis dashboard is divided into four main domains: “Home”, “Thematic Analyses”, “Platform Statistics” and “Key Performance Indicators”. On the “Home”-screen, a map of the pilot area in Munich is shown containing different types of points of interest (POIs) such as intelligent lampposts or mobility stations. When clicking on one of these POIs a details window opens showing master data of this POI (e.g. type of POI, address, name etc.) as well as dynamic data (e.g. last measured PM10 value of lamppost sensor).
Under “Thematic Analyses” the dashboard user accesses the use case specific analyses such as for example the historical data of pollution sensor measurements as a line chart or the usage of mobility stations etc. “Thematic Analyses” is subdivided into the three topics of “Mobility”, “Intelligent Lampposts” and “Energy”.
“Platform Statistics” enables the user to visualize different statistics on the Smart Data Platform performance the number of sent data over the data integration and data provision APIs including indications on API downtimes.
Lastly, the domain “Key performance indicators” summarizes and visualises the key performance indicators for the different project use cases that are needed for the project performance monitoring.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 26
Figure 9 : Analysis Dashboard Start Page_Pilot Area with POIs
3.2.8 Transparency Dashboard The Transparency Dashboard is a public website that aims at showing citizens which data are collected in SMARTER TOGETHER Munich and how they are being used in the project use cases. It is the “window” into the Smart Data Platform Munich and illustrates transparently which data sources are integrated, how data protection issues are addressed and what the different data are used for.
The Transparency Dashboard shows
▪ which data in the platform are publicly available
▪ what happens with the data within the Smart Data Platform
▪ how these data can be used / accessed
The information on the Transparency Dashboard is divided into the following domains: general project information, energy, mobility, intelligent lamppost and data usage with further information and subsites.
The Transparency Dashboard is a responsive website with no database in the background. No special technologies are necessary apart from a content management system to dynamically add or modify text and information. The CMS used for the transparency dashboard is Wordpress.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 27
The figures below show the screenshots in browser http://transparency.smartdataplatform.info and mobile device resolution.
Figure 10 : Screenshot of Transparency Dashboard (browser view)
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 28
Figure 11 : Screenshot of Transparency Dashboard (mobile device view)
4. Data Processing Framework Whereas chapter 3 presents the different system components, which together form the overall software system, this chapter outlines the framework in which the different data are processed, managed, combined and published. Only together can the software components and the data processing mechanisms constitute a smart data platform.
4.1 Data Classification As different data sets need to be handled differently in the Smart Data Platform according to the specific requirements towards data security and privacy, each integrated data source receives a data classification along multiple criteria. The data classification criteria were defined by the SMARTER TOGETHER project partners in Munich involved in the design of the Smart Data Platform. The criteria are the following: Integrity and Data Privacy, Data Processing and Data Analysis, Transfer and Allocation of Data, Back-up and Deletion of data.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 29
For each criterion, one of the four scale levels has to be chosen per data source so that the data processing mechanisms in the Smart Data Platform can be adjusted accordingly. The definition of the criteria has to be performed by the data owner under consultation of the project stakeholders.
Scale Level 1 Scale Level 1 Scale Level 1 Scale Level 1
Integrity and data privacy
I1: No further data privacy measures necessary
I2: Data has to be anonymized / pseudonymized before further processing
I3: Data has to be anonymized / pseudonymized as well as aggregated before further processing
I4: Data may be stored in SDP´s data warehouse but can only be processed after individually defined criteria
Data processing and data analysis
D1: Data may be analysed and correlated with any other data
D2: Data may only be correlated with other data if no information hereby occurs with which individuals or individual groups can be identified
D3: Data may only be correlated or analysed according to the specific criteria defined for the individual use case
D4: Data may not be correlated or analysed at all
Transfer and allocation of data
T1: Data may be transferred to any user internally and externally
T2: Data may only be transferred to external users under specific conditions
T3: Data may only be transferred to the original data provider
T4: Data may only be provided to predefined internal users
Back-up and deletion of data.
B1: Data may be stored at any physical location and as long as desired
B2: Data may be internally stored as long as desired
B3: Data may be stored internally with a time limit after which the data must be deleted
B4: Data may only be stored at predefined physical locations with access protection. A time limit must be provided if needed
Table 1: Data Classification Criteria
In the following table, it is listed how these different criteria and their scale levels are technically implemented in the Smart Data Platform.
Data Classification Scale Level Technical Implementation in Smart Data Platform
Integrity and Data Privacy
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 30
I1: No further data privacy measures necessary
Raw data is stored in SDP´s data warehouse without anonymization / pseudonymisation measures and provided to analytics module
I2: Data has to be anonymized / pseudonymized before further processing
If data has not been anonymised / pseudonymised by data provider in the first place before integration into SDP, a data converter is applied to anonymise / pseudonymise the data before storing it in SDP´s databases. The anonymization / pseudonymisation criteria are defined together with data provider.
I3: Data has to be anonymized / pseudonymized as well as aggregated before further processing
In addition to anonymization / pseudonymisation measures, the data will be automatically aggregated to a level to be discussed with data provider and other use case stakeholders
I4: Data may be stored in SDP´s data warehouse but can only be processed after individually defined criteria
Data are stored in the data warehouse but are only processed according to use case specific and individual processing rules to be defined by data provider and other use case stakeholders
Data Processing and Data Analysis
D1: Data may be analysed and correlated with any other data
Data stored in database is provided to analytics module and 3rd party systems without any restrictions
D2: Data may only be correlated with other data if no information hereby occurs with which individuals or individual groups can be identified
If this scale level is chosen, data is only correlated with other data if no individual-related data evolves. In case of any doubt all use case stakeholders have to agree to the specific analysis and data visualisation. Since all analyses and data visualisations have to be developed individually it is not possible that data correlations occur which are not intended.
D3: Data may only be correlated or analysed according to the specific criteria defined for the individual use case
If this scale level is chosen, analyses are only developed based on this data if all use case stakeholders agree to this analysis and data visualisation. Since all analyses and data visualisations have to be developed individually it is not possible that data correlations occur which are not intended.
D4: Data may not be correlated or analysed at all
No analyses will be developed based on this data
Transfer and allocation of data
T1: Data may be transferred to any user internally and externally
The data is accessible over the API catalogue of the SDP to all users. However, a registration at the Smarter Together API is necessary
T2: Data may only be transferred to external users under specific
The API catalogue and the connected Identity and Access Management distinguish between different users. Only approved users with the username and password can access
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 31
conditions the data over the Smarter Together API. On the Transparency Dashboard it is described under which conditions access to the data can be provided.
T3: Data may only be transferred to the original data provider
The API catalogue and the connected Identity and Access Management distinguish between different users. Only the original data provider can access the data
T4: Data may only be provided to predefined internal users
The API catalogue and the connected Identity and Access Management distinguish between different users. Only internal users with the permission of the original data provider can be unlocked for the specific API. Even the original data provider needs to be unlocked for the API
Back-up and deletion of data
B1: Data may be stored at any physical location and as long as desired
The data is stored at a location free of choice for SDP operator without any further requirements for the lifetime of the project and beyond.
B2: Data may be internally stored as long as desired
The data is stored in the data warehouse of the SDP for the lifetime of the project and beyond. If the data is transferred to a 3rd party system over the Smarter Together API, the handling of the data is the responsibility of the operator of that 3rd party system. The handling of the data should In this case be determined in a data licensing contract
B3: Data may be stored internally with a time limit after which the data must be deleted
The data are stored in the data warehouse of the SDP with a specific time-to-live attribute, which manages the automated deletion of the data after a validity period to be specified by the data provider
B4: Data may only be stored at predefined physical locations with access protection. A time limit must be provided if needed
Data providers, who have chosen this scale level, are briefed about the IT security measures and server locations of the SDP. If desired, the data provider can choose a certain preferred server location. The data are stored in with a specific time-to-live attribute, which manages the automated deletion of the data after a validity period to be specified by the data provider.
Table 2 : Technical Implementation of Data Classification
Although each single data classification entails a specific implementation in the Smart Data Platform as presented in the table above, this does not mean that these specific data processing mechanisms are triggered automatically. The data classification for each data source helps the SDP operator to know how manually handle, restrict and configure the data sets but this manual configuration is a necessary step in the current status of the Smart Data Platform. If, for example, a data owner requires that his or her data needs to be aggregated to a certain level before further processing, he can choose the data classification accordingly offline. However, this does not mean that the
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 32
aggregation is triggered automatically but that the SDP operator enters into a discussion with the data owner and the use case owner to define up to which level the data should be aggregated. After this offline discussion, the aggregation algorithm is implemented or configured accordingly.
In the Smart Data Platform, this configuration is handled via the user interface of the single sign-on service. Each system component or external actor is defined as a user/client in the single sign-on service to which certain roles are assigned. These roles determine which users/clients are allowed to do which database or system operations. Per user/client, these roles have to be assigned manually by the platform operator. These roles are defined and set by the SDP operator and represent the different system operations available. The roles are in principal divided into a three level hierarchy as the figure below shows. A user/client can be assigned to a 1st level composite role, which includes several 2nd level composite sub-roles which, in turn, contain the single 3rd level system operation roles.
Figure 12: Role assignment for SDP users/clients
The example of using the analysis dashboard of the Smart Data Platform will be shortly illustrated to better understand this process. An analysis dashboard user needs to register before being able to see any visualization of an analysis. After registration, the user receives a basic default role. More advanced roles with which certain analyses can be requested have to be additionally assigned to this user by the platform administrator. If the platform administrator is in doubt if a newly registered user is allowed to be unlocked for more advanced roles, he or she will check with the data and use case owner beforehand. If this user is allowed to see all thematic analyses, he or she can for example be assigned with a “Global_CompositeRole” which includes certain composite sub-roles (see figures below).
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 33
Figure 13: Screenshot of roles list (extract)
For each composite role, sub roles can be added or removed until they represent the exact scope and limitations of each role as here for the example of a Global_CompositeRole.
Figure 14: Screenshot of role assignment for Global_CompositeRole
The sub roles, in turn, include the different system operation roles that can also be added and removed. In the screenshot below, we see an extract of the third level system operation roles assigned to the “Energy_CompositeRole” as an example.
Figure 15: Screenshot of role assignment for Energy_CompositeRole
This configuration process of role assignment per user and the involved data processing settings in accordance to data classification is, as mentioned above, a manual process to be performed by the Smart Data Platform operator.
This manual configuration could be prospectively replaced by an automated process in which a smart city platform decides “itself” how certain data should be handled and
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 34
which system operations should be allowed by which components. This could be possible if all data sets carry a classification attribute directly which automatically defines and triggers the role composition and the involved access restrictions for all system components and platform users. If a bigger number of use cases evolve and resemble each other in such a way that they require similar system operations, it could be helpful to automate this process. For the research context of SMARTER TOGETHER with a limited number of use cases that are very different from each other, it was decided that the efforts for automation would have been too big for the added value it had provided.
4.2 DGK Data Model and Processing Rules It has been documented in this report, how data needs to be handled in the SDP according to different data classification and other requirements set by the Data Gatekeeper concept. One important outcome of the Data Gatekeeper is the data model that documents the different role schemes of various stakeholders for handling data sets. The data model is shown in the figure below and illustrates the different components and roles in the SDP as well as the flow of data through the system.
Figure 16: Data Gatekeeper Data Model
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 35
The data model consists of rectangles representing system components and system actors as well as of arrows representing system interactions such as events or information flow. The data model is structured in four sectors: “Inside Smart Data Platform”, “User Roles”, External” and “Smart Data Systems”.
In order to explain the data model, the flow of data provided by an external data source will be illustrated through the different processing steps of the SDP towards a third party system requesting data from the SDP over the SMARTER TOGETHER API catalogue. The initial data set is provided by a third party data source such as a backend, which is administered and operated by the Supplier´s Technical Data Steward. Over an API or file export, the data is being imported into the data warehouse of the SDP. This raw data set is immediately processes according to the specific data classification and/or aggregation/pseudonymization mechanisms set in the Data Gatekeeper Registry for this data source. If e.g. an aggregation mechanism is necessary, the raw data is processed accordingly and, from that point on, stored in the data warehouse as a “Derived Data Set”. The initial raw data may be deleted after that process if required by the data classification. The data classification and aggregation framework is set by the data owner after discussion with other stakeholders especially the “owner” of that specific use case.
The derived data set is then being used for visualisation purposes in the analysis dashboard and/or for publication towards third party systems over the SMARTER TOGETHER API. A user of the analysis dashboard is only allowed and able to see the visualisation of that derived data set if he is granted access by the access authorization service (SSO). Which users are allowed to see which visualized analyses is administered in the DGK Registry and decided by the data provider and the use case owner. In addition, the access to the SMARTER TOGETHER API is restricted by this framework. Only external systems or apps can request the derived data set over the API catalogue if they are unlocked for this specific part of the API. The decision on which external systems can access which data over the SMARTER TOGETHER API catalogue is taken by the owner of the data and the use case owner.
4.3 Functional Library for Data Processing As some of the data sets to be integrated in the SDP need to be processed according to the specific needs of each data classification, the platform provides a toolset of data processing mechanisms. Data providers and use case owners can choose from this toolset how specific data sets should be processed within the SDP. There are different mechanisms available in the platform how to anonymize / pseudonymize data sets. In addition, different options are available how certain data can be aggregated to meet data privacy requirements. The data analysis and data access functionalities, in turn, do not provide different options to choose. For these topics, the table rather informs which functions are implemented in the SDP.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 36
Functionality 1 Functionality 2 Functionality 3
Anonymization / Pseudonymization of person-related data
After the integration of raw data into the Smart Data Platform over the JSON REST interface, data can be anonymised / pseudonymised before writing it into the platform´s database
Removing person-related data fields
For a full anonymization of the transferred data sets, data fields containing person-related data sets can be removed completely before writing it into the database. These data fields need to be specified by the data provider and other use case stakeholders.
Removing ID digits
Before writing raw data sets into the database, the person-related IDs indicated in specific data fields can reduced by a number of digits to be specified by the data provider and other use case stakeholders. If the data set contains a person-related ID as for example “D42197334”, it can be written into the database as e.g. “D42197” or “D42197XXX”. The number of digits to be cut off needs to ensure that no data can be referenced to a specific person.
Creating hash values
For data pseudonymization purposes, a hash value can be created for a person-related ID before writing the data into the database. The content of a specific data field is replaced by a new “string” created with the algorithm of the hash-function. The created hash-value can only be referenced to the original data values with the correct key which has been used for creating the hash-value. This key can either be deleted or safely stored in a different system.
Data Aggregation
Another set of functions to ensure data protection requirements is the aggregation of data either before writing it into the platform´s database or before providing data to the platform´s analysis module.
Remove subcategories of a data set
Subcategories of a data set can be removed. If a data field contains a category and there are subcategories specifying this category even further, these subcategories can be removed. For example, if the data set contains a location category “city” and the subcategory “district” or “street”, the latter two subcategories can be removed so that the data set is aggregated to city-level.
Create an average value for a defined period of time
If data sets contain measured values for an interval of e.g. 5 minutes, but due to data protection guidelines only measured values for e.g. an hour should be processed in the platform, an average value for the desired period of time can be calculated (either before writing the data set into the data base or before further processing in the analysis module)
Aggregate (sub-)categories
Categories of a data set can be aggregated to a “higher” category level. If a data set contains for example the category “district”, the districts “1” and “2” can be aggregated to the new category “1/2” to provide a higher aggregation level.
Access to data
The Smart Data Platform foresees two ways of accessing the data. The first way is to access the visualized analyses of data
Access to visualized analyses with log-in credentials
On the analysis dashboard the user needs to log in and will only be able to
Access over JSON REST API
Access to a specific API will be granted after approval by the use case stakeholders, especially the data provider. The
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 37
in the analysis dashboard. The second way is to request data over a JSON REST API for e.g. integration in applications. In both cases of data access, the user needs to have the access rights to receive certain data. For some data, it might for example be necessary to sign a licensing contract with the data provider. In case all requirements to access certain data are fulfilled and approved “offline”, the Smart Data Platform will grant access to certain analyses or APIs with the implemented multi-client-capability.
access the visualized analyses connected to his access rights to be approved by the use case
stakeholders.
endpoint of the API will be secured via OAuth2 with JSON Web Token (JWT). The access token will be exchanged in the http headers. The access token contains a signed JSON file including the defined access rights of the user.
Table 3 : SDP Functional Library
5. Data Exchange in Use Cases This chapter illustrates how the platform processes different data in the four main use cases of the Munich smart data pilot. After the general system and the data processing framework has been explained in this document, this chapter shortly shows the workflow of the SDP in the concrete project use cases.
5.1 Energy The Use Case “Energy” is about collecting data that measure temperature and air humidity in flats. A measurable direct effect on energy efficiency and energy consumption is not part of this use case.
In the Energy Use Case, two data sources are integrated. Firstly, smart home sensors of the provider Securitas which measure temperature and humidity in different apartments and buildings, and secondly, outdoor weather data (temperature and humidity) from a weather station of DWD (German Meteorological Office) located in Munich. In the Securitas backend, all single measurements of the different sensors are bundled in a database. This database is exported by a tool provided to Securitas by the Smart Data Platform operator. The tool sends all JSON files with the current measurements over the MQTT protocol to the SDP where the raw data is stored in the data warehouse.
The DWD weather data is accessible to third parties (in this case the SDP) as .csv-Files from the DWD weather data server. As only historical data is needed for the further analyses in the SDP, the file transfer is scheduled for only once a day.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 38
The two data sources are used for analyses visualized in the analysis dashboard under the menu point thematic analyses > energy. Only users that are specifically unlocked for the energy analyses can open this menu point.
Figure 17 : SDP Workflow of Use Case Energy
The raw data of these two data sources are processed, analysed, and combined so that they can be presented in the analysis dashboard of the SDP. The screenshots below show two different visualisations of the analysed data. In the first screenshot the temperature measurements of different smart home sensors are visualised throughout a single day. For the same day the outdoor temperature is included based on the DWD weather data for comparison.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 39
Figure 18: Screenshot 1 Energy analyses
Furthermore, a comfort analysis has been implemented. This analysis presents the measurements of a smart home sensor, along with the dimensions of indoor temperature, absolute humidity and relative humidity, that lie within the so-called comfort field for indoor climate. Based on the DWD weather data it can, for example, be filtered that the measurements should be presented for a day with certain outdoor weather conditions such as the coldest day of the year.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 40
Figure 19: Screenshot 2 Energy analyses
5.2 Intelligent Lampposts The use case of Intelligent Lampposts contains data from different sensors of multiple sensor operators, which are installed at different lampposts across the pilot area. The use case of Intelligent Lampposts can be divided into the two sub-topics of environmental sensors and traffic sensors.
For all sensor use cases, the data is firstly sent from the sensor modules to the respective operator backends, where the data is bundled and processed. Over different JSON REST interfaces provided by the sensor operators, the data is transferred (pull) to the Smart Data Platform where the raw data is stored. Additionally, the static data of the lampposts (address, location, height etc.) is imported as JSON file provided by the geo data server of Munich.
The different sensor data are processed and combined by the analytics module of the SDP and the resulting data is stored so that it can be requested by the analysis dashboard for visualisation.
In this use case, the lamppost data is also provided to a 3rd party system over the SMARTER TOGETHER API.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 41
Figure 20: Workflow Use Case Lampposts
This JSON REST API over which the Munich Smart City App accesses the lamppost data is provided by the Smart Data Platform via an API catalogue. Over a GET request the app can retrieve data from three different endpoints. The Munich Smart City App has to register for each of these APIs and access is granted by the single sign-on service according to data classification and agreements among the project stakeholders. Each GET request is secured via HTTPS and access authorization via OAuth2 with JSON web tokens. Only if a valid token is included in the GET request the app can access the data. The data fields of the interface and the request parameters were designed to meet the specific use case requirements to display the values in the Smart City App.
Over the “authorize” button, the user, in this case the Munich Smart City App, requests access to the specific API. Once the access is granted real data can be retrieved. The different request parameters that are possible for the API are listed in the API documentation.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 42
The Munich Smart City App uses this data to display the positions of the different Intelligent Lampposts in the pilot area and to show the different measured values of the different sensors installed at the lampposts. Both the current measurements as well as historical value diagrams, e.g. the development of values in the last week, are shown to the app user.
Figure 21: Screenshots from Munich Smart City App with Intelligent Lampposts
Different analyses are calculated for the Intelligent Lampposts use case, which are visualized in the analysis dashboard that can be accessed by the dashboard users who are unlocked for the topic of Intelligent Lampposts. The screenshot below shows an example for an Intelligent Lampposts analysis. In this diagram it can be flexibly chosen between all environmental sensors to display the different measured values (temperature, relative humidity, wind speed, air pressure, wind direction, SO2, CO, NO2, Ozone, PM2,5, PM10) over the specified period of time. Different aggregation levels can be chosen to display single measurements as well as hourly, daily and monthly average values. In order to compare different value types, e.g. temperature with SO2, the diagram can display these different values on two y-axes.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 43
Figure 22: Screenshot Intelligent Lampposts analyses
5.3 Mobility In the Mobility use case, different data for the mobility offerings at the pilot area´s mobility stations are integrated in the SDP. For this use case, the data is not transferred over an API to the SDP but over csv.-Files regularly provided by the Mobility use case owner to the SDP operator. These .csv-Files are initially generated by the operators of the different mobility offerings at the mobility stations such as MVG for bike sharing or STATTAUTO for carsharing. The data is processed for visualisation in the analysis dashboard.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 44
Figure 23: Workflow Use Case Mobility
The analyses implemented for Mobility are basically usage statistics of the different mobility services in the pilot area. Per mobility station, it can be analysed how the use of the services has evolved in the last days, weeks or months. Furthermore the usage data of the different services is combined with weather data of the environmental sensors integrated in the Intelligent Lamppost use case to investigate how different weather conditions influence the use of, for example, bike sharing.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 45
Figure 24: Screenshot Mobility analyses
5.4 KPIs The purpose of the KPI use case is to present the key performance indicators of SMARTER TOGETHER project in the analysis dashboard as the central visualisation interface in a bundled way.
For each reporting period of SMARTER TOGETHER a csv.-file with all KPIs for all topics to be monitored is provided to the SDP by the monitoring task leader. Based on these numbers, the overall KPI tables, and based on that, different diagrams of KPI development are visualized in the SDP. An exception is the topic of Energy. In this case, the KPIs are not provided by the task leader but are calculated by the SDP automatically based on data integrated over a JSON REST API from an external database called E-Manager where all energy related developments are documented by the owner of the use case Energy.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 46
Figure 25: Workflow Use Case KPIs
For each of the KPI topics of Participation, Energy, Integrated Infrastructure, Mobility, Monitoring, Replication and Other, the respective KPIs are presented in an overview table. For many KPIs, single diagrams of the value development can be observed as well as diagrams of the current status in percent how much of the target value for each KPI has already been achieved. All tables and diagrams, which as well applies to all diagrams of the other use cases described above, can be exported as .csv- or .svg-files.
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 47
Figure 26: Screenshot KPIs analyses
6. Conclusion The Smart Data Platform, as an evolution of the former City Intelligence Platform, was designed and implemented in a way to meet all requirements coming from the SMARTER TOGETHER use cases regarding the integration, processing, analysis and the exchange of data. The example of the use case Intelligent Lampposts shows exemplarily the full platform workflow from integrating different lamppost data from various data providers, processing, combining and analysing data, and, finally, visualising the analyses in a user interface and providing data to a 3rd party over a dedicated Smarter Together API.
A wide range of different APIs has been implemented between data providers and the Smart Data Platform. In half of the cases, the JSON REST APIs have been designed and provided by the SDP, and in the other cases proprietary APIs (also JSON REST) have been provided by data providers for implementation by the SDP. The implementation of the Smart Data Platform in SMARTER TOGETHER has demonstrated that a wide range of technical interfaces and very different data types can be integrated into one system and brought together meaningfully for cross-topic analyses and use cases.
It has turned out that the user interface for visualisation of analyses, the analysis dashboard, is a crucial component of the Smart Data Platform as this dashboard makes all data processing mechanisms visible and represents an important added value for actual users. Even if the backend services require more development effort, is the analysis dashboard the most important visible part of the platform within SMARTER TOGETHER. As basis for the actual development process, it can be summarized that
SMARTER TOGETHER – Smart Data Platform Munich – D4.4.1 – Version 2.0 – 03/05/2019 48
many efforts have to be made for designing, discussing and redesigning the analyses. It has proved helpful to work with several loops of screen designs as a development basis, which has been agreed by all involved stakeholders, especially the later users of the analyses. As the challenge lies in creating meaningful analyses based on big amounts of raw data, define filter options for the user interface, and decide on a suitable visualisation, this process has turned out to be a very manual one. Based on the experience within the use cases of SMARTER TOGETHER it seems impracticable to design automated standard analyses.
It can also be summarized that the system architecture and the design of the single components cannot stand for itself but has to reflect with its functionalities the specific data processing framework defined by the various use cases stakeholders. The developed data model and data classification framework of the DGK had to be translated into specific technical implementations in the Smart Data Platform described in chapter 4. As documented, this is mainly still a manual configuration and not an automated process.
After all use cases have been fully implemented, the Smart Data Platform is ready for the operational phase from February 2019 until the end of the projects. Apart from developing small improvements, the efforts for the SDP thus switch from development to hosting and operation of the implemented APIs, services and applications to guarantee a stable flow of data through the system. Furthermore, the Smart Data Platform will be used for the upcoming monitoring and replication tasks.
7. Annex The concept description “Data Gatekeeper” (DGK) is added as annex (but as a separate document) to this deliverable. Although the DGK is an independent concept that is applicable to any Data Platform environment, the Smart Data Platform frequently refers to the DGK when integrating SW building blocks, data classification and the complete data model.
The project partner Fraunhofer IAO designed the DGK concept in close cooperation with Landeshauptstadt München.
Smart and Inclusive
Solutions for a Better
Life in Urban Districts
Data Gatekeeper
Munich Final Documentation
This project has received funding from the
European Union’s Horizon 2020 research and
innovation programme under grant agreement
No 691876
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 2
Table of Contents Table of Contents .................................................................................................................................... 2
List of Figures ........................................................................................................................................... 7
Preambel ................................................................................................................................................. 7
Executive Summary ................................................................................................................................. 9
Introduction ................................................................................................................................... 12
1.1 General Approach of the Data Gatekeeper ........................................................................... 14
1.2 Structure of the Document ................................................................................................... 17
1.3 Use Case Example “Smart Home” ......................................................................................... 19
Strategic Guidelines ....................................................................................................................... 23
2.1 Legal Regulations ................................................................................................................... 24
2.1.1 Privacy for Personal Data .............................................................................................. 25
2.1.2 Opening Administrative Data ........................................................................................ 28
2.1.1 Use Case Reflection ....................................................................................................... 29
2.1.2 Golden Rules .................................................................................................................. 30
2.2 Internal Compliance .............................................................................................................. 30
2.2.1 Use Case Reflection ....................................................................................................... 31
2.2.2 Golden Rules .................................................................................................................. 32
2.3 Smart Data Infrastructure Basics ........................................................................................... 33
2.3.1 Data Protection Concept ............................................................................................... 34
2.3.2 Data Security ................................................................................................................. 35
2.3.3 Hardware Location ........................................................................................................ 35
2.3.4 Data Integration, Processing and Analysis .................................................................... 36
2.3.5 Use Case Reflection ....................................................................................................... 37
2.3.6 Golden Rules .................................................................................................................. 39
2.4 Smart Data Strategy .............................................................................................................. 40
2.4.1 Drivers for Smart Data ................................................................................................... 41
2.4.2 The Formulation of a City-specific Strategy .................................................................. 42
2.4.3 Use Case Reflection ....................................................................................................... 44
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 3
2.4.4 Golden Rules .................................................................................................................. 45
2.5 Responsibilities ...................................................................................................................... 46
2.5.1 Strategy ......................................................................................................................... 47
2.5.2 Participation .................................................................................................................. 48
2.5.3 Compliance .................................................................................................................... 49
2.5.4 Operations ..................................................................................................................... 50
2.5.5 Data Provision................................................................................................................ 52
2.5.6 Data Consumption ......................................................................................................... 53
2.5.7 Use Case Reflection ....................................................................................................... 54
2.5.8 Golden Rules .................................................................................................................. 56
2.6 Participation .......................................................................................................................... 56
2.6.1 Use Case Reflection ....................................................................................................... 58
2.6.2 Golden Rules .................................................................................................................. 61
Smart Data Creation Guidelines .................................................................................................... 62
3.1 Requirement Specification .................................................................................................... 63
3.1.1 User Goals ...................................................................................................................... 63
3.1.2 Use Case Project Information ........................................................................................ 63
3.1.3 Specification Checklist ................................................................................................... 64
3.1.4 Agreement Meeting and Approval ................................................................................ 66
3.1.5 Use Case Reflection ....................................................................................................... 66
3.1.6 Golden Rules .................................................................................................................. 67
3.2 Data Categorisation ............................................................................................................... 68
3.2.1 Categories ...................................................................................................................... 69
3.2.2 Use Case Reflection ....................................................................................................... 71
3.2.3 Golden rules .................................................................................................................. 72
3.3 Data Usage Agreement and Licensing ................................................................................... 72
3.3.1 Standard Licences .......................................................................................................... 73
3.3.2 Individual Agreement or Side Letter ............................................................................. 74
3.3.3 Use Case Reflection ....................................................................................................... 75
3.3.4 Golden Rules .................................................................................................................. 75
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 4
3.4 Data Classification ................................................................................................................. 75
3.4.1 Integrity and Data Protection ........................................................................................ 77
3.4.2 Processing and Analysis ................................................................................................. 77
3.4.3 Redistribution and modification .................................................................................... 78
3.4.4 Storage and Deletion ..................................................................................................... 78
3.4.5 Use Case Reflection ....................................................................................................... 78
3.4.6 Golden Rules .................................................................................................................. 80
3.5 Quality Gate 1 ........................................................................................................................ 81
3.5.1 Elaborate Final Use Case Concept ................................................................................. 81
3.5.2 Final Concept Approval (Q1) ......................................................................................... 81
3.5.3 Sign Data Usage Agreement .......................................................................................... 81
3.5.4 Use Case Reflection ....................................................................................................... 81
3.5.5 Golden Rules .................................................................................................................. 82
3.6 Data Sets and Technical Implementation.............................................................................. 83
3.6.1 Create Data Scheme and Interface Specification .......................................................... 83
3.6.2 Create Technical Specification....................................................................................... 83
3.6.3 Technical Specification Approval ................................................................................... 84
3.6.4 Implementation and Data Set Creation......................................................................... 84
3.6.5 Implementation of Data Interface ................................................................................. 84
3.6.6 Develop App or Hardware solution ............................................................................... 85
3.6.7 Use Case Reflection ....................................................................................................... 85
3.6.8 Golden Rules .................................................................................................................. 85
3.7 Data Processing ..................................................................................................................... 85
3.7.1 Anonymisation and Pseudonymisation ......................................................................... 87
3.7.2 Data Aggregation ........................................................................................................... 88
3.7.3 Golden Rules .................................................................................................................. 89
3.8 Quality Gate 2 ........................................................................................................................ 90
3.8.1 Specification and Implementation Review .................................................................... 90
3.8.2 Use Case Integration Test .............................................................................................. 90
3.8.3 Go Live Pilot Phase ........................................................................................................ 90
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 5
3.8.4 Analysis Test .................................................................................................................. 90
3.8.5 Final Approval and Go Live ............................................................................................ 91
3.8.6 Golden Rules .................................................................................................................. 91
Data Model .................................................................................................................................... 91
4.1 Examplary Data Model .......................................................................................................... 95
Smart Data Infrastructure ............................................................................................................. 96
5.1 Basic Architecture .................................................................................................................. 97
5.2 Data Integration .................................................................................................................. 100
5.3 Connection with External Systems ...................................................................................... 101
5.4 Analysis Dashboard ............................................................................................................. 102
5.5 Platform Statistics Dashboard ............................................................................................. 103
5.6 Transparency Dashboard ..................................................................................................... 104
Conclusion and Outlook .............................................................................................................. 105
Glossary ............................................................................................................................................... 108
Bibliography ......................................................................................................................................... 115
Appendix ...................................................................................................................................... 117
7.1 Legal Regulations ................................................................................................................. 117
7.1.1 Principles on Privacy for Personal Data ....................................................................... 117
7.1.2 Objectives and Principles of Opening Administrative Data......................................... 124
7.2 Checklist Templates for the Smart Data Creation ............................................................... 127
7.2.1 Requirement Specification (Epic) ................................................................................ 128
7.2.2 Data Categorisation ..................................................................................................... 130
7.2.3 Data Usage Agreement and Licensing ......................................................................... 130
7.2.4 Data Classification ....................................................................................................... 131
7.2.5 Quality Gate 1 .............................................................................................................. 132
7.2.6 Data Sets and Technical Implementation .................................................................... 132
7.2.7 Data Processing ........................................................................................................... 133
7.2.8 Quality Gate 2 .............................................................................................................. 134
7.3 RACI ..................................................................................................................................... 135
7.4 Categories ............................................................................................................................ 135
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 6
7.5 Classification ........................................................................................................................ 138
7.6 Data Model .......................................................................................................................... 142
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 7
List of Figures Figure 1: General Approach for Smart Data on Strategic, Organisational and Technical level .................. 16 Figure 2: Overview of the concept and the structure of the document ............................................................. 18 Figure 3: IT-Components of the Use Case .................................................................................................................... 38 Figure 4: Categorisation of the Use Case ..................................................................................................................... 72 Figure 5: Classification of the Use Case ........................................................................................................................ 79 Figure 6: Data Model: Inside Smart Data Platform and External ............................................................................ 92 Figure 7: Data Model: Smart Data Systems .................................................................................................................. 93 Figure 8: Users relevant for the Smart Data Platform ................................................................................................. 95 Figure 9: Data Model of a specific Smart Data Platform with two Use Cases .................................................... 95 Figure 10: Smart Data Platform Layer Architecture ................................................................................................... 98 Figure 11: Smart Data Platform Layer Architecture and Data Gatekeeper Registry ....................................... 99 Figure 12: Platform Core Functionalisties and their Administration ...................................................................... 101 Figure 13: Exemplary analysis of Smart Home Use Case ........................................................................................ 103 Figure 14: Example of a Transparency Dashboard Prototype for Smarter Together Munich) ..................... 104
Preambel The idea to create a Data Gatekeeper concept was born during the early preparation
phase of the EU-project “Smarter Together” in Munich. All team members were
convinced that, when dealing with subjects like Smart City, Digitalisation or Smart Data
Analysis in a City environment, the question of how to handle data correctly will
become a fundamental future leitmotif. In contrast to an institution that collects,
interprets and sells data for pure monetary business purposes, a City has various
different motivations and underlying benefit models of how and why to handle Smart
Data.
Very quickly it became obvious that “handling data correctly” includes several
aspects that need to be looked at. Not only the technical and legal aspects, but also
ethic thoughts of how a City wants to use data in a “digitalized future” need to be
discussed. In a more digitalized world, owning and handling data from a City
Administration perspective more and more becomes a game changing lever for the
steering and planning ability of a complex future City. Doing this in a most transparent
but still efficient way from the beginning is a key factor to convince and involve all
potential stakeholders and the citizens in and around a City that are required to start
a new level of a future City.
This concept should be regarded more as a living document approach rather than a
complete and concluding final paper. It should be looked at as a helping hand
mainly, but not exclusively, for City Administration’s IT-Strategy and City Planning
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 8
management whose tasks are to start, design or implement innovative data driven
Smart City and Digitalization concepts and projects.
In order to discuss the Data Gatekeeper approach as comprehensive as possible this
document offers several functional viewpoints to involve as many as possible
developing aspects. These viewpoints should be looked at as impulses to better
understand the interdisciplinary nature of such a concept.
In a first step we look at the topic from a legal and City strategic perspective. In parallel
and to reflect more practical experiences we constantly describe and discuss a given
real Use Case of the Smarter Together project, offering “Golden Rules” that can be
interpreted as potential best practice Data Gatekeeper experiences. Last but not
least the IT-reflection and underlying more technical implementation models are
described in order to offer possible approaches of how to integrate a Data
Gatekeeper concept into an existing IT-landscape or into a Smart City Data Platform.
Also a short overview on the underlying Data Model and IT-structures is provided.
Capitalizations are used for references to chapters or glossary.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 9
Executive Summary Urban contribution and shaping in the development of Smart Cities is important to
keeping up with the digitized industry. Digitalisation is the main driver for analysis of
urban challenges as it allows the occurence of more and more data and accelerates
information-exchange processes. When processing citizens data in a Smart City
context, transparency needs to be ensured that recent data privacy debates claim
for.
The role of a Smart City in the creation of Smart Data is to position oneself as trustworthy
partner, install IT-solutions that allow efficient data exchange and analysis, and comply
with national and international law and regulations. Moreover, the sufficient
maintance and governance of data as well as the prevention of security vulnerabilities
are unrelenting efforts in order to ensure data quality, data privacy and other
compliance issues.
Therefore, this document proposes a holistic concept that contains the strategic
anchoring, the creation of Use Cases as well as the underlying Smart Data Platform
concept as a Smart City is an interdisciplinary challenge. The so-called “Data
Gatekeeper concept” is a generic concept to be realized either in a City, in a
consortium of local towns or urban environments in a practically oriented
comprehensible manner.
A systematic approach was applied in this document to guide cities through the
implementation of organisational structures and a Smart Data Platform. These
guidelines are illustrated in a practical manner with a paradigmatic Use Case
reflection. The example used for the Use Case reflection is “Smart Home”. Data, like
temperature and humidity as well as the battery status, is submitted to the Smart Data
Platform in order to analyse the advantage of refurbishment. The data is made
available for the voluntary residents via App for information and comparison with other
participants.
The Data Model is the essential bridge between the Data Gatekeeper as
organisational concept for Smart Data and the Smart Data Platform. It translates
considerations and specifications of the Strategic and Smart Data Creation Guidelines
into logical IT-oriented structures including all available and selectable options that
characterizes needed components within the Smart Data Platform.
The latter allows an efficient data exchange between City stakeholders and offers
innovative data analysis methodologies. Its architecture is designed including the
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 10
Transparency Dashboard as an additional offering for data processing transparency
towards citizens for all Use Cases.
Driving motivations for a City to create Smart Data may be to increase quality of life,
environmental sustainability through less emissions or in the case of the “Smart Home”
Use Case, the support of renovation planning. Drivers on the side of companies can
be new revenue opportunities or improvements in efficiency, productivity or decision
making. Cost savings in e.g. energy or time savings regarding e.g. public transportation
and traffic are beneficial examples to citizens.
The recommended tasks of a Smart City are to
1. make use of appropriate IT-solutions that converts available data into Smart
Data and to introduce a collaborative data platform which allows an efficient
data exchange between the City stakeholders.
2. maintain and govern data sufficiently, so no data becomes obsolete or privacy
flaws occur
3. position themselves as trustworthy partners in the context of data collection
and handle data security and privacy in conformity with society
o Involvement of many stakeholders by means of co-creation activities
o Bulding a positive transparency image e.g. with a public accessible
website
o Setting up responsibilities and organisational data governance structures
in order to ensure that rules including approval steps are strictly followed
by the stakeholders
o Complying with national and international law and regulations at any
time
4. define a Smart Data Strategy including a long-term vision as well as a related
financial budget to ensure backing and sponsorship of the City’s mayor and
municipal council
5. raise awareness and educate the citizens, as well as internal and external Data
Suppliers, how and where data should be collected and which analyzes and
information evaluations should be generated or expected from the raw data
collected as well as to whom the evaluated data will be made available
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 11
The development of Smart Cities is important to keeping up with the industrial
digitalization and not admitting prevalence to individual companies. The threshold of
entry into a digitized city is lower if relationships with suppliers already exist and one
can build on existing structures. Actively innovating is important to prevent expensive
catching up of missed development.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 12
Introduction The expression „Smart City“ is used in discussions on urban projects like e-mobility,
energy efficiency, City planning or environmental protection. The term Smart City can
be defined as the provision of all kinds of new innovative services as well as existing
services in a more efficient manner and includes the usage of information and
communication technologies in cities.
“Smart Data” is the key for these innovations. Within the given context the term is
defined as digital information that is filtered, processed, formatted and optimized from
all available data sources (e.g. sensors) in a way that is ready to use and allows more
precise and focused analytics of the use cases. So what makes Smart Data smart is
that it allows a much faster extraction of valid and meaningful information than
“conventional” data.
Driving objectives are e.g. to find answers to challenges like monitoring aging
infrastructure, handling the increase of traffic, improving the urban administration’s
effectiveness and efficiency as well as raising citizens’ quality of life in metropolitan
areas.
Today, a main driver that influences the way problems can be analyzed and
eventually be solved is digitalization. The process is called “digital transformation”
which takes place in industries (“Industry 4.0”), governments (“Smart/Future Cities”) as
well as in the society as a whole.
Especially in very data privacy-aware countries like e.g. Germany social media
websites and internet companies currently tend to have an image of overcollection
data. Additionally, over the past years many data privacy issues have been raised by
national and European governmental and judicial systems. They especially have
complained about intransparent policies (cf. BBC, 2012). If actors and stakeholders in
a Smart City context also collect and process bigger amounts of citizen’s data, they
need to ensure an outstanding transparency from the very beginning.
Digital transformation is often used to specify and accelerate collaboration processes
between involved stakeholders and their underlying processes of exchanging
information. Therefore, the role of a City in the context of digitalization is rapidly
changing as more and more data from citizens, authorities and private organisations
is being generated.
This implies three main requirements for a future-proof City:
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 13
– First, determine how cities position themselves as trustworthy partners in the
context of data collection and how to handle data security and data privacy
issues. An open and transparent position will define how citizens accept or
reject “Smart Data” concepts and proposed “Smart City” solutions in the future.
– Second, make use of appropriate IT-solutions that convert available data (incl.
the possibility of big amounts of data) into Smart Data and to introduce a
collaborative data platform that allows an efficient data exchange between
the City stakeholders and offer innovative data analysis methodologies which
allow for more integrated City planning activities (“Smart City Platform”). Smart
Data in that context can be interpreted as data that is filtered and optimized
from all available data sourves in a way to allow more precise and focused
analytics of the originally described topics or use cases.
– Third, cities need to ensure they comply with national and international law and
regulations at any time. Even when not taking into account public visibility, non-
compliance leads to increasing financial penalties, especially with regard to
new EU-regulations like GPDR, and growing mistrust from the public.
When all three requirements are addressed adequately, it is not sufficient to just install
a technical solution or instruct an IT service provider to conduct a Smart Data Platform.
If data is not maintained and governed sufficiently, it becomes obsolete. With regard
to technical solutions, security vulnerabilities occur due to old software releases. A
Smart Data Platform as a technical solution for a Smart City therefore must be
enhanced with a strategic and organisational pendant ensuring data quality, data
privacy and other compliance issues. A holistic socio-technical solution can be
understood as a “managed data control area” which ensures own trustworthiness
rules, given compliance regulations as well as essential data security and data privacy
rules at any time.
The present document aims at describing a concept for operating a management
system involving such a ”data control area” including underlying guiding principles.
The concept named “Data Gatekeeper” describes an implementation of
organisational structures as well as a Smart Data Platform for cities, town consortiums,
City districts or urban environments in a practically oriented comprehensible manner.
The described Data Gatekeeper mainly consists of the following aspects:
– Transparent and traceable definition of compliance rules that all input and
output data have to follow strictly. These rules cover the relevant data security
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 14
legislations, data privacy aspects as well as individual policies that a City and
other relevant stakeholders might have defined in a co-creation manner as
being indispensable for a secure and reliable Smart Data Platform operation.
– Organisational data governance and workflow structures ensuring every
stakeholder to strictly follow these rules including requested approval steps.
Furthermore, clear responsibilities shall define who is allowed to change City-
specific components of these rules.
– Well documented principles, Data Schemes and workflows as part of the IT-
architecture allowing to implement a Data Gatekeeper in a given City
environment, including the coupling of an IT-data platform.
Therefore, the Data Gatekeeper concept plays an important role not only to help
protecting sensitive data against potential misuse, but also to underline that crucial
future factors of success for any City are reliable and transparent ways of how to
arrange IT-based Smart City concepts and how to handle Smart Data in general.
The following subsections give a brief overview of the general approach of the Data
Gatekeeper concept and outline the document structure.
1.1 General Approach of the Data Gatekeeper
The Data Gatekeeper Concept is a strategic and organisational concept ensuring
data quality, data privacy and other compliance issues. The whole concept aims to
be a generic concept to be realized either in a City, in a consortium of local towns or
urban environment as a tool for the upkeep of a secure data handling reputation.
Data Privacy
In general there are two approaches of addressing data privacy leaks: “a priori” and
“a posteriori” (cf. Li et al., 2016). Compliance management brings both together: It
prevents, detects and elaborates strategies for responding (cf. Holzmann, 2016, p.37).
The Data Gatekeeper concept implements an “a priori” approach specifically for
Smart Data in cities.
If the privacy impacts of a particular product or service shall be evaluated, a so-called
Privacy Impact Assessment (PIA) is necessary (BSI 2011). A PIA must always be
conducted on a concrete Use Case level and follows the goals to ensure law
conformity to determine risks and effects as well as to evaluate protection and
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 15
mitigation measures. Described quality gates in the Use Case Creation Process of the
Data gatekeeper concept address this issue.
In case a City provides administrative data itself for a particular Smart City Use Case,
all guidelines and recommendations for opening governmental data apply (Manske,
Knobloch 2017).
On top of the above mentioned aspects the impact of fundamental elements of the
new GDPR (General Data Protection Regulation) is briefely discussed in this document.
Data Quality
Main principles in the following general approach stem from the concept of Data
Governance, which is an organisational function to ensure data quality (cf. Scheuch,
Gansor & Ziller, 2012). Data quality can be seen as a sub-aspect of compliance, as
some privacy principles as well as regulations, which especially demand personal data
to be accurate, correct, precise and current (cf. EU, 2016; Wang & Kobsa, 2008).
The Data Gatekeeper Concept is a three-tier approach addressing strategy,
organisation and technology. On the strategic level, prerequisites regarding Legal
Regulations and City-internal compliance are taken into account as a basis for the
Smart Data Strategy, responsibilities and participation strategy (cf. Figure 2 and
structure of the document). Especially the City-specific Smart Data Strategy must be
defined once, taking into account all relevant stakeholders including a political
mandate from the municipal council of the City in order to have a clear commitment
on budget and planned activities.
Further an enhancing participation strategy for citizen’s involvement is part of the
concept.
On an organisational level Smart Data responsibilities and operations are next steps to
be established according to the Smart Data Strategy (cf. chapter 2). A continuous
improvement according to citizens’ feedback, analysis reports and Smart Data
Platform statistics should be realized by Smart Data responsibles. According to Smart
Data Creation processes (cf. Chapter 3) now digital smart services based on an
evolving Data Model can be created following citizen, company and City
departments requirements. Important task of the smart services is ensuring processing
transparency e.g. by a transparency dashboard providing data Classification details
publicly to data owners and citizens.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 16
Main focus of the organisational level are the Use Case lifecycle processes. The Data
Gatekeeper Concept focuses on the most important Use Case lifecycle process,
which is the Use Case of the Creation Process. Further also relevant processes are the
Use Case Update Urocess – for instance in case a compliance shortcoming has been
identified – and the use Case Deletion Process. As two Smart Data Use Cases may
share a single Data Set, also lifecycle processes of Data Sets are relevant, even when
mostly run as part of a particular Use Case creation or Update Process.
On the technical tier a data architecture is necessary with a specific Data Model
based on the Data Gatekeeper Data Model holding relevant master and transaction
data. Operational smart services finally ensure the optional distribution of data to
various other systems or to a citizen’s smartphone via a service interface.
The following figure illustrates the general approach as a holistic framework for Smart
Data in Cities ensuring transparency and Participation as well as compliance and data
privacy for personal data. The parts of the Strategic Guidelines are highlighted blue.
Figure 1: General Approach for Smart Data on strategic, organisational and technical level
Legal Regulations
Internal Compliance
Data ProvisionData Consumption
Problem Awareness
Requirements SpecificationRequirements
Specification
Collect Data
Benefit Models
Efficiency Increase
Benefit for Stakeholders
Ch
allenges &
Pro
blem
s
Solu
tion
s & B
enefits
Communication
Political Support & Drivers
Stakeholder
Operations
Coordination
Responsibilities
Smart Data Lifecycle
Data Model
Smart Data Infrastructure
Analyze Data
Process Data
Change Processes
Smart Data Platform
ICT Architecture
Data SetsUse Cases
Data Rules
Process RulesProcessing Transparency
Monitoring
ClassificationCompl iance RolesDecider
Use Case Creation, Change & Deletion
Transparency
Participation
SMART DATA
STRATEGY
Legal Regulations
Internal Compliance
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 17
Due to research project limitations the following document solely focuses on some
aspects of a necessary holistic approach also addressing a continuous monitoring of
compliance flaws according to the concept of compliance management. Especially
the strategic parts to be defined once per City and the Use Case Creation Process are
in particular focus. Further necessary key aspects of the Smart Data Architecture are
described, such as the Data Model and the smart services with the Transparency
Dashboard.
1.2 Structure of the Document
In general the document is structured according to the previous described general
approach. All terms written in capital letters are described in a glossary, which is
enhanced at the end of the document. Figure 2 gives an overview of document
structure at hand.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 18
Figure 2: Overview of the concept and the structure of the document
The document is divided in the following chapters:
The first chapter serves as Introduction, containing a brief description of the motivation
and approach. It summarizes also the structure of the Data Gatekeeper concept.
Chapter 2 describes the underlying Strategic Guidelines. These guidelines comprise
mandatory axioms, which are given boundary conditions that cannot be neglected.
The chapter consists of a description of the given laws and other legal or local
restrictions. Furthermore, aspects regarding a Smart Data Strategy are described as
well as Smart Data Infrastructure, Tasks and Responsibilities and Participation. The
information on these topics support the responsible project stakeholder to take into
consideration the principal starting conditions before detailing any Gatekeeper
implementation.
g
Participation
Transparency
Data Provision
Data Consumption
Problem Awareness
Requirements Specification
Requirements Specification
Collect Data
Benefit Models
Efiiciency Increase
Benefit for Stakeholders
Smart Data Strategy
Political Support & Drivers
OperationsResponsibilities
Smart Data Lifecycle
Data Model
Smart Data Infrastructure
Legal Regulations
Internal Compliance
Introduction
Strategic Guidelines
Smart Data Creation Guidelines
Conclusion and Outlook
Data Model
Structure ofthe Document
General Approach
Chapter 1:
Chapter 2:
Chapter 3:
Chapter 4:
Chapter 5:
Chapter 6:
Use Case „Smart Home“
1.1 1.2 1.3
Smart Data Platform andServices
Examplary Data Model: „Smart Home“
5.1 5.2 5.3 5.4 5.5 5.6
Data Integration
Basic Architecture
Connection with External
Systems
Analysis Dashboard
TransparencyDashboard
Platform Statistics
Dashboard
Quality Gate 1
Quality Gate 2
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8
Data Processing
Data Usage Agreement
and
Licensing
Data Categori-
zation
Data Classification
Data Sets and
Technical Implemen
tation
RequirementSpecification
2.1
2.22.3
2.4
2.5
2.6
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 19
Chapter 3 describes the Smart Data Creation Guidelines which summarize the
individual variables and preconditions that need to be defined, thought of or brought
together before starting any concrete Data Gatekeeper procedure. Therefore, the
chapter describes the preparation work that is required to articulate an individual Use
Case. Furthermore for each subsection it contains a description of an exemplary Use
Case.
Chapter 4 contains the design and description of the Data Model that summarizes the
considerations and specifications from the previous chapters. The Data Model serves
as a steering basis for the Smart Data Platform. The goal of the Model is to visualize a
comprehensive structure including all available and selectable options that
characterize the Data Gatekeeper concept.
Chapter 5 discusses the Smart Data Infrastructure. The chapter contains a description
of necessary technical requirements and gives an overview of the architecture and
Dashboards of an exemplary Smart Data Platform.
The last chapter 6 summarizes the findings and provides a Conclusion and Outlook.
1.3 Use Case Example “Smart Home”
The Use Case „Smart Home“ that was chosen as a representative example in this Data
Gatekeeper concept. It consists of an adapted workflow description of a real Use
Case that was implemented during the Smarter Together project in the City of Munich.
The description contains all relevant elements of the real Use Case implementation.
For simplicity reasons however some stakeholder names and minor process details are
not mentioned here.
The main stakeholders being involved in this Use Case are:
Flat inhabitants (to provide raw data and benefit from feedback of measurements),
External service provider (to initially collect, monitor and store raw data from flats),
Operator of the underlying Smart Data Platform (to collect additional data and
execute predefined analysis),
University to define required information / Data Sets as well as definition of scientific
analysis scheme,
City including the required City departments and subsidiary companies. One of the
subsidiary companies, that offers refurbishment consulting and coordinates related
refurbishment projects for the City of Munich, defines the overall Use Case. A “Smart
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 20
City Team” of the City IT-department coordinates the complete Use Case from a
technical viewpoint with all relevant City and external stakeholders
The Use Case “Smart Home” was developed in parallel to the creation and definition
of the Data Gatekeeper and the Smart Data Platform. Therefore we chose for an
iterative approach with stakeholders being invited to various workshops. In these
workshops we elaborated the necessary steps, described the common goals, all
mandatory data inputs as well as the required Analysis Dashboards. Although most of
the stakeholders where familiar with the content of a Use Case in general, it took a
stepwise approach to bring all new requirements and thoughts concerning the Data
Gatekeeper and Smart Data Platform into this new Use Case description.
Goal of the Use Case
The Use Case “Smart Home” is about measuring temperature and humidity in flats in
the Munich project area.
The goal of the Use Case is to measure the efficiency of refurbishment activities before
and after the refurbishment. On top of this it offers inhabitants additional information
on their ventilation behaviour via the so-called “feel-good-app” which indicates the
impact of ventilation activities.
Data Sources required for this Use Case are:
– Ongoing temperature and air humidity of individual flats (deriving from external
service provider)
– Address of the flats (deriving from external service provider)
– External weather conditions (deriving from German weather forecast institute,
DWD)
– Information about flat refurbishment status (deriving from City of Munich internal
Database)
Therefore about 400 starter sets including each 2 temperature/air-humidity sensors and
a base-station were offered “free of cost” to the flat inhabitants of the project area.
The interested inhabitants were asked to register themselves using a WEB-based
process. This register process included a clear description of the planned analysis goals
as well as the individual paticipant’s allowance to use their required privacy related
data for the data collection and analysis process. An access to an individual password
protected WEB-APP was then granted to each participating user. The information on
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 21
the personalised WEB-APP indicates the individual data measured in the personal flat
as well as the measured average data from all participating flats.
The complete process including supply of the starter set, raw data transfer, as well as
the design and provisioning of the WEB-APP was owned by an external service
provider participating in the Smarter Together project, but not by the City
Administration itself.
All relevant data from each flat was transmitted to a central Database of the external
service provider (using the starter kit base-station and an existing Internet access in
each flat).
Only the part of the transmitted raw data which is relevant for the above mentioned
scientific examination was then transmitted from the external service provider to the
Smart Data Platform (which is the central project platform to store, analyse and
distribute all project related data). Another project partner (University) then got access
to the transmitted data and executed their evaluations, also taking into account
additional information deriving from other Data Sources like external temperature and
air humidity and refurbishing status of the affected flats/buildings.
Findings (GoldenRules) for the Data Gatekeeper:
– As a first initial and most important step, for each Use Case an underlying
business and/or benefit model has to be defined that offers a clear and
comprehensive definition of the individual contribution as well as the individual
benefit of the planned outcome for each individually involved stakeholder,
participant, partner and the public
– If possible the business model has to initially pass through a professional reality
check process in advance to avoid useless investments or repair potential weak
points from the beginning. A main driver for complexity in that phase can be
the necessity to collect and analyse data that is considered as “privacy
related”. The more privacy related data are involved, the more complex the
discussions and policies that need to be considered subsequently (see also
“Golden Rules” in Chapter “Legal Regulations”)
– If possible avoid the necessity to work with any privacy related data in any Use
Case from the beginning. If not possible to avoid, restrict privacy related data
needs to an absolute minimum.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 22
Brief Summary of “Introduction”
Smart City concepts and Digitalisation are big drivers and solution pillars for Cities to
tackle some of their main arising real-world problems. Being able to provide and
analyse Smart Data is one of the levers a City needs to control in the future. One of
the challenging problems in that context for cities will be to find ways to correctly
handle the required raw-data as well as the Smart Data deriving from analysis
processes. Data handling requires not only to comply with the latest data-privacy
related EU-laws, but also to take into consideration the City’s data handling policies,
to evolve the desired “data-image” of the City, and to sensitise the City’s organisation
on how to behave correctly in that context. Last but not least it also requires to find
ways of how to implement the correct processes into the existing IT-infrastructure. The
Data Gatekeeper mainly addresses City administration’s IT-Strategy and City planning
management. It is written as a living document for cities. It offers a structured view on
several of these aspects, from a legal perspective to a potential technical
implementation description including concrete experiences and golden rules from a
project Use Case. The concept is always focussing on the goal to offer a broader
awareness of how to potentially handle data in a smart and digitalised City of the
future.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 23
Strategic Guidelines In order to establish Smart Data applications and Use Cases in a City, a strategic
baseline is needed. The following chapter describes aspects that need to be
considered as well as measures that shall be applied initially once per City when a
common Smart Data organisation and architecture is envisioned.
On the one hand, the following strategic concept translates externally given boundary
conditions like EU-law to strategic guidelines for an establishment of architecture and
applications supervised by the City administration. On the other hand, strategic
decisions on for instance the management of citizen data are needed in order to
define the image of the City as it is perceived in public. The following chapter on
strategic guidelines is a general approach on questions and aspects each City should
take into account when establishing a Smart Data Platform and a coherent
organisation. The following subsection therefore answers two main questions with
respective sub aspects:
– What are the prerequisites for using Smart Data?
o Legal Regulations
o Internal Compliance
o Smart Data Infrastructure
– What do we want to define strategically? How to proceed?
o Smart Data Strategy
o Roles and Responsibilities
o Participiation
Before sections describe details on these aspects an executive summary briefly
summarizes key issues of the chapter.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 24
2.1 Legal Regulations
When considering Smart Data in the context of Cities or urban environments, legal
issues in two main dimensions are relevant. First, data privacy for the storage and
processing of personal data or data that refer to personal data is highly relevant.
Second, in case City-internal data is also used and provided to public within the scope
of Smart Data the opening of administrative data may be an issue.
In this occasion, when selecting Data Sets for Use Cases, a distinction must be made
between personal and public data in order to be able to treat them with the right
protection need.
“‘personal data’ means any information relating to an identified or identifiable natural
person (‘Data Subject’)”; an identifiable natural person is one who can be identified,
directly or indirectly, in particular by reference to an identifier such as a name, an
identification number, location data, an online identifier or to one or more factors
specific to the physical, physiological, genetic, mental, economic, cultural or social
identity of that natural person” (GDPR, Article 4).
All other information is public data, which can be accessible to the general public and
freely used, reused and redistributed by anyone. In contrast to public data, open data
meet a format standard and are released with ability for re-use.
Sensitive data can be seen as a special category of personal data and “Processing
of personal data revealing racial or ethnic origin, political opinions, religious or
philosophical beliefs, […] shall be prohibited.” (GDPR, Article 9, 1).
An exception applies, for instance, if the “processing relates to personal data which
are manifestly made public by the Data Subject;” (GDPR, Article 9, 2 (e)). When the
Participation
Transparency
Data Provision
Data Consumption
Problem Awareness
Requirements Specification
Requirements Specification
Collect Data
Benefit Models
Efiiciency Increase
Benefit for Stakeholders
Ch
allenges &
Pro
blem
s
Solu
tion
s and
Ben
efits
Smart Data Strategy
Political Support & Drivers
OperationsResponsibilities
Smart Data Lifecycle
Data Model
Smart Data Infrastructure
Legal Regulations
Internal Compliance
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 25
Data Subject communicates and diffuses personal data to public (e.g. by social
media), retraceability is made possible but the data is assigned to public data.
This is an indication that assignment of Data Sets to personal or open data is not
obvious or final and it is highly recommended to confide a data protection officer.
In chapter 3.2.1, an example for the distinction between four fundamental
Classification Categories of data types is presented, claiming different protection
needs. The next sections provide an overview on personal and open data especially
with regard to current EU law.
2.1.1 Privacy for Personal Data
As legal compliance is highly relevant for a City’s Smart Data Strategy the following
subsections describe principles of personal data based on law. They are mandatory
and refer to the EU (2016) General Data Protection Regulation (GDPR), which result
from the European data protection reform1. After a two-year transition period the
European Data Protection Regulation will be applicable as of May 25th, 2018 and
replace the Data Protection Directive (Directive 95/46/EC) from 1995. The regulation
updates the principles of the old directive to guarantee people’s right to personal
data protection.
The General Data Protection Regulation applies to everyone located/established or
resident in the European Union, who processes personal data through the use of IT
systems or through the storage in file systems, as well as for everyone who processes
data of persons in the European Union (GDPR, Article 1).
This section aims an increase in sensitivity to the handling with personal data through
data laws, but it is recommended to consult a Data Protection Officer as there exist
further regulations for instance on (artistic) copyrights or e-privacy.
Required aspects of privacy for personal data are outlined as key principles enhanced
with quotes from the GDPR in the following:
1 The European data protection reform was adopted by the European Parliament and the European Council on April 27th, 2016. The data protection reform package includes the General Data Protection Regulation and the Data Protection Directive for the police and criminal justice sector. More details are provided at https://gdpr-info.eu .
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 26
Personal Data shall be…
1) “...processed lawfully […].“ (GDPR, Article 5, 1(a)) and has given consent:
o necessary for the performance of the Data Subjects (see Glossary)
contract and of a task carried out in the public interest or in the exercise
of official authority
o for compliance with a legal obligation
o to protect the vital interests of the Data Subject or of another person
o for the purposes of the legitimate interests pursued by the controller or by
a third party
2) “… limited to what is necessary in relation to the purposes for which they are
processed.“ (GDPR, Article 5, 1(c))
o to the need
o to the least privacy-invasive option
o to time of process for which they are necessary
3) “… collected for specified, explicit and legitimate purposes and not further
processed in a manner that is incompatible with those purposes.“ (GDPR, Article
5, 1(b)) and otherwise needs to get consent again
4) “… kept in a form which permits identification of Data Subjects for no longer
than is necessary for the purposes for which the personal data are processed.“
(GDPR, Article 5, 1(e)), unless it will be processed solely for archiving purposes in
the public interest, scientific or historical research purposes or statistical
purposes (see GDPR, 156, 162)
5) “…processed […] in a transparent manner in relation to the Data Subject.“
(GDPR, Article 5, 1(a)) by providing information to Data Subject (e.g. purposes
of the processing), information on the actions taken on a request (free of
charge) and facilitating the exercise of Data Subject rights.
The Data Subject has the right…
o to obtain confirmation whether processing data or not and of access to
the personal data and information
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 27
o to obtain the rectification of inaccurate personal data
o to obtain from the controller the erasure
o to obtain from the controller restriction of processing (e.g. when the
processing is unlawful or the accuracy is contested etc.)
o to receive the provided personal data and to transmit those data to
another controller (e.g. when the processing is based on consent or it
doesn’t affect the rights and freedoms of others)
o to object on grounds relating to his or her particular situation
o not to be subject to a decision based solely on automated processing,
including profiling.
6) “… accurate and, where necessary, kept up to date.“ (GDPR, Article 5, 1(d))
Every reasonable step must be taken to ensure that personal data that are
inaccurate are erased or rectified immediately.
7) “… processed in a manner that ensures appropriate security of the personal
data […].“ (GDPR, Article 5, 1(f))
The controller and the processor are responsible for the implementation of
appropriate technical and organisational measures to ensure a level of security
appropriate to the risk and
o consider the state of the art, costs of implementation, scope, context,
and the risk of varying likelihood and severity for the rights and freedoms
of natural persons.
o consider especially risks regarding accidental or unlawful destruction,
loss, alteration and unauthorised disclosure of personal data as well as
access to transmitted, stored or otherwise processed personal data.
o using an approved code of conduct or certification mechanism to
demonstrate compliance with this principle.
o ensure that any natural person who has access to personal data does
not process them except on instructions from the controller.
8) processed with several Responsibilities and statutory obligations of the controller
and processor, who needs to
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 28
o erase it (e.g. when unlawfully processed, for compliance reasons, the
Data Subject withdraws consent, etc)
o notify the supervisory authority of a personal data breach, not later than
72 hours
o notify the Data Subject, when it is likely to result in a high risk to the rights
and freedoms of natural persons and carry out an assessment of the
impact and seek the advice of the data protection officer
o designate a data protection officer (e.g. when the processing is carried
out by a public authority or body
You will find an overview with wider explanation for each key principle listed in the
appendix.
2.1.2 Opening Administrative Data
Opening administrative data means making generally available documents held by
the public sector freely available to everyone to use and republish. This is seen as a
fundamental instrument for extending intellectual property rights and knowledge,
which are basic principles of democracy.
According to the EU (2003) differences in the rules and practices in the different
member states concerning the use and reuse of public sector data/information
prevents the full economic use of this resource. Therefore, a general framework for the
conditions governing re-use of public sector documents is needed in order to ensure
fair, proportionate and non-discriminatory conditions for the re-use of such
information. This idea is supported by the directive 2003/98/EC on the re-use of public
sector information (see objectives in the appendix 7.1.2).
The general principle of the directive states that:
‘Member States shall ensure that, where the re-use of documents held by public sector
bodies is allowed, these documents shall be re-usable for commercial or non-
commercial purposes […]’ (EU, 2003)
The directive includes principles covering the aspects ‘requests for re-use’, ‘conditions
for re-use’ and ‘non-discrimination and fair trading’ (see appendix).
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 29
2.1.1 Use Case Reflection
The Use Case „Smart Home“ also deals with personal data such as names and
addresses of the participating inhabitants in the project area. By defining the Use Case
the new EU-laws of data protection (the so called “General Data Protection
Regulation” (GDPR) for handling of personal data and the underlying principles like
data minimisation and others, as described above), were considered in a first step. At
last the open data strategies of the EU have to be considered.
Within the Use Case “Smart Home” specific user agreements have to be signed
between the participants and the responsible project partners (Smart Home Partner)
and between the other involved project partners (City, University…) to get and secure
the allowance to transfer and process certain personal data that is required to define
a reasonable analysis in context with the measured sensor data.
For the analysis of the ventilation behaviour of a flat inhabitant in comparison to the
energetic refurbishment status of the flat, it is necessary to know the address of each
participant (but not the individual family name).
The involved stakeholders (like e.g. the City and the University) need to define the data
that is required for their analysis requirements in advance, as the new EU-law allows to
collect and analyse data only for the predefined and specified scope. Collect and
store of other data for potential additional future evaluation is not allowed offhand. In
any case the concerned people need to have insight and agree to analyse this data.
Temperature and air-humidity do not seem to be privacy related data. If measured
inside a home however, temperature can give away whether a person is at home or
not. This data might seem not directly privacy related, but can be used to derive
private information. This shows that even in apparently non-critical cases the data
collection and handling has to be discussed and defined very carefully. The
allowance however to use the address for further analysis required a written
agreement with the flat inhabitant. This consent process therefore demanded to
clearly describe and limit the goals of the analysis and the further usage of the
analysed data and make this transparent to the flat inhabitant during the registration
process.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 30
2.1.2 Golden Rules
~ Precise analysis if any potential privacy related data is required or not. Knowing
this from the beginning is crucial for all discussions with Data Security Officers
and for the organisation of any definition of data processing
~ Define the required data transmission interfaces and data formats, especially
discuss the secure end-to-end internet transmission policies of the project with
the data stakeholders.
~ Offer an availability and easy accessibility of information for all inhabitants
about measures, goals and kind of collected data in a Smart Data Platform
(SDP) giving a transparent access to the project using e.g. a Transparency
Dashboard, a webpage to inform all participants on the status of all used data
within the project
~ Investigation if licenses (legal or commercial) for collected data are required
2.2 Internal Compliance
Compliance requires the ‘observance of the requirements of the general law and of
rules and regulations imposed by any regulatory bodies to which a firm is subject.’
(Edwards & Wolfe, 2005, p.55)
Participation
Transparency
Data Provision
Data Consumption
Problem Awareness
Requirements Specification
Requirements Specification
Collect Data
Benefit Models
Efiiciency Increase
Benefit for Stakeholders
Ch
allenges &
Pro
blem
s
Solu
tion
s and
Ben
efits
Smart Data Strategy
Political Support & Drivers
OperationsResponsibilities
Smart Data Lifecycle
Data Model
Smart Data Infrastructure
Legal Regulations
Internal Compliance
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 31
Compliance stands for conforming to a rule, usually with respect to laws. Besides
binding regulatory compliance it is usually necessary to adhere to further rules,
standards, policies, regulations and other requirements. So there are external rules as
well as internal rules, but not all of them need to be mandatory.
Internal Compliance comprises optional regulations and seals, which are specified by
the City (authorities) itself, e.g. for publicity reasons and for the public image of the
City. When defining a Smart Data Strategy, it is important to know these internal
policies and take them into account.
Internal compliance typically include rules and requirements with respect to:
– IT strategy
– IT governance
– IT security
– Information security
– Privacy levels
– Risk management
– Objects and their protection needs
– Necessary provisions in order to comply to a certain standard
In general for setting up a Smart Data Strategy it is recommended to know the City’s
or involved organisation’s internal compliance regulations in order to build on them
and integrate them in technical and organisational descisions.
2.2.1 Use Case Reflection
In the next step the Use Case „Smart Home“ was checked on the Munich IT-Strategy
in general and mainly on the guidelines for IT security and information security. By this
step the relevant people like IT-strategists and data security officers were involved in
the process so that the City specific guidelines and regulations or required processes
for handling of personal data like risk declarations and conformity declarations were
considered.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 32
As all data that is generated within the Use Case “Smart Home” will be stored in
external databases of project partners (Smart Home Platform and Smart Data
Platform) no further processes have to be considered within the City of Munich to
guarantee the safety of the data. Therefore the main task of the City was to provide
all relevant documents covering risk management and IT-security to the defined
partners and to elaborate and check whether all these guideline were covered
accordingly. More detailed information on IT-Infrastructure can be found in the
following chapters “Smart Data Infrastructure”.
In the Smarter Together project, the Use Case Responsible is a subsidiary of the City of
Munich und works mainly for the planning office of the City. Their main task is to advise
citizens and companies in terms of Energy Refurbishment issues.
Although the Smart Data Platform was built in a project environment that is only
connected to the existing City IT-network via defined interfaces, all data security
aspects and data Classification rules (see also examples in chapter “Data
Classification”) have to be considered and an underlying risk analysis has to be done
oriented on the existing compliance rules of the City of Munich. The handling of
privacy related information as well as the terms of use of collected data were
discussed and finalised in tight cooperation with the partners and the Data Protection
Officer.
2.2.2 Golden Rules
Stakeholder specific and comprehensible description of the City compliance rules
that apply for this Use Case must be defined and signed
~ Involvement of the Data Protection Officer of the City (assuring e.g. data
access security in a data center as well as data handling risk minimisation, in
contrast to the Data Security Office taking care about the legal data privacy
aspects) to understand, tighten and agree to the underlying Use Case
requirements from the beginning. (This Golden Rule applies accordingly in the
chapter “Smart Data Infrastructure”, but it is recommended to also involve the
Data Protection Officer as early as possible into the process, once the relevant
stakeholders dealing with the technical equipment are defined.)
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 33
Brief Summary of “Strategic Guidelines”
One fundamental new aspect of future data handling principles is based on the EU-
legislation “General Data Protection Regulation” (GPDR), applicable latest in spring
2018. It contains clear rules of how EU-cities (amongst others) need to behave when it
comes to data collection, data analysis and data privacy. Often the potentially
occurring problems when dealing with data are underestimated and might create
severe legal implication during or after the implementation of a Use Case. Therefore,
although the described legal rules seem simple and logical at first sight, it turns out that
any data driven Use Case (whether data collection, data analysis or opening
administrative data) needs to be checked and released mandatory by a City’s Data
Protection Officer.
A second aspect is to know and strictly follow the City Internal Compliance rules. It
might not be as momentous as an EU-law. Nevertheless, a City, when claiming and
introducing innovative new ways of “Smart City” or “Digitalisation” approaches, needs
to be able to refer to proven and reasonable own rules that are transparent and
known to all involved stakeholders including the citizens. In most cases the public
notice and a strict compliance of these rules form the base of a City’s positive data
handling image that is seen as one basic necessity when planning or performing Smart
City or digitalization projects.
2.3 Smart Data Infrastructure Basics
It is expected that, within the next years, a huge amount of new data will become
available. Examples include Floating Car Data, Car to X-Technology and Crowd
Sourced data in the mobility domain, which requires tools to access, store and
Participation
Transparency
Data Provision
Data Consumption
Problem Awareness
Requirements Specification
Requirements Specification
Collect Data
Benefit Models
Efiiciency Increase
Benefit for Stakeholders
Ch
allenges &
Pro
blem
s
Solu
tion
s and
Ben
efits
Smart Data Strategy
Political Support & Drivers
OperationsResponsibilities
Smart Data Lifecycle
Data Model
Smart Data Infrastructure
Legal Regulations
Internal Compliance
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 34
evaluate the data and to create services and user interfaces for various user groups.
Therfore a Smart Data Platform has to meet several challenges.
The following subsection summarizes strategic technical aspects that are relevant for
setting up an urban Smart Data solution by the City administration. As key topic a data
protection concept is recommended. For safeguarding data with adequate
technical means, guidelines for data security are given. Further, a strategic decision
on hardware location is necessary due to impacts on Legal Regulations. The main
recommendation is further to establish a central Smart Data Platform per City or urban
environment realizing a data integration, processing and analysis.
2.3.1 Data Protection Concept
Due to the complexity of data protection issues and the many prevailing
interdependencies between these topics, it is important to either base the Smart Data
Strategy on the existing data protection concept or to define such a concept in the
scope of the City’s Smart Data Strategy. The concept should be comprehensive and
seamlessly integrated within the organisational Smart Data concept. The most
important fields of actions to consider when developing a data protection strategy
are:
– Backup and recovery: the safeguarding of data by making offline copies of the
data to be restored in the event of disaster or data corruption. The backup must
be responsive to external demands in cases where users want to be deleted.
– Remote data movement: the real-time or near-real-time moving of data to a
location outside the primary storage system or to another facility to protect
against physical damage to systems and buildings. The two most common
forms of this technique are remote copy and replication. These techniques
duplicate data from one system to another, in a different location.
– Storage system security: applying best practices and security technology to the
storage system to augment server and network security measures.
– Data Lifecycle Management (DLM): the automated movement of critical data
to online and offline storage. Important aspects of DLM are placing data
considered to be in a final state into read-only storage, where it cannot be
changed, and moving data to different types of storage depending on its age.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 35
– Information Lifecycle Management (ILM): a comprehensive strategy for valuing,
cataloging and protecting information assets. It is tied to regulatory compliance
as well. ILM, while similar to DLM, operates on information, not raw data.
Decisions are driven by the content of the information, requiring policies to take
into account the context of the information.
All these methods should be deployed together to form a proper data protection
strategy.
2.3.2 Data Security
In general, data security is the degree of protection to safeguard an IT system. This
covers all the processes and mechanisms by which information and services are
protected from unintended or unauthorized access, change or destruction.
Data security, including logical security (authorisation, authentication, encryption and
passwords) along with physical security (locked doors, surveillance or access control),
has traditionally been associated with large enterprise applications or environments
with sensitive data. The reality is, that given increased reliance on information and
data privacy awareness, data security is an issue of concern for all environments.
Logical security includes securing networks with firewalls, running anti-spyware and
virus-detection programs on servers and network-addressed storage systems. No
storage security strategy is complete without making sure that applications,
databases, file systems and server operating systems are secure to prevent
unauthorized or disruptive access to the stored data. Implement storage system based
volume or logical unit number mapping and masking as a last line of defense for the
stored data.
2.3.3 Hardware Location
The location of the servers of a data platform is in that sense important as it affects
which regional, national and supranational data privacy regulations apply.
One option is that a platform is exclusively hosted in the premises of the Platform
Provider. Usually a Platform Provider has its own computer center where all platform
components could be hosted. This option would mean that the national data privacy
regulations of the Platform Provider are applicable as well as supranational data
privacy regulations beyond national borders as for example in the case of European
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 36
data privacy regulations. The drawback of this locally hosted option would be that the
reliability level is lower than a cloud hosting solution with redundant instances in
different computer centers of cloud hosting providers.
Another possibility would be a cloud hosted solution where the servers of the computer
centers are either located in one specific country or a distributed system with server
locations in different countries. Which data privacy policies apply in this case is
determined by the server locations of the cloud hosting provider and needs to be
considered carefully. Some cloud hosting providers make their server locations
transparent and even allow selecting the specific location of servers on which the
platform would be hosted. In general, it has to be stated that different levels of
reliability can be configured in each cloud hosting service. For research projects it is
usually a lower level than for business projects as result of cost-benefit-considerations.
Cloud hosting services also allow for a dynamic adjustment of server utilisation based
on load-balancing principles in times of higher number of accesses.
2.3.4 Data Integration, Processing and Analysis
In order to establish an organisational and technical Smart Data approach for an
urban environment or City it is recommended to establish a central technical Smart
Data Platform once per City. The platform shall not only store Data Sets, but also import
and integrate them just-in-time from technical interfaces of Data Sources provided by
Data Suppliers (see glossary).
Requirements for such a central infrastructure can briefly be summarized as follows:
– Data Storage and Integration
– Capability for Pre- and Post-Processing for Anonymisation and
Pseudonymisation
– Data management, Aggregation and Linkage
– Several User Interfaces for Data Visualisation
– Connection of analysis modules with the data management system
– Precalculation of coordinated analyses for faster response-time within the
application
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 37
– Data Categorisation and Rights Management
– Use Case independent and consistent data provision
– Formulation of specific data formats/shapes for later analysis
– Implementation of control mechanisms in order to uphold data quality
These central technical capabilities are the basis for the organisational concept
described in the following. For more details regarding the Smart Data Platform’s
requirements and architecture refer to chapter 5.
2.3.5 Use Case Reflection
The Use Case „Smart Home“ runs (like all Smart City Use Cases) in a secure Use Case-
owner controlled and trusted area to guarantee all needed aspects in terms of the
above mentioned frame conditions. The IT-infrastructure needed for the Use Case
“Smart Home” though was not integrated into the existing City’s IT-infrastructure. For
simplicity and flexibility reasons (during the project phase) all elements were installed
and controlled in the stakeholders’ IT-departments and in their specific built project
environments. All collaboration policies and data handling rules between every
involved stakeholder (excluding only the flat inhabitants) were included in the General
Agreement of the “Smarter Together” project. One existing and common agreement
between all stakeholders shortened the individual discussions and avoided the work
needed to set up custom-made agreements. Also, a common agreement between
all stakeholders underlined the transparency character of the Use Case handling in
general.
In Smarter Together the following IT-components have been built up for the Use Case:
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 38
Figure 3: IT-Components of the Use Case
– Smart Home-Platform (SHP) for storing Master Data of the participants and
measurement data (temperature and humidity); the server is also the Database
for the so-called “feel good app” operated by the Smart Home Partner
– Smart Data Platform (SDP) for importing and storing of the anonymised
measurement data and other Data Sources for energy Master Data of
buildings, for analysis and export data for external use
– Secure interfaces for importing data, referencing actual geo data from the geo
portal of Munich and providing data for external applications e.g. Smart City
App
All the platforms mentioned above were built up through the particular project
partners (external service provider, operator of the Smart Data Platform, City of
Munich subsidiary company). In the next steps all partners discussed and agreed to
the following parameters and definitions:
– All platforms are operated autonomous by the partners (SaaS or Cloud Service),
without installation of Smarter Together specific server components onsite the
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 39
City of Munich (only using interfaces to existing systems like the Munich geo
portal.
– Communication only via defined and secure data interfaces (JSON Rest) using
secure data transfer protocols (https)
– All used interfaces, API and data handling processes must be replicable and
based on open standards
– All Software designs of all platforms must be scalable and “cloud-ready”
– System architecture incl. the Data Gatekeeper Data Model are designed as an
open “Blueprint” for future reuse.
– All components can be used as a playground during the development phase
(prototyping …)
After defining the above mentioned parameters the companies started implementing
their components and functionalities based on this defined Use Case.
2.3.6 Golden Rules
~ Define and sign common technical processes and policies between involved
stakeholders also taking into consideration existing partner infrastructures, rules
and know-how
~ Discuss and plan the seamless integration of new Smart City-related
infrastructures with your IT-department (if required). An overarching new Smart
City platform or application normally will not be physically integrated onto an
already existing City IT-Server. Plan for a reasonable long test phase, running all
new Smart City related equipment on dedicated new servers, cloud
environments or in a Smart City “sandbox”.
~ It is not necessary to build up the whole smart IT-infrastructure on your own –
better think about starting smart and also use existing infrastructure and
functionality from your partners (early “Make or Buy” decision saves a lot of
discussions about later deep implementation into existing & complex IT-
infrastructure afterwards)
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 40
~ All components must have a user management for access control, defined
interfaces (e.g. JSON-Rest) and a secure data transfer (https) between all
components and data repositories
2.4 Smart Data Strategy
When dealing with internal and external data, it is most important to ensure backing
and sponsorship of the City’s mayor and municipal council. This can be managed by
defining a Smart Data Strategy and request for a related financial budget. As in the
scope of the whole concept, the term “City” can also refer to either a group of local
cities or towns or particular City district. ”Strategy” in this context means having a basic
plan for realizing a long-term goal. The term itself is more widely used on executive
level and helps to incorporate City leaders and to secure their wide support.
On the other hand, a short-term goal for each Use Case should be defined. Benefit-
models clarify the aimed (not necessarily monetary) profit from data provision of all
participants, partners and the public as well as benefits of stakeholders from data
consumption. The establishment of drivers of the Use Cases are part of the
Requirement Specification and the basis of the benefit models, which can be
determined in the agreement meetings but not further focused in this document2.
The following sections outline potential motivations and drivers for Smart Data. Finally
a general approach on how to proceed in order to set up a City’s strategy for Smart
Data is described and enriched with examples.
2 A good external reference is st gallen ST deliverable of the « Smarter Together » project.
Participation
Transparency
Data Provision
Data Consumption
Problem Awareness
Requirements Specification
Requirements Specification
Collect Data
Benefit Models
Efiiciency Increase
Benefit for Stakeholders
Ch
allenges &
Pro
blem
s
Solu
tion
s and
Ben
efits
Legal Regulations
Internal Compliance
Smart Data Strategy
Political Support & Drivers
OperationsResponsibilities
Smart Data Lifecycle
Data Model
Smart Data Infrastructure
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 41
2.4.1 Drivers for Smart Data
Why does a City want to become smart? Generally, the City’s field of action may
increase and multiple problems can be addressed in a more elaborated way.
A service and data platform may have assisting effects on the development of a City
district. Other potential reasons why cities get involved in Smart Data and decide to
become active as a Smart Data Platform Provider are:
– Benefit from existing data of Data Suppliers (e.g. company that delivers flat
temperatures)
– Benefit of Smart Data related to City infrastructure and link several data sets
– Enrich the citizen’s/user’s data with other administrative infratsructure data and
use for subject-matter insight analysis (e.g. on usage of cycle tracks)
More general, there are drivers for value creation that exist in every City. Those drivers
and ideas indicate how economic value can be generated in a Smart City and help
creating a viable business model with regard to the respective boundary conditions
of every City.
Drivers on the side of the public sector can be:
– Development of touristic sector and ecomomic growth
– Attract new industries to settle in town
– Sell valuable add-on City information to commercial 3rd party service providers
– (Environmental) sustainability through less emissions
– Efficiency improvements within City departments
– Cost savings by offering digital services
– Better and faster decision making
– Tackling problems with increasing traffic volume
Drivers on the side of companies can be:
– New markets and new revenue opportunities
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 42
– Efficiency improvements
– Growth in productivity
– Better and faster decision making and more precise planning
Drivers on the side of citizens can be:
– Cost savings in e.g. energy
– Time savings regarding e.g. public transportation and traffic
– Empowerment and engagement
– Value addition, e.g. obtaining additional information from the City
– Increase general attractiveness of City and quality of life
2.4.2 The Formulation of a City-specific Strategy
In order to gain first insights for being able to define a Smart Data Strategy, it is
recommended to perform a stakeholder analysis and interview key stakeholders
relevant for Smart Data. The targets that are part of the Smart Data Strategy should
be derived from general City targets and should be clearly formulated as a 3-5 item
bullet list. Further, a reference to internal compliance prerequisites as for instance an
IT strategy should be contained.
When defining a strategy for data handling it is important to consider these three
important steps as follows:
– Target definition: A general question for a City operating a Smart Data Platform
is why it should do so and to what extent it is assisting for its core operations. A
reference to the general City strategy may be helpful.
– Strategic Mission Statement: Derived from long-term targets, more concrete Dos
or Dont's should be formulated. The sooner possible realisation options are
eliminated, the less effort they cost.
– Plan for Realisation: In order to not lose sight of aspects, at least rough milestones
on a year or quarter year level should be defined.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 43
A City could e.g. define its target to corporate with industry for providing Smart Data
applications and position itself as a trustworthy partner for citizen’s services. However,
those two targets for instance are difficult to harmonize. Hence, for a Smart City
strategy it is all the more important to also define targets for general directives on
desired practices that go beyond compliance to legal regulations. For instance it
could be compliant to use (anonymized) citizen’s data from companies but not
desired due to a potential bad image of the City administration. Therefore, citizen’s
trust of their City and the City’s image in public could play a role for a strategic target
related to Smart Data and related compliance issues.
Possible areas of target definitions are:
– Overfulfill existing Legal Regulations (e.g. comply to a seal)
– External partners and internal stakeholders (corporations or only public
organisation)
– Opening administrative data to citizens as part of Smart Data Use Cases
(Legal Regulations see section 2.1)
– New organisational roles and efforts to be spent on employees
– Conditions and licensing (e.g. only use particular license contract)
– Data Classification (only process data with particular category)
– IT infrastructure (e.g. two databases for different purposes, use a particular
system or not)
In the context of this concept, a new Internal Compliance directive could be raised,
e.g. claiming City data services complying with a particular seal or regulation even if
not prescribed by law. As an important aspect this directive should be communicated
to employees and optionally to the public.
As an example a City could decide not to provide any fee-based services or not to
allow the development of smart apps by external companies. A typical directive
related to personal-data could be the proposition to citizens to comply to GDPR or to
a seal claiming more than GDPR.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 44
2.4.3 Use Case Reflection
For Smarter Together the City specific strategy in the City of Munich generally is
oriented on the definitions and implementations of the existing E-government and
Open-government activities (eo-gov, www.muenchen.de/eogovernment) as well as
on the currently evolving and adapting general Digitalisation Guidelines. One of these
specific strategies is to produce and to provide open data for the use of the City as
well as for the benefit of the broader community. Another aspect is to use as less
personal data as possible and to be as transparent as possible for our partners and the
citizens. The City of Munich aims to underline their goal and image of the City only to
collect and provide data for the benefit of the City and their citizens.
To achieve the goal to produce and to provide mainly open data, it is necessary to
check which basic data is needed and whether these can be used as open data. The
discussion and contractual binding agreements about which resulting data should be
classified as open was integrated into the early Use Case definitions.
City related data needs to be collected, treated and analysed for the wellbeing of
the citizens. More and more digitalization of all kind of City related processes create
valuable databases that need a neutral and non-discriminating handling and control.
The City of Munich sees its position to manage such data and behave as a neutral
“data gatekeeper” instance.
In this context the City of Munich compliance guidelines foresee to act as a
transparent and fair data handling partner with clear defined and published rules of
how and why the data collection is used to contribute to the City goals (see e.g
https://www.muenchen.de/rathaus/Stadtrecht/Informationsfreiheitssatzung.html).
One concrete driver-element of the broader Smarter Together Task “Low Energy
Districts” in Munich is to analyse temperature and air humidity data from dedicated
flats in the project area (being the “Smart Home” Use Case). The analysis of this data
has two goals for the City:
– One is to examine how inhabitants of energy-refurbished and non-energy-
refurbished flats behave over time concerning ventilation of their flats in
conjunction of the flat temperature and air humidity. This outcome of the
analysis contributes to improve the efficiency of future refurbishment projects in
the City.
– The other is to inform the inhabitants (via a dedicated WEB-APP) when an
optimal ratio between flat temperature and air humidity is reached and how
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 45
energy consumption can be decreased and mould formation can be avoided
by an adjusted individual ventilation behavior. Besides the possibility to
individually contribute to an innovative EU research project, this aspect is the
main driver and motivation for the flat inhabitant to contribute his/her data to
the project.
The underlying concrete motivation of the Smart Home Use Case for the City of Munich
is a measurable contribution to the overall City goal to decrease CO2 emission and
energy consumption in the City.
2.4.4 Golden Rules
~ Know your City strategy, e.g. in the context of IT, data transparency, E-Gov and
open data policies and also related to comparable overall City image building
or confirming backgrounds.
~ Apply specific Use Case to the benefit of the City strategy
Short Summary of “Smart Data Infrastructure Basics and
Strategy”
Although it is clear for any IT-department at the first sight, of how to set up a new data
infrastructure, the aspect of a Smart Data Platform, including a Data Gatekeeper
approach, requires some additional strategic considerations first. The driver and
challenge of such an additional IT-approach is not the technology itself. Most
questions arise, when discussing the City’s initial drivers for a Smart City approach.
What are underlying Business models or Benefit Models for a City? What are the
expected outcomes of the various analysis? Who provides data, who owns them and
who administers them? Which departments need to contribute their available data
(often seen as private department treasures) and who might get access to it? Last but
not least the question needs to be discussed if the Smart Data Platform incl. the Data
Gatekeeper will be installed on one additional City-wide and centralised IT-platform.
Also the discussion about which technical approach should be chosen (e.g. cloud
versus on-premise) often arises in these contexts. Hence, one general activity before
starting any Smart City, digitalization or Data Gatekeeper project should be, to design
and approve a clear strategic and infrastructure Smart City & digitalization guideline
that drafts the main pillars and drivers going forward.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 46
2.5 Responsibilities
City districts, towns or countys are organised in hierachical structures and particular
roles with corresponding Responsibilities. These structures can be used in order to
implement a Smart Data Strategy and harmonize existing roles within the
administration with new tasks and Responsibilities arising from the provision of smart
services.
Required Roles and Responsibilities in the context of a Data Gatekeeper strongly
depend on various aspects of a real City, e.g. how the City is organised, which
department-overarching structures are in place, which know-how is available, how
many people are available to staff a Smart City & Data Gatekeeper team, which
concrete collaboration policies are established in a City, etc.
The following discussion of roles and Responsibilities offers a broad view on which roles
need to be planned in a perfect City. Knowing that every City is different, various
aspects should be considered and its implementation impacts need to be discussed
before any final decision on roles is taken. The goal is not to implement as many as
possible theoretical roles, but rather find a pragmatic way of how to integrate the
main required Responsibilities into a City’s concrete situation, in order to not underrate
the integration complexity of an innovative Smart City & Data Gatekeeper project
over time.
Fundamentally, Smart City is being understood as a long-term function which as soon
as established brings benefits for cities as well as their citizens without temporal
limitation. Therefore it is not regarded as a program or project, which both terms
inherently carry end points with them by definition. With that in mind it is cruicial for
Participation
Transparency
Data Provision
Data Consumption
Problem Awareness
Requirements Specification
Requirements Specification
Collect Data
Benefit Models
Efiiciency Increase
Benefit for Stakeholders
Ch
allenges &
Pro
blem
s
Solu
tion
s and
Ben
efits
Smart Data Strategy
Political Support & Drivers
OperationsResponsibilities
Smart Data Lifecycle
Data Model
Smart Data Infrastructure
Legal Regulations
Internal Compliance
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 47
municipalities to overlook the whole transformation process and gain a solid
understanding of upcoming tasks and Responsibilities.
The following concept describes Responsibilities that all should be fulfiled by either the
City or by the external stakeholders in order to establish a complete Smart Data
compliance organisation. In exceptional cases it even may be possible for one person
to fulfil multiple Responsibilities – as soon as the dual control principle is not affected.
The following Responsibilities are categorized in the fields of Strategy, Participation,
Compliance, Operations, Data Provision and Data Consumption.
2.5.1 Strategy
On a strategic level it is recommendend to take Responsibilities for decision making
and managing.
Strategic Comitee
A decision making body is consulted in case of any strategic decisions or for solving
operative conflicts as escalation level. It serves as decision instance for deciding on
new Smart Data concepts or Use Cases.
Possible members of the administration for major strategic issues and organisational
design changes are:
– Mayor
– Responsible member of municipal council
– Main City-internal Data Suppliers
– External Data Suppliers as e.g. companies
Participation
Transparency
Smart Data Strategy
Data ProvisionData Consumption
Communication
Political Support & Mandate
Stakeholder
Operations
Coordination
Responsibilities
Smart Data Lifecycle
Data Model
SmartServices
Analyze Data
Process Data
Change Processes
Smart Data Platform
ICT Architecture
Data Sets
Use Cases
Data Rules
Process RulesProcessing Transparency
Problem Awareness
Requirements Specification
Requirements Specification
Collect Data
Monitoring
ClassificationCompliance Roles
Decider
Use Case Creation, Change & Deletion
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 48
– Data Protection Officer
Tasks:
– Escalation of Decisions
– Approve new Smart Data concepts, campaigns or Use Cases
– Decide strategically on new Smart City Use Cases and Smart Data topics
Strategic Manager
Moreover an additional Responsible for the data strategy is needed. This stakeholder
has to realize the comitees decisions on what kind of data the City would like to collect
and to publish through the Smart City Data Platform. The Strategic Manager (Could
be the Chief Digital Officer –CDO- of a City) must maintain an overview (e.g. in form
of a list) that protocols the City`s intern and extern data ownership. She or he can be
part of central decision making body.
He or she should be accompanied by someone (or himself or herself is) fulfilling an
internal strategic and communicative role that also elaborates concepts for new
Smart Data Use Cases or new features of the Smart Data Platform. He or she should
further moderate across City internal departments and promote Smart City Use Cases
for new digital City Use Cases.
Tasks:
– Leading person (e.g. CDO or municipal board member) responsible for the
Smart Data Platform
– Sponsoring and Smart Data budget main responsibility
– Elaborates Data Set usage agreements with suppliers
– Moderates and conceptualizes Use Cases
2.5.2 Participation
It is important to take responsibility for the management of public relations work and
other communicational tasks.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 49
Communication and Participation Manager
This stakeholder is in charge of all external communication by means of informing the
public and attracting attention about the Use Cases to public as an essential part of
the transparency principle. This communication raises awareness and increases the
interest and Participation of users, whose integration into the Use Case can be
accomplished by Participation events. Their organisation and moderation is the the
main task of a Communication and Participation Manager.
Tasks:
– Communicate Use Cases
– Public relations work
– Organize Participation events
– Continuous design companion of Use Case concept and development
– Moderate and provide users
2.5.3 Compliance
In the field of compliance it is recommended to take responsibility for data protection
on the side of the City and, if necessary, additionally on the side of the broader Use
Case (that could include external stakeholders).
Data Protection Officer
This role is key for the concept and must be staffed. She or he must hold an adequate
legal education. The reason that she or he has to proof that all legal requirements and
privacy laws are met in particular cases. However, as this cannot be carried out by a
non-technical person only, she or he has to have good moderation competencies
and collaborate with the operational roles of platform administrator, data analyst and
service developer. This might be challenging due to the multiple sources of data in a
Smart City environment which might all together jeopardize the privacy of its citizens.
The full transparency on data and the future use of it should be prioritized in any case.
Personal data must be handled cautiously and sensitive data that could conflict with
privacy issues must be anonymized. The corresponding person therefore acts in the
citizen’s privacy concerns and the best interest of the law and order. The role approves
the principle workflow but has no access to any privacy related raw data:
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 50
With regard to the Use Case Creation Process, the Data Protection Officer receives
the completed form (see appendix). On the basis of the form, she or he will be able to
understand the intended procedure and verify it accordingly. If there are no conflicts
or privacy issues she or he will approve the Use Case and the form can be forwarded
to the Use Case Responsible for the final approval. If there are conflicts or privacy
issues, she or he needs to describe the problem(s) and either decide what measures
need to be taken or suggest possible alternatives in order to ensure that the Use Case
is legally acceptable. The Use Case needs to be revised thereupon and again
approved with regard to data protection standards. This process continues until she or
he can guarantee that it complies with all relevant data protection law and approves
on the part of the City.
Tasks:
– Decide on necessary data processing steps in order to prevent possible legal
violations
– Ensure compliance to Legal Regulations as e.g. data privacy
– Responsible for legal aspects and pre-check of final approval of data provision
– Ensure compliance of internal and external data usage for the whole data
platform and specific Use Cases
2.5.4 Operations
For Smart Data Operations the Responsibilities for platform administration, analysis and
service developement occur. Their tasks can either be set up in a newly founded
department or put in external charge.
Platform Administrator
Data Security and reliability of the Smart Data Platform and related systems has to be
guaranteed. In this concept, this is ensured by two roles. The Platform Administrator for
example therefore connects new Data Sets technically to the platform and provides
support to the data responsible on supplier side. He or she administrates the technical
systems Analysis Dashboard, Transparency Dashboard and Platform Statistics
Dashboard (see chapter 5).
Tasks:
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 51
– Ensure working data platform to all stakeholders
– Collaborate with Data Protection Officer for Data Security and privacy issues
– Maintain access control following the “need to know” principle
– Provide internal and external support
Data Analyst
Any person concerned with insights from data or aggregated data inside or also
outside the City administration could become a Data Analyst. Typical eligible persons
may be employees of for instance the mobility or other subject-matter departments in
a City or district administration. Analysts may use any created Data Sets and
visualisations in case a respective purposes of use has been granted by the Data
Subject (see glossary) or owner.
Tasks:
– Access various Data Sets and analyze them for granted purposes of use
– Provide Data Analysis results to the Use Case Responsibles
Service Developer
The responsible person within this task is responsible for preventing data security flaws
on the application/hardware level (compare section 2.3.2). In case the hardware or
software app involves user interactions as e.g. with citizens he or she needs additional
competences in user experience and usability engineering and interaction design.
Otherwise an additional responsible for conceptual user interface design and
evaluation may be needed.
Tasks:
– Develop App or Hardware solution
– Prevent security flaws on application level
– Ensure availability of Use Case and consult City
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 52
2.5.5 Data Provision
In the field of data provision the Data Owner and Data Administrator occur and are
described in the following.
Data Owner
The Data Owner is responsible for the Use Cases Data Sets. He is the project manager
at side of the supplier organisation (e.g. company) and hence involved as a
stakeholder.
He bears the responsibility and is therefore required to sign the form, containing all the
information listed in the previous chapter, at the end and thus approve the Use Case.
Therefore, this role should be taken by an executive.
Tasks:
– Main Responsible for data
– Defines general or specific purposes of use data may be used with
Data Administrator
The mentioned responsibility is primary having technical related tasks. She or he may
for instance be responsible for an application programming interface (API) provided
to the Platform Administrator. He or she ensures that the data is provided according
to the contract and monitors this process.
Due to her or his technical know-how she or he should be involved in the Use Case
Creation Process, especially when discussing technical related parts and filling in those
in the form.
If there is already another person responsible for this, it is necessary that in any case
someone with technical expertise and inside verifies the details made (in the form) to
ensure that they are correct and that it can be implemented as described.
Tasks:
– Technically supply data to the City data platform
– Provide technical information in Use Case Creation Process
– Supply support to City and Use Case developers regarding data
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 53
– Finally approve data provision
2.5.6 Data Consumption
In the field of data consumption, persons from City departments, companies or citizens
can elaborate their own Use Case, become a Use Case responsible, request for
particular smart services and therefore, profit by the results of the Use Case – the Smart
Data.
Use Case Responsible
An internal responsible is needed for every Use Case, whereby it is possible that a
person is the owner of more than one Use Case. She or he is in charge for the Use Case
and is therefore required to sign the completed form.
The tasks should be taken by a person involved with the topic and the other
stakeholder(s), like a City department manager.
Tasks:
– Responsible for subject-matter aspects of a Smart City Use Case
– Ensure running operation and provide support
– Analyze usage and use data for City-internal or external purposes according to
usage agreements
If the Use Case that shall be run on the Smart Data Platform is a City internal project,
the responsibilites may be taken over by persons within the municipality. In general it
is possible that a person can have the same or another responsibility in different Use
Cases. Especially as soon as many Use Cases are run on the platform however it is
sufficient to staff the data owner from respective administrational departments.
Especially for municipalities which are often understaffed and often missfinancial
ressources or technical competences it is recommended to fall back on external Data
Supplier and solutions.
If data for a Use Case is provided by a City-external supplier, there must be an
additional data compliance approver from the external organisation who supervises
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 54
the published data and thus guarantees that it complies with relevant data protection
law from the other organisation’s perspective.
The glossary contains explicit examples on how these listed tasks may be allocated to
specific roles within the Smart Data Platform.
2.5.7 Use Case Reflection
Although the previous chapter discusses the Responsibilities in an extensive and
detailed way, it must be clear that not all of these theoretic tasks, roles and
Responsibilities are realisable without any restriction in any City. Each city initially has
its (gr)own rules and hierarchical facts. Nevertheless it is important to know that any
new or innovative Smart City project approach will challenge the existing structures.
(Perhaps in some cities it is even one of the main tasks of a Smart City project to reflect
and challenge existing out-of-date processes and propose new ways of thinking or
break up structural silos within the organisation).
In most cases a Smart City is an overarching project approach that requires support
and commitment from various (if not all) established departments in parallel. A top
down support and backing therefore is the main and first lever to get involvement and
real commitment among the various operative departments of the City.
In Munich an interdisciplinary team staffed by all relevant departments of the City was
established from the very beginning, creating the needed common rules and goals
for the project collaboration. As all participants are experienced and highly motivated
specialists with a good knowledge of the standard City processes, all described
Responsibilities in the City were covered. In some cases (also in order to break up silos)
a combination of Responsibilities was given over to single staff members.
The Use Case „Smart Home“ needs to be integrated in the existing processes and
policies of the whole City. In some cases although new roles had to be defined (and
staffed) to cover new processes like the co-creation process (see dedicated
description in this document) within the project.
Within the Use Case “Smart Home” the following main roles were defined:
– Flat inhabitant
– External Service Provider
– University
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 55
– City
– Operator SmartDataPlatform
– City
The impact and understanding of this chapter was a very important prerequisite for
the definition of the Use Case.
First the stakeholders and their project specific roles, Responsibilities as well as their
motivation to participate had to be discussed and straightened. Even in such a rather
simple process the number of relevant stakeholders was remarkable:
The Flat Inhabitant is owner of the raw data and shares them with the project on a
volunteer base. He/she has to agree to share the raw data and potential other
information needed for the scientific analysis.
The External Service Provider owns the process of collecting the data, defines the
contract incl. all legal aspects with the inhabitant, designs and runs the WEB-APP and
transmits the necessary data for the scientific analysis to the operator of the project
data base. A data processing supply contract with the operator of the data platform
has to be negotiated as well as a Classification of the transferred data that determines
the usage within the Smart Data Platform (SDP).
The University as scientific analyst of the data had to sign an agreement that describes
the proceeding and goals of the analysis and determines the legal usage of the
analysed data.
The Operator of the SDP had to sign a contract that defines the legal and technical
usage of the transferred and analysed data as well as the complete service of the
platform.
The City (forming one internal Smart City team including participants from all relevant
departments and subsidiaries) acts as responsible and very visible overall project
partner. It had to supervise and steer the complete end-to-end process. The City is
interested in the analysed data for their future Smart City replication activities. Also in
any digitalization project it is very important for the image of the City to comply with
all legal and City internal laws and rules and communicate with the citizens and
project stakeholders in a most transparent way.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 56
2.5.8 Golden Rules
~ Clear Definition of all stakeholders in advance including their motivation and
roles simplifies the discussion and interaction during the project.
~ The formation of an interdisciplinary “Smart City Team” including all relevant
City departments and strongly backing them with a top-down support (e.g. by
installing an upper management Smart City steering committee incl. the major
or his/her deputy) is one of the (perhaps even the) most powerful levers to
overcome potential and established communication barriers between grown
departments of a City. Without an open and honest inter-department
communication based upon common goals it is unlikely to successfully execute
an innovative Smart City driven project.
2.6 Participation
Before starting any Smart City project that includes the necessity to generate and
evaluate data, the responsible project-lead / stakeholder has to describe which raw-
data needs to be retrieved or collected and where it should derive from. In this context
it needs to be determined, which data evaluation or information analysis is requested
and in which format the outcome needs to be made available as well as for whom.
Such an overall description of goals and items is called „Use Case“.
In the future, Use Case descriptions in the context of a Smart City more and more
depend on data input coming from sensors. These sensors might be placed in private
or public space to automatically provide raw-data and even aggregate data
(depends on sensor). Sensor data can e.g. comprise traffic counting or measurement
Participation
Transparency
Data Provision
Data Consumption
Problem Awareness
Requirements Specification
Requirements Specification
Collect Data
Benefit Models
Efiiciency Increase
Benefit for Stakeholders
Ch
allenges &
Pro
blem
s
Solu
tion
s and
Ben
efits
Smart Data Strategy
Political Support & Drivers
OperationsResponsibilities
Smart Data Lifecycle
Data Model
Smart Data Infrastructure
Legal Regulations
Internal Compliance
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 57
of air pollution. This data allows e.g. for more focused urban development planning in
dedicated focus-areas or supports other Use Case goals.
The installation of sensors in public space (e.g. to detect available parking places in a
certain district) and possible resulting new services for the public can induce a high
degree of uncertainty or scepticism of e.g. citizens, especially if they are not well
informed about the nature and goals of the sensor installation. Therefore and
irrespective of necessary legal requirements or City-internal policies, it is
recommended to actively involve the affected citizens in an early stage of a Smart
City project. This is done to achieve a best possible local Participation. This can be
done by co-creation workshops for basic elements of a Smart City (mobility solutions,
energy efficiency, etc.) or more detailed co-creation workshops e.g. for the use of
sensors. It is also helpful to organize new Participation formats like barcamps or to bring
individual questions to specific target groups in order to get new and fresh ideas as
input for the co-creation workshops. Young people and students e.g. are often taking
part in govjam's and the local ICT-community is interested in hackathons.
Even in case target groups are not directly or personally affected by a data collection,
data analysis or data visualisation (e.g. in Use Cases that deal with “local
meteorological data”, “anonymous traffic counting” or “general utilisation of mobility
hubs”) the organisation of appropriate information measures is key to allow for a
successful Smart City project.
A so-called “co-creation process” can be helpful to concretise or challenge goals
addressed by the Smart City project. In context with the Data Gatekeeper concept
the co-creation process is understood as a number of successive workshops with
relevant stakeholders located in different areas of expertise reaching from potential
data providers, over City administration personnel, industry partners to citizens as well
as local residents. The content of these workshops exceeds information, discussion and
consultation, and includes collective idea generation sessions and collaborative
development of concrete solutions/ recommendations for the implementation. It is
necessary to receive input about which solutions are acceptable, but also to get an
idea about which kind of data-collecting is not tolerated by local stakeholders. The
overall workshop goal is to uncover diverse stakeholders’ views on data security and
related controversies. Furhtermore, it is important to determine which scenarios
consisting of appropriate technology, local measures and planned visualisation of
data shall be further developed and which are not suitable for the specific context.
Thus, the workshops need to be individually designed according to a local context
(topic, location, legal frameworks, timeframe, planning practices etc.). The outcome
of the stakeholder involvement should be an increased mutual understanding,
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 58
substantial contribution to and acceptance of a planned Smart City project, or a part
of a project.
2.6.1 Use Case Reflection
For the Use Case „Smart Home“ it is important to involve the participants very early in
the process because doing Smart City projects is not only technical integration it also
deals with social integration and collaboration with the citizens.
Due to the fact that in the Use Case Smart Home the required functionality was
already fixed –besides the normal information process- it did not make any sense to
perform any co-creation processes. In the following example we therefore describe
a comparable Participation process as part of the co-creation process that was
conducted in the affected City district of Munich. In this other Use Case (dealing with
the sensors in the intelligent lampposts) we offered several “creative workshops” and
bar camp sessions to define desired functionalities.
Several months before the planned implementation phase, the local citizens of the
project area were guided towards the topic of “Smart Home” within the following
activities:
1) Public events introducing the topic broadly and sensitising possible concerned
groups like e.g. inhabitants of the complete project City district, dedicated
inhabitants of the block of flats to be potentially refurbished
2) Additional meetings with interested citizens in the project meeting-hall
demonstrating the functionality at a demo wall showing all needed
components of the Smart Home
3) Info booth and discussions with citizens at the local district festival conducting
the campaign “become a smart home pioneer”
The local citizens of the project area were guided towards the topic of “sensors”
within the following process:
1) Meetings and discussions in the district lab presenting the system with slides and
demo equipment
2) Additional meetings with interested citizens demonstrating the functionality at
a demo wall showing all needed components and the GUI (feel-good-app) of
the Smart Home including following aspects:
o Live demo of Smart Home
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 59
o What is a Smart Home and what is my benefit?
o Presentation of additional Smart Home components (security …)
3) Info booth and discussions with citizens at the local district festival conducting
the campaign “become a smart home pioneer”
o Discussions and demo of the Smart Home
o Photo-session for the children “Smart Home Pioneer”
o Presentation, discussion and documentation of the outcomes
Description of the Co-Creation Workshops that were organised by Smarter Together
Munich for the innovative lamppost sensors approach. In close co-operation with the
City's project stakeholders, the Technical University of Munich (TUM) defined and
conducted the following series of workshops:
The professorship for Participatory Technology Design is based at the Munich Center
for Technology in Society and the TUM Department of Architecture. In the context of
Smarter Together it has been in charge of enabling instants of co-creation in support
of the tasks mobility, energy and technology. Therefore customized co-creation
processes were designed for several Smarter Together solutions in order to make sure
that possible public concerns and requirements will be incorporated into the
planned infrastructural enlargements. In principal everybody is invited to contribute
to the co-creation processes, however they are not designed for the involvement of
a large amount of people and hence are no equivalent for a broad public
engagement. In fact the goal of the co-creation process is to establish an ongoing
collaboration between City officials, concerned residents and other relevant civil
society actors throughout the ideation, planning and implementation phase of the
respective solution.
A significant element for the success of co-creation processes is the openness of all
responsible stakeholders to disclose certain components of the planned projects for
additional stakeholders. Co-creation is not merely collecting and evaluating ideas,
but to allow for the co-design of concrete aspects within a project.
In the following example a co-creation process is described that was conducted in
Munich:
The local citizens of the project area were guided towards the topic of “sensors”
within the following process:
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 60
1) Public events introducing the topic broadly and sensitizing possible concerned
groups:
o Panel discussion on the use of gamification principles for Participation
and sustainable urban planning
o Moderated exhibition: What are „smart services“? Do I need them? And
what do they need from me?
2) First co-creation workshop to make the smart lamppost topic accessible,
dissolve strong hierarchies of expertise and stimulate collaboration
o Short Input on the planned measures
o Playful group activity on the questions: What is a sensor and what can it
do?
o Presentation, discussion and documentation of the outcomes of the
game
3) Second co-creation workshop to develop ideas for useful sensor-based services
o Take home task: Find an everyday situation where a sensor-based service
could be helpful
o Group activity based on previous workshop to define desirable and
undesirable services (i.e. impact on privacy or quality of stay, usefulness,
etc.)
o Presentation, discussion and documentation of the outcomes
4) Third co-creation workshop to develop recommendations to be incorporated
into the planned open call for sensors
o Site visits for “reality check” of imagined services
o Expert-input from legal scholar on data protection and privacy in smart
cities
o Collective formulation of recommendations concerning 1) technical
requirements, 2) functional requirements, 3) “no go” criteria, 4) data
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 61
ownership, open data and licensing as well as 5) transparency
requirements.
All data collection that local citizens define as “no go” during the workshops shall be
bindingly described as such in the related Use Cases. All other data-related items can
be included into the Use Cases, under the condition that potential boundary
conditions for future implementations will be maintained. Examples for boundary
conditions could be that dedicated sensors can be used to collect data, when
dedicated technical or organisational guidelines can be guaranteed and will be
documented in a transparent way. A continuous communication/ collaboration on
identified and not yet identified issues is highly recommendable.
2.6.2 Golden Rules
~ A Smart City without a reasonable social integration or Participation process is
not really smart
~ Design of processes need the involvement and acceptance of the employees,
partners and citizens
~ An innovative and professional Participation or Co-Creation process sometimes
cannot be performed yet by existing City stakeholders. The special skills required
for such a process often are only available from dedicated agencies or
University departments and need to be stepwise built up or elaborated by City
employees.
Short Summary of “Responsibilities and Participation”
At a first glance, discussions about “Responsibilities” or “Participation” do not seem to
have much to do with the described Data Gatekeeper item. But the more one is
thinking about the implications of what a Data Gatekeeper approach within any Use
Case really means for an innovative and data driven Smart City environment, the
more the question arises about how a city needs to change or adapt old behavior to
be able to respond to the upcoming challenges in the “new data driven world”, also
from an organizational perspective. Any Smart City approach comprising the
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 62
described Data Gatekeeper approach always includes both, technical and social
integration aspects.
Often the established roles in a given City organization and the traditional interworking
between departments is not designed to support “data driven benefit models”.
Interdisciplinary teams need to be set up and well supported in their new roles from a
“top down” perspective. The City’s own role to enable structured and honest citizen
Participation processes often has to be designed from scratch too.
It is not the theoretical approach of how a perfect team should look like, it is more the
pragmatic approach to challenge outdated collaboration traditions in a city and to
allow, trial and optimize more “Use Case driven collaboration methodologies” for a
future and data driven Smart City.
Smart Data Creation Guidelines Smart data results from a Smart Data Platform by collecting and processing data from
several separate Use Cases. Therefore, the main aspect of Smart Data creation is the
creation of Smart Data Use Cases. The following chapter describes a detailed process
that needs to be initiated at the beginning of every Use Case creation and passed
completely once per Smart Data Use Case.
The following subsections describe the workflow of the creation of a concrete Smart
Data Use Case, which includes the following necessary steps:
– Create a requirement specification
– Classify each Data Set
– Define a data usage contract/agreement
– First Quality Gate and approval by all Use Case stakeholders
– Describe Data Sets and consider technical implementation issues
– Choose adequate data processing methods
– Second Quality Gate with technical acceptance by all stakeholders
A checklist for each step is attached to the document.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 63
3.1 Requirement Specification
As soon as the responsible bodies have initiated the Use Case creation the
requirements specification starts as first activity of the Use Case Creation Process.
When creating a Use Case, several aspects need to be considered and specified
beforehand. On the one hand classical requirements engineering questions regarding
the goals of a user are relevant like User goals, preconditions and other Use Case
descriptions. On the other hand first considerations for data Classification are in
demand. Following subsections give an overview on these aspects including a brief
specification checklist for validation and an agreement meeting.
3.1.1 User Goals
A Use Case defines procedures and interactions between actors and/or systems to
achieve a specified goal. Therefore, a Use Case description includes information
about the goal that should be achieved and the activities and steps that need to be
taken in order to achieve it. Furthermore, it specifies the steps that need to be taken
for achieving the described goal as well as technical requirements regarding the used
system.
Every Use Case needs a detailed description of the analysis purpose(s), its underlying
motivation and goals as well as the requested results, including their presentation.
Different roles may have different goals within the Use Case.
3.1.2 Use Case Project Information
In addition, it contains comprehensive information of all needed steps and
requirements to get a solid overview of the Use Case’s project in order to be able to
understand the context completely.
Smart Data Creation Guidelines
Chapter 3:
Quality Gate 1
Quality Gate 2
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8
DataProcessing
RequirementSpecification
Data Usage Agreement
and Licensing
Data Categori-
zation
Data Classification
Data Sets and
Technical Implement
ation
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 64
That means in effect that a Use Case defines all required data (sources) and it specifies
the data streams and processing steps up to the final requested result(s). Meaning
further that a Use Case is always project based.
Knowing the objectives of the data analysis in all details and at an early stage,
simplifies the determination of whether potential data from other sources are needed.
Further benefits are being able to ensure data security and data privacy. In order to
achieve that, for every Use Case the privacy Classification level (see chapter 3.4)
needs to be specified in advance according to its protection needs. Showing on the
one hand if special protection is necessary due to privacy risks, on the other hand it
reflects further conditions that were agreed on.
Describing the exact purpose and approach from the beginning simplifies discussions
about data handling, data Classification, data exchange, technical interface
requirements as well as possible data licensing, data security and privacy. A detailed
documentation further allows the detection of errors in advance and makes it possible
to respond accordingly. Moreover, when handling personal data, law prescribes a
detailed documentation, like described in chapter 2.1.
3.1.3 Specification Checklist
A Use Case should therefore contain:
– A basic Use Case description including:
o An identification number
o A unique name
o Analysis purpose and motivation/business model
(what should be analyzed and why)
o Task assignments: How to achieve the goal
(indicating every single step and the persons involved)
o Analysis results (output)
o A precise description of the data that is needed
o The time period
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 65
o Where applicable: Design ideas of the analysis
o Where applicable: Exceptions or special cases
o Where applicable: Specific requirements or special conditions
– Drivers for the Use Case (see 2.4 strategy):
o Benefits from data provision of all participants, partners and the public
o Benefits from data consumption of all participants, partners and the
public
– Information on Project Stakeholders and Use Case specific Responsibilities:
o Definition of contact persons
o Assignment of required tasks
o Where applicable: Definition of further required tasks
o Where applicable: Assignment of new task(s)
– Technical Requirements:
o The kind of Data Set (e.g. data base/sensor) and the location
o Data format(s)
o Interfaces for the import and export of the data (API access)
o Time and rate of exchange
o Where applicable: A description of further specific requirements or
special conditions (modalities)
– Suggestion for Data Processing and Exchange
o Type of contract for data sharing/usage
o A Categorisation (see chapter 3.2) of the data involved
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 66
o Potential necessary Pre- and Post-Processing measures for
Anonymisation and Pseudonymisation
All this information should be set down in a written agreement. The creation and use
of a standardized form is recommended, which should be filled out compulsory for
every Use Case.
3.1.4 Agreement Meeting and Approval
As soon as the specification checklist is filled out in a first draft version, the responsible
person or bodies initiate and moderate a Use Case agreement meeting. The relevant
stakeholder are defined within the Tasks and Responsibilities 2.5.
All kind of Data Analyses queries have to be specified and confirmed by all
stakeholders in a discussion in order to check legal requirements in the next steps.
A clear benefit model, taking into consideration all stakeholders (data providers,
analysis benificiaries, others), needs to be defined.
In a next step the technical feasibility needs to be approved. Hereby especially the
proposed data formats and interfaces are evaluated. Finally a legal check of the
Requirement Specification is carried out.
In the scope of this concept basic information on data classification as well as allowed
purposes of use (incl. approval of open data, restrictive use or exclusive use of Data
for defined target groups etc.) are defined. It needs to be determined which
granularity of information should be made open to the public by means of the
Transparency Dashboard (chapter 5.6). Therby an image of transparency is suggested
to public that helps for establishing the city or urban district as a trustworthy partner.
3.1.5 Use Case Reflection
For any Use Case, including „Smart Home“, it is important to define clear and
complete requirements from the very beginning. Otherwise you risk to get incomplete
solutions and expensive work afterwards. All requirements and possible data-usage
restrictions from all stakeholders should be defined at this point.
The overall goal of the use case “Smart Home” is to measure the efficiency of
refurbishment activities before and after the refurbishment and give inhabitants
additional information on their ventilation behaviour via the so-called “feel-good-
app” which indicates the impact of ventilation activities.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 67
Therefore about 400 starter sets including each 2 temperature/air-humidity sensors and
a base-station were offered “free of cost” to the flat inhabitants of the project area.
The requirement specification of the Use Case “Smart Home” was basically divided
into the following processes:
– Data import from the City’s own “Energy Master Database” into the Smart Data
Platform (energetic Master Data information of buildings in the district) are
prepared and imported into the SDP to gain needed master information on flats
and buildings where owners have been advised and flats and buildings might
be refurbished. The classification (usage restictions of data) was defined.
– Data import from the Service Provider’s „Smart Home Platform“ into the Smart
Data Platform (sensor data (temperature and humidity) have been collected
in selected flats, measurement data from sensors are transferred via the base
station to Smart Home Provider, Smart Home Provider prepares data and makes
them anonymous and transfers data to the SDP). The classification (usage
restictions of data) was defined.
– Data import from „Deutscher Wetterdienst” (DWD, German Weather Forcast
Service) into the SmartDataPlatform (to receive the external temperature and
air humidity conditions on a regular base)
– Analysis of the required indicators for the EU-reporting (SDP imports data and
provides several functionality for maintaining, analysing, presenting and
exporting of data, analysis algorithm based on specifications worked out by
TUM, University of Munich )
3.1.6 Golden Rules
~ Glossary definition at the beginning – always keep in mind that a Use Case is
not necessarily a familiar tool for the involved City departments and other
stakeholders. The Use Case content should translate between the customer’s
requirements and the future IT-solution.
~ Clearly understanding and describing the underlying department needs, as well
as the requested analysis demands, might take several steps for all involved
people. A reasonable time schedule should be considered for this work.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 68
~ The more precise the definition of all requirements from the beginning, the less
work and “misunderstanding” arise during the implementation phase.
~ Clear and detailed description of all required analysis results (“how should the
output for the involved departments look like, what is the intended output
format, etc.”) is a key element for a successful implementation.
3.2 Data Categorisation
Data categorisation is understood as the assignment of data privacy and security
levels to a Data Set stemming from one Data Source. A categorisation can also
include economic or other important aspects that might restrict or open a data set.
This is necessary due to compliance reasons that result from legal or City-internal rules,
which are listed above (see chapter 2.1 and 2.2).
One possible realization of proceeding can be done in the following 2 steps:
- Within this concept Data Sets first are assigned to different previously specified
categories with regards to their protection needs. The protection needs of each
Data Set result from legal requirements and demands laid down by the Data
Owner or Data Provider (Categorisation).
- Second, the Data Set is rated in 4 Classification Dimensions (levels: 1–4).
Labeling data before entering the Smart Data Platform ensures legal and safe
processing.
__________________________________________________________________________________
Remark: In this Data Gatekeeper concept we chose for a 2 step approach to raise
the awareness of how data Classification can be done in detail. We first described 4
possible Categories followed by 4 Classification types, each with an additional 4-step
level rating approach. Please keep in mind that this procedure is an example but not
the only possible approach. Any reasonable existing methodology can be fine, as long
as it is used continuously within a City or a project.
Smart Data Creation Guidelines
Chapter 3:
Quality Gate 2
DataProcessing
RequirementSpecification
Data Usage Agreement
and Licensing
Data Categori-
zation
Data Classification
Data Sets and
Technical Implement
ation
Quality Gate 1
3.1 3.2 3.3 3.4 3.6 3.7 3.83.5
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 69
In case the available raw-data is not useable for an analysis within the context e.g. of
the existing data privacy laws and/or city policies, using pre- and postprocessing
algorithms (like anonymisation or pseudonymisation) for data (e.g. at the data
provider or within the Smart Data Platform) might allow changes in the Category and
Classification of data. A solid anonymization algorithm often is a prerequisite to allow
the analysis of data that originally was not allowed to use (e.g. due to privacy
constraints). The application of this aspect in close cooperation with the data security
/ data protection officer can be a very powerful method to overcome potential
privacy barriers, without violating or even touching any privacy law or policy.
Introducing e.g. innovative anonymization algorithms into a Smart Data Plattform can
change the complete categorisation and classification scheme that originally was
required by the data providers. The present version of the Data Gatekeeper concept
does not examine these potential dynamic aspects in depth. Possible Categorisation
and Classification changes facilitated by robust and reliable algorithms of a Smart
Data Platform need to be examined more in detail in future extrapolations of the Data
Gatekeeper concept.
__________________________________________________________________________________
According to this exemplary approach, (see Annex Data Categorization and
Classification), the following section describes the four Categories from open data to
personal data. Depending on which individual use case or Categorisation scheme is
used in other projects, the following process description can vary.
3.2.1 Categories
In the given context, a distinction is made between four fundamental Categories of
data types, claiming different protection needs:
- category 1: open data
The main focus lies on topics concerning the prerequisites for passing on the
data and respective user agreements, licensing, and further agreed on
conditions. Data in this category is not subject to any privacy restrictions.
Meaning there are no personal data or personal related data therein. This data
has the lowest level of protection need out of the four Categories.
- category 2: non-personal data but not open (organisation internal)
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 70
Data in this category cannot be unequivocally classified as public data, but is
not subject to any privacy restrictions. Meaning it does not contain any
protective information about persons and cannot consequently be used for the
violation of individual personality rights.
For the reason that the data were not (yet) released/marked as open data,
there might be restrictions regarding their use, processing, distribution or
storage. Topics that need to be considered within this category therefore
include usage and distribution rights and agreements as well as licenses and
further restrictions.
- category 3: non-personal data but restricted
Data belonging to this category cannot be unequivocally classified as personal
data, but contains a (potential) risk that under special circumstances this data
can be attributed to persons (or other restricted areas with a certain degree of
risk potential) and therefore needs to be treated accordingly.
This means that data in this category does not contain any direct personal or
privacy related information. It is however possible, when linked or evaluated
with other information, that the data allows to extract individual information,
such as the behavior of small groups, or even conclusions about an individual
itself. Furthermore, data in this category can be designed in such a way that
there is a certain (individually specified) risk of misuse, even without any person-
related information being involved.
When processing this kind of data, it needs to be verified and ensured, that the
processing does not allow any kind of traceability to individuals or micro groups.
If this is not possible, the data is either not allowed to be processed within the
Smart Data Platform or it needs to be treated as personal data and hence
compliance to all privacy regulations needs to be ensured.
- category 4: Personal data
Personal data is highly sensitive and confidential data and hence has the
highest protection need. When processing personal data, the responsible or
assigned stakeholder is obliged to create a detailed written documentation of
the reasons for the processing including possible risks and all taken measures to
ensure data privacy and security. If it is not absolutely necessary for the
fulfilment of a use case, personal data should not be processed. In some cases,
however, it may be necessary to process personal data, which is possible if the
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 71
principles in chapter 2.1 and 2.2 are taken into account and adhered to. There
are different possibilities on how to transform (personal) data within a platform
so that it cannot any longer be referenced to an individual. Measures that can
or need to be taken can be found in chapter 3.7.
For more details see Checklist Template in the Annex.
3.2.2 Use Case Reflection
All used data within the Use Case was first analysed and categorised to clarify the
terms of licencing and terms of use. Often data providers initially do not have a clear
picture of which data Classification they need to choose. Also, data providers might
get uncertain of how their data should be classified during the whole end-to-end
process and therefore would restrict the usage to a minimum. For that reason, we
consider it as helpful to offer a first orientation of how data can be categorized. We
used the Categorization discussion as a preparation for the following data
classification. The Classification is a much more precise tool than the Categorisation,
but it requires a certain base understanding of how data processing in the Smart Data
Platform is organized.
Therefore the following three steps have to be done:
– Analyse Categorisation of data
– Check terms of licensing (see also chapter 3.3)
– and / or check compliance (terms of use)
Within the Use Case Smart Home, Categorisation, licence checking and compliance
checking was needed for:
– Master Data from the Geo Portal Munich
– Master Data of buildings and refurbishment information deriving from the City´s
“Energy Master Database”
– Measurement data from flats (sensors)
– Weather data from DWD3
3 Deutscher Wetterdienst – National German Weather Service (www.dwd.de)
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 72
Analysis results of data as follows:
Data Source Categoristaion Type licence terms of use
Master Data from Geo Portal Munich Restricted (cat 3) no yes
energy Master Data Restricted (cat 3) no yes
Measurement data from flats Restricted (cat 3) no yes
weather data from DWD Open (cat 1) yes yes
Figure 4: Categorisation of the Use Case
3.2.3 Golden rules
~ The more detailed the analysis of the provided data the easier is the IT
implementation, and misunderstandings can be avoided in an early stage of
the project
~ Categorisation of data often helps the data provider to get a better
understanding of the real value of the offered data. This simplifies the following
classification of the data and generates a trustful collaboration base between
the involved stakeholders.
~ Arranging and actively using pe- and post processing algorithms for data (e.g.
at the data provider or within the Smart Data Platform) might allow changes in
the category and Classification of data. This tool often is a prerequisite to allow
the analysis of data that originally was not allowed to use (e.g. due to privacy
constraints)
3.3 Data Usage Agreement and Licensing
Smart Data Creation Guidelines
Chapter 3:
Quality Gate 1
Quality Gate 2
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8
DataProcessing
RequirementSpecification
Data Usage Agreement
and Licensing
Data Categori-
zation
Data Classification
Data Sets and
Technical Implement
ation
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 73
There are different options to share data and make it available for others to use. In all
cases it is necessary to specify the conditions of use in a Data Usage Agreement. Either
they are already subject to an existing license like for instance the Creative Commons
license, or an individual agreement is necessary. An individual agreement additional
to a standard licence is called Side letter.
3.3.1 Standard Licences
In the usual licensing procedure no negotiations takes place between supplier and
user in the individual case. Hence, the use of open data licenses or sample user
contracts (templates) minimize efforts. Moreover, it is possible to agree on further
conditions by an additional agreement if necessary (side letter agreement).
Common standard open data licenses in Germany and beyond are briefly
summarized in the following:
Creative Commons (CC)
URL: https://creativecommons.org/share-your-work/licensing-types-examples/
- Internationally known and widespread public copyright licenses that enables
the free distribution of an otherwise copyrighted work
- Several types of CC licenses
- Current version: 4.0
German translation was provided in January 2017
There is no porting for German law yet
- There is a porting of the older version 3.0 for German law
Datenlizenz Deutschland
URL: https://www.govdata.de/lizenzen
- Current version: 2.0
- Available in two versions:
„Datenlizenz Deutschland – Namensnennung – Version 2.0"
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 74
Obligates the data user to state (the name of) the data provider
„Datenlizenz Deutschland – Zero – Version 2.0"
Allows unrestricted use
Geodatennutzungsverordnung (GeoNutzV)
URL: http://www.gesetze-im-internet.de/geonutzv/
- Applicable for geodata, geodata services and metadata
- Must be adapted by the data-holding body to the respective circumstances
3.3.2 Individual Agreement or Side Letter
In case there is no existing license or an additional agreement is desired, it is necessary
to set up a Data Usage Agreement contract. The agreement should contain detailed
information on the conditions of use, the delivery arrangements and all the tasks in
detail with respective Responsibilities and rights of the two parties. Topics like further
distribution and the conditions for every state of the data should be addressed as well.
Such a Data Usage Agreement should either be created by a lawyer or at least be
reviewed and approved by one.
If the data is not already under a license, it is necessary to draft the Data Usage
Agreement as a contract defining the rules and conditions of the data usage and
processing. If the agreement is a Side Letter to an existing license, it just needs to
describe addional usage conditions or exchange arrangements.
Distinguished by the kind of mutual compensation of the two parties, common types
of user agreements are:
– Fee-based agreements
– Exchange agreements (e.g. data exchange)
– Individual additional agreement (e.g. additionally to an existing license)
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 75
3.3.3 Use Case Reflection
In the Use Case “Smart Home” some of the data usage agreements between the
consortium partners were already defined within the General Agreement of the EU
project “Smarter Together”. Details however concerning concrete Classification and
licensing of Data Sets (e.g. data privacy concerns or commercial aspects) had to be
done on an individual base. Other external licence agreements we had to consider
and investigate in more detail was the DWD weather forecast data (which is open
data). Also we needed to find an agreement of how to restrict the usage of the flat
inhabitants’ personal data (“Address”). This was done by the above mentioned
“written individual agreement” that every participating flat inhabitant had to sign (on
a volunteer base) before starting to use the offered service.
3.3.4 Golden Rules
~ Make sure that for each Data Source that exchanges data with the project
data base (e.g. Smart Data Platform), there is a written agreement with the
data provider of how the data is categorised.
~ Make sure that for every Data Source that exchanges data with the project
data base (e.g. Smart Data Platform), there is a clear written agreement of the
data usage licenses (e.g. monetary conditions or other restrictions required)
3.4 Data Classification
Remark: (see also data Categorisation): In this Data Gatekeeper concept a 2 step
approach was chosen to raise the awareness of how data Classification can be done
in detail. Four possible Categories and four Classification types were described, each
with an additional 4-step level rating approach.
Please keep in mind that this procedure is not the only possible approach. Any
reasonable existing methodology can be fine, as long as it is used continuously within
a City or a project.
Smart Data Creation Guidelines
Chapter 3:
Quality Gate 1
Quality Gate 2
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8
DataProcessing
RequirementSpecification
Data Usage Agreement
and Licensing
Data Categori-
zation
Data Classification
Data Sets and
Technical Implement
ation
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 76
Also note, that the subdividing of respective Categories (Classification) must be
realized by means of previously defined pre- or post-processing steps (see Data
Processing) of respective derived Data Sources as it is an instrument for Anonymisation
and Pseudonymisation.
__________________________________________________________________________________
In this concept, Classification was made in the following four Dimensions:
I for integrity and data protection (Level I1 – I4)
P for processing and analysis (Level P1 – P4)
R for redistribution and modification (Level R1 – R4)
S for storage and deletion (Level S1 – S4)
There are four different levels of scale for each Classification dimension. The level
shows the required type of treatment for the data on the Smart Data Platform. Each
Classification Dimension has four different settings, starting with level 1 for open data,
with the lowest protection needs up until level 4 for personal data, with the highest
protection needs.
Thus, in total there are 16 different Classification settings. A Data Set therefore will
always have 4 Classification Data Fields, one for every Classification level.
A Data Set can be level 1 regarding integrity and data protection but level 2 regarding
redistribution and modification (e.g. I1-V1-W2-S1). The exact meaning of each
Classification setting will be described in the following subchapters.
In addition to the guidelines shown in chapter 2, mainly resulting from legal
requirements, the Classification is a further orientation guidance by specifying in which
area a Use Case can be set up.
The Categorisation hence helps (potential) Use Case creators by setting rules and
requirements that need to be complied with. It offers further guidance by giving an
overview of the different existing data Categories, enabling the creator to see whether
a Use Case idea is legal and realisable. Use Cases and data that do not meet these
basic requirements or which do not comply with the basic legal regulations are not
allowed on the Smart Data Platform.
Within one Use Case, depending e.g. on the analysis goals, there can be several
Categorisations of Data Sets simultaneously. They however must be clearly separated
from each other in order to not produce misunderstandings in the data processing.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 77
As relevant information for processing within the Smart Data Platform, the following
information is stored: Whether the Data Set is Open Data or not (refer to respective
Category), a time-to-live for each Data Object (may be unset or set upon decision by
Data Protection Officers), The 4-level rating for the Classification Dimensions Integrity
& Data Protection (I), Processing & Analysis (P), Redistribution & Modification (R) and
Storage & Deletion (S).
The four Classification Dimensions and their possible settings are detailed in the
following.
3.4.1 Integrity and Data Protection
The Classification I stands for Integrity and Data Protection. It ensures, that data is
processed lawfully and securely with regards to those aspects.
Level 1 (I1): No further measures need to be taken.
Level 2 (I2): Data need to be anonymized before further processing.
Level 3 (I3): Data need to be anonymized before further processing and if
needed also aggregated before that.
Level 4 (I4): Data needs to be treated as personal data and therefore it is
necessary to comply with data protection laws and legal requirements,
described in chapter 3.4.4.
3.4.2 Processing and Analysis
The Classification P stands for Processing and Analysis. It ensures, that data is
processed lawfully and securely with regards to those aspects.
Level 1 (P1): No restrictions.
Level 2 (P2): Data can only be processed and analyzed with other data when
it is ensured that by processing, it (still) won’t be possible to trace data back to
an individual.
Level 3 (P3): Data can only be processed and analyzed with preassigned data
within the Use Case.
Level 4 (P4): Data are not allowed to be processed or analyzed with other data.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 78
3.4.3 Redistribution and modification
The classification R stands for Redistribution and Modification. It ensures, that data is
processed lawfully and securely with regards to those aspects.
Level 1 (R1): No restrictions.
Level 2 (R2): Data can be transmitted to external users, only under specific
conditions, which are specified in the usage agreement.
Level 3 (R3): Data are only allowed to be transmitted internally and to the data
provider.
Level 4 (R4): Data are only allowed to be transmitted to specified internal users.
3.4.4 Storage and Deletion
The Classification S stands for Storage and Deletion. It ensures, that data is processed
lawfully and securely with regards to those aspects.
Level 1 (S1): No restrictions.
Level 2 (S2): Data need to be stored in-house, but are not subject to any
restriction regarding deletion.
Level 3 (S3): Data need to be stored in-house and the deletion limit needs to be
observed.
Level 4 (S4): Data contain personal data or comparable vulnerable information
and therefore need to be stored at a preassigned, secure place if necessary
with specific access protection and the deletion limit needs to be observed.
3.4.5 Use Case Reflection
The previously categorised data are classified in detail to determine the processing
and maintaining rules within the SDP
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 79
In this step the Classification of the following data will be determined in close
cooperation with data owners:
– Master Data from geo portal Munich
– Energy Master Data of buildings and refurbishment
– Measurement data from flats (sensors)
– Weather data from DWD
All data in the SDP deriving from the flats are classified by the data owner or the data
provider in order to assure the above mentioned legal aspects and technically define
and restrict the complete data handling on the SDP accordingly (use of
anonymization and aggregation …). In case e.g. only 2 flats are project participators
but the overall building/address has 20 flats, it could be relatively easily determined
which flat owner contributed which data. This would theoretically allow to use
algorithms which conclude a personal behaviour of a flat inhabitant. To avoid this kind
of potential misuse from scratch, also anonymization or aggregation procedures for
the collected data needed to be taken into account.
The results of the Classification are as follows:
Data Source Integrity and
Data
Protection
Processing
and Analysis
Redistribution /
Modification:
Storage /
Deletion
Master Data from geo portal
Munich
Tbd² Tbd² Tbd² Tbd²
Energy Master Data Tbd² Tbd² Tbd² Tbd²
Measurement data from flats 4 2 4 4
weather data from DWD 1 1 1 1
Figure 5: Classification of the Use Case
__________________________________________________________________________________
Remark: ²: some of the project specific classifications couldn’t be finalised during the
description of this concept.
__________________________________________________________________________________
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 80
3.4.6 Golden Rules
~ Classification of data needs to be discussed in close cooperation with the data
owner and the chief data officer before transferring any data into the Smart
Data Platform. Using more than 4 levels of Classification increases the danger
to overstretch complexity and lose track of the details. All impacts and
individual conditions need to be written down and agreed to with the platform
operator and the party that implements the rulebase (algorithms) into the Smart
Data Platform.
~ Arranging and actively using pre- and post processing algorithms for data (e.g.
at the data provider or within the Smart Data Platform) might allow changes in
the category and Classification of data. This “tool” often is the only reasonable
way to allow the analysis of data that originally was not allowed to use (e.g.
due to privacy constraints) (applying same golden rule as already mentioned
in “Data Categorisation”)
Short Summary of “Data Creation Guidelines incl. Data
Categorisation and Classification”
The preparation and creation of the Use Case is the interdisciplinary translating process
between all participating stakeholders, from business orientation to IT-implementation.
It is one of the most important preparation parts where the essential elements of the
Data Gatekeeper concept are brought together and where its required
implementation details are recorded. It stepwise sharpens and concretises all of the
original requirements and guideline parameters in a way that in the end a
programmer can perfectly understand them and implement them in the Smart Data
Platform.
The Use Case anticipates the complete end-to-end process. It describes the
challenges that need to be analysed e.g. by the operative department, it lists the Data
Sources required, it defines the output parameters of the analysis. It determines the
data security and data privacy aspects as well as the underlying business or benefit
models for the City, a City department and other stakeholders. And last but not least
it rises the value of the input and output information data by categorising and
classifying it accordingly. The discussion about the Classification of the data often is a
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 81
very fruitful and important step to express the fundamental Data Gatekeeper related
rules of a Use Case.
3.5 Quality Gate 1
The Smart Data and Use Case Creation Process has several quality gates that ensure
lawful processing of the data as well as data security and privacy.
The first quality gate in the process contains the final Use Case concept elaboration,
its approval and the signment of the Data Usage Agreement.
3.5.1 Elaborate Final Use Case Concept
As soon as the Requirements Specification, the Data Categorisation and Classification
as well as the Data Usage Agreement and related meetings and negotiations are
finished the responsible instance can elaborate the final Use Case concept. It is
recommended to summarize agreements in a document including the previously
provided specification checklist. For a checklist template on activities until Quality
Gate refer to Appendix (7.2.2f.).
3.5.2 Final Concept Approval (Q1)
The Data Owner now approves the final Use Case Concept in agreement with the
Data Supplier. An approval of the decision making body concerning the related
budget is required too. Other important stakeholders need to be informed on
decisions and eventual demanded rectifications and shortcomings.
3.5.3 Sign Data Usage Agreement
If the approval has been successful, the data owner on the side of the City and the
Data Supplier sign the Data Usage Agreement as negotiated in previous steps.
3.5.4 Use Case Reflection
Quality Gate 1 is for checking the Use Case compliance and Use Case completeness
of all Legal Regulations and internal rules and all definitions that have been
determined so far. It is the basis for the implementation of the Use Case in the different
systems and platforms. Incompleteness and unclearness of definition can cause high
costs in terms of needed rework.
Smart Data Creation Guidelines
Chapter 3:
Quality Gate 2
3.1 3.2 3.3 3.43.5
3.6 3.7 3.8
DataProcessing
RequirementSpecification
Data Usage Agreement
and Licensing
Data Categori-
zation
Quality Gate 1
Data Classification
Data Sets and
Technical Implement
ation
3.1 3.2 3.3 3.4 3.6 3.7 3.8
3.5
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 82
Within the Use Case “Smart Home” we prepared the written Use Case together with
the involved parties. Every party had the possibility to look at the complete Use Case
description before a final approval. Also we organized a final meeting between the
involved stakeholders “Use Case owner”, “Analytics draftsman” and “Data platform
operator” in order to go through all requirements again and answer all remaining
questions.
We recommend to create a checklist for every single step that has been done so far
and to sign this by at least your Chief Data Officer, Data Security Officer, Use Case
owner and Data Platform Provider.
As a minimum check you should verify:
– Used data and corresponding terms of use
– Classification and corresponding processing rules
– Use Case completeness and workflow (incl. requirement spec for Analytics)
– Interface definition (all systems involved)
– Functionality and GUI of the components (depending of the implementation
process) such as Analysis Dashboard, SC-App …
3.5.5 Golden Rules
~ Clearly identify the issue and the benefit of the Use Case. Get the commitment
on this from the involved partners
~ Invite all relevant Use Case stakeholders to comment on the final version of the
Use Case (give them a reasonable time schedule), invite them for a final
approval and Q&A session (if needed) before releasing the Use Case for the
concrete implementation work.
~ Keep the checklist for the quality gate as simple as possible but insist on a signed
and agreed quality gate 1
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 83
3.6 Data Sets and Technical Implementation
In the previous chapters primarily organisational tasks and measures were described.
The following section describes an example for the technical layer, focusing on related
procedures with respect to the previously described organisational part. As described
in section 2.3 it is recommended to establish a central Smart Data Platform
infrastructure for Integration, Processing and Data Analysis. The following section
describes necessary steps of the Data Platform and potential other Use Case-specific
systems.
3.6.1 Create Data Scheme and Interface Specification
The first main task of the Data Supplier is to create a Data Scheme based on the Data
Model outlined in chapter 4. Further she or he can consult with the Platform
Administrator on the Data Scheme and the planed interface of the external Data
Source for data exchange.
Arising key questions of the Data Supplier can be:
– Is the data format compatible with the interface and the Smart Data Platform?
– Does the Data Set have all the necessary attributes to be correctly processed
in the Smart Data Platform? Does this necessiate further steps on Supplier side?
– What sampling rate is needed - are different time granularities needed?
The Platform Administrator can assist in answering these questions by providing
established standard solutions that are in use for several other Smart Data Use Cases.
3.6.2 Create Technical Specification
The Service Developer, who designs and develops the main Use Case hardware
component or software application, has been involved in previous requirements
engineering and concept steps. As first main responsibility he or she now elaborates a
Smart Data Creation Guidelines
Chapter 3:
Quality Gate 1
Quality Gate 2
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8
DataProcessing
RequirementSpecification
Data Usage Agreement
and Licensing
Data Categori-
zation
Data Classification
Data Sets and
Technical Implement
ation
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 84
technical specification for his External System that uses the Smart Data Platform API4
of the Smart Data Platform.
As the conceptual approach is now translated into technical concepts the first time,
a question that may arise is wheter it is necessary to add human-generated content
in the Data Set. Do the user generate data on their own while using e.g. an App?
Where is it stored safely? Either the Smart Data Platform or another service run by the
conceptual and technical developer could overtake this task.
__________________________________________________________________________________
Remark: The following descriptions (3.6.3. – 3.6.6.) are kept rather short, as most
activities follow a standard IT implementation process.
__________________________________________________________________________________
3.6.3 Technical Specification Approval
As soon as the technical specification for e.g. an interaction of users (citizen), External
System integration, Smart Data Platform (API, Analysis Dashboard) and Data Source
Interfaces has been finished, the Use Case Responsible can approve it in consultancy
with the assigned responsible persion or forum (see chapter responsibilities) .
3.6.4 Implementation and Data Set Creation
Now the technical implementation of the Use Case content can start. Most important
is the creation of the Data Set with all required attributes in the Smart Data Platform or
in case it already exists, to check the compatibility for the given Use Case. This is done
by the Platform Administrator. He or she further ensures the linkage of the Analysis
Dashboard and several Derived Data Sets and their visualisation to Data Analysts.
3.6.5 Implementation of Data Interface
The Data Supplier now is realizing the previously planed Interface of the external Data
Source and tests the solution together with the Platform Administrator.
4 Application Programming Interface
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 85
3.6.6 Develop App or Hardware solution
Now the External System, which is dependent on the Use Case a smartphone app or
a hardware solution can be developed and tested by the Service Developers.
3.6.7 Use Case Reflection
The definition of the technical specifications mainly used established standard IT-
processing tools. One important aspect however was the discussion of potential data
formats, protocols and the sandbox-testing of APIs and data flows. As most of the
project partners did not know each other personally before the project, it was
important to agree to a common process, to arrange meetings with specialists and to
arrange test flows before the final technical implementation took place.
3.6.8 Golden Rules
In this phase of the project normally other people than during the preparation and
Use Case definition phase are involved. Although these people will have very
dedicated and specific IT-related work to do (e.g. installing and testing SW and
HW, programming algorithms and GUIs, etc.), it is very helpful to inform them about
the complete end-to-end process and background of the Use Case. For that
reason it is recommended to generate a standard documentation (e.g. slides),
explaining the Use Case and providing the most important facts and expectations
of the relevant stakeholders.
3.7 Data Processing
__________________________________________________________________________________
Remark: The following descriptions (3.7 – 3.7.3.) describe Standards in many IT-projects.
Nevertheless some background information is provided within the scope of this
Smart Data Creation Guidelines
Quality Gate 1
Quality Gate 2
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8
DataProcessing
RequirementSpecification
Data Usage Agreement
and Licensing
Data Categori-
zation
Data Classification
Data Sets and
Technical Implement
ation
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 86
document to facilitate a better understanding of some Data Gatekeeper aspects for
readers who are not familiar with these IT-related details.
__________________________________________________________________________________
As next activity the realisation of previously defined pre- or post-processing steps of
respective derived Data Sources is recommended. These designated processing steps
now have to be substantiated by adequate processing functionalities and
Classification realized by means of processing. There is no standard processing,
however, the exemplary proceeding for processing, which is ensuring different aspects
relevant for a Smart Data Platform, is as follows:
1) extract, transform and load process
2) adequate processing of data according to following functionalities
3) load of data to Smart Data Platform
In the scope of this concept the conceptual procedures Anonymisation /
Pseudonimisation are understood as a processing measure. The technical and more
concrete realisation of this measure can be done using different processing
functionalities as for instance the removing of a technical identified (ID). Function
libraries e.g. are a collection of possible software tools that can be applied to format
any available raw-data or raw-information.
Technical key concepts relevant for processing are the Raw Data Set (see glossary)
which stores data obtained by an external Data Source (via technical interface) and
the Derived Data Data Set that uses an Algorithm to calculate modificated Data
Objects (see glossary) from the Raw Data Set.
There are two general different possibilities (mechanisms/measures) on how to
transform data within the platform so that it cannot any longer be referenced to an
individual.
– Pre-Processing: After the integration of raw data into the Smart Data Platform,
data can be anonymised / pseudonymised before writing it into the platform´s
Database.
– Post-Processing: The data is stored on the platform’s Database as Raw Data Set
and dynamically processed when accessed by a Derived Data Set. The
platform´s Analysis Dashboard, in turn, accesses the Derived Data Set.
In the scope of this concept the two main tasks of the Platform Administrator is to
derive necessary processing measures and functions and to implement processing
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 87
functions. In consultation with all responsible stakeholders for data security and
protection (poss. the Data Analyst and Use Case Responsible), they are in charge of
ensuring compliance with internal and external requirements of data privacy.
The following subsections describe the two Data Processing Measures
Anonymisation/Pseudonymisation for person-related data and Data Aggregation.
Beyond these, further measures may be necessary, implemented and used in the
future.
3.7.1 Anonymisation and Pseudonymisation
Anonymisation of person-related data means processing it with the aim of irreversibly
preventing the identification of the individual to whom it relates to. Also it must be
impossible to identify any individual from the data by any further processing of that
data or by processing it together with other information which is available or likely to
be available.
It is rather difficult to predict if a particular technique will be 100% effective in
protecting the identity of Data Subjects except for methods offerimg formal
guarantees. However, it is possible to minimise the risks of Data Subjects when
anonymising data. Identification means the possibility of retrieving a person's name
and/or address, but also the potential identifiability by singling out, linkability and
inference.
Pseudonymisation of person-related data means replacing any identifying
characteristics of data with a pseudonym, or, in other words, a value which does not
allow the Data Subject to be directly identified.
Although pseudonymisation has many uses, it should be distinguished from
anonymisation, as it only provides a limited protection for the identity of Data Subjects
in many cases as it still allows identification using indirect means. Where a pseudonym
is used, it is often possible to identify the Data Subject by analysing the underlying or
related data.
This means that the processing of personal data is manipulated in such a way that the
data can no longer be attributed to a specific Data Subject without the use of
additional information. In order to pseudonymize a Data Object, the “additional
information” must be kept separately and subject to technical and organisational
measures to ensure non-attribution to an identified or identifiable person. In sum, it is a
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 88
privacy-enhancing technique where directly identifying data is held separately and
securely from processed data to ensure non-attribution.
The following Data Processing Functionalities realize Anonymisation and
Pseudonimisation Measures:
Removing person-related Data Fields
For a full anonymisation of the transferred Data Sets, Data Fields containing person-
related Data Sets can be removed completely before writing it into the data-base.
These Data Fields need to be specified by the data provider and other Use Case
stakeholders.
Removing ID digits
Before writing raw Data Sets into the Database, the person-related IDs indicated in
specific Data Fields can be reduced by a number of digits to be specified by the data
provider and other Use Case stakeholders. If the Data Set contains a person-related
ID as for example “D42197334”, it can be written into the Database as e.g. “D42197”
or “D42197XXX”. The number of digits to be cut off needs to ensure that no data can
be referenced to a specific person.
Creating hash values
A hash value is a numeric value of a fixed length that uniquely identifies data. Hash
values represent large amounts of data as much smaller numeric values, so they are
used with digital signatures. You can sign a hash value more efficiently than signing
the larger value.
For data pseudonymisation purposes a hash value can be created for a person-
related ID before writing the data into the Database. The content of a specific Data
Field is replaced by a new “string” created with the algorithm of the hash-function.
The created hash-value can only be referenced to the original Data Values with the
correct key which has been used for creating the hash-value. This key can either be
deleted or safely stored in a different system.
3.7.2 Data Aggregation
The Processing Measure “Data Aggregation” is understood as the compiling of
information from one or several Data Sets with the intent to prepare a combined
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 89
Derived Data Set that does not include all information from original Data Sets. A set
of Processing Functionalities to ensure data protection requirements for the
aggregation of data is described in the following.
Remove subcategories of a Data Set
Subcategories of a Data Set can be removed. If a Data Field contains a category
and there are more subcategories specifying this category even further, these
subcategories can be removed. If the Data Set for example contains a location
category “City” and the subcategory “district” or “street”, the later two
subcategories can be removed so that the Data Set is aggregated to City-level.
Create an average value for a defined period of time
If Data Sets contain measured values for an interval of e.g. 5 minutes, but due to
Data Protection guidelines only measured values for e.g. an hour should be
processed in the platform, an average value for the desired period of time can be
calculated (either before writing the Data Det into the data base or before further
processing in the analysis module)
Aggregate (sub-) categories
Categories of a Data Set can be aggregated to a “higher” category level. If a Data
Set contains for example the category “district”, the districts “1” and “2” can be
aggregated to the new category “1/2” to provide a higher aggregation level.
3.7.3 Golden Rules
~ Consultation: inform the Data Supplier about changes. The Data Analyst or Use
Case responsible may also be consulted for figuring out their concrete ideas on
their desired Data Analysis Queries – in case they are compliance-critical or
different to realize.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 90
3.8 Quality Gate 2
The second quality gate ensures the final testing and approval of the Smart Data Use
Case. It contains the specification and implementation reviews, a Use Case
integration test and a pilot phase for testing the External System with a small number
of people. A subsequent test by the Data Analyst allows to adjust the Data Analysis
Queries based on data and experiences from the pilot phase. Finally the Use Case is
completed by a final approval and a Go Live of the Use Case.
3.8.1 Specification and Implementation Review
As soon as the implementation of technical solutions inside and outside of the Smart
Data Platform are completed, the Use Case Responsible reviews the specification and
Implementation and raises her or his comments and feedback to the respective
developer responsibles (Data Steward, Service Developer and Platform Administrator).
3.8.2 Use Case Integration Test
The Use Case integration test by the Strategic Manager is a last laboratory test of all
external components under stress conditions. Detail shortcomings can be straightened
out by the developers. The Data Analysis Queries still can be in a prototype status, as
only a basic test can be performed with mocked data in the laboratory.
3.8.3 Go Live Pilot Phase
A pilot phase within a fixed time frame is testing the External System with a small
number of test users. Depending on the analysis requrements the Data Analysis Queries
can be substantiated by means of real data of the test users. The Use Case Responsible
is in charge of the pilot testing.
3.8.4 Analysis Test
The Data Analyst test allows the Use Case Responsible and other Analysts that are
interested in findings from Smart Data to test the Data Analysis Queries in the platform’s
Analysis Dashboard. They can raise usability issues and request adjustments to the
organizing Strategic Manager.
Smart Data Creation Guidelines
Chapter 3:
Quality Gate 1
Quality Gate 2
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8
DataProcessing
RequirementSpecification
Data Usage Agreement
and Licensing
Data Categori-
zation
Data Classification
Data Sets and
Technical Implement
ation
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 91
3.8.5 Final Approval and Go Live
3.8.6 Golden Rules
refer to existing test scenarios of your IT department or IT-outsourcing partner
Data Model
__________________________________________________________________________________
Remark: The following descriptions of the Data Gatekeeper’s Data Model should not
be seen as a fixed and “ready to use” Blueprint. Although a Data Model is important
and necessary to implement all stakeholders’ requirements and processes into a Data
Platform, the character of the present Data Model, although it covers all main needed
elements, is “preliminary” and “project specific” and requires a more detailed
description in next possible updating-steps of the Data Gatekeeper. It shoud be seen
as one possible example of how a Data Gatekeeper’s Data Model can be designed.
The Data Model strongly depends on the particular use case and focus of an individual
Data Gatekeeper implementation. The design of the Data Model as well as the
implementation into an existing Data Platform requires specialised Know How and
design matching from both IT Architects as well as IT Programmers.
__________________________________________________________________________________
This chapter contains the design and description of the Data Model that summarizes
as formal relationship glossary the considerations and specifications from the previous
chapters as formal relationship glossary. It is arranged as a minimal set of information
necessary for the implementation of data handling (described in the Data
Gatekeeper Concept). The DGK Registry (The Data Gatekeeper (DGK)-Registry is a
Software instance within the Smart Data Platform that assures the Data Gatekeeper’s
Data Model implementation process requirements within the Data Platform) for
instance is only one possible realization of many but can serve as template for the IT-
responsibles like the Service Developer, Data or Platform Administrator.
The Data Model is a scheme that translates the Use Case wordings and options into
logical IT-oriented structures. The goal of the Data Model is a comprehensive structure
including all available and selectable options that characterizes needed components
within the Smart Data Platform and the perfect interplay with the Data Gatekeeper
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 92
concept. Data Schemes elaborated in the scope of a Use Case (see section 3.6.1)
need to comply with the Data Model and realize together with the Data Schemes of
all other Use Cases the concrete Data Model of the Smarter Together Platform. Several
organisational aspects are not included in the Data Model. For instance only users with
access to the system are represented.
The Data Model in Figure 6 shall ensure standards in syntax and allow a semantic
representation of all compliance and licensing aspects.
Figure 6: Data Model: Inside Smart Data Platform and External
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 93
Figure 7: Data Model: Smart Data Systems
Figure 6 and Figure 7 show data entities relevant for the Data Gatekeeper Concept in
UML class chart notation. They represent a conceptual model including classes, their
relationships including cardinalities (partially annotated) as well as attributes of the
class entities. The two Data Models are connected and interacting (see appendix,
7.6).
Figure 6 illustrates the Data Model inside the Smart Data Platform and external (Figure
7). There are two kinds of Data Sets. Raw Data Sets are obtained from an external Data
Source and stored in the platform’s Database. Derived Data Sets refer to a Raw Data
Set and are dynamically calculated by the system. A Derived Data Set needs an
Algorithm and one or several Raw Data Sets to obtain data from. The external Data
Sources stream data in a particular time period to their Data Set pendant. This update
rate may be a long time period as e.g. once per 5 years in case of a database and a
short time interval as e.g. once per minute in case of a sensor. A Data Set holds various
Data Objects (see glossary) and has various Data Fields.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 94
Each Data Set can further be assigned to one or several Smart Data Use Cases. A Use
Case is characterized by a title and a description. The further existing documentation
on the Use Cases (see section 3.2 ff.) is not relevant and not stored in the Smart Data
Platform.
Data Set refers to exactly one Data Classification. The attribute “Is Open Data”
determines wheter the corresponding Data Set is under an open data licence, which
has impact on processing permissions. The attribute “Time-to-live” can either be null in
case of no duty to delete this Data Set after a particular time period or be set with a
time period after which it should be deleted. The time period starts at the time of
creation of the corresponding Data Object.
Several levels of privacy indicate the general Classification of all Data Objects
belonging to this Data Set. The different Classification attributes I, P, R and S can each
have a different level from 1 to 4, depending on the protection needs of the data.
Figure 7 shows the Data Model of the Smart Data Systems. Relevant entities are the
Transparency Dashboard, the Analysis Dashboard, the Acces Authorization, the DGK
Registry, the Smart Data Platform API and the External System.
The Data Gatekeeper (DGK) Registry contains Aggregation rules, Pseudonymization
rules and Restrictions and is also related to the Access Authorization. The Access
Authorization is determined by the DGK Registry and grants access to both the Analysis
Dashboard and the Smart Data Platform API, which is accessed by External Systems in
order to access Data Sets and Data Objects from the platform.
All Users relevant for the Smart Data Platform and their possible interactions with System
Components are further visualised in Figure 8.
User Interactions with System Components
Use Case Responsible Owns the Smart Data Use Case
Agrees on the Data Classification
Decides on the Access
Authorization for his Use Case
Data Analyst Can read the Analysis Dashboard
Platform Administrator Operates the DGK Registry
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 95
Data Supplier Owns the external Data Source
belonging to a Raw Data set
Sets the Data Classification
Figure 8: Users relevant for the Smart Data Platform
4.1 Examplary Data Model
Figure 9 is an example of a Data Model of a concrete Smart Data Platform with two
Use Cases.
Figure 9: Data Model of a specific Smart Data Platform with two Use Cases
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 96
– Both Use Cases have two Data Sets. The Use Case “Smart Home” uses the Data
Sets “Internal Temperature” and “Humidity”. The Data Set “Internal
Temperature” contains Raw Data which is provided by an external temperature
sensor as well as a so called Feel-good factor calculated by an algorithm using
raw data provided by the two external Data Sources, a thermometer and
hygrometer sensor, and a calculation rule. The Data Set “Humidity” is a Raw
Data Set provided by the hygromether sensor and does not contain Derived
Data. Every Data Set has a Data Classification. In this data model, for example,
the internal temperature is 23 degrees celcius on the 13th march, 2018. This Data
Set is classified as open data, has a time-to-live of one month and has specific
levels of privacy.
– In the Use Case “Lamppost” the two Data Sets used are “External Temperature”
and “Air Pollution”. The Data Set for “External Temperature” consists of Raw
Data provided by a temperature sensor, the Data Set “Air Pollution” consists of
both Raw Data, provided by an external source, and Derived Data. The Raw
Data of this Data Set is generated by an Air Pollution Sensor once per minute
and the Derived Data is calculated by the algorithm “Pollution calculation”. In
contrast to the Derived Data Set “Feel-good Factor” from the Use Case “Smart
Home”, the Derived Data Set “Air Pollution Index” uses only one Raw Data Set
in the algorithm “Pollution Calculation” in order to determine the Derived Data
Set.
Smart Data Infrastructure The Smart Data Platform converts available raw data into Smart Data and serves as a
collaborative platform for smart services that allows an efficient data exchange
between City stakeholders. It offers innovative data analysis methodologies and
needs to meet many data integration and analysis requirements while at the same
time ensuring data privacy and security.
The following chapter first summarizes technical requirements to the platform and then
describes the basic architecture. The core features data integration, connection with
External Systems and data analysis are further presented more deeply, before the
Transparency Dashboard is introduced as important feature for data processing
transparency towards citizens.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 97
In total, this chapter offers an exemplary platform architecture based on the Use Case
of Munich and aims to serve as a template for the IT-Responsibilities like the Service
Developer, the Data and Platform Administrators. On the other hand, its structural
content can be used by the Strategic Comitee as an announcement for a Smart Data
Platform.
Before main sections follow, an executive summary briefly summarizes key issues of the
chapter.
5.1 Basic Architecture
The basic principle of the recommended Smart Data Platform is to develop services
and applications for various new types of data and Use Cases. It offers an open
interface to enable an IT ecosystem around its deployment. This includes the possibility
to exchange raw and aggregated data as well as results from data analytics. It also
offers the possibility for third parties to develop their own applications-making use of
open interfaces. Most technical innovations are being made within the analytics and
evaluation modules.
The platform furthermore enables the integration of different City domains, such as
Water, Energy, Mobility, Buildings and others, and helps to obtain synergies to enhance
the efficiency of urban infrastructure.
The Smart Data Platform offers mechanisms to extract, transform, load (ETL), store and
analyze data, as a basis for new services with dedicated user interfaces.
Figure 10 shows the layer architecture of the Smart Data Platform and an exemplary
bunge of potential Use Cases (at the bottom).
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 98
Figure 10: Smart Data Platform Layer Architecture
The Smart Data Platform provides the following core features:
External Systems realize Smart Data Use Case functionalities such as water asset
management, building energy monitoring & controlling or mobility functionalities.
These systems deliver data in various formats and demand for flexible and robust
integration techniques.
Data Integration/ETL is performed to obtain data from External Systems and to load
data into an integrated schema. Data integration is important to get a complete view
of the problem context.
Datawarehouse stores data in a persistent manner and enables the composition of
new data. The Datawarehouse must be scalable and deliver results in a timely manner.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 99
Analytics Modules include various analysis algorithms that provide insights into
complex processes and dependencies. Algorithms must be efficient in computing
results.
APIs/Services provide access to data and functionality in a standardized manner. This
helps to establish an ecosystem based on data and services.
Web Dashboards provides means for visualisation and presentation of results to a
variety of users.
Figure 11: Smart Data Platform Layer Architecture and Data Gatekeeper Registry
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 100
5.2 Data Integration
Core feature of the platform is the storage of Data Sets as well as the import and just-
in-time integration from technical interfaces provided by Data Suppliers (see glossary).
The platform replicates all data streamed from external Data Sources (see glossary)
and allows data management, Linkage and Aggregation by means of Derived Data
Sets (see glossary). Some processing steps are performed automatically by the Smart
Data Platform. For some others manual implementaions are necessary.
As described in chapter 3, based on the agreements of Data Supplier, Use Case
Responsible and other involved stakeholders, the Data Classification and
Categorisation is configured by the Platform Administrator. The Admin Tool of the DGK
Registry component allows the Platform Administrator to create the Data Sets and its
Classification, while other manual interventions and code adjustions may be needed.
There is no automated processes for the creation of Use Cases and related Data Sets
or for Analysis Queries. The present Data Gatekeeper concept thus can not be seen
as a process description implementing any automated process to assure data privacy
or to eliminate any data privacy leaks. All necessary steps are described, but still the
implementation needs to be done in a stepwise and manual approach.
The configuration is considered by the DGK Registry component accordingly. The
Data classification is matched to the different Data Sets during the data integration.
In case Pre-Processing has been configured (cp. 3.7 on Data Processing) before Raw
Data Sets are stored, it is necessary to anonymize or pseudonymize and aggregate
the data as specified. Figures xx illustrate the core functionalities of the Smart Data
Platform that are coordinated by the Admin Tool.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 101
Figure 12: Platform Core Functionalisties and their Administration
The Smart Data Platform foresees two ways of accessing the integrated data. The first
way is to access the visualized analyses of data in the analysis dashboard. The second
way is to request data over an API for e.g. integration in applications. In both cases of
data access, the user needs to have the access rights to receive certain data. The
Smart Data Platform grants access to certain Data Analysts or External Systems
connected via API with the implemented multi-client-capability.
5.3 Connection with External Systems
One important functionality of an integrated Smart Data Platform is the easy access
of various Smart Data applications and Systems that provide smart services to citizens
or other Smart Data users.
The linkage of the External Systems developed by the Service developer is done by an
API. In order to obtain data the API user needs to register for an API Gateway. Users
are granted access to specific APIs in line with data Classification and their specified
access rights.
Access to a specific API will be granted after approval by the Use Case stakeholders,
especially the data provider. The endpoint of the API is secured following current
security standards. Accessing data from the API is possible based on the API’s
documentation specification.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 102
Short Summary of “Flow of Data, Technical
implementation and underlying Architecture”
The technical description and implementation of all previously discussed and defined
parameters is the last step before the Data Gatekeeper rules can be executed in the
Data Platform. This work cannot be executed automatically, it requires special IT-skills
as well as a deep understanding of the Use Case, but it is an integral part of the end-
to-end process of the Data Gatekeeper implementation. The Data Model e.g.
describes all logical sequences that need to be followed by the Smart Data Platform,
so that e.g. the Use Case specific data Classification will be met during the complete
analysis process. The programmer or architect of the Smart Data Platform gets precise
information about how the IT-system needs to behave to fulfil a dedicated Use Case.
For that purpose, the Data Gatekeeper Registry is a special program entity in the
underlying Smart Data Platform that enables a structured input of the Data
Gatekeeper rules, which then influence the other Smart Data Platform data handling
algorithms accordingly.
Also the input and output interfaces to transfer the raw data into or from the Smart
Data Platform based on the given Data Gatekeeper rules as well as the underlying
data exchange formats and security rules need to be described, implemented and
tested in order to guarantee a frictionless functioning.
5.4 Analysis Dashboard
Another important feature of a Smart Data Platform is the existence of various analysis
capabilities for users interested in statistical findings. Not only for specific users but for
all users of a particular Use Case e.g. the Use Case Responsible and Data Analysts. The
two user interfaces Analysis Dashboard and Platform Statistics Dashboard address
those users.
The different data stored in the platform´s Database is used by the analysis module of
the platform to calculate certain data interpretations. The Data Set combinations and
their visualisations can be browsed by Data Analysts in the Analysis Dashboard user
interface. Figure 11 shows an exemplary analysis user interface for the analysis of
passenger amounts in public transportation.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 103
Figure 13: Exemplary analysis of Smart Home Use Case (Source : TUM)
For the Analysis Dashboard the user needs to log-in. He or she is only able to access
the visualized analyses belonging to his access rights.
As a boundary condition the analysis of data and Data Analyst access rights must be
inline with the Data Classification, which is ensuren by the Use Case Creation Process
and the Data Usage Agreement between Data Owner and City.
5.5 Platform Statistics Dashboard
The Platform Statsitics Dashboard enhances the capabilities provided to the Data
Analysts by the means of analyzing the usage statistics of the External Systems and the
corresponding API accesses registered in the Smart Data Platform. Targetet analyses
of user workflows can support the contious improvement of Use Cases.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 104
5.6 Transparency Dashboard
Another important feature of this particular Smart Data Platform approach is the
Transparency Dashboard, which allows citizens to get an overview on all data
Classification, processing and analysis undertaken for a particular Smart Data Use
Case. The opportunity of gaining insights into all processing and analysis steps the data
is used for, builds trust in the City as Smart Data operator and addresses the strategic
goal of building a positive and transparency image.
Figure14: Example of a Transparency Dashboard Prototype for Smarter Together Munich
(Source : VMZ)
The Transparency Dashboard allows external Smart Data Users data providers, Use
Case Responsibles and other project stakeholders to provide the list of available data
and their „fact sheet“ including current Categorisation on a website, also giving
information on how to access the data as a third party.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 105
The Transpartency Dashboard is realized as open website for interested citizens. It is
build a static website that is administrated by a standard content managemen system.
Contents are baed on the Use Case Concept Document and Data Classification
Sheets.
Short Summary of “Analysis and Transparency Dashboard”
The appropriate visualization of the Use Case input and output parameters is one key
element to assure transparency and to gain acceptance of the stakeholders and end
users. The Analysis Dashboard summarises the main Use Case specific parameters that
are used to define the Data Gatekeeper rules. The most important task of the Analysis
Dashboard however is to offer reasonable and comprehensible presentations of the
analysis, taking into account the possible restrictions that have been submitted by the
Data Gatekeeper rules.
The Transparency Dashboard is a Web-page that is designed for citizens. It can
contain an abstract of the Use Cases, a list of Data Sources, analysis goals and applied
data privacy rules. Its main task is to inform a broader public about the background
of the Use Cases, the possibilities to get access to related open data, to describe
which sensors have been used or to offer the possibility to get in touch with the City
stakeholders. It is one possible IT-tool for a City to demonstrate transparency in Smart
City or Digitalisation projects.
Conclusion and Outlook
Why
The present Data Gatekeeper Concept was created to establish a possible new view
on how to handle data in terms of data security and data privacy within an innovative
Smart City approach. Dealing with a Data Gatekeeper concept in a Smart City
environment automatically leads the stakeholders to consider a huge complexity and
variety of items, leading from legal and compliance aspects over an adequate
organisation to various expertise and finally towards an IT-integration.
How to see
The concept should therefore not only be looked at as a technical model of how to
handle data. It also offers a lot of discussion points of what a new view on a Smart City
could be in the future.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 106
Looking at the resulting complexity and number of triggered items when discussing a
Data Gatekeeper approach, the present concept should be looked at as a starting
point, underlining the character as a “living document” or going forward, as “Blue
Print”.
How to work with
Participation in the concept, feedback and share of additional experiences from
other cities therefore will help the Data Gatekeeper to stepwise expand to a more
and more robust and practical approach. It is the opportunity mainly of the
participating cities to trial this approach, discuss it and stepwise improve it over time.
Although this concept is a first step, it can already be used in City administrations to
rethink or align the way how the future City building material “Smart Data” needs to
be handled accordingly.
First conclusion
Concluding on the complex items discussed in this concept shows that data and
digitalization processes as well as involving citizens in an appropriate Participation
process will play an outstanding role in any strategic approach around any future
Smart City concept – a smooth transformation of processes and a cultural change of
people is needed.
What’s it for
The Data Gatekeeper with its focus on terms like “data handling”, including “data
security” and “data privacy” is the filter to correctly collect the Smart Data that are
needed to help cities worldwide to face their enormous challenges in the areas like
traffic, air pollution, the housing situation and to help guarantee last but not least the
economical attractiveness of the City for investors.
Why using it
The mentioned problems can be reduced by using European- or worldwide solution
approaches taking into consideration new administrative proceedings (“think and act
beyond traditional borders”), cooperative exchange of experiences and harmonised
political course of actions. As an example, it doesn’t seem reasonable that every
European City would describe and develop a different own Data Gatekeeper
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 107
approach. Only the intensive exchange of experiences, ideas and innovative ideas
will finally bring appropriate answers.
What is the benefit of it
New and smarter Data Sources are needed to illustrate the status-quo and help
analysing future scenarios – this is a key asset in a modern City administration.
Furthermore it is necessary to establish new Data Sources like innovative sensors and
actuators to add new information on how the City and its inhabitants are potentially
behaving over time is a crucial task and benefit for any City planning process. These
aspects often are extremely challenging for a City, especially when dealing with the
organizational set up in traditional department silos.
Also Cities will need convenient and innovative planning tools for the future. These
tools will only help to solve the above mentioned huge challenges when more useable
Smart Data will be provided and more intelligent analysis methodologies will be
incorporated.
Some ideas for intelligent planning and simulation of very complex processes in the
above mentioned areas are already evolving or are currently scheduled. Intelligent
“Digital Twins” of a City e.g. could enable innovative and predictive planning
simulation. Those approaches could be a first step to project which practical solutions
in a “real City” environment would be realisable.
All of these potential benefits for a future Smart City depend on mainly two things:
First is the availability of the right Use Cases. The creation of a valuable Use Case is a
challenging and time consuming process that should not be underestimated.
Second is the availability of data that can be considered as smart enough (and
useable for a City) to develop the required analysis that help solving the initial
problems.
New problems
New technologies lead to even more data and over time citizens might become more
and more skeptical when a City talks about sensors and innovative data collections
and analysis. Data privacy and the awareness of the potential misuse of it becomes a
focus area for many involved citizens and companies. For that reason a Smart City
approach without reasonable information and Participation processes seems very
hard to execute over time.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 108
On the other hand, discussions about growing criminal data hacking, data leakages
and almost uncontrollable commercialization of personal and private data also
affects cities in their new and growing role of Smart Data handlers. When it comes to
Smart City solutions, the City needs to provide suitable technical and organizational
answers to the above mentioned challenges and the citizens’ concerns. A Smart City
approach needs to position itself as a trustworthy and transparent data handler who
is only committed to the wellbeing of the citizens and a balanced City itself.
Also, taking into consideration changing legal situations and adapting existing or
planned processes to it, might create problems, as initial starting conditions might
have been changed by new regulations over time.
Final conclusion
Here is where the Data Gatekeeper comes into the game again. It is a strong
approach and vision for a holistic view on how Smart City data should be handled in
the future. It is also a concrete and stepwise improving tool that cities can use,
introduce and further integrate into their processes to underline their willingness to use
Smart Data as indispensable future raw material and at the same time be a
transparent and trustable fair data player.
But it is only when cities really adopt the idea of the current Data Gatekeeper
concept, pilot it, improve and enhance it over time, that it will become a robust and
trustworthy model that can be implemented in any given Smart City administration
over time. Actively communicating and openly exchanging the individual
experiences, hurdles and successes between the cities and their stakeholders will help
to accelerate this process.
A possible future “Data Gatekeeper Inside” label for European Cities is certainly a
strong vision that would emphasize the trustful position a Smart City should play for all
the stakeholders in the complex network that are needed to help implementing the
solutions that eventually will solve the upcoming challenges.
Glossary
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 109
Term Definition
Basic Use Case
Description
Key question for each Use Case, such as user goals (Data Analyst
as well as Smart Data Users), analysis purposes, user tasks, internal
and external stakeholders etc.
City In the scope of this concept a city can mean a City as e.g.
Munich or an urban district of a City or e.g. a consortium of
several Cities. This depends on the local or regional granularity a
Smart Data Platform shall be run and managed. Roles defined
per City therefore can also mean roles per urban environment.
Categorisation Within this concept Data Sets therefore first are assigned to
different in advance specified Categories with regards to their
protection needs (see Data Categories).
Categories /
Categorisation
Dimensions
The following Data Cateories are distinguished in the
Categorisation within this concept:
category 1: open data
category 2: non-personal data but not open
category 3: non-personal data but restricted
category 4: Personal data
Classification Data Classification is understood as the assignment of data
security levels to a Data Set stemming from a Data Source. This is
necessary due to compliance reasons. Within this concept the
Data Set is rated quantitatively (rating level 1–4) in 4
Classification Dimensions. Further, whether the Data Set is Open
Data or not (refer to respective Category) as well as a time-to-
live for respective Data Objects is assesses as part of the
Classification.
Classification
Dimensions
In this concpt the following Classification Dimensions are used to
quantify the Classification for each Data Set:
• I for integrity and data protection (Level I1 – I4)
• P for processing and analysis (Level P1 – P4)
• R for redistribution and modification (Level R1 – R4)
• S for storage and deletion (Level S1 – S4)
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 110
Term Definition
Data Administrator Technical role on Data Supplier side ensuring the technical
provision of data to the Smart Data Platform. He or she provides
technical and data scheme information during the Smart Data
Creation Process. Further he or she helps out with support to
various Use Case roles and works on technical tasks related to
the data ownership of the Data Owner.
(example for task: Operations)
Data Analysis
Query
Either the Use Case Responsible himself or several Use Case-
specific Data Analysts e.g. from City departments demand for
gaining insights from Smart Data. In The Scope of this concept a
Data Analysis Query is a concrete demand for combined or
simple data analysis raised by a Data Analyst or Use Case Owner.
The Analysis Dashboard visualized several Data Analysis Queries
by means of Derived Data Sets. More precisely, it is a Query that
processes one Data Set or concatenates several Data Sets.
(example for task: Operations)
Data Analyst
Various users with interest in gaining insights from Smart Data as
e.g. employees of various city departments. Within the concept
they have access to dedicated Data Sets and related Data
Analysis Queries using the Data Analysis Dashboard and can
analyze them for granted purposes of use.
(example for task: Operations)
Data Field Label describing all single Data Values per data column
Data Model In the scope of this concept the Data Model describes many
possible flexible Data Models e.g. consisting of two Use Cases
and tow related data sets or 5 Use Case referring to 6 different
data sets.
Data Object Row of Data Values with multiple Data Fields and corresponding
Data Values
Data Owner The Data Owner is a leading person on Supplier side. He or she
defines general or specific purposes of use data may be used
with and plays a key roles for decisions of a related Smart Data
Use Case.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 111
Term Definition
(example for task: Data Provision)
Data Protection
Officer
The Data protection Officer ensures compliance to Legal
Regulations as e.g. data privacy. He or she is the legal
responsible for the whole data platform including all Use Cases
and ensures the data just being used for purposes the Data
Owner agrees with.
(example for task: Data Consumption)
Data Scheme Schematic structure of one or several Data Sets related to one
Use Case provided by the Data Administrator to the Platform
Administrator
Data Set Set of Data Objects that is hosted on the Smart Data Platform
and
Data Source External System providing a regulary data stream that stems from
sensors or databases
Data Subject For personal data as well as for usage data related to humans,
the Data Subject is the person that owns the data and can
decide whether she or he wants to allow a respective purpose
of use e.g. by a Use Case or by the City administration. The Data
Subject is granted many rights on his or her personal data.
Data Supplier The Data Supplier is a company or an organisation supplying a
Data Set to the Smart Data Platform. This regularly happens in the
scope of a Smart Data Use Case.
Data Value Atomic data value that is described by a Data Field
Data Interface A technical interface (e.g. REST webservice) regularly providing
a stream of datasets for a Data Set
Database Information system persisting data for business transactions and
analysis
External System External System that either is an interactive system used by Smart
Data Users or a system serving the Smart Data Platform as a Data
Source
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 112
Term Definition
Infrastructure Data The most common domain of Master Data in a City context. In
contrary to real-time data Master Data have a low update
frequency.
Master Data Data describing fundamental and independent entities as e.g.
the temperature sensor id and its location (following ISO 8000). In
the context of the data gatekeeper concept, this means
metadata and data relevant for steering the usage of
databases or describing data of databases.
Participation Process of continuous involvement of citizens and various
stakeholders inside and outside the City. Co-Creation is one
possible method of Participation.
Platform
Administrator
Technical Administrator for the Smart Data Platform inside or
outside the City. He or she ensures the availablitly of the data
platform to all stakeholders and realized technical adjustments,
configurations and implementation in case of new Use Cases.
(example for task: Operations)
Platform Provider In case a City does not want to run a Smart Data Platform on
won infrastructure a Platform Provider Company or Organisation
could run a Smart Data Platform on behalf of the City.
Processing
Functionionality
Concrete data manipulatipon function as e.g. the replacement
of Data Object identifiers
Processing
Measure
Manipuliation of single Data Objects and its Data Values of a
Data Set necessary for compliance reasons. In the scope of this
concept the conceptual procedures
Anonymisation/Pseudonimisation are understood as a
Processing measure that can be realized by various Processing
Functionalities.
Real-Time Data Real-Time Data are data stemming from sensors with a small
update time interval (e.g. 1 min or 30 sec.).
Service Developer
The Service Developer designs and evaluates a human-
computer interaction concept for an External System if
necessary (e.g. not necessary for hardware solution like Lamp
Post). Therefore he or she is involved in the Requirements
Specification and other phases of the Use Case Creation
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 113
Term Definition
Process. Further he or she develops an app or constructs and
tests a hardware solution. His or her responsibility is further to
prevent security flaws on application level and to ensure the
availability of the Smart Data Use Case.
(example for task: Data Provision)
Smart City A Smart City is understood as an urban environment with digitally
wired inhabitants, a wide net of actors exist (cf. Walser & Haller,
2016, p. 19).
Smart Data Smart Data is understood as analysis solutions based on big data
with a clear purpose, semantic, data quality, security and data
privacy (based on Jähnichen 2015).
Smart Data
Administration
All roles involved in the Smart Data Management on side of the
City.
(example for task: Strategy)
Smart Data Board Comitte of various leading persons within the City or the urban
environment. The Decide strategically on new Smart City Use
Cases and Smart Data topics and serve for escalating decisions.
(example for task: Strategy)
Smart Data
Creation
Process that elaborates and realizes a single Smart Data Use
Case, see Use Case Creation Process
Smart Data Owner Leading person (e.g. municipal board member) responsible and
accountable for Smart Data in the City or urban environment. He
or she is internal Sponsor and has budget main responsibility.
(example for task: Strategy)
Smart Data
Platform
Technical Platform that serves as basis for smart services and for
the organisational Data Gatekeeper Concept.
Smart Data User User that uses a Smart Data app or an other External System that
uses Data Sets from the Smart Data Platform
(example for task: Operations)
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 114
Term Definition
Strategic Manager Key project manager and responsible for the Smart Data
Platform and related strategic decisions, moderations and
negotiations. He or she elaborates Data Usage Agreements with
Data Suppliers.
(example for task: Strategy)
Transaction Data Data that changes dynamically as e.g. tempaerature values
from a sensor Data Set.
Use Case In the context of this concept a Smart Data Use Case is a
concrete Smart Data User scenario like e.g. Smart Home or Smart
Lamp Posts.
Use Case Creation
Process
Process that has to be started in case a new Smart Data Use case
shall be elaborated and added to the Smart Data Platform
(described in chapter 3 and 4). Synonyme:
Use Case
Responsible
The Use Case Responsible can be at City side or on Side of an
other organisation or on side Request for Data Analysis Queries
Responsible for subject-matter aspects of a Smart City Use Case
Ensure running operation and provide support
Analyze usage and use data for city-internal or external purposes
according to usage agreements
(example for task: Operations)
Use Case Topic
Area
Groups of Use Cases in the Smarter Together Project like e.g.
Smart Home, Mobility or Energy.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 115
Bibliography
BBC (2012). Google 'fails to meet EU rules' on new privacy policy.
URL: http://www.bbc.com/news/business-17192234
BSI (2011). Privacy Impact Assessment Guideline Authors: Marie Caroline Oetzel, Sarah
Spiekermann, Ingrid Grüning, Harald Kelter, Sabine Mull, Editor: Julian Cantella,
Bundesamt für Sicherheit in der Informationstechnik, 53133 Bonn. URL:
https://www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/ElekAusweise/PIA/Pri
vacy_Impact_Assessment_Guideline_Kurzfasssung.pdf?__blob=publicationFile
&v=1
Edwards, J., & Wolfe, S. (2005). Compliance: A review. Journal of Financial Regulation
and Compliance, 13(1), 48–59.
EU (2016). General Data Protection Regulation (GDPR): Regulation (EU) 2016/679 of
the European Parliament and of the Council of 27 April 2016 on the protection
of natural persons with regard to the processing of personal data and on the
free movement of such data, and repealing Directive 95/46/EC. URL: http://eur-
lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32016R0679
EU (2003). Directive 2003/98/EC of the European Parliament and of the Council of 17
November 2003 on the re-use of public sector information. URL: http://eur-
lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32003L0098
EU (1995). Directive 95/46/EC of the European Parliament and of the Council of 24
October 1995 on the protection of individuals with regard to the processing of
personal data and on the free movement of such data. URL: http://eur-
lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:31995L0046
Holzmann, R. (2016). Betrug und Korruption im Experiment: Ansätze für ein
evidenzbasiertes Compliance-Management. Wiesbaden: Springer
Fachmedien.
Jacka, J. M., & Keller, P. J. (2009). Business Process Mapping: Improving Customer
Satisfaction. New Jersey: John Wiley & Sons
Jähnichen, S. (2015). Von Big Data zu Smart Data – Herausforderungen für die
Wirtschaft. Smart Data Newsletter Volume 1/2015, Smart-Data-Begleitforschung
c/o Loesch Hund Liepold Kommunikation GmbH, 10115 Berlin. URL:
http://www.digitale-
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 116
technologien.de/DT/Redaktion/DE/Downloads/Publikation/SmartData_NL1.pd
f?__blob=publicationFile&v=5
Li, Y., Dai, W., Ming, Z., & Qiu, M. (2016). Privacy Protection for Preventing Data Over-
Collection in Smart City. IEEE Transactions on Computers, 65(5), 1339-1350.
doi:10.1109/TC.2015.2470247
OECD (2013). The OECD Privacy Framework. URL:
http://www.oecd.org/sti/ieconomy/oecd_privacy_framework.pdf
Scheuch, R., Gansor, T., & Ziller, C. (2012). Master Data Management: Strategie,
Organisation, Architektur. Heidelberg: dpunkt-Verlag.
Walser, K., & Haller, S. (2016). Smart Governance in Smart Cities. In: A. Meier & E.
Portmann (Eds.), Smart City: Strategie, Governance und Projekte (p 19-46).
Wiesbaden: Springer Vieweg.
Wang, Y., & Kobsa, A. (2008). Privacy-Enhancing Technologies. URL:
https://www.ics.uci.edu/~kobsa/papers/2008-IUI-Book-kobsa.pdf
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 117
Appendix
7.1 Legal Regulations
7.1.1 Principles on Privacy for Personal Data
An overview with wider explanation for each key principle listed in chapter 2.1 Legal
Regulations:
Consent and Necessity of Processing
„Personal data shall be processed lawfully […].“ GDPR, Article 5, 1(a)
There are only a few possibilities for collecting and processing personal data:
1) The Data Subject has given consent
2) It is necessary for the performance of a contract to which the Data Subject is
party
3) It is necessary for compliance with a legal obligation
4) It is necessary in order to protect the vital interests of the Data Subject or of
another person
5) It is necessary for the performance of a task carried out in the public interest or
in the exercise of official authority
6) It is necessary for the purposes of the legitimate interests pursued by the
controller or by a third party (except where such interests are overridden by the
interests or fundamental rights and freedoms of the data subject).
At least one of the above need to apply for each purpose.
If the collection and processing of data is based on consent, it is necessary to comply
with the conditions for consent, listed in Article 7 (GDPR). Moreover, there are special
conditions regarding the processing of personal data of children, special Categories
of personal data and personal data relating to criminal convictions and offences (see
GDPR, Article 8-10).
References: GDPR, Article 6-10
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 118
Data minimisation
„Personal data shall be […] limited to what is necessary in relation to the purposes for
which they are processed.“ GDPR, Article 5, 1(c)
1) It is necessary to evaluate carefully what data are needed for the purpose,
especially when processing personal data.
2) The collected data needs to be in relation to the purpose.
3) It must be ensured, that only the needed data are collected and processed.
4) If there are several options to achieve the purpose, the least privacy-invasive
one should always be chosen.
5) Personal data must be erased without undue delay when the personal data are
no longer necessary in relation to the purposes.
References: GDPR, Article 5, 1(c)
Purpose limitation and specification
„Personal data shall be collected for specified, explicit and legitimate purposes and
not further processed in a manner that is incompatible with those purposes.“ GDPR,
Article 5, 1(b)
1) Data is only allowed to be collected for legitimate purposes (see principle
‘lawfulness’).
2) The purposes for which the data are collected needs to be specified in
advance.
3) The data collected for specified purposes are only allowed to be used exactly
for these purposes, except:
o The purposes are compatible with the initial ones (see Article 6, 4).
4) For purposes, not compatible with the initial ones, the data are not allowed to
be processed and it is necessary to get the consent of the Data Subject again.
References: GDPR, Article 5, 1(b)
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 119
Storage limitation
„Personal data shall be kept in a form which permits identification of Data Subjects
for no longer than is necessary for the purposes for which the personal data are
processed.“ GDPR, Article 5, 1(e)
As soon as personal data are no longer needed for the purpose they have been
processed, they should be erased. It is possible to store personal data further insofar as
the personal data will be processed solely for:
– archiving purposes in the public interest
– scientific or historical research purposes
– statistical purposes (see GDPR, 156, 162)
References: GDPR, Article 5, 1(e)
Transparency
„Personal data shall be processed […] in a transparent manner in relation to the
Data Subject.“ GDPR, Article 5, 1(a)
The collection and processing of personal data should be handled in a transparent
way, therefore the controller is obligated to:
1) provide the Data Subject at the time when personal data are obtained, with
information (where applicable) about:
o the identity and the contact details of the controller and of the
controller's representative
o the contact details of the data protection officer
o the purposes of the processing for which the personal data are intended
as well as the legal basis for the processing
o the legitimate interests pursued by the controller or by a third party
o the recipients or Categories of recipients of the personal data
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 120
o the fact that the controller intends to transfer personal data to a third
country or international organisation
o the reference to the appropriate or suitable safeguards and the means
by which to obtain a copy of them or where they have been made
available
2) facilitate the exercise of Data Subject rights
3) provide information on action taken on a request (see Articles 15 - 22) to the
Data Subject without undue delay, if further delayed for e.g. because of
complexity, the Data Subject needs to be informed.
4) information provided as well as any communication and any actions taken
(under Articles 15 - 22 & 34) should be provided free of charge.
The controller is allowed to ask for further information to confirm the identity of
the requester if there are any doubts.
References: GDPR, Article 12-14; 19; 34
Data Subject’s Participation
„Personal data shall be processed […] in a transparent manner in relation to the
Data Subject.“ GDPR, Article 5, 1(a)
To guarantee a transparent handling of personal data, the Data Subject has the right
of access, the right to rectification, to erasure, to restriction of processing, to data
portability and to object, meaning in detail that the Data Subject has the right to:
1) obtain confirmation as to whether or not personal data concerning him or her
are being processed, and, where that is the case, access to the personal data
and the following information (where applicable):
o the purposes of the processing
o the Categories of personal data concerned
o the recipients or Categories of recipient to whom the personal data have
been or will be disclosed
o the envisaged period for which the personal data will be stored
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 121
o the rights of the Data Subject
o any available information as to their source, if not collected from the
Data Subject
o the existence of automated decision-making
2) obtain without undue delay the rectification of inaccurate personal data
concerning him or her.
3) obtain from the controller the erasure of personal data concerning him or her
without undue delay.
4) obtain from the controller restriction of processing, when:
o the processing is unlawful and the Data Subject opposes the erasure of
the personal data and requests the restriction of their use instead
o the accuracy of the personal data is contested
o the personal data are no longer necessary in relation to the purposes,
but they are required by the data subject for the establishment, exercise
or defence of legal claims
o the Data Subject has objected to processing until it is verified whether
the legitimate grounds of the controller override those of the Data
Subject
5) receive the personal data concerning him or her, which he or she has provided
to a controller and to transmit those data to another controller, when:
the processing is based on consent, on a contract or when it is carried
out by automated means
it doesn’t affect the rights and freedoms of others
6) object on grounds relating to his or her particular situation, at any time to
processing of personal data concerning him or her.
7) not to be subject to a decision based solely on automated processing,
including profiling.
References: GDPR, Article 15-18; 20-22
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 122
Accuracy
„Personal data shall be accurate and, where necessary, kept up to date.“ GDPR,
Article 5, 1(d)
When processing personal data, it is necessary to verify (regularly) that the data is
accurate. Every reasonable step must be taken to ensure that personal data that are
inaccurate are erased or rectified immediately.
References: GDPR, Article 1(d); 16
Security, confidentiality and integrity
„Personal data shall be processed in a manner that ensures appropriate security of
the personal data […].“ GDPR, Article 5, 1(f)
To ensure security for personal data, including protection against unauthorised or
unlawful processing and against accidental loss, destruction or damage, the
controller and the processor are responsible for the implementation of appropriate
technical and organisational measures to ensure a level of security appropriate to the
risk.
1) The state of the art, costs of implementation, scope, context, and the risk of
varying likelihood and severity for the rights and freedoms of natural persons
should be considered when deciding about appropriate measures, like:
o the pseudonymisation and encryption of personal data
o the ability to ensure the ongoing confidentiality, integrity, availability and
resilience of processing systems and services
o the ability to restore the availability and access to personal data after an
incident
o a process for regularly testing, assessing and evaluating the effectiveness
of technical and organisational measures for ensuring the security of the
processing
2) When assessing the appropriate level of security, the risks that are presented by
processing personal data need to be considered. Especially risks regarding
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 123
accidental or unlawful destruction, loss, alteration and unauthorised disclosure
of personal data as well as access to transmitted, stored or otherwise processed
personal data.
3) Adherence to an approved code of conduct (see Article 40) or an approved
certification mechanism (see Article 42) may be used to demonstrate
compliance with this principle.
4) Steps need to be taken, to ensure that any natural person who has access to
personal data does not process them except on instructions from the controller.
References: GDPR, Article 32, 35-36
Obligations and accountability
Processing personal data implies several Responsibilities and statutory obligations of
the controller and processor, who needs to:
1) erase personal data without undue delay when:
o the Data Subject withdraws consent
o objects to the processing
o the data have been unlawfully processed
o for compliance reasons
o the personal data are no longer necessary in relation to the purposes
except when:
o the Data Subject withdraws consent
o processing is necessary for exercising the right of freedom of expression
and information
o for compliance with a legal obligation
o for reasons of public interest in the area of public health
o for archiving purposes in the public interest, scientific or historical
research purposes or statistical purposes
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 124
o for the establishment, exercise or defence of legal claims
2) notify the supervisory authority of a personal data breach, without undue delay
(not later than 72 hours), unless the personal data breach is unlikely to result in
a risk to the rights and freedoms of natural persons.
3) notify the Data Subject of a personal data breach without undue delay, when
the personal data breach is likely to result in a high risk to the rights and freedoms
of natural persons.
4) Where a type of processing is likely to result in a high risk to the rights and
freedoms of natural persons, the controller should, prior to the processing:
o carry out an assessment of the impact of the envisaged processing
operations on the protection of personal data (see Article 35).
o seek the advice of the data protection officer, where designated, when
carrying out a data protection impact assessment (see Article 36).
5) designate a data protection officer in any case where:
o the processing is carried out by a public authority or body
o the core activities of the controller or the processor consist of processing
operations which, by virtue of their nature, their scope and/or their
purposes, require regular and systematic monitoring of data subjects on
a large scale.
o the core activities of the controller or the processor consist of processing
on a large scale of special Categories of data and personal data relating
to criminal convictions and offences.
References: GDPR, Article 24 - 31, 33 -37
7.1.2 Objectives and Principles of Opening Administrative Data
The objectives of the directive are ‘to facilitate the creation of community-wide
information products and services based on public sector documents, to enhance an
effective cross-border use of public sector documents by private companies for
added-value information products and services and to limit distortions of competition
on the Community market.’ (EU, 2003)
The directive contains a set of rules governing the re-use and the practical means of
facilitating re-use of existing documents held by public sector bodies of the member
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 125
states. It does not contain an obligation to allow re-use of documents. The decision
whether or not to authorize re-use will remain with the Member States or the public
sector body concerned.
The directive builds on the existing access regimes in the Member States. It should be
implemented and applied in full compliance with the principles relating to the
protection of personal data in accordance with the GDPR as well as with provisions of
international agreements, community and national law.
The full list of principles covering the aspects ‘requests for re-use’, ‘conditions for re-
use’ and ‘non-discrimination and fair trading’of the directive:
Requests for re-use:
1) Public sector bodies should, through electronic means where possible and
appropriate, process requests for re-use and make the document available for
re-use to the applicant.
2) Where no time limits or other rules regulating the timely provision of documents
have been established, requests should be processed within a timeframe of not
more than 20 working days after its receipt.
3) In case of a negative decision, the public sector bodies shall communicate the
grounds for refusal to the applicant on the basis of the relevant provisions of the
access regime in that Member State.
4) A negative decision should contain a reference to the means of redress.
Reference: 2003/98/EC, Article 4
Conditions for re-use:
1) Available formats:
o Documents should be made available by public sector bodies in any
pre-existing format or language, through electronic means where
possible and appropriate.
o It does not require public sector bodies to create or adapt documents in
order to comply with the request.
o Public sector bodies are not required to continue the production of a
certain type of documents.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 126
2) Principles governing charging:
o Where charges are made, the total income from supplying and allowing
re-use of documents should not exceed the cost of collection,
production, reproduction and dissemination, together with a reasonable
return on investment.
3) Transparency:
o Any applicable conditions and standard charges for the re-use of
documents held by public sector bodies should be pre-established and
published, through electronic means where possible and appropriate.
o If requested, the public sector body should indicate the calculation basis
for the published charge.
4) Licences:
o Public sector bodies may allow for re-use of documents without
conditions or may impose conditions, where appropriate through a
licence.
o Conditions should not unnecessarily restrict possibilities for re-use and
should not be used to restrict competition.
5) Practical arrangements:
o Member States should ensure that practical arrangements are in place
that facilitate the search for documents available for re-use.
o An Example for such arrangements are assets lists, accessible preferably
online.
References: 2003/98/EC, Article 5-9
Non-discrimination and fair trading:
1) Non-discrimination:
o Any applicable conditions for the re-use of documents should be non-
discriminatory for comparable Categories of re-use.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 127
o If documents are re-used by a public sector body as input for its
commercial activities which fall outside the scope of its public tasks, the
same charges and other conditions should apply as to other users.
2) Prohibition of exclusive arrangements:
o The re-use of documents should be open to all potential actors in the
market.
o Contracts or other arrangements between the public sector bodies
holding the documents and third parties should not grant exclusive rights.
o Where an exclusive right is necessary for the provision of a service in the
public interest, the validity for such an exclusive right should be reviewed
regularly (at least every 3 years).
o Exclusive arrangements should be transparent and made public.
References: 2003/98/EC, Article 10-11
7.2 Checklist Templates for the Smart Data Creation
This chapter contains a library of checklists, helping to fulfill the recommended steps in
the creation of a Smart Data Use Case Creation Process if executed in the exemplary
way.
The instructions given in chapter 3 are implemented, to simplify the subject and help
understand and reproduce the described approach. If some of the steps in the
following subchapters are unclear, it is suggested ro (re)read the corresponding
section in chapter 3, where each step is described more detailed.
The following chapter is structured in table form. The left column lists the keywords and
the right column is left open for answers. Table lines with blue background indicate a
headline for the following table lines. Table lines with green background indicate an
action and include an exemplary role that performs the task. A long (under)line
requires a name or another term to be filled in there.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 128
7.2.1 Requirement Specification (Epic)
Strategic Comitee : Discuss and Decide on Initiative for new Use Case
Basic Use Case Description
An identification number
A unique name
Analysis purpose and motivation/business
model
Task assignments
(how to achieve the goal)
Analysis results (output)
Needed Data
Time period
Design ideas of the analysis
Exceptions
Specific requirements
Project Stakeholders and Use Case specific Roles
Definition of contact persons Company C:
Mr. Max Mustermann
Address XY
Tel.:
E-Mail: example@company_c.com
Assignment of required roles Use Case Responsible:
Mr./Mrs. __________________________________
Data Owner:
Mr./Mrs. __________________________________
Data Protection Officer:
Mr./Mrs. __________________________________
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 129
Data Administrator:
Mr./Mrs. __________________________________
Service Developer: Mr./Mrs.
__________________________________
Definition of further required tasks or roles
Assignment of new task(s)/role(s)
Technical Requirements
The kind of Data Set and the location
Data format(s)
Interfaces for the import and export of the
data (API access)
Time and rate of exchange
Specific requirements (modalities)
Suggestion for Data Processing and Exchange
Data Categories open data
non-personal data but not open
non-personal data but restricted
personal data
Type of contract for data usage License
License with side letter
Individual Agreement
Measures anonymization
pseudonymization
aggregation
access restriction/protection
______________________
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 130
7.2.2 Data Categorisation
Data Categorization
Data Category Attribute 1: Temperature open data
non-personal data but not open
non-personal data but restricted
personal data
Data Category Attribute 2: Humidity open data
non-personal data but not open
non-personal data but restricted
personal data
Data Category Attribute 3: ________________ open data
non-personal data but not open
non-personal data but restricted
personal data
7.2.3 Data Usage Agreement and Licensing
Strategic Manager: Create draft for Data Usage Agreement
Strategic Manager: Initiate and moderate Data Usage Agreement meeting
Contract Specifications
Type of contract for data usage License
Strategic Manager: Initiate and moderate Use Case Agreement meeting
Data Administrator: Approve Data Format and Technical Feasibility
Data Protection Officer: Legal Check Requirement Specification
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 131
License with side letter
Individual Agreement
Licensing type ___________________________
Data Protection Officer: Legal Check Data Usage Agreement
7.2.4 Data Classification
Data Classification
Data Set Attribute 1: Temperature
Open Data yes no
Time-to-live ____________________
(Day / Month / Year )
Classification: Integrity and data protection
(I)
I1
I3
I2
I4
Classification: Processing and analysis (P) P1
P3
P2
P4
Classification: Redistribution and
modification (R)
R1
R3
R2
R4
Classification: Storage and deletion (S) S1
S3
S2
S4
Data Set Attribute 2: Humidity
Open Data yes no
Time-to-live ____________________
(Day / Month / Year )
Classification: Integrity and data protection
(I)
I1 I2
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 132
I3 I4
Classification: Processing and analysis (P) P1
P3
P2
P4
Classification: Redistribution and
modification (R)
R1
R3
R2
R4
Classification: Storage and deletion (S) S1
S3
S2
S4
Data Protection Officer: Legal Check Data Classification
Aggregate Classification
Calculation Rule: Take the maximum
of each I, of each P, of each R and
each S.
Take into account the (different)
time-to-live Attribute(s)
I ___
P ___
R ___
S ___
Time-to-live: ____________________
(Day / Month / Year )
7.2.5 Quality Gate 1
Use Case Responsible: Elaborate final Use Case concept
Strategic Manager: Final Concept Approval (Q1)
Strategic Manager: Sign Data Usage Agreement
7.2.6 Data Sets and Technical Implementation
Specification for Technical Implementation
Data Scheme
Compatible with Data Model yes no
Existence of all necessary Attributes yes no
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 133
7.2.7 Data Processing
Sampling rate
Data Administrator: Create Data Scheme and Interface Specification
Service Developer: Create Technical Specification
Use Case Responsible: Technical Specification Approval
Platform Administrator: Implementation and Data Set Creation
Data Administrator: Implementation of Data Interface
Processing Measures and Functions
Measures anonymization
pseudonymization
aggregation
access restriction/protection
______________________
Function for measure 1:
______________________
(if measure 1 is anonymization or
pseudonymization)
Removing person-related Data Fields
Removing ID digits
Creating hash values
______________________
Function for measure 2:
______________________
(if measure 2 is aggregation)
Remove subcategories of a Data Set
Create an average value for a defined
period of time
Aggregate (sub-) Categories
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 134
7.2.8 Quality Gate 2
Use Case Responsible: Specification and Implementation Review
Strategic Manager: Use Case Integration Test
Use Case Responsible: Go Live Pilot Phase
Strategic Manager: Analysis User Test
Strategic Manager: Final Use Case Approval and Go Live
______________________
Platform Administrator: Implement Processing Functions
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 135
7.3 RACI
Figure shows the activities described in the respective sections and the roles that are
in charge of the activities in the RACI matrix notoation (cf. Jacka & Keller, 2009).
7.4 Categories
Open data
For data belonging to the category ‘open data’ the main focus is on topics
concerning the prerequisites for passing on the data and respective user agreements,
licensing, and further agreed on conditions.
Data in this category is not subject to any privacy restrictions. Meaning there are no
personal data or personal related data therein. This data has the lowest level of
protection need out of the four Categories.
Sect
ion
Task
Smar
t D
ata
Boa
rdSm
art
Dat
a O
wne
r Ci
tySt
rate
gic
Smar
t D
ata
Res
pons
ible
Dat
a Pr
otec
tion
Off
icer
Cit
y
Plat
form
Adm
inis
trat
orA
naly
sis
Use
r
Use
Cas
e Re
spon
sibl
e Co
ncep
tual
and
Tec
hnic
al D
evel
oper
Dat
a O
wne
r Su
pplie
rTe
chni
cal D
ata
Stew
ard
Supp
lier
Dat
a Pr
otec
tion
Off
icer
Sup
plie
r
3.2 Discuss and Decide on Initiative for new Use Case C R C C
3.2 Basic Use Case Description C C R C C C
3.2 Specification Project Stakeholders and Roles R I
3.2 Technical Requirements I C R C
3.2 Suggestion for Data Processing and Exchange R I C
3.2 Initiate and moderate Use Case Agreement meeting I I R I C C C C I C
3.2 Approve Data Format and Technical Feasibility C C R
3.2 Legal Check Requirement Specification I R I C
3.3 Data Categorization per Data Set Attribute I C C R C
3.3 Data Classification per Data Set Attribute I C C R
3.3 Legal Check Data Classification I I R I I
3.4 Create draft for Data Usage Agreement R C C C
3.4 Initiate and moderate Data Usage Agreement meeting R C I C I C C
3.4 Legal Check Data Usage Agreement I R
3.5 Elaborate final Use Case concept C C C R C C
3.5 Final Concept Approval (Q1) C R I I C
3.5 Sign Data Usage Agreement A R C A C
3.6 Create Data Scheme and Interface Specification C I C R
3.6 Create Technical Specification C I R C
3.6 Technical Specification Approval C I R I
3.6 Implementation and Data Set Creation R I C
3.6 Implementation of Data Interface C I R
3.6 Develop App or Hardware solution C C R
3.7 Derive necessary Processing Measures and Functions C R C C C C
3.7 Implement Processing Functions R I
3.8 Specification and Implementation Review C C R C C
3.8 Use Case Integration Test R C I C C C
3.8 Go Live Pilot Phase C C R C C
3.8 Analysis User Test R C C C C
3.8 Final Use Case Approval and Go Live C R C I I C I C I
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 136
It is necessary to ensure or check that data in this category:
- doesn’t contain personal data
- doesn’t contain data that allows to trace it back to a person
- doesn’t contain data that, when combined with other data, allows to trace
it back to a person
- is proven to be open data
- the conditions under which data may be processed and/or distributed
- to which licenses is the Data Subject to
Non-personal data but not open
Data in this category cannot be unequivocally classified as public data, but is not
subject to any privacy restrictions. Meaning it does not contain any protective
information about persons and can consequently not be used for the violation of
individual personality rights.
For the reason that the data were not (yet) released/marked as open data, there
might be restrictions regarding their use, processing, distribution or storage.
Topics that need to be considered within this category therefore include usage and
distribution rights and agreements as well as licenses and further restrictions.
It is necessary to ensure or check that data in this category:
- doesn’t contain personal data
- doesn’t contain data that allows to trace it back to a person
- doesn’t contain data that, when combined with other data, allows to trace
it back to a person
- is not marked as open data
- has a lawful owner
- the conditions under which data may be processed and/or distributed
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 137
- to which restrictions is the Data Subject to
Non-personal data but restricted
Data belonging to this category cannot be unequivocally classified as personal data,
but contains a (potential) risk that under special circumstances this data can be
attributed to persons and therefore needs to be treated accordingly.
This means that data in this category does not contain any direct personal information.
It is however possible, when linked or evaluated with other information, that the data
allows to extract individual information, such as the behavior of small groups, or even
conclusions about an individual itself. Furthermore, data in this category can be
designed in such a way that there is a certain (individually specified) risk of misuse,
even without any person-related information being involved.
When processing this kind of data, it needs to be verified and ensured, that the
processing does not allow any kind of traceability to individuals or micro groups. If this
is not possible, the data is either not allowed to be processed within the Smart Data
Platform or it needs to be treated as personal data and hence compliance to all
privacy regulations needs to be ensured.
This means that in this category you have to specify individually what kind of protection
needs to be applied to this data.
It is necessary to consider following questions for data in this category:
- Does the data contain personal data?
- Is it possible to trace data back to a person? (Is there enough different
data?)
- Is it possible to trace data back to a person, when being processed with
other data?
- What is the risk of that data being able to violate personal rights?
- Measures should be taken to minimize risk
- What are the conditions for (a lawful) processing of the data?
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 138
Personal data
Personal data is highly sensitive and confidential data and hence has the highest
protection need. When processing personal data, the responsible person is obliged to
create a detailed written documentation of the reasons for the processing as well as
possible risks and all taken measures to ensure data privacy and security. If it is not
absolutely necessary, personal data should not be processed.
In some cases, however, it may be necessary to process personal data, which is
possible if the principles in chapter Legal Regulations are taken into account and
adhered to.
There are different possibilities on how to transform (personal) data within a platform
so that it cannot any longer be referenced to an individual. Measures that can or need
to be taken can be found in chapter Data Processing.
It is necessary to consider following questions for data in this category:
- Is it (originally) personal data?
- Is the processing legal?
- Is the personal data really necessary - are there other possible solutions?
- Is the survey in relation with the result/benefit?
- Is the data marked accordingly?
- What measures can/need to be taken?
- Are there any specific conditions that were agreed on?
7.5 Classification
Integrity and data protection
The Classification “I” stands for integrity and data protection. It ensures that data is
processed lawfully and securely with regards to those aspects.
Level 1 (I1): No further measures need to be taken.
Data does neither contain personal information nor allow backtracing to an individual.
Furthermore, the data is not subject to any license or agreement condition, which
requires specific treatment regarding integrity and data protection.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 139
Level 2 (I2): Data need to be anonymized or pseudonymised5 before further
processing.
Data contains information that could be used to identify a person when being
analyzed with other data and therefore needs to be anonymized or pseudonymised
accordingly.
Level 3 (I3): Data need to be aggregated and combined with more or similar data of
this kind in order to be able to process and analyze it. The necessary amount must be
identified individually per Use Case, written down in a text and coordinated.
Level 4 (I4): Data needs to be treated as personal data and therefore it is necessary
to comply with data protection laws and legal requirements, described in chapter
Legal Regulations.
Processing and analysis
The Classification “P” stands for “processing and analysis”. It ensures, that data is
processed lawfully and securely with regards to those aspects.
Level 1 (P1): No restrictions.
Data does neither contain personal information nor allow backtracing to an individual.
Furthermore, the data is not subject to any license or agreement condition, which
requires specific treatment regarding processing and analysis.
Level 2 (P2): Data can only be processed and analyzed with other data when it is
ensured that by processing, it (still) won’t be possible to trace data back to an
individual.
In this case, it is necessary to identify which or what kind of data would enable to trace
data back to a person. This needs to be described in detail, written down in a text and
coordinated.
Level 3 (P3): Data can only be processed and analyzed with preassigned data within
the Use Case.
5 Pseudonymisation is only applicable when it is ensured that there is no way to trace it back to its initial form, meaning the pseudonymisation needs to be irreversible.
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 140
In this case, the conditions for further processing and analysis need to be set
individually per Use Case and embedded in the terms of use or the license. The
processing and analysis possibilities need to be coordinated, arranged and implement
with responsible stakeholders.
Level 4 (P4): Data are not allowed to be processed or analyzed with other data.
Data contains personal data or comparable vulnerable information and therefore it is
not allowed to use it for other analysis than for the initial purpose they were collected
for. If this data is needed for other analysis, it is necessary that the principle ‘Consent
and Necessity of Processing’ is met for the new purpose.
Redistribution and modification
The Classification “R” stands for “redistribution and modification”. It ensures, that data
is processed lawfully and securely with regards to those aspects.
Level 1 (R1): No restrictions.
Data does neither contain personal information nor allow backtracing to an individual.
Furthermore, the data is not subject to any license or agreement condition, which
requires specific treatment regarding regarding redistribution and modification.
Level 2 (R2): Data can be transmitted to external users, only under specific conditions,
which are specified in the usage agreement.
These conditions need to be set individually per Use Case and embedded in the terms
of use or the license. It is also necessary to coordinate those conditions with the
operator of the Smart Data Platform, in order to ensure that the distribution or
modification is technically possible and what is necessary to implement it.
Level 3 (R3): Data are only allowed to be transmitted internally and to the data
provider.
The provisions for the data needs to be defined individually with the operators of the
Smart Data Platform in the context of the Use Case.
Level 4 (R4): Data are only allowed to be transmitted to specified internal users.
The provisions for the data needs to be defined individually with the operators of the
Smart Data Platform in the context of the Use Case. Data contains personal data or
SMARTER TOGETHER - Data Gatekeeper Munich - 22/02/2019 141
comparable vulnerable information and therefore it needs to be ensured that they
are only accessible for certain persons through access protection and are only be
transmitted to specific internal persons.
Storage and deletion
The Classification “S” stands for “storage and deletion”. It ensures, that data is
processed lawfully and securely with regards to those aspects.
Level 1 (S1): No restrictions.
Data does neither contain personal information nor allow backtracing to an individual.
Furthermore, the data is not subject to any license or agreement condition, which
requires specific treatment regarding storage and deletion.
Level 2 (S2): Data need to be stored in-house, but are not subject to any restriction
regarding deletion.
The implementation of the provisions needs to be defined with the operators of the
Smart Data Platform. In special cases or under certain circumstances, an access
protection will be necessary, so that it is only possible for specific (in advanced
defined) persons to access the data.
Level 3 (S3): Data need to be stored in-house and the deletion limit needs to be
observed.
The implementation of the provisions needs to be defined with the operators of the
Smart Data Platform. In special cases or under certain circumstances, an access
protection will be necessary, so that it is only possible for specific (in advanced
defined) persons to access the data.
If there is a deletion limit, the limited availability of this data needs to be mentioned
when transferring the data.Data that are already anonymized can be stored without
any restrictions, if not defined differently in the contract or license.
Level 4 (S4): Data contain personal data or comparable vulnerable information and
therefore need to be stored at a preassigned, secure place if necessary with specific
access protection and the deletion limit needs to be observed.
The storage and deletion of this kind of legally sensitive data needs to be defined
individually with the operators of the Smart Data Platform in the context of the Use
Case.