22
WP.1 Project Management D1.3 Data Management Plan (1)

WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

WP.1 Project Management

D1.3 Data Management Plan (1)

Page 2: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Grant Agreement Number: 693171

Acronym: RECAP

Project Full Title: Personalised public services in support of the implementation of the CAP

Start Date: 01/05/2016

Duration: 30 months

Project URL: www.recap-h2020.eu

DOCUMENT HISTORY:

Versions Issue Date Stage Changes Contributor

1.0 24/10/2016 Draft Draft for review DRAXIS

2.0 25/10/2016 Draft Review Feedback LAAS

3.0 31/10/2016 Final Final version DRAXIS

Deliverable Number & Name: D1.3 Data Management Plan (1)

Work Package Number & Name: WP.1 Project Management

Date of Delivery: 31/10/2016 Contractual: 31/10/2016 Actual: 31/10/2016

Nature: Report Dissemination Level: Public

Lead Beneficiary: DRAXIS

Responsible Author: Ifigeneia- Maria Tsioutsia (DRAXIS) Contributions from: Stavros Tekes (DRAXIS), Christodoulos Keratidis (DRAXIS), Polimachi Simeonidou (DRAXIS), Ioannis Papoutsis (NOA), Alice Mauchline (UREAD), Joseba Aranguren (INI), Manolis Tsantakis (ETAM), Gintare Kucinskiene (LAAS)

© RECAP Consortium, 2016 This deliverable contains original unpublished work except where clearly indicated otherwise. Acknowledgement of previously published

material and of the work of others has been made through appropriate citation, quotation or both. Reproduction is authorised provided

the source is acknowledged.

Disclaimer Any dissemination of results reflects only the author's view and the European Commission is not responsible for any use that may be made

of the information it contains.

Page 3: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 3/22

WP.1 Project Management

D1.3 Data Management Plan (1)

Table of Contents

Executive Summary .............................................................................................................................................4

1. Methodology ...............................................................................................................................................5

1.1 Data Summary ....................................................................................................................................5

1.2 FAIR data .............................................................................................................................................7

1.2.1 Making data findable, including provisions for metadata ...........................................................7

1.2.2 Making data openly accessible ...................................................................................................8

1.2.3 Making data interoperable .........................................................................................................8

1.2.4 Increase date re-use ...................................................................................................................8

1.3 Allocation of resources .......................................................................................................................8

1.4 Data security .......................................................................................................................................9

1.5 Ethical aspects ....................................................................................................................................9

1.6 Other issues ........................................................................................................................................9

2. DMP Components in RECAP ..................................................................................................................... 10

2.1 DMP Components in WP1 – Project Management (DRAXIS) ........................................................... 10

2.2 DMP Components in WP2 – Users’ needs analysis & co-production of services ............................. 11

2.3 DMP Components in WP3 – Service integration and customization ............................................... 12

2.3.1 System Architecture ................................................................................................................. 12

2.3.2 Website content farmer ........................................................................................................... 13

2.3.3 User uploaded photos .............................................................................................................. 14

2.3.4 Website content inspectors ..................................................................................................... 15

2.3.5 E-learning material ................................................................................................................... 16

2.3.6 CC laws and rules ..................................................................................................................... 17

2.3.7 Information extraction and modeling from remotely sensed data .......................................... 17

2.4 DMP Components in WP4 – Deployment and operation ................................................................. 19

2.5 DMP Components in WP5 – Dissemination & Exploitation .............................................................. 21

Page 4: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 4/22

WP.1 Project Management

D1.3 Data Management Plan (1)

Executive Summary The purpose of the current deliverable is to present the 1st Data Management Plan (DMP) of the RECAP project and is a collective product of work among the coordinator and the rest of the consortium partner. The scope of the DMP is to describe the data management life cycle for all datasets to be collected, processed or generated in all Work Packages during the course of the 30 months of RECAP project. FAIR Data Management is highly promoted by the Commission and since RECAP is a data intensive project, relevant attention has been given to this task. However, the DMP is a living document in which information will be made available on a more detailed level through updates as the implementation of RECAP project progresses and when significant changes occur. This document is the initial of the three versions to be produced for the Data Management Plan throughout the RECAP project’s duration. The deliverable is structured in the following chapters:

Chapter 1 includes a description of the methodology used

Chapter 2 includes the description of the DMP Components

Page 5: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 5/22

WP.1 Project Management

D1.3 Data Management Plan (1)

1. Methodology The Data Management Plan methodology approach that has been used for the compilation of the D1.3 has been based on the updated version of the “Guidelines on FAIR Data Management in Horizon 2020”1 version 3.0 released on 26 July 2016 by the European Commission Directorate – General for Research & Innovation. The RECAP DMP addresses the following issues:

Data Summary

FAIR data

Making data findable, including provisions for metadata

Making data openly accessible

Making data interoperable

Increase data re-use

Allocation of resources

Data security

Ethical aspects

Other issues

The RECAP project coordinator (DRAXIS) has provided on time all the work package leaders and rest of the partners with a template that includes all the 10 abovementioned issues along with instructions to fill the template.

1.1 Data Summary

The Data Summary addresses the following issues:

Outline the purpose of the collected/ generated data and its relation to the objectives of RECAP

project.

Outline the types and formats of data already collected/ generated and/ or foreseen for generation at

this stage of the project.

Outline the reusability of the existing data.

Outline the origin of the data.

Outline the expected size of the data.

Outline the data utility.

RECAP proposes a methodology for improving the efficiency and transparency of the compliance monitoring procedure through a cloud-based Software as a Service (SaaS) platform which will make use of large volumes of publicly available data provided by satellite remote sensing, and user-generated data provided by farmers through mobile devices. Therefore, the majority of the data that it will fall into the following categories:

1 European Commission, (26 July 2016), Guidelines on FAIR Data Management in Horizon 2020, Version 3.0

Page 6: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 6/22

WP.1 Project Management

D1.3 Data Management Plan (1)

Remote Sensing Imagery (VHR)

Free Satellite Data (Sentinel, LandSat)

Copernicus and GEOSS-DataCore open products

User Photos (geo-referenced and dated photos from user’s smartphones)

User data (data related to a farmer’s plants)

Compliance data (user actions related to compliance requirements)

At this stage of the project these data are not in any way all-inclusive but provide a basis from which RECAP project has developed the user requirements in relation to the RECAP platform. One of the main concepts of the project is to involve farmers into the data collection and contribution process. The idea is to make that as simple as possible and allow them to contribute data. That way RECAP will be able to collect a large amount of information and data related to the farmer’s activities and habits related to compliance. By collecting, organising them and combining with the remote sensing imagery organisation RECAP will also be able to gain more insight into the process of auditing, identify misconducts and mistreatments, recognize good practices and be able to trace back what went wrong and what thrived. Obviously, privacy issues will be taken into account in order to ensure that no personal or sensitive data of any farmer are dispersed. Data sharing and accessibility for verification and re-use will be available through the RECAP project platform open to anyone. The use of open standards and architecture will also allow other uses of this data and their integration with other related applications. Data obtained by RECAP will be openly available under open data licenses for use by:

All the public control and paying agencies who are in charge of payments, oppositions, compensation

and recovery of support granted under the CAP.

The farmers associations that will use parts of the system to support their farmers in complying with

the Cross Compliance Scheme.

The agricultural consultants that will use the data in order to provide services to their farmers in

complying with the Cross Compliance Scheme.

The research partners in RECAP (UREAD and NOA) which will use the data and the results for further

scientific and research purposes.

Within RECAP all personal data used in the project will be protected. When possible, the data collected in the project will be available to third parties in contexts such as scientific scrutiny and peer review. As documented in the D1.1- Project Management Handbook, deliverables’ external reviewers will sign a confidentiality declaration, which includes the following statement: “I hereby declare that I will treat all information, contained within the above mentioned deliverable and which has been disclosed to me through the review of this deliverable, with due confidentiality.” Finally, it is expected that the RECAP project will result in a number of publications in scientific, peer-reviewed journals. Project partners are encouraged to collaborate with each other and jointly prepare publications relevant to the RECAP project. Scientific journals that provide open access (OA) to all their publications will be preferred, as it is required by the European Commission.

Page 7: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 7/22

WP.1 Project Management

D1.3 Data Management Plan (1)

1.2 FAIR data

1.2.1 Making data findable, including provisions for metadata

This point addresses the following issues1:

Outline the discoverability of data (metadata provision)

Outline the identifiability of data and refer to standard identification mechanism.

Outline the naming conventions used.

Outline the approach towards search keyword.

Outline the approach for clear versioning.

Specify standards for metadata creation (if any).

This point refers to existing suitable standards of the discipline, as well as an outline on how and what metadata will be created. Therefore, at this stage, the available data standards (if any) accompany the description of the data that will be collected and/or generated, including the description on how the data will be organised during the project, mentioning for example naming conventions, version control and folder structures. As far as the metadata are concerned, the way the consortium will capture and store this information should be described. For instance, for data records stored in a database with links to each item metadata can pinpoint their description and location. There are various disciplinary metadata standards2, however the RECAP consortium has identified a number of available best practices and guidelines for working with Open Data, mostly by organisations or institutions that support and promote Open Data initiatives, and will be taken into account. These include:

Open Data Foundation3

Open Knowledge Foundation4

Open Government Standards5

Furthermore, data will be interoperable, adhering for data annotation, data exchange, compliant with available software applications related to agriculture. Standards that will be taken into account in the project are:

2 http://www.dcc.ac.uk/resources/metadata-standards 3 http://www.opendatafoundation.org/ 4 https://okfn.org/ 5 http://www.opengovstandards.org/

Page 8: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 8/22

WP.1 Project Management

D1.3 Data Management Plan (1)

INSPIRE: Infrastructure for Spatial Information in the European Community. Addresses spatial data

themes needed for environmental applications6.

IACS: Integrated Administration and Control System. IACS is the most important system for the

management and control of payments to farmers made by the Member States in application of the

Common Agricultural Policy7.

AGROVOC: This is the most comprehensive multilingual thesaurus and vocabulary for agriculture

nowadays. It is owned and maintained by a community of institutions all over the world and curated

by the Food and Agricultural Organisation of the United Nations (FAO).

Dublin Core and ISO/IEC 11179 Metadata Registry (MDR): This addresses issues in the metadata and

data modelling space.

1.2.2 Making data openly accessible

The objectives of this point address the following issues1:

Specify which data will be made openly available and if some data is kept closed explain the reason

why.

Specify how the data will be made available.

Specify what methods or software tools are needed to access the data, if a documentation is necessary

about the software and if it is possible to include the relevant software (e.g. in open source code).

Specify where the data and associated metadata, documentation and code are deposited.

Specify how access will be provided in case there are any restrictions.

1.2.3 Making data interoperable

This point will describe the assessment of the data interoperability specifying what data and metadata vocabularies, standards or methodologies will be followed in order to facilitate interoperability. Moreover, it will address whether standard vocabulary will be used for all data types present in the data set in order to allow inter-disciplinary interoperability.

1.2.4 Increase date re-use

This point addresses the following issues1:

Specify how the data will be licensed to permit the widest reuse possible.

Specify when the data will be made available for re-use.

Specify if the data produced and/ or used in the project is useable by third parties, especially, after the

end of the project.

Provide a data quality assurance processes description.

Specify the length of time for which the data will remain re-usable.

1.3 Allocation of resources

The objectives of this point address the following issues1:

6 http://inspire.jrc.ec.europa.eu/ 7 http://ec.europa.eu/agriculture/direct-support/iacs/index_en.htm

Page 9: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 9/22

WP.1 Project Management

D1.3 Data Management Plan (1)

Estimate the costs for making the data FAIR and describe the method of covering these costs.

Identify responsibilities for data management in the project.

Describe costs and potential value of long term preservation.

1.4 Data security

This point will address data recovery as well as secure storage and transfer of sensitive data.

1.5 Ethical aspects

This point will cover the context of the ethics review, ethics section of DoA and ethics deliverables including references and related technical aspects.

1.6 Other issues

Other issues will refer to other national/ funder/ sectorial/ departmental procedures for data management that are used.

Page 10: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 10/22

WP.1 Project Management

D1.3 Data Management Plan (1)

2. DMP Components in RECAP 2.1 DMP Components in WP1 – Project Management (DRAXIS)

DMP Component Issues to be addressed

Data Summary Contact details of project partners and advisory board Databases containing all the necessary information regarding the project partners and Advisory Board members. The project partners data is stored in a simple table in the RECAP wiki, with the following fields:

Name

Email

Phone

Skype id

The advisory board members data is described by the following fields: Name

Description

Affiliation

Organisation

Country

Proposed by

Additional fields will be added as the project progresses.

Making data findable, including provisions for metadata

N/A

Making data openly accessible The databases will not be publicly available. The databases will only be accessible through the RECAP wiki and only the members of the consortium will have access to that material. The administration of the RECAP wiki will only be accessible by the Coordinator (DRAXIS) of RECAP and the databases will be renewed when new data will be available.

Making data interoperable N/A

Increase data re-use N/A

Allocation of resources Preserving contact details of the project partners and advisory board members for the entire time of the project will facilitate the internal communication.

Data security The data will be preserved and shared with the members of the consortium through the RECAP wiki. The data is collected for internal use in the project, and not intended for long-term preservation. The work package leader is keeping a quarterly backup on a separate disk.

Ethical aspects N/A

Other issues N/A

Page 11: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 11/22

WP.1 Project Management

D1.3 Data Management Plan (1)

2.2 DMP Components in WP2 – Users’ needs analysis & co-production of services

DMP Component Issues to be addressed

Data Summary The scope of the collection of user needs of the initial requirements (D2.2: Report of user requirements in relation to the RECAP platform) and also for the co-production phase (D2.4: Report on co-production of services), where applicable results will also be used to produce peer reviewed papers. The collection of data from end users is an integral part of the RECAP project and co-production of the final product that will help to ensure the creation of a useful product. Questionnaire data (including written responses (.docx and .xslx) and recordings (.mp3)) compromise the majority of the data. The work package leader may also collect previous inspection and BPS reports. The origin of the data is from:

Paying Agency partners in the RECAP project,

Farmers in the partner countries,

Agricultural consultants and accreditation bodies in the partner

countries.

Written responses are likely to be fairly small in size (<1 GB over the course of the project). Recordings are larger files and likely to be 10 - 20 GB over the course of the project. The data will be useful to the work package 3 leader for the production of the RECAP platform; other partner teams throughout the project, as well as the wider research community when results are published.

Making data findable, including provisions for metadata

When data is published in peer reviewed papers it will be available to any who wish to use it. As it contains confidential and sensitive information, the raw data will not be made available. Outline naming conventions used (e.g. Data_<WPno>_<serial number of dataset>_<dataset title>. Example Data_WP1_1_User generated content). Data is stored on University of Reading servers and labelled with the work package, country of origin and the type of data. Data can be searched by country, WP number or data type. There are unlikely to be multiple versions of data collected – for example, each interview will be conducted on a single occasion. This data contains sensitive personal information so it cannot be made public. Data included in published papers will be anonymised and summarised by region or other suitable grouping criteria (e.g. farm type or farmer age) following the journal standards to make it possible to include in meta-analysis.

Making data openly accessible Data contains sensitive personal data therefore it cannot legally be made public. Anonymised, summarised data will be available in any published papers. Complete data cannot be made available because it contains sensitive personal data.

Page 12: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 12/22

WP.1 Project Management

D1.3 Data Management Plan (1)

Making data interoperable Raw data cannot be made freely available because it contains sensitive personal information. Data included in published papers will be anonymised and follow the standards of the journal to ensure that it can be used in meta-analysis.

Increase data re-use Any data published in papers will be immediately available to meta-analysis. However, it is not legal to release sensitive personal data such as the questionnaire responses. Data quality is assured by asking partners to fill out paper questionnaire in their own languages. These are the translated and stored in spreadsheets. Separately, the interviews are recorded, translated and transcribed. This ensures accurate data recording and translation.

Allocation of resources Costs of publishing papers in open access format is the key cost in this part of the project. During the duration of the project, money from the RECAP budget will be used to cover journal fees (these are approximately £1000/paper). Papers are likely to be published after the completion of the project, in this case the university has a fund to which we can apply in order to cover the costs of open access publishing. Data is stored on University of Reading servers.

Data security University of Reading servers are managed by the university IT services. They are regularly backed up and secure.

Ethical aspects N/A

Other issues N/A

2.3 DMP Components in WP3 – Service integration and customisation

2.3.1 System Architecture

DMP Component Issues to be addressed

Data Summary A report describing the RECAP platform in details containing information like component descriptions and dependencies, API descriptions, information flow diagram, internal and external interfaces, hardware requirements and testing procedures. This will be the basis upon which the system will be built.

Making data findable, including provisions for metadata

It will become both discoverable and accessible to the public once it is delivered to the EU and the consortium decides to do so. The report will contain a table stating all versions of the document, along with who contributed to each version, what the changes were as well as the date the new version was created.

Making data openly accessible The data will be available in D3.1: System architecture. The dissemination level of D3.1 is public. It will be available through the RECAP wiki for the members of the consortium and when the project decides to publicise deliverables, it will be uploaded along with the other public deliverables to the project website or anywhere else the consortium decides.

Making data interoperable N/A

Increase data re-use Engineers who want to build similar systems, could use this as an example.

Page 13: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 13/22

WP.1 Project Management

D1.3 Data Management Plan (1)

Allocation of resources N/A

Data security The Architecture report will be securely saved in the DRAXIS premises and will be shared with the rest of the partners through the RECAP wiki.

Ethical aspects N/A

Other issues N/A

2.3.2 Website content farmer

DMP Component Issues to be addressed

Data Summary Various data like users’ personal information, farm information, farm logs, reports and shapefiles containing farm location will be generated via the platform. All of these data will be useful for the self-assessment process and the creation of meaningful tasks for the farmers. The data described above will be saved in the RECAP central database. All user actions (login, logout, account creation, visits on specific parts of the app) will be logged and kept in the form of a text file. This log will be useful for debugging purposes. Reports containing information on user devices (which browsers and mobile phones) as well as number of mobile downloads (taken from play store for android downloads and app store for mac downloads) will be useful for marketing and exploitation purposes, as well as decisions regarding the supported browsers and operating systems.

Making data findable, including provisions for metadata

Every action on the website will produce meaningful metadata such as time and date of data creation or data amendments and owners of actions that took place. Metadata will assist the discoverability of the data and related information. Only the administrator of the app will be able to discover all the data generated by the platform. The database will not be discoverable to other network machines operating on the same LAN, VLAN with the DB server or other networks. Therefore only users with access to the server (RECAP technical team members) will be able to discover the database.

Making data openly accessible Only registered users and administrators will have access to the data. The data produced by the platform is sensitive private data and cannot be shared with others without the user’s permission. No open data will be created as part of RECAP. The database will only be accessible by the authorised technical team.

Making data interoperable N/A

Increase data re-use N/A

Allocation of resources N/A

Data security All platform generated data will be saved on the RECAP database server. Encryption will be used to protect sensitive user data like emails and passwords. All data will be transferred via SSL connections to ensure secure exchange of information. If there is need for updates, the old data will be overwritten and all actions will be audited in detail and a log will be kept, containing the changed text

Page 14: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 14/22

WP.1 Project Management

D1.3 Data Management Plan (1)

for security reasons. The system will be daily backed up and the backups will be kept for 3 days. All backups will be hosted on a remote server to avoid disaster scenarios. All servers will be hosted behind firewalls inspecting all incoming requests against known vulnerabilities such as SQL injection, cookie tampering and cross-site scripting, etc. Finally, IP restriction will enforce the secure storage of data.

Ethical aspects All farmer generated data will be protected and will not be shared without the farmer’s consent.

Other issues N/A

2.3.3 User uploaded photos

DMP Component Issues to be addressed

Data Summary RECAP users will be able to upload photos from a farm. These photos will be timestamped and geolocated and will be saved in the RECAP DB or a secure storage area. The purpose of the images is to prove compliance or not. The most common file type expected is jpg.

Making data findable, including provisions for metadata

Metadata related to the location and the time of the taken photo as well as a name, description and tag for the photo will be saved. These metadata will help the discoverability of the photos within the platform. Farmers will be able to discover photos related to their farms (uploaded either by them or the inspectors) and Paying Agencies will be able to discover all photos that have been granted access to. The images folder will not be discoverable by systems or persons in the same or other servers in the same LAN/VLAN as the storage/database server.

Making data openly accessible Only if the farmer allows to, some photos might be openly used within the RECAP platform as good practice examples. Otherwise, and only if the farmer gives their consent, the photos will be accessible by the relevant RECAP users only.

Making data interoperable Photos will be saved in jpeg format.

Increase data re-use Farmers will be able to download photos and use them in any way they want. Inspectors and paying agencies will have limited abilities of reusing the data, depending on the access level given by the farmer. This will be defined later in the project.

Allocation of resources Preserving photos for a long time will offer both farmers and the paying agencies the opportunity to check field conditions of previous years and use them as example to follow or avoid.

Data security User generated photos will be saved on the RECAP server. SSL connections will be established so that all data are transferred securely. In case of necessary updates, the old data will be overwritten and all actions will be audited in detail and a log will be kept, containing the changed text for security reasons. The system will be daily backed up and backups will be kept for 3 days. All backups will be hosted on a remote server to avoid disaster scenarios.

Page 15: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 15/22

WP.1 Project Management

D1.3 Data Management Plan (1)

All servers will be hosted behind firewalls inspecting all incoming requests against known vulnerabilities such as SQL injection, cookie tampering and cross-site scripting, etc. Finally, IP restriction will enforce the secure storage of data.

Ethical aspects All user generated data will be protected and will not be shared without the farmer’s consent.

Other issues N/A

2.3.4 Website content inspectors

DMP Component Issues to be addressed

Data Summary Inspection results will be generated by the inspectors through the system. The inspection results will be available through the farmer’s electronic record and will be saved in the RECAP central database.

Making data findable, including provisions for metadata

Metadata such as date, time, associated farmer and inspector and inspection type will be saved along with the inspection results to enhance the discoverability of the results. Inspectors will be able to discover all inspection results, whereas farmers will only be able to discover results of their farms. The administrator of the app will be able to discover all the inspection results generated by the platform. The database will not be discoverable to other network machines operating on the same LAN, VLAN with the DB server or other networks. Therefore only users with access to the server (RECAP technical team members) will be able to discover the database.

Making data openly accessible Inspection results contain sensitive private data and can only be accessed by inspectors and associated farmers. These data cannot be shared with others without the user’s permission. No open data will be created as part of RECAP. The database will only be accessible by the authorised technical team.

Making data interoperable Inspection results will be possible to be exported in pdf format and used in other systems that the local governments are using to manage the farmer’s payments.

Increase data re-use RECAP will be integrated with third party applications, currently being used by the local governments, in order to reuse information already inserted in those systems.

Allocation of resources N/A

Data security All platform generated data will be saved on the RECAP database server. All data will be transferred via SSL connections to ensure secure exchange of information. If there is need for updates, the old data will be overwritten and all actions will be audited in detail and a log will be kept, containing the changed text for security reasons. In case of necessary updates, the old data will be overwritten and all actions will be audited in detail and a log will be kept, containing the changed text for security reasons. The system will be daily

Page 16: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 16/22

WP.1 Project Management

D1.3 Data Management Plan (1)

backed up and the backups will be kept for 3 days. All backups will be hosted on a remote server to avoid disaster scenarios. All servers will be hosted behind firewalls inspecting all incoming requests against known vulnerabilities such as SQL injection, cookie tampering and cross-site scripting, etc. Finally, IP restriction will enforce the secure storage of data.

Ethical aspects Inspection results will be protected and will not be shared without the farmer’s consent.

Other issues N/A

2.3.5 E-learning material

DMP Component Issues to be addressed

Data Summary As part of RECAP videos and presentations will be created in order to educate farmers and inspectors on the current best practices. Some of them will be available for the users to view whenever they want and some other will be available only via live webinars. The e-learning material will be mainly created by the paying agencies and there is a possibility to reuse existing material from other similar systems.

Making data findable, including provisions for metadata

Metadata such as video format, duration, size, time of views, number of participants for live webinars will be saved along with the videos and the presentations in order to enhance the discoverability of the results. All registered users will be able to discover the e-learning material either via searching capability or via a dedicated area that will list all the available sources. The database and the storage area will not be discoverable to other network machines operating on the same LAN, VLAN with the DB server or other networks. Therefore only users with access to the server (RECAP technical team members) will be able to discover the database and the storage area.

Making data openly accessible The e-learning material will only be accessible through the RECAP platform. All RECAP users will have access to that material. The database will only be accessible by the authorised technical team.

Making data interoperable N/A

Increase data re-use N/A

Allocation of resources N/A

Data security Videos and power point presentations will be saved on the RECAP database server. All data will be transferred via SSL connections to ensure secure exchange of information. The system will be daily backed up and the backups will be kept for 3 days. All backups will be hosted on a remote server to avoid disaster scenarios.

Ethical aspects N/A

Other issues N/A

Page 17: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 17/22

WP.1 Project Management

D1.3 Data Management Plan (1)

2.3.6 CC laws and rules

DMP Component Issues to be addressed

Data Summary Cross compliance law and inspection lists with checkpoints will be used both by the inspectors during the inspections but also by the farmers to perform some sort of self-assessment. The lists will be given to us by the Paying agencies in a various formats (xl, word) and will be transformed in electronic form.

Making data findable, including provisions for metadata

All registered users will have access to the laws and the inspection checklists via the RECAP platform. Metadata related to the different versions of the checklists and the newest updates of the laws, along with dates and times will also be saved. Metadata will help the easy discoverability of the most up to date content.

Making data openly accessible N/A

Making data interoperable N/A

Increase data re-use N/A

Allocation of resources N/A

Data security All content related to CC laws and inspections will be securely saved on the RECAP database server. All data will be transferred via SSL connections to ensure secure exchange of information. The system will be daily backed up and the backups will be kept for 3 days. All backups will be hosted on a remote server to avoid disaster scenarios.

Ethical aspects N/A

Other issues N/A

2.3.7 Remotely sensed data

DMP Component Issues to be addressed

Data Summary Generation of satellite based spectral indices and remote sensing classification products to establish an alerting mechanism for breaches of cross-compliance. The products will be used in WP4. Processing of open satellite data for monitoring CAP implementation is in the core of RECAP. Data will be available in raster and vector data, accessible through a GeoServer application on top of a PostGIS database. Historical, Landsat-based spectral indices may be used to assist a time-series analysis. The origin of the data will be:

USGS for Landsat (http://glovis.usgs.gov/) and

ESA for Sentinel, delivered through the Hellenic National Sentinel

Data Mirror Site (http://sentinels.space.noa.gr/)

Sentinel-2 data are about 4 GB each, while Landsat around 1 GB each, both compressed. Assuming 4 pilot cases, and a need to have at least one image per month on a yearly basis, this accounts for 240 GB of image data

Page 18: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 18/22

WP.1 Project Management

D1.3 Data Management Plan (1)

in total. Indices and classification products will account for an additional 10%, hence a total of 250 GB of data is foreseen to be generated. Data and products will be useful for the Paying Agencies, the farmers themselves and the farmer consultants. They will be ingested by the RECAP platform and disseminated to project stakeholders, while their usefulness will be demonstrated during the pilot cases.

Making data findable, including provisions for metadata

The image data and the processed products will be available to all stakeholders through a PostGIS. Registered users will have unlimited access to the products for the duration of the project. Data is stored on the National Observatory of Athens servers and labelled with the work package, country of origin and the type of data. Geoserver and PostGIS provide a build-in keyword search tool that will be used and Postgres MCCC versioning tool will also be used. INSPIRE metadata will be created for all the EO-based geospatial products that will be generated in the lifetime of the project.

Making data openly accessible Spectral Indices and EO-based classification objects will be made available. Commercial VHR satellite imagery that will be used in the context of the pilots will not be restricted due to the associated restrictions of the satellite data vendor. Data and products will be made accessible through an API on top a Postgres database. No special software is needed in order to access the data. A user can create scripts to access and query the database and retrieve relevant datasets. They data and associated metadata will be deposited in NOA’s servers.

Making data interoperable PostGIS and Geoserver is a widely accessible tool for managing geospatial information. INSPIRE protocol will be used for metadata descriptors, the typical standard for geospatial data. No standard vocabulary will be used and no ontology mapping is foreseen.

Increase data re-use The PostGIS database that will be created in RECAP will be licensed with the Open Data Commons Open Database License (ODbL). The EO-based geospatial products that will be generated in RECAP will be made available for re-use for the project’s lifetime and beyond. All EO-based products will remain usable after the end of the project. No particular data quality assurance process is followed, and no relevant warranties will be provided. EO-based products will remain re-usable at least two years after the project’s conclusion.

Allocation of resources Costs for maintaining a database of the EO-based products that will be generated to serve the pilot demonstrations are negligible. Publication fees (approximately €1000/paper) are however foreseen. Data is stored on NOA’s servers. Long term preservation of the products generated for the pilots is minimal. However, if this is to scale-up and go beyond the demonstration phase, then making data FAIR will incur significant costs. Generating FAIR spectral indices and EO-based classification products for large geographical regions and with frequent updates, has a potential for cross-

Page 19: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 19/22

WP.1 Project Management

D1.3 Data Management Plan (1)

fertilization of different fields (e.g. precision farming, CAP compliance, environmental monitoring, disaster management, etc.).

Data security NOA servers are managed by the IT department. They are regularly backed up and secure.

Ethical aspects N/A

Other issues N/A

2.4 DMP Components in WP4 – Deployment and operation

DMP Component Issues to be addressed

Data Summary The purpose of the WP4 data is to identify all training needs for the pilot cases, to complete the training and to perform the pilots testing in the 5 locations: Spain, Greece, Lithuania, UK and Serbia. Also the WP4 data will serve to monitor the effective conduct of the pilots, and provide an effective feedback to enhance the final solution of the RECAP platform. The data collected and generated in WP4 will be necessary in order to develop the proper platform and test it for the delivery of public services that will enable the improved implementation of the Common Agricultural Policy (CAP), increasing efficiency and transparency of public authorities, offering personalised services to farmers and stimulating the development of new added value services by agricultural consultants; and also to develop personalised public services to support farmers to better comply with CAP requirements. Mainly and if it is possible, it will be used online and/or electronic archives. The main documents and formats that will be used in order to collect and generate the necessary data will be templates agreed in the D1.4: Pilot Plan. There will be templates of documents such as: questionnaires, interviews, cooperation agreements, invitation letters to participate in the pilots, agendas and minutes of the meetings, attendance sheets, application forms, informed consent forms, etc. Semi-structured interviews with individuals will be collected and stored using digital audio recording (e.g. MP3) only if the interviewees give their permission. In case they deny, interview notes will be typed up according to agreed formats and standards. All transcripts will be in Microsoft Word (doc. / docx.). In the D4.1: Pilot Plan/Impact Assessment Plan, the metadata of WP4, procedures and file formats for note-taking, recording, transcribing, storing visual data from participatory techniques, and semi-structured interviews, questionnaires and focus group discussion data will be developed and agreed. In other Work Packages, a few existing general data is already being used to develop different tasks and deliverables; for example compliance requirements in each country. Also in WP4, a few existing data from the different pilot partners will be re-used or will be available in the necessary format to the project or in this case to develop properly the WP4.

Page 20: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 20/22

WP.1 Project Management

D1.3 Data Management Plan (1)

Generally the research objectives require qualitative data that are not available from other sources. Some data can be used to situate and triangulate the findings of the proposed research, and will supplement the collected data as part of the proposed research. However, qualitative and attitudinal data are generally rare or of insufficiently high quality to address the research questions. The research objectives also require quantitative analysis of public data. The origin of the data for WP4, will be mainly from:

Partners of the project Pilot partners Public national/regional authorities of the Pilot countries Agricultural consultancy services of pilot countries Different farmers from the different pilot countries

This data will be collected through different templates, questionnaires, interviews, meetings and focus groups. The detail of this data origin and how the data will occur, will be detailed in the D1.4-Pilot Plan. Firstly, the data of the WP4 will be useful for the research purposes of the project, and therefore for their partners and for the improvement of the RECAP platform that will be developed in WP3. Also the data of the WP4 and the results of the project will be useful for the regional/national authorities of CAP in the pilot countries, for the agricultural consultancy services and of course these data, results and outputs of the project, and for the farmers and farmers’ cooperatives.

Making data findable, including provisions for metadata

Outline naming conventions used “data_name of the file_WPnº_TaskNº”.

Making data openly accessible WP leader intends to use Hadoop8 which supports multiple types of data, both structured and unstructured and can generate value from it remarkably quickly. Another major benefit for Hadoop is the fact that it is resilient to failure. When data is sent to an individual node, that data is also replicated to other nodes in the cluster, which means that in the event of failure, there is another copy available for use. Other NoSQL9 technologies may also be used to store unstructured data where it is considered that will reinforce efficiency (e.g. MongoDB10). The new breed of NoSQL databases are designed to expand transparently to take advantage of new nodes, and they are usually designed with low-cost commodity hardware in mind.

Making data interoperable N/A

Increase data re-use The data of WP4 will start to be collected and generated in WP4 in spring 2017, and all the specifications and periods of use, and re-use will be established in deliverable D4.1 Pilot Plan.

Allocation of resources N/A

8 http://hadoop.apache.org/ 9 https://en.wikipedia.org/wiki/NoSQL 10 https://www.mongodb.com/

Page 21: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 21/22

WP.1 Project Management

D1.3 Data Management Plan (1)

Data security The data of WP4 will need to be backed up regularly; due to viruses’ problems, this will include regular email sharing with the technological partners and coordinator, so that up-to-date versions will be stored on different institutions server. Qualitative data will be backed up and secured by the responsible partner of WP4 on a regular basis and metadata will include clear labelling of versions and dates. There are some potential sensitivities around some of the collected data, so it will be established a system for data protection, including use of passwords and safe backup hardware.

Ethical aspects A letter explaining the purpose, approach and dissemination strategy (including plans of sharing data) of the pilot phase, and an accompanying consent form (including sharing data) will be prepared and translated into the relevant languages by the pilot partners. A clear verbal explanation will also be provided to each interviewee and focus group participant. Commitments to ensure confidentiality will be maintained by ensuring recordings will not be publicly; that transcripts will be anonymised and details that can be used to identify participants will be removed from transcripts or concealed in write-ups. Due to the highly-focused nature of the pilot phase, many participants may be easily identifiable despite the efforts to ensure anonymity or confidentiality. In such cases, participants will be shown sections of transcript and/or report text in order to ensure that the confidentiality of their interview data.

Other issues N/A

2.5 DMP Components in WP5 – Dissemination & Exploitation

DMP Component Issues to be addressed

Data Summary Data collection is necessary for the elaboration of the Dissemination and Communication Strategy, the establishment and management of the Network of Interest, the Market assessment and the Business plan. Lists of communication recipients in excel files containing organisations/bodies and their e-mail addresses. Parts of the lists have been developed in previous projects of the WP leader. The rest of the data has been developed through desk research. Project User Group contact details (name and e-mail address). Not fully specified and finalised yet. Information regarding direct/indirect competitors and data regarding Paying Agencies, Agri-consultants and farmers (name/organization and e-mail address). Not fully specified and finalized yet.

Making data findable, including provisions for metadata

The deliverable publically available “Dissemination and Communication Strategy” will facilitate discoverability of data contained in them.

Making data openly accessible Data concerning e-mail addresses will not be openly available, as being personal data. Deliverables publically posted on the website of RECAP will make available all relative data. No particular methods or software tools are needed to access the data.

Page 22: WP.1 Project Management D1.3 Data Management Plan (1) · Page 4/22 WP.1 Project Management D1.3 Data Management Plan (1) Executive Summary The purpose of the current deliverable is

Page 22/22

WP.1 Project Management

D1.3 Data Management Plan (1)

Data are stored at ETAM’s server. Deliverables are posted on the website of RECAP.

Making data interoperable N/A

Increase data re-use N/A

Allocation of resources Data management responsibilities have been allocated to two members of the WP project team.

Data security Automated backup of files.

Ethical aspects N/A

Other issues N/A