38
D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932) 1 Model Driven Paediatric European Digital Repository Call identifier: FP7-ICT-2011-9 - Grant agreement no: 600932 Thematic Priority: ICT - ICT-2011.5.2: Virtual Physiological Human Deliverable 13.1 Initial list of requirements after stakeholder interviews including priority domains Due date of delivery: 30 th November 2013 Actual submission date: 3 rd December 2013 Start of the project: 1 st March 2013 Ending Date: 28 th February 2017 Partner responsible for this deliverable: HES-SO Valais Version: 1.0

Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D. 13.1 Initial list of requirements after stakeholder interviews including priority domains

MD-Paedigree - FP7-ICT-2011-9 (600932)

1

Model Driven Paediatric European Digital Repository

Call identifier: FP7-ICT-2011-9 - Grant agreement no: 600932

Thematic Priority: ICT - ICT-2011.5.2: Virtual Physiological Human

Deliverable 13.1

Initial list of requirements after stakeholder

interviews including priority domains

Due date of delivery: 30th November 2013

Actual submission date: 3rd December 2013

Start of the project: 1st March 2013

Ending Date: 28th February 2017

Partner responsible for this deliverable: HES-SO Valais

Version: 1.0

Page 2: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D. 13.1 Initial list of requirements after stakeholder interviews including priority domains

MD-Paedigree - FP7-ICT-2011-9 (600932)

2

Dissemination Level: Public

Document Classification

Title Initial list of requirements after stakeholder interviews including priority domains

Deliverable 13.1

Reporting Period Month 1 to 9

Authors Ranveer Joyseeree, Henning Müller

Work Package 13

Security

Nature Report

Keyword(s) Analysis of user requirements, priority list

Document History

Name Remark Version Date

Deliverable 13.1 1.0 29.11.2013

List of Contributors

Name Affiliation

Ranveer Joyseeree HES-SO Valais

Prof. Dr. Henning Müller HES-SO Valais

David Manset Gnubila

Sebastien Gaspard Gnubila

Patrick Ruch HES-SO Geneva

Emilie Pasche HES-SO Geneva

Omiros Metaxas Athena

Harry Dimitropoulos Athena

Edwin Morley-Fletcher Lynkeus

List of reviewers

Name Affiliation

Prof. Rod Hose University of Sheffield

Bruno Dallapiccola OPBG

Abbreviations

MD-PAEDIGREE Model Driven Paediatric Digital Repository

Page 3: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

Contents

1 Executive summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Assets available for development of MD-PAEDIGREE . . . . . . . . . . . . . . . . . . . . . . . . . 6

4 Determining priority domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64.1 Analysis of user requirements in Questionnaire 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

4.1.1 Use of MD-PAEDIGREE does not add to clinical workload . . . . . . . . . . . . . . . . . . 74.1.2 Secure sharing of data between users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74.1.3 Ability to search repository using the following criteria: . . . . . . . . . . . . . . . . . . . . 8

4.1.3.1 Search by sex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.1.3.2 Search by age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.1.3.3 Search by image modality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.1.3.4 Search by anatomical structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.1.3.5 Search by pathology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.1.3.6 Search by keywords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.1.3.7 Search by image similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4.1.4 Support for search in multiple languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104.1.5 Easy upload of images and associated metadata . . . . . . . . . . . . . . . . . . . . . . . . 104.1.6 Support for free-text or unstructured text reports . . . . . . . . . . . . . . . . . . . . . . . 104.1.7 Curation of uploaded data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104.1.8 Automatic data anonymization or pseudonymization . . . . . . . . . . . . . . . . . . . . . . 114.1.9 Automatic annotation of anatomical regions and detection of modalities . . . . . . . . . . . 114.1.10 Learning from usage pattern of users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114.1.11 Support for data modelling and simulation in 3D . . . . . . . . . . . . . . . . . . . . . . . . 124.1.12 Unified repository for clinical and research systems . . . . . . . . . . . . . . . . . . . . . . . 124.1.13 Rating system to assess the quality of stored cases . . . . . . . . . . . . . . . . . . . . . . . 124.1.14 Additional requirements identified through questionnaire . . . . . . . . . . . . . . . . . . . 12

4.2 Use cases identified by the Infostructure group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134.3 Analysis of user requirements in Questionnaire 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.3.1 Rating the priority of use-case: ’Decision Support — Search through external information’ 144.3.2 Complementary sources of information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.3.3 General user comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

5 Drawing a priority list of user requirements and use cases . . . . . . . . . . . . . . . . . . . . . . . 155.1 Specification sheet - in descending order of priority score . . . . . . . . . . . . . . . . . . . . . . . 165.2 Priority list of use-case scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

A Questionnaire 1 . . . . . . . . . . . . . . . . . . . . . . . 23

B Questionnaire 2 . . . . . . . . . . . . . . . . . . . . . . . 28

C Analysis of clinical and research workflows . . . . . . . . . . . . . . . 33C.1 Clinical workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

C.1.1 General description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34C.1.2 Data acquired . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34C.1.3 Difficult decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

C.2 Research workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37C.2.1 Cardiomyopathy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37C.2.2 Risk of Cardio-Vascular Disease in Obese Children and Adolescents . . . . . . . . . . . . . 37C.2.3 JIA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38C.2.4 NND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

C.3 Outcome of analysis of workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3

Page 4: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

1 Executive summary

This document presents an initial list of MD-PAEDIGREE requirements for the information managementstructure of the project after consultation with stakeholders via a survey. It also presents priority domainswithin this list. The requirements list will be updated throughout the project in regular intervals and thisdocument does not include all dependencies of the full system requirements. It rather gives an initial overviewof the views of the stakeholders.

In addition, one of the key points made is that it is intended that some of the planned components of theproposed system are re-used from past projects of members of the MD-PAEDIGREE community whereverpossible. This means that several parts of MD-PAEDIGREE can be assembled easily and quickly since theyare already in existence, have proven their worth in past projects, and are well understood.

In light of the above, it is very likely that a specific requirement can be addressed reasonably quickly in thelikely event that expertise for its implementation is present in the MD-PAEDIGREE. A question remaining tobe answered after the survey is: which user requirement should be implemented first?

Answering this question is vital to the progression of the project since the quality of the early prototypesprovided to the end users will have a major impact on the likelihood of the success of the MD-PAEDIGREEinfo structure. In fact, it is necessary to make any initial prototype of the system as useful to as many endusers as early as possible. In other words, if the prototype implements the most frequently desired functionsof MD-PAEDIGREE, it will quickly have a heavy positive impact on the work carried out by stakeholders andwould, thus, quickly be able to integrate usual clinical and/or research workflows.

Thus, the direction of integration and development of MD-PAEDIGREE needs to be dictated by the mostpressing user needs and dependencies of these. To determine what those needs are, two questionnaires wereprepared in the first months of the project. An analysis of the results is provided in this report. A completeinitial list of user requirements is then presented in descending order of priority.

4

Page 5: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

2 Introduction

Model-Driven Paediatric Digital Repository (MD-PAEDIGREE) is a clinically-driven project with strong rootsin the Virtual Physiological Human (VPH) initiative [1]. It aims to allow experienced partners in the VPHcommunity, a leading European medical-applications powerhouse, highly qualified small and medium enterprises(SMEs), and world-renowned clinical centres to work together in order to develop a system that facilitatesaccess to paediatric biomedical information and enables the creation of reusable, adaptable, multi-scale modelsfor personalized and predictive patient care.

Four pathologies have been selected for modeling. These are:

• Cardiomyopathy

• Risk of Cardio-Vascular Disease in Obese Children and Adolescents

• Juvenile Idiopathic Arthritis (JIA)

• Neurological and Neuro-muscular Diseases (NND)

In addition, the core capabilities and characteristics of the repository have been clearly defined in the Descriptionof Work (DoW) of the project. A list of key features is provided below:

• Implementation of the Service Oriented Knowledge Utilities (SOKU) vision

• Effective data access and formulation

• Distributed processing and GPU support

• Intelligent mining, modeling, reasoning and simulation

• Personalized predictive medicine

• Compliance with Model-Based Drug Development guidelines

• Deployment at point of care and for research

At the core of MD-PAEDIGREE, there will lie an innovative model-driven infostructure. It will not be builtfrom scratch. Instead, the considerable knowledge and experience acquired from the Health-e-Child [2] andSim-e-Child projects [3] will be put to use in its development. In addition, some of the partners involvedalready possess tools or have experience with techniques that may be integrated into the infostructure in orderto improve its list of functionalities. These tools and techniques will be covered in more detail later in thisdocument.

To ensure the success of MD-PAEDIGREE in a clinical environment, there is a need to quickly set up a workingprototype of the system such that it can be deployed early for use by clinicians. This will not only give theclinicians plenty of time to incorporate MD-PAEDIGREE into their daily routine before its full release, itwill also allow them to provide feedback that will help drive the iterative improvement of the system over time.Moreover, to ensure frequent use of the prototype, one needs to equip the prototype with the most highly-desiredfeatures from the outset. Finding such popular functionalities entails the gathering of all user requirements andtheir subsequent ranking in terms of importence. Performing surveys is, therefore, a necessity. To providefeedback on user requirements and to help develop an effective system that caters to all four disease areas,experts from each of them were asked to take part in the survey.

On the 13th and 14th of March 2013, the kick-off meeting for MD-Paedigree was held in Rome, Italy. Duringthat event, stakeholders in the four pathology and infostructure sub-groups met separately and simultaneouslyto discuss the next steps in their respective areas of the project in detail. A comprehensive initial list of userrequirements was drawn from the content of these discussions.

A questionnaire was prepared based on the initial list (shown in Appendix A). Henceforth, it will be referredto as Questionnaire 1. Its objective was two-fold. First, it would allow the grading of the identified userrequirements in order of perceived importance such that the most desired functionalities of the infostructure areidentified and potentially implemented first. It also provided the opportunity for stakeholders to add and graderequirements that were not yet identified. To gather responses for the questionnaire, a head-to-head stakeholderinterview session was organized on the 2nd of May 2013 between HES-SO Valais members and some clinicalpartners involved in MD-PAEDIGREE.

5

Page 6: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

Upon analyzing the questionnaire, we discovered that some important information had not been included. Afollow-up questionnaire was set (shown in Appendix B). Henceforth, it will be referred to as Questionnaire 2.It was disseminated during the MD-Paedigree Half-yearly meeting in Genoa, Italy that took place on the 14thand 15th of October 2013.

In this deliverable, a brief overview of the technical assets at the disposal of the Infostructure group is firstpresented. The order in which the assets will be integrated into the MD-PAEDIGREE infostructure dependson the demands of the primary users of the system. To determine the functionalities most in demand, we thenproceed to analyze the responses to Questionnaire 1. Then, a list of use cases identified by the Infostructuregroup is presented. It becomes apparent that, before the user requirements can be fully mapped onto the theuse cases, information that was lacking from the responses to Questionnaire 1 still needed to be collected. Themissing information, gathered using Questionnaire 2, is then presented. A list of user requirements, arrangedin descending order of priority is subsequently shown. This is followed by a similarly-ordered list of use cases.

3 Assets available for development of MD-PAEDIGREE

At the disposal of the Infostructure group is a series of tools and expertise that can be integrated into MD-PAEDIGREE in an order dictated by the needs expressed by the end users. Below, in Table 1, is an overviewof such tools and expertise.

Task Known method of implementation

International access to medical records epSOS [4]Setting up foundations of infostructure Health-e-Child/Sim-e-Child/PCDR digital repository [5]Parallel processing Athena Distributed Processing Engine [6]Ensuring consistency of data Data Curation and Validation (DCV) toolAnalysis of complex data madIS tool [7]Image search interface Khresmoi [8]Semantic web technologies THESEUS-MEDICO [9] and SemanticHealthNet [10]Personalisation interface PAROS [11]Predictive and statistical simulation models AITION [12]Interactive data filtering and similarity searchover a grid of clinical centres

CaseReasoner [13]

Design of clinical trials Clinical trial design tool [14]Indexing and searching of data collections p-medicine [15]Data mining DebugIT [16]Infostructure components VPH-Share [17]Matching publications with data OpenAIRE/OpenAIRE+ [18]

Table 1: Known methods of implementing some essential tasks

Adding to the information in the table, data protection and privacy will be major considerations in the projectand might prove to be a major tumbling block. However, awereness of Health Insurance Portability and Ac-countability Act (HIPAA) regulations [19] and the upcoming General Data Protection Regulation (GDPR) [20]that govern this subject exists within the group. That will be put to good use to ensure that the repositorycomplies with data protection regulations.

Following along the same vein, data anonymization techniques that can be used to protect the identity ofpatients are known. For instance, ultrasound images have the identity of patients inscribed on them. Solutionsexist, however, to blank specific areas of those images [21]. The same goes for DICOM files [22].

4 Determining priority domains

To make best use of limited resources and to set up a working system as soon as possible, priorizing theimplementation of the varying needs of stakeholders is necessary. This section deals with the analysis of theresponses of stakeholders to Questionnaire 1.

6

Page 7: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

For each user requirement identified in the Questionnaire 1, choosing one of the four options between highestand lowest priority adds 4, 3, 2 or 1 respectively to the its score. If the requirement is unrated, it means thatit is seen as not a priority and, therefore, receives a score of zero. A normalized score varying from zero to oneis then derived from the previous score and that is used to quantify the level of priority of each requirement.A normalized score of one would represent the case where all respondents rate a requirement as being of thehighest level of priority.

Requirement A certain requirementPriority level Highest Quite High Quite Low Lowest UngradedFrequency a b c d eNormalized score s

Table 2: Example of a presentation table for the response to a questionnaire entry

Table 2 illustrates how the level of priority of each requirement is quantified. The normalized priority score iscalculated as follows:

Normalized priority core, s= 4∗a+3∗b+2∗c+1∗d+0∗e(4∗(a+b+c+d+e))

Having found a measure for the level of priority of a given requirement, we proceed to the assessment of thepriority score for each user requirement identified prior to the May 2nd meeting and present in Questionnaire1.

4.1 Analysis of user requirements in Questionnaire 1

4.1.1 Use of MD-PAEDIGREE does not add to clinical workload

Requirement Use of MD-PAEDIGREE does not add to the workload of usersPriority level Highest Quite High Quite Low Lowest UngradedFrequency 6 14 3 0 0Score 0.782

Table 3: Data and calculated score for workload-related requirement

If the use of MD-PAEDIGREE considerably increases the amount of time and or effort that a clinician spendsfrom initial contact with the patient to final diagnosis, it will not be deemed useful and will never come intowidespread use. User-friendliness of MD-PAEDIGREE will aid in reducing the extra workload brought uponby the use of the proposed system and will, therefore, increase its appeal among clinicians. As can be seen fromTable 3, that requirement has a high priority rating.

To help MD-PAEDIGREE achieve the objective, an analysis of the usual clinical workflow was carried out. It isillustrated by Figure 1 in Appendix C.1. Moreover, an overview of the difficult decisions taken by clinicians intheir daily routine is presented in Appendix C.1.3. It will serve as a guide in the near future for the Infostructuregroup for finding ways for MD-PAEDIGREE to facilitate clinical decision-making, thereby making it extremelyuseful and getting it widely adopted into usual clinical routine.

4.1.2 Secure sharing of data between users

Requirement Secure sharing of data between usersPriority level Highest Quite High Quite Low Lowest UngradedFrequency 14 7 0 1 1Score 0.848

Table 4: Data and calculated score for security-related requirement

7

Page 8: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

Security is a high-priority requirement as Table 4 demonstrates. During the requirement-gathering interview, itbecame apparent that it is also a very delicate issue and is tough to implement. The less anonymous the datais, the more the MD-PAEDIGREE system needs to protect the data from unauthorized access. The discussionssurrounding security also tended to focus on the exact type of consent that patients need to provide whenmaking their data available for research. In line with the concept of ’enhanced privacy’, giving feedback (ifwanted) to the patients at the end of studies that used their data was also considered.

4.1.3 Ability to search repository using the following criteria:

4.1.3.1 Search by sex

Requirement Search by sexPriority level Highest Quite High Quite Low Lowest UngradedFrequency 10 9 4 0 0Score 0.815

Table 5: Data and calculated score for searching by sex

Deemed an important feature, as observed from Table 5, the ability to search the contents of the repositoryaccording to the gender of the patient will help narrow down the number of results for a particular query quiteconsiderably and will thus make the MD-PAEDIGREE algorithm return more relevant results.

4.1.3.2 Search by age

Requirement Search by agePriority level Highest Quite High Quite Low Lowest UngradedFrequency 12 9 2 0 0Score 0.859

Table 6: Data and calculated score for searching by age

The ability to search the repository using the age criterion will allow the narrowing of the search results accordingto the specific needs of clinicians. Once again, it needs to be one of the first features to be implemented.

4.1.3.3 Search by image modality

Requirement Search by image modalityPriority level Highest Quite High Quite Low Lowest UngradedFrequency 12 3 5 1 2Score 0.739

Table 7: Data and calculated score for searching by image modality

Being able to restrict search to specific image modalities such as magnetic resonance (MR) or computed tomog-raphy (CT) is seen as relatively less important, as can be judged from Table 7.

8

Page 9: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

4.1.3.4 Search by anatomical structure

Requirement Search by anatomical structurePriority level Highest Quite High Quite Low Lowest UngradedFrequency 14 3 3 1 2Score 0.783

Table 8: Data and calculated score for searching by anatomical structure

The essence behind this requirement is to find out if restricting search to a certain anatomical structure or organsuch as the brain or the heart would be useful for clinicians. Table 8 shows that it is seen as a fairly importantfeature. It will help the users of MD-PAEDIGREE to retrieve cases pertaining to only one particular organ.

4.1.3.5 Search by pathology

Requirement Search by pathologyPriority level Highest Quite High Quite Low Lowest UngradedFrequency 17 5 1 0 0Score 0.924

Table 9: Data and calculated score for searching by pathology

Searching MD-PAEDIGREE and then only displaying results connected to a precise pathology such as arthritisor cerebral palsy can help clinicians restrict what they get of the system to only relevant cases. That featurewill need to be implemented quite early on in the MD-PAEDIGREE development phase due to its high rating.

4.1.3.6 Search by keywords

Requirement Search by keywordsPriority level Highest Quite High Quite Low Lowest UngradedFrequency 13 8 2 0 0Score 0.870

Table 10: Data and calculated score for searching by keywords

It is essential that clinicians are able to query the MD-PAEDIGREE infostructure using keywords as theyusually do with online search engines such as Google or PubMed. Table 10 reflects that conclusion.

4.1.3.7 Search by image similarity

Requirement Search by image similarityPriority level Highest Quite High Quite Low Lowest UngradedFrequency 6 8 8 0 1Score 0.696

Table 11: Data and calculated score for searching by image similarity

The idea behind this requirement is that one could provide a reference image based on which similar images inthe MD-PAEDIGREE infostructure could be retrieved. The relatively low score of the image search capabilityis explained by the belief among clinicians that such a feature is unlikely to give reliable results. If it is provento work, however, they would like to see it as an essential feature of MD-PAEDIGREE. Work on image retrievalhas already been carried out in the Khresmoi project with promising results.

9

Page 10: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

4.1.4 Support for search in multiple languages

Requirement Support for search in multiple languagesPriority level Highest Quite High Quite Low Lowest UngradedFrequency 4 3 5 10 1Score 0.489

Table 12: Data and calculated score for searching in multiple languages

Since MD-PAEDIGREE is a pan-European project with partners speaking various languages, the ability tosearch the repository in several languages was identified as a possible useful feature. However, Table 12 showsthat its implementation does not seem to be a top priority.

4.1.5 Easy upload of images and associated metadata

Requirement Easy upload of images and associated metadataPriority level Highest Quite High Quite Low Lowest UngradedFrequency 15 7 1 0 0Score 0.902

Table 13: Data and calculated score for the easy uploading of images and their associated metadata

In line with the notion of minimizing the impact on clinical workflow of using MD-PAEDIGREE, the ease ofuploading data to the repository would be very important. It has been identified as a high-priority requirement,as Table 13 demonstrates.

To achieve such ease of upload, we gathered a list of the characteristics of the data that MD-PAEDIGREE isexpected to handle. That is covered in Appendix C.3.

4.1.6 Support for free-text or unstructured text reports

Requirement Support for free-text or unstructured text reportsPriority level Highest Quite High Quite Low Lowest UngradedFrequency 6 10 7 0 0Score 0.739

Table 14: Data and calculated score for support of free-text and unstructured text reports

Numerical and image data is often accompanied by complementary unstructured text reports, the availabilityof which would be very useful for later study. MD Paedigree should, therefore, be able to handle such textinputs. The questionnaire participants, accordingly, assigned a quite high priority to such a feature.

4.1.7 Curation of uploaded data

Requirement Curation of uploaded dataPriority level Highest Quite High Quite Low Lowest UngradedFrequency 12 7 3 0 1Score 0.815

Table 15: Data and calculated score for curation of uploaded data

Once data is uploaded into the system, it is necessary to check them for inconsistencies otherwise queries willreturn incorrect results. Curation of uploaded data is thus an important requirement for MD-PAEDIGREE.

10

Page 11: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

According to the questionnaire results, shown in Table 15, such a feature needs to be developed for MD-PAEDIGREE with great urgency.

4.1.8 Automatic data anonymization or pseudonymization

Requirement Automatic data anonymization or pseudonymizationPriority level Highest Quite High Quite Low Lowest UngradedFrequency 15 5 2 1 0Score 0.870

Table 16: Data and calculated score for automatic anonymization and pseudonymization

Removing identifying information from patient data may be non-trivial and time-consuming. If MD-PAEDIGREEhad an in-built automatic depersonalization or pseudonymization tool, clinicians would just need to upload com-plete datasets to the system without spending any time on removing personal details of patients. The clinicianshave indicated that the development of such a tool within the system is of high priority, as can be observed inTable 16.

4.1.9 Automatic annotation of anatomical regions and detection of modalities

Requirement Automatic annotation of anatomical regions and detection of modalitiesPriority level Highest Quite High Quite Low Lowest UngradedFrequency 2 9 8 2 2Score 0.576

Table 17: Data and calculated score for automatic annotation of anatomical regions and detection of modalities

For this requirement that was asked for during the kick-off meeting, the clinicians reiterated their interest butwere cautious regarding its performance. They prefer manual annotation and detection techniques until theautomatic methods are shown to be reliable. This general sentiment is echoed in the relatively low priorityscore in Table 17.

4.1.10 Learning from usage pattern of users

Requirement Learning from usage pattern of usersPriority level Highest Quite High Quite Low Lowest UngradedFrequency 5 8 6 3 1Score 0.641

Table 18: Data and calculated score for learning from usage pattern of users

The basic idea behind this feature is that MD-PAEDIGREE learns from the queries formulated by the userand the hits viewed in order to display more relevant future hits. One major concern is the bias in the displayof search hits that not everyone would be willing to accept. Another tool covered by this requirement is theauto-completion of search queries based on system-wide and user-specific utilization of the repository. Table 18indicates that the implementation of this requirement should be high in the list of priorities.

11

Page 12: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

4.1.11 Support for data modelling and simulation in 3D

Requirement Support for data modelling and simulation in 3DPriority level Highest Quite High Quite Low Lowest UngradedFrequency 9 7 4 3 0Score 0.739

Table 19: Data and priority score for implementing data modelling and simulation in 3D

Enabling data modelling and 3D simulation would allow clinical researchers to manipulate patient data thewithin MD-PAEDIGREE repository. In that way, they could make even better use of the available informationand be in a better position to make calls on the treatment of the patient. Table 19 shows that this requirementhas a fairly high priority.

4.1.12 Unified repository for clinical and research systems

Requirement Unified repository for clinical and research systemsPriority level Highest Quite High Quite Low Lowest UngradedFrequency 10 9 3 1 0Score 0.804

Table 20: Data and calculated score for implementing a unified repository for both clinical and research systems

Having a unified repository for both clinical and research systems would allow both sides to use the same data.That poses a security problem as the research side does not have authorization to see as much as the clinicalside. It was suggested that one way to avoid the problem is to provide different layers of access to the data to thetwo sides. An analysis of the outcome of the questionnaire for this question suggests that the implementationof such a system is of high priority.

4.1.13 Rating system to assess the quality of stored cases

Requirement Rating system to assess the quality of stored casesPriority level Highest Quite High Quite Low Lowest UngradedFrequency 6 12 4 1 0Score 0.750

Table 21: Data and priority score for implementing a rating system to assess the quality of stored cases

A rating feature would allow users of the repository to flag cases of interest such that they become moreprominent in the future. Such a feature would ensure that clinicians get alerted to the existence of highlyinteresting cases that they can learn from and that can influence their choice of treatment for their patients.Table 21 points to the fact that clinicians assigned a high priority to its inclusion in the system.

4.1.14 Additional requirements identified through questionnaire

Some requirements were missing from the questionnaire and were added by the participating clinicians. Theyare listed below with their respective indicated priority level in brackets:

• Ability to include dietary information of patients in the repository. (high)

• Ability to add details regarding treatment in MD-PAEDIGREE. (high)

• Data extraction from .c3d files. (highest)

• Online viewers for data. (high)

12

Page 13: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

• Patient-specific simulation and prediction. (highest)

• Patient-specific workflow. (highest)

• Search for similar cases based on information in gait analysis time series. (highest)

• Ability to classify motion-capture curves (e.g. joint angles) and to export classifications (code availablefor next 6 requirements at KU Leuven). (high)

• Ability to place annotations on curves and to export annotated graphs. (low)

• Ability to send and receive data via the RESTful API. (lowest)

• Ability to create/upload/run custom-made processing algorithms e.g. Python and Javascript. (high)

• Ability to edit layout for data presentation. (low)

• Ability to create custom work-flows for processing blocks (e.g. http://noflojs.org/). (low)

• Ability to find similarity in all data (3D Kinematics, Kinetics, muscle activity) instead of just in images.(highest)

• Ability to download data easily from MD-PAEDIGREE. (high)

• Support for management of standard non-image data formats used in project (e.g. genetic SNP). (highest)

• API for easy access to heterogeneous data collection for data modeling/research. (highest)

• Annotation and extraction of structures: support for uploading and viewing annotation and extractionoutcomes. (highest)

• Ability to search using clinical features (e.g. ANA, RF, uveitis, morning stiffness). (high)

• Ability to upload and visualize simulation results. (highest)

• Ability to run simulations on data. (highest)

• Information from and for patient advocacy groups. (highest)

For these functionalities a next step will be to estimate the facility to integrate them and in connection withthe priorities it will help to decide on the steps in the integration process.

4.2 Use cases identified by the Infostructure group

Independently of the outcome of questioning the clinical partners, the Infostructure subgroup has come up withfive use cases relating to the capabilities of MD-PAEDIGREE. These use cases will guide the development ofthe functionality and of the desired parts from a technical point of view. In no particular order, they are asfollows:

• Data protection throughout the system.

• Data collection, curation, annotation and representation.

• Decision Support — Search through internal data.

• Decision Support — Search through external information.

• Modeling and simulation.

The next step in the process of finding the priority domains for MD-Paedigree is to map the user requirementsand priorities to the above list of use-case scenarios. Information was found to be missing regarding the use-case: ’Decision Support - Search through external information’. To fill this gap in information, Questionnaire2 (found in Appendix B) was set.

13

Page 14: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

4.3 Analysis of user requirements in Questionnaire 2

As was the case with Questionnaire 1, Questionnaire 2 is analyzed in order to find out the priority assignedto the listed requirements by the system users. Since it was set in order to address the missing informationregarding the use of external sources of information in usual clinical workflow, its primary focus was on thisaspect. The clinical partners were first to rate the priority of the use-case scenario: ’Decision Support - Searchthrough external information’. They were then asked to list the sources of external information in order offrequency of use. The responses are summarized below.

4.3.1 Rating the priority of use-case: ’Decision Support — Search through external information’

Requirement Preferred position in priority list of 5 use-casesPosition 1 2 3 4 5Frequency 2 0 1 1 4Average position 3.625 (approximately equal to 4)

Table 22: Responses regarding the position of use-case in priority list of use cases

Table 22 shows that the preferred position of the use-case is fourth in the priority list.

4.3.2 Complementary sources of information

Clinicians often search for extra information when assessing whether they have made the right call on thetreatment of a patient. They use a variety of information sources and the ones listed by those who filled outthe questionnaire are provided below in descending frequency of use:

• Offline records: hospital records (for example: previous visits, discharge letters) and departmental knowl-edge repositories. Internal guidelines.

• Other clinicians: colleagues from the same field, specialists, experts.

• Online search engines: Google, Google Scholar, ScienceDirect, PubMed, Scopus, Orphanet, GeneReviews(now called: Genetic Testing Registry), ISI Web of Knowledge, and other disease-specific sites (for exam-ple: gut microbiata studies for metagenomics).

4.3.3 General user comments

The following comments were made regarding the requirement-gathering process:

• Data should be anonymized and uploaded at the location of collection rather than being sent to a centrallocation for this to be performed.

• A repository, accessible project-wide, for data processing tools, research tools and code would be useful.A facility for discussions, reporting errors and improvements on tools and code would be useful. Such asystem could be like MATLAB file exchange.

• Ability to search using parameters of gait analysis time series is really important for NND and should behigher up the list.

• The use cases do not reflect the power of the models. In particular, the decision support does not leveragethem enough for improved pediatrics. It would be nice to be able to search similar cases in terms ofsimulation results, simulation parameters, etc.

• With the increased importance of patient partnering, working towards a system that includes informationfrom and for patients advocacy groups would be extremely good and innovative.

14

Page 15: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

5 Drawing a priority list of user requirements and use cases

Since enough information has been gathered for the priorization of user requirements to identified use-cases,we proceed to creating the complete list of them in order of perceived priortity. This list is presented as aspecification sheet and it is provided in the next section. In addition, the user requirements are mapped ontothe use-case scenarios identified by the Infostructure group. This is presented in the section following thespecification sheet. Still, such a ranking does not take into account dependencies and easy of implementationof the various functionalities.

15

Page 16: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

5.1 Specification sheet - in descending order of priority score

1. Ability to search using pathology as a criterion.

2. Easy upload of images and associated metadata.

3. • Ability to search using keywords.

• Automatic pseudonymization or anonymization of uploaded data.

5. Ability to search using age as a criterion.

6. Secure data-sharing between users.

7. • Ability to search using sex as a criterion.

• Automatic curation of uploaded data.

9. Unified repository for clinical and research systems.

10. • Use of MD-PAEDIGREE does not add to the workload of users.

• Ability to search using anatomical structure as a criterion.

12. A rating system to assess the quality of cases stored in the infostructure.

13. • Ability to search using image modality as a criterion.

• Support for unstructured text and free-text reports.

• Support for data modelling and simulation in 3D.

16. Ability to search using image similarity as a criterion.

17. Learning from usage pattern of users to provide more relevant search results.

18. Automatic annotation of anatomical regions and detection of modalities.

19. Support for search in multiple languages.

20. • Data extraction from .c3d files.

• Patient-specific simulation and prediction.

• Patient-specific workflow.

• Search for similar cases based on information in gait analysis time series.

• Ability to find similarity in all data (3D Kinematics, Kinetics, muscle activity) instead of just inimages.

• Support for management of standard non-image data formats used in project (e.g. genetic SNP).

• API for easy access to heterogeneous data collection for data modeling/research.

• Annotation and extraction of structures: support for uploading and viewing annotation and extrac-tion outcomes.

• Ability to upload and visualize simulation results.

• Ability to run simulations on data.

• Ability to incorporate information from and for patient advocacy groups.

31. • Ability to include dietary information of patients in the repository.

• Ability to add details regarding treatment in MD-PAEDIGREE.

• Ability to download data easily from MD-PAEDIGREE.

• Ability to search using clinical features (e.g. ANA, RF, uveitis, morning stiffness).

• Online viewers for data.

• Ability to create/upload/run custom-made processing algorithms e.g. Python and Javascript.

16

Page 17: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

• Ability to classify motion-capture curves (e.g. joint angles) and to export classifications (code avail-able at KU Leuven).

38. • Ability to place annotations on curves and to export annotated graphs.

• Ability to edit layout for data presentation.

• Ability to create custom work-flows for processing blocks (e.g. http://noflojs.org/).

41. Ability to send and receive data via the RESTful API.

Note: Skepticism around the value/quality of searching by image similarity has lowered its rank in the listabove. It was in the preparation of the proposal considered a high priority as it is a demonstrably functioningand valuable tool in concrete scenarios.

17

Page 18: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

5.2 Priority list of use-case scenarios

Based on the results of analyzing the requirements questionnaire, the use cases identified by the Infostructuresubgroup need to be assigned a level of priority depending on how important they are for clinicians. They willbe implemented in descending order of priority. As a reference, the user requirements falling under each usecase is also added to the list and their respective position in the priority list of requirements is indicated inbrackets.

1. Decision Support - Search through internal data

a) Ability to search using pathology as a criterion (1)

b) Ability to search using keywords (3)

c) Ability to search using age as a criterion (5)

d) Ability to search using sex as a criterion (7)

e) Ability to search using anatomical structure as a criterion (10)

f) Ability to search using image modality as a criterion (13)

g) Ability to search using image similarity as a criterion (16)

h) Learning from usage pattern of users to provide more relevant search results (17)

i) Support for search in multiple languages (19)

j) Search for similar cases based on information in gait analysis time series (20)

k) Ability to find similarity in all data (3D Kinematics, Kinetics, muscle activity) instead of just inimages. (20)

l) Ability to search using clinical features (e.g. ANA, RF, uveitis, morning stiffness) (31)

2. Data collection, curation, annotation and representation

a) Easy upload of images and associated metadata (2)

b) Automatic pseudonymization or anonymization of uploaded data (3)

c) Automatic curation of uploaded data (7)

d) Unified repository for clinical and research systems (9)

e) Use of MD-PAEDIGREE does not add to the workload of users (10)

f) A rating system to assess the quality of cases stored in the infostructure (12)

g) Support for unstructured text and free-text reports, including descriptions of details such as dietaryhabits, and treatment (13)

h) Automatic annotation of anatomical regions and detection of modalities (18)

i) Data extraction from .c3d files (20)

j) Support for management of standard non-image data formats used in project (20)

k) API for easy access to heterogeneous data collection for data modeling/research (20)

l) Annotation and extraction of structures: support for uploading and viewing annotation and extrac-tion outcomes (20)

m) Ability to incorporate information from and for patient advocacy groups (20)

n) Ability to include dietary information of patients into the repository (31)

o) Ability to add details regarding treatment in MD-PAEDIGREE (31)

p) Ability to download data easily from MD-PAEDIGREE (31)

q) Online viewers for data (31)

18

Page 19: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

r) Ability to classify motion-capture curves (e.g. joint angles) and to export classifications (code avail-able at KU Leuven) (38)

s) Ability to place annotations on curves and to export annotated graphs (38)

t) Ability to edit layout for data presentation (38)

u) Ability to create custom workflows for processing blocks (e.g. http://noojs.org/) (38)

v) Ability to send and receive data via the RESTful API (41)

3. Data Protection

a) Secure data-sharing between users (6)

4. Decision Support - Search through external information

5. Modeling and Simulation

a) Support for data modeling and simulation in 3D (13)

b) Patient-specific simulation and prediction (20)

c) Patient-specific workflow (20)

d) Ability to run simulations on data (20)

e) Ability to upload and visualize simulation results (20)

f) Ability to create/upload/run custom-made processing algorithms e.g. Python and Javascript (31)

19

Page 20: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

6 Conclusion

Having found what the most pressing needs of the clinical and research users of MD-PAEDIGREE are, theinfostructure group can now focus on getting an initial prototype of the repository ready and connect dataacquisition tools to this repository. The prototype should incorporate highly rated requirements from theoutset while the less popular ones should be added incrementally over time in order of priority or based ondependencies of functionalities. In this way, the end users will immediately be able to put the digital repositoryto good use and, over time, will embed it in their daily clinical and/or research workflow. An important useand acceptance by the clinicians is one of the major objectives of MD-PAEDIGREE. The requirements analysiswill be continuing throughout the first years of the project and it is thus a living document that will be updatedwith requirements coming up when data acquisition has started and clinical workflows have gained experienceswith using the first versions of the info structure.

20

Page 21: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

References

[1] Virtual Physiological Human. http://www.vph-noe.eu/. Accessed: 28-11-2013.

[2] Health-e-Child. http://www.health-e-child.org/. Accessed: 28-11-2013.

[3] Sim-e-Child. http://www.sim-e-child.org/. Accessed: 28-11-2013.

[4] epSOS. http://www.epsos.eu/. Accessed: 28-11-2013.

[5] PCDR. https://pcdr.gnubila.fr/about-pcdr. Accessed: 28-11-2013.

[6] Josina M. Arfman and Peter Roden. Project athena: Supporting distributed computing at mit. IBMSystems Journal, 31(3):550–563, 1992.

[7] madIS tool. https://code.google.com/p/madis/. Accessed: 28-11-2013.

[8] Khresmoi. http://www.khresmoi.eu/. Accessed: 28-11-2013.

[9] THESEUS-MEDICO. http://www.joint-research.org/das-theseus-forschungsprogramm/medico/.Accessed: 28-11-2013.

[10] Semantic Health Net. http://www.semantichealthnet.eu/. Accessed: 28-11-2013.

[11] Yannis Ioannidis, Maria Vayanou, Thodoris Georgiou, Katerina Iatropoulou, Manos Karvounis, Vivi Kat-ifori, Marialena Kyriakidi, Natalia Manola, Alexandros Mouzakidis, Lefteris Stamatogiannakis, et al. Pro-filing attitudes for personalized information provision. Data Engineering, page 35, 2011.

[12] Harry Dimitropoulos, Herald Kllapi, Omiros Metaxas, Nikolas Oikonomidis, Eva Sitaridi, Manolis M.Tsangaris, and Yannis E. Ioannidis. Aition: A scalable platform for interactive data mining. In SSDBM,pages 646–651, 2012.

[13] CaseReasoner. http://www.slideshare.net/GaborRendes/healthechild-casereasoner. Accessed: 28-11-2013.

[14] Clinical trial design. http://www.ehr4cr.eu/. Accessed: 28-11-2013.

[15] pMedicine. http://www.p-medicine.eu/. Accessed: 28-11-2013.

[16] DebugIT. http://www.debugit.eu/. Accessed: 28-11-2013.

[17] VPHShare. http://www.vph-share.eu/. Accessed: 28-11-2013.

[18] openAIRE. https://www.openaire.eu/. Accessed: 28-11-2013.

[19] HIPAA regulations. http://www.hhs.gov/ocr/privacy/hipaa/understanding/. Accessed: 28-11-2013.

[20] GDPR EU Press Release. http://europa.eu/rapid/press-release_IP-12-46_en.htm?locale=en. Ac-cessed: 28-11-2013.

[21] T Chan and KW Tsui. Automatic anonymization of ultrasound images.

[22] Gary Kin-wai Tsui and Tao Chan. Automatic selective removal of embedded patient information fromimage content of dicom files. American Journal of Roentgenology, 198(4):769–772, 2012.

21

Page 22: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

Appendix

22

Page 23: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

A Questionnaire 1

23

Page 24: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 MD-Paedigree - FP7-ICT-2011-9 (600932)

Questionnaire - MD Paedigree Requirements

To help the infostructure group draw a list of requirements for the MD Paedigree infostructure and prioritize thedevelopment of the most useful features, please fill in the following questionnaire. The infostructure team thanksyou in advance for your participation. Please send any queries to Ranveer Joyseeree: [email protected].

Full name

E-mail address

Institution

Disease group(s)

Work package(s)

Existing Requirements

Please rate the following requirements in terms of the level of priority to be assigned to them. They were identifiedat the kick-off event in Rome during the meetings of the infostructure and the four disease sub-groups. There arefive levels of priority per requirement ranging from highest priority (leftmost boxes) to lowest priority (rightmostboxes). Please mark your choice of level of priority with an tick mark (X).

1. Use of MD Paedigree does not add to the workload of users.Highest Lowest

2. Secure data sharing between users.Highest Lowest

3. Ability to search repository using the following criteria:

Sex: Highest Lowest

Age: Highest Lowest

Image modality: Highest Lowest

Anatomical structure: Highest Lowest

Pathology: Highest Lowest

Keywords: Highest Lowest

Image similarity: Highest Lowest

4. Support for search in multiple languages.Highest Lowest

5. Easy upload of images and associated metadata.Highest Lowest

6. Support for free-text or unstructured text reports.Highest Lowest

7. Curation of uploaded data.Highest Lowest

HES-SO Valais 1

Page 25: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 MD-Paedigree - FP7-ICT-2011-9 (600932)

8. Automatic pseudonymization or anonymization of dataHighest Lowest

9. Automatic annotation of anatomical regions and detection of modalitiesHighest Lowest

10. Learning from the usage pattern of users to provide more relevant search resultsHighest Lowest

11. Support for data modelling and simulation in 3DHighest Lowest

12. Unified repository for clinical and research systemsHighest Lowest

13. A rating system to assess the quality of cases stored in the infostructureHighest Lowest

Open-ended questions

Please answer the following questions in a clear and concise manner.

14. Please describe your usual clinical and/or research work-flow especially with regards to the useof image and associated non-image data for the treatment of patients.

15. Please list the difficult decisions that need to be made in the clinical workflow and state thetype of information that can aid the decision-making process.

16. Please state the modality and, if known, the file format/extension for each type of image andnon-image data that you would like to upload to the infostructure.

HES-SO Valais 2

Page 26: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 MD-Paedigree - FP7-ICT-2011-9 (600932)

17. Please list the methods you employ to gather complementary information about cases at hande.g. asking colleagues, searching on google or pubmed and so on.

Other required or desired features in MD Paedigree

Please use this section to add and rate further requirements that have not been covered so far in this questionnaire.

18.Highest Lowest

19.Highest Lowest

20.Highest Lowest

21.Highest Lowest

22.Highest Lowest

23.Highest Lowest

24.Highest Lowest

25.Highest Lowest

26.Highest Lowest

27.Highest Lowest

28.Highest Lowest

29.Highest Lowest

HES-SO Valais 3

Page 27: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 MD-Paedigree - FP7-ICT-2011-9 (600932)

Comments

30. Please add any further comments here.

HES-SO Valais 4

Page 28: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

B Questionnaire 2

28

Page 29: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 MD-Paedigree - FP7-ICT-2011-9 (600932)

Analysis of MD Paedigree requirements

Please contact the Infostructure sub-group for any queries regarding requirements:

Ranveer Joyseeree ([email protected])

1 Specification sheet - in descending order of priority score

1. Ability to search using pathology as a criterion.

2. Ability to search using keywords.

3. Secure data-sharing between users.

4. • Ability to search using age as a criterion.

• Easy upload of images and associated metadata.

• Automatic psedonymization or anonymization of uploaded data.

7. Ability to search using sex as a criterion.

8. • Automatic curation of uploaded data.

• Unified repository for clinical and research systems

10. • Support for unstructured text and free-text reports.

• Learning from usage pattern of users to provide more relevant search results.

• A rating system to assess the quality of cases stored in the infostructure.

13. Ability to search using image similarity as a criterion.

14. • Ability to search using anatomical structure as a criterion.

• Support for data modelling and simulation in 3D.

16. Ability to search using image modality as a criterion.

17. Automatic annotation of anatomical regions and detection of modalities.

18. Support for search in multiple languages.

19. • Ability to include dietary information of patients in the repository.

• Ability to add details regarding treatment in MD Paedigree.

• Data extraction from .c3d files.

• Search for similar cases based on information in gait analysis time series.

23. • Online viewers for data.

Note: Skepticism around the value of searching by image similarity has lowered its rank in the list above. Italso needs to be one of the top priorities as it is a demonstrably functioning and valuable tool.

HES-SO Valais 1

Page 30: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 MD-Paedigree - FP7-ICT-2011-9 (600932)

2 Use cases

Independently of the outcome of questioning the clinical partners, the infostructure subgroup have come up withfive use cases relating to the capabilities of MD Paedigree. In no particular order, they are as follows:

• Data protection throughout the system.

• Data collection, curation, annotation and representation.

• Decision Support - Search through internal data.

• Decision Support - Search through external information.

• Modeling and simulation.

3 Prioritized use cases

Based on the results of analyzing the requirements questionnaire, the above use cases need to be assigned a levelof priority depending on how important they are for clinicians. They will be implemented in descending order ofpriority.

1. Decision Support - Search through internal data

a) Ability to search using pathology as a criterion (1)

b) Ability to search using keywords (2)

c) Ability to search using age as a criterion (4)

d) Ability to search using sex as a criterion (7)

e) Learning from usage pattern of users to provide more relevant search results (10)

f) Ability to search using image similarity as a criterion (13)

g) Ability to search using anatomical structure as a criterion (14)

h) Ability to search using image modality as a criterion (16)

i) Support for search in multiple languages (18)

j) Ability to search using parameters of gait analysis time series (19)

2. Data Protection

a) Secure data-sharing between users (3)

3. Data collection, curation, annotation and representation

a) Easy upload of images and associated metadata (4)

b) Automatic pseudonymization or anonymization of uploaded data (4)

c) Automatic curation of uploaded data (8)

d) Unified repository for clinical and research systems (8)

e) Support for unstructured text and free-text reports, including descriptions of details such as dietaryhabits, and treatment (10)

f) A rating system to assess the quality of cases stored in the infostructure (10)

g) Automatic annotation of anatomical regions and detection of modalities (17)

h) Automatic extraction and presentation of contents of gait analysis data contained in .c3d files (19)

i) Availability of online viewers for data (23)

4. Modeling and Simulation

HES-SO Valais 2

Page 31: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 MD-Paedigree - FP7-ICT-2011-9 (600932)

a) Support for data modeling and simulation in 3D (14)

3.1 Feedback on use case: Decision Support - Search through exter-nal information.

As a reminder, sources for external information mentioned in questionnaire were:

• Online search engines: Google, PubMed, Scopus, Orphanet, GeneReviews, ISI Web of Knowledge, andother disease-specific sites (e.g. gut microbiata studies for metagenomics).

• Other clinicians: colleagues from the same field and specialists.

• Offline records: hospital records (e.g. previous visits, discharge letters) and departmental knowledge repos-itories.

1. Position in previous list where this use case should be found.1 2 3 4 5

2. Please list, in descending order of priority, the information sources you would like to beable to consult through MD Paedigree.

4 Missing requirements

Please state any requirement that is missing in the previous lists and indicate what level of priorityeach should be assigned.

3.

Highest Lowest

4.

Highest Lowest

5.

Highest Lowest

HES-SO Valais 3

Page 32: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 MD-Paedigree - FP7-ICT-2011-9 (600932)

6.

Highest Lowest

7.

Highest Lowest

5 General comments

8. Please provide any comment that may be of help to the Infostructure sub-group.

HES-SO Valais 4

Page 33: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

C Analysis of clinical and research workflows

33

Page 34: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

C.1 Clinical workflow

Figure 1: Illustration of the usual clinical workflow adopted by medical professionals.

C.1.1 General description

Figure 1 illustrates a general workflow adopted by clinicians involved with MD-PAEDIGREE. An initial diag-nostic impression of the illness that is afflicting the patient is formed by the clinicians. The patient is thenmade to go through some data acquisition procedures if he/she is suitable for it. The data is then processedand subsequently used to confirm or deny the initial diagnosis based on the found evidences.

C.1.2 Data acquired

Table 23 shows the main data types acquired for the four disease areas.

34

Page 35: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

Pathology -Types of data acquired

Cardiomyopathy -demographic data-echocardiography-MRI

Risk of Cardio-Vascular Diseasein Obese Children and Adoles-cents

-demographic data-patient history-lab tests-proteomic and metagenomic tests

Juvenile Idiopathic Arthritis(JIA)

-data obtained from gait analysis using an optoelectronic systemintegrated with force platforms and electromyographic (EMG) sen-sors-ultrasound, MRI, and X-ray-Gathering historical data e.g. previous medicine side-effects. physi-cal examinations, inflammatory indexes such as ESR, CRP. Search forbiomarkers of side effects

Neurological and NeuromuscularDiseases (NND)

-gait analysis data-tissue samples for genetic studies and lab tests-CT, MR, DXA, EMG, X-ray on demand-video of gait-energy expenditure information, and results of physical examinationsbefore and after treatment including scores on severity of disease.

Table 23: Type of data acquired

For each disease areas, the file types and formats mentioned in the questionnaire are provided in Table 24.

35

Page 36: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

Pathology File types and formats

Cardiomyopathy -echocardiography and MRI data as DICOM-echo data from Philips CVIS database-Free text as .txt, .doc, and .pdf files-meshes (e.g. VTK format)-2D images (.jpg, .png)-Numerical data as .csv and .xls files

Risk of Cardio-Vascular Diseasein Obese Children and Adoles-cents

-CT, MR, US as DICOM files-proteomic and metagenomic data as .xls and .txt-free text as .doc, and .pdf files-SNP genetic data-big set of clinical variables collected in the form of questionnaires.

Juvenile Idiopathic Arthritis(JIA)

-MRI, ultrasound, X-ray, DXA files-Gait analysis data as .c3d files-Non-volumetric image data as .avi videos, .tiff files-EMG data as .txt-OpenSim models (.osim)-Numerical data as .csv and .xls files.-Surface geometries/meshes (.vtp, .vtk, .stl)-FEA models (.odb, .db)-Microsoft Access files as .mdb-Other: .x2d, .vsk files

Neurological and NeuromuscularDiseases (NND)

-MRI DICOM and DXA files-Gait analysis .c3d files-.txt EMG data-Non-volumetric image data as .avi videos, .tiff files-Time-series .mox files-OpenSim files containing joint angles, joint torques and musclelengths-Landmark points (.vtk,.stl), mesh data (.vtk), mask data(.mhd)-anamnesis data stored in databases-xml format-MATLAB .mat files

Table 24: File types and formats for acquired data

Once the data are acquired, they are usually post-processed to include all required metadata and then uploadedto the hospital data repository. They are then used to form the final diagnosis of the patient’s illness.

C.1.3 Difficult decisions

Clinicians are often confronted with situations where a treatment decision is difficult to make because of thecomplexity of the case or non-conclusiveness of information at their disposal. If MD-PAEDIGREE could helpmake such decisions easier, it would be of much higher value to clinicians in their daily routine. Identification ofexamples of such difficult decisions will help in the development of MD-PAEDIGREE such that it better suitsclinical needs. Table 25 lists some situations with difficult decisions for the disease areas.

36

Page 37: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

Pathology Critical decisions

Cardiomyopathy -decision over transplant in borderline cases-assessment of drug efficacy-treatment decision after multi-variate analysis of data

Risk of Cardio-Vascular Diseasein Obese Children and Adoles-cents

-assessment of cardiovascular risk which drives the decision-making pro-cess over transplants-use of ventricular assist devices (VAD).

Juvenile Idiopathic Arthritis(JIA)

-assessment of whether disease is active or inactive-decision over treatment while contending with lack of standardized wayto proceed for given observations after analysis of gathered data

Neurological and NeuromuscularDiseases (NND)

-assessment of the correlation between observed genotype and phe-notype-type of treatment to be administered e.g. orthopedic surgery ver-sus neurological surgery, exact focus of treatment e.g. which mus-cle to treat-Timing of clinical gait analysis when botox is used. It is difficult toplan CGA right before consultations-Decision over treatment of different types of impairments

Table 25: Difficult decisions made during clinical work

C.2 Research workflow

MD-PAEDIGREE has a significant component dealing with research. This section illustrates some exampleworkflows associated with that for the four disease areas.

C.2.1 Cardiomyopathy

Siemens is a major research contributor within MD-PAEDIGREE in the area of cardiomyopathy and theworkflow that the firm adopts will be used as an example here. Primary concern is performing image analysisand simulation on acquired data and uploading the simulation results. Their detailed workflow is presentedbelow:

1. Segmentation and tracking of myocardium, left/right ventricle, heart valves form 3D+t MRI/US images.

2. Store point clouds/meshes as VTK polydata files on hard disk for further processing.

3. a) Compute left/right ventricular volume curve over time for one heart cycle and extract parameters.b) convert segmentation to volumetric model of myocardium for model simulations.

4. If available, use pressure data (left/right ventricle/atrium) for patient-specific hemodynamics.

5. If available, use clinical ECG data (e.g. from images or pdf files) to personalize EP model (computeparameters and save them).

6. Use extracted data and models from steps 1-4 to personalize electromechanical model.

C.2.2 Risk of Cardio-Vascular Disease in Obese Children and Adolescents

For this disease area, the primary research task is the analysis of all available heterogeneous data sources inorder to evaluate the risk factors associated with obesity. The data sources in question are:

1. Imaging data.

2. Data associated with the patient’s genome.

3. Extracted information from above data.

4. General clinical variables.

37

Page 38: Deliverable 13 .1 Initial list of requirements after ...€¦ · D. 13.1 Initial list of requirements after stakeholder interviews including priority domains MD -Paedigree - FP7 -ICT

D 13.1 Initial list of requirements after stakeholderinterviews including priority domains MD-Paedigree - FP7-ICT-2011-9 (600932)

C.2.3 JIA

1. Acquisition of clinical (descriptive, text etc.), structural (imaging, geometry) and functional (gait analysis)data.

2. Processing of above data in a multi-scale model framework to achieve one or more of:

a) Registration of the heterogeneous data collected.

b) Segmentation of regions of interest

c) Development of inverse-dynamics, musculoskeletal, and finite-element models, the output of whichwill include loading data that will act as a kind of biomarker for the individual subject.

3. Comparison of obtained models with models already on record.

4. Personalized prediction of disease course based on information obtained from the previous steps.

C.2.4 NND

For NND, the research workflow was found to be very similar to JIA.

C.3 Outcome of analysis of workflows

Analyzing clinical and research workflows has generated a wealth of information that may give a valuable insightinto the way MD-PAEDIGREE can be made to fit into the daily routine of its intended users.

At the very least, the Infostructure group within the MD-PAEDIGREE community will now be aware ofthe types and formats of data that the digital repository needs to accommodate. Section C.1.3. has alsohelped identify areas where MD-PAEDIGREE should aim to make a big impact. Regarding research, theInfostructure group will now be aware of the wide arrray of simulation capabilities that the proposed systemneeds to incorporate.

38