10
international journal of medical informatics 78S ( 2 0 0 9 ) S3–S12 journal homepage: www.intl.elsevierhealth.com/journals/ijmi SHARE road map for HealthGrids: Methodology Mark Olive a , Hanene Rahmouni a , Tony Solomonides a,, Vincent Breton b , Yannick Legré c , Ignacio Blanquer d , Vicente Hernandez d a Biomedical Informatics Group, Bristol Institute of Technology, UWE Bristol, Coldharbour Lane, Bristol BS16 1QY, UK b Laboratoire de Physique Corpusculaire, CNRS-IN2P3, Campus de Cézeaux, 63177 Aubière Cedex, France c HealthGrid, 36 rue Charles de Montesquieu, 63430 Pont du château, France d Universidad Politécnica de Valencia, Camino de Vera s/n, E-46022 Valencia, Spain article info Article history: Received 26 February 2008 Received in revised form 15 September 2008 Accepted 15 October 2008 Keywords: HealthGrid e-Health Grid applications Biomedical informatics Innovative medicine abstract The SHARE 1 project (http://www.eu-share.org) was asked to identify the key developments needed to achieve wide adoption and deployment of HealthGrids throughout Europe. The project was asked to organise these as milestones on a road map, so that all technical advances, social actions, economic investments and ethical or legal initiatives necessary for HealthGrids would be seen together in a single coherent document. The full road map includes an extensive analysis of several case studies exploring their technical require- ments, full discussion of the ethical, legal, social and economic issues which may impede early deployment, and concludes with an attempt to reconcile the tensions between tech- nological developments and regulatory frameworks. This paper has been restricted to the technical aspects of the project. SHARE built on the work of the ‘HealthGrid’ initiative so we begin by, reviewing work car- ried out in various European HealthGrid projects and report on joint work with numerous European collaborators. Following many successful HealthGrid projects, HealthGrid pub- lished a ‘White Paper’ which establishes the foundations, potential scope and prospects of an approach to health informatics based on a grid infrastructure. The White Paper demonstrates the ways in which the HealthGrid approach supports many modern trends in medicine and healthcare, such as evidence-based practice, integration across levels, from molecules and cells, through tissues and organs to the whole person and community, and the promise of individualised healthcare. SHARE was funded by the European Commis- sion to define a research roadmap for a ‘HealthGrid for Europe’, to be seen as the preferred infrastructure for biomedical and healthcare projects in the European Research Area. © 2008 Elsevier Ireland Ltd. All rights reserved. 1. Introduction A ‘grid’ – not the grid – is now understood to mean an Internet- like infrastructure which extends the concept of the Internet in several significant ways: like the Internet, a grid would provide access to informa- tion services but in addition would provide pooled storage, Corresponding author. Tel.: +44 1173283149. E-mail address: [email protected] (T. Solomonides). processing power and collaboration in so-called ‘virtual organisations’ (VOs); use of a grid will be reciprocal—while a user subscribes and takes advantage of services provided by a grid, the user’s resources are pooled and are available to all grid subscribers; the process is transparent—the grid allocates resources and provides an interface to services which give the appearance that the user is accessing just one powerful machine. 1386-5056/$ – see front matter © 2008 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.ijmedinf.2008.10.003

SHARE road map for HealthGrids: Methodology

  • Upload
    colpos

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

S

MIa

b

c

d

a

A

R

R

1

A

K

H

e

G

B

I

1

Ali

1d

i n t e r n a t i o n a l j o u r n a l o f m e d i c a l i n f o r m a t i c s 7 8 S ( 2 0 0 9 ) S3–S12

journa l homepage: www. int l .e lsev ierhea l th .com/ journa ls / i jmi

HARE road map for HealthGrids: Methodology

ark Olivea, Hanene Rahmounia, Tony Solomonidesa,∗, Vincent Bretonb, Yannick Legréc,gnacio Blanquerd, Vicente Hernandezd

Biomedical Informatics Group, Bristol Institute of Technology, UWE Bristol, Coldharbour Lane, Bristol BS16 1QY, UKLaboratoire de Physique Corpusculaire, CNRS-IN2P3, Campus de Cézeaux, 63177 Aubière Cedex, FranceHealthGrid, 36 rue Charles de Montesquieu, 63430 Pont du château, FranceUniversidad Politécnica de Valencia, Camino de Vera s/n, E-46022 Valencia, Spain

r t i c l e i n f o

rticle history:

eceived 26 February 2008

eceived in revised form

5 September 2008

ccepted 15 October 2008

eywords:

ealthGrid

-Health

rid applications

iomedical informatics

nnovative medicine

a b s t r a c t

The SHARE1 project (http://www.eu-share.org) was asked to identify the key developments

needed to achieve wide adoption and deployment of HealthGrids throughout Europe. The

project was asked to organise these as milestones on a road map, so that all technical

advances, social actions, economic investments and ethical or legal initiatives necessary

for HealthGrids would be seen together in a single coherent document. The full road map

includes an extensive analysis of several case studies exploring their technical require-

ments, full discussion of the ethical, legal, social and economic issues which may impede

early deployment, and concludes with an attempt to reconcile the tensions between tech-

nological developments and regulatory frameworks. This paper has been restricted to the

technical aspects of the project.

SHARE built on the work of the ‘HealthGrid’ initiative so we begin by, reviewing work car-

ried out in various European HealthGrid projects and report on joint work with numerous

European collaborators. Following many successful HealthGrid projects, HealthGrid pub-

lished a ‘White Paper’ which establishes the foundations, potential scope and prospects

of an approach to health informatics based on a grid infrastructure. The White Paper

demonstrates the ways in which the HealthGrid approach supports many modern trends in

medicine and healthcare, such as evidence-based practice, integration across levels, from

molecules and cells, through tissues and organs to the whole person and community, and

the promise of individualised healthcare. SHARE was funded by the European Commis-

sion to define a research roadmap for a ‘HealthGrid for Europe’, to be seen as the preferred

med

infrastructure for bio

. Introduction

‘grid’ – not the grid – is now understood to mean an Internet-ike infrastructure which extends the concept of the Internet

n several significant ways:

like the Internet, a grid would provide access to informa-tion services but in addition would provide pooled storage,

∗ Corresponding author. Tel.: +44 1173283149.E-mail address: [email protected] (T. Solomonides).

386-5056/$ – see front matter © 2008 Elsevier Ireland Ltd. All rights resoi:10.1016/j.ijmedinf.2008.10.003

ical and healthcare projects in the European Research Area.

© 2008 Elsevier Ireland Ltd. All rights reserved.

processing power and collaboration in so-called ‘virtualorganisations’ (VOs);

• use of a grid will be reciprocal—while a user subscribes andtakes advantage of services provided by a grid, the user’s

resources are pooled and are available to all grid subscribers;

• the process is transparent—the grid allocates resources andprovides an interface to services which give the appearancethat the user is accessing just one powerful machine.

erved.

i c a l

S4 i n t e r n a t i o n a l j o u r n a l o f m e d

Major IT companies have agreed to develop web servicesas the technology to enable the deployment of services on theInternet. It has been also adopted by the Open Grid Forumwhich is the acknowledged body to propose and develop stan-dards for grid technology.

Moreover, web service technology provides the bridgebetween the grid world and the Semantic Web which isabout common formats for the interchange of data andabout language for recording how the data relates to realworld objects. Although many current grid infrastructuresdo not offer a web service interface to their services, wewill concentrate our ‘state of the art’ on web servicesbecause it is the relevant technology for the future. We willthen go on to discuss the status of existing grid infras-tructures, the technologies they use and the services theyoffer.

The initial idea behind web services was to enable theWorld Wide Web increasingly to support real applicationsand a means of communication among them. Thus, a setof standards and protocols have been proposed to allowinteraction between distant machines over a network. Theseinteractions are made possible through the use of stan-dardised interfaces which describe the available operationsin a service, the nature and form of messages exchanged(requests and responses), and the physical location of theservice on the network. A language, web service descriptionlanguage (WSDL) has been devised to describe such inter-faces.

One of the subtle advantages of a so-called Service Ori-ented Architecture (SOA) is that it leads to loosely coupledcomponents that may be substituted by better services solong as they comply with the interface specification. Thus ina grid, services can be offered in a way that approximatesan ideal marketplace, a market with near perfect informa-tion.

Grid technology has been identified as one of the key tech-nologies to enable and support the ‘European Research Area’The impact of this concept is expected to reach far beyondeScience, to eBusiness, eGovernment, and eHealth. However,a major challenge is to take the technology out of the labora-tory to the citizen. A HealthGrid is an environment in whichdata of medical interest can be stored and made easily avail-able to different actors in the healthcare system, physicians,allied professions, healthcare centres, administrators and, ofcourse, patients and citizens in general. Such an environmenthas to offer all appropriate guarantees in terms of data pro-tection, respect for ethics and observance of regulations; it hasto support the notion of ‘duty of care’ and may have to dealwith ‘freedom of information issues’. Working across memberstates, it may have to support negotiation and policy bridg-ing.

Early grid projects, while encompassing potential appli-cations to the life sciences, did not address the specificitiesof an e-infrastructure for health, such as the deploymentof grid nodes in clinical centres and in healthcare admin-istrations, the connection of individual physicians to the

grid and the strict regulations ruling the access to personaldata. However, a community of researchers did emerge withan awareness of these issues and an interest in tacklingthem.

i n f o r m a t i c s 7 8 S ( 2 0 0 9 ) S3–S12

2. The HealthGrid initiative

Many pioneering projects in the application of grid technolo-gies to health and biomedical research have been completed,and the technology to address high-level requirements in agrid environment has been under development and makinggood progress. Because these projects had a finite lifetimeand the ambition for HealthGrids required a sustained effortover a much longer period, and besides because there was anobvious need for these projects to cross-fertilise, the ‘Health-Grid initiative’, represented by the HealthGrid association(http://www.healthgrid.org), was initiated to bring the neces-sary long-term continuity. Its goal has been to encourage andsupport collaboration between autonomous projects in sucha way as to ensure that requirements really are met, and thatthe wheel, so to speak, is not re-invented repeatedly at theexpense of other necessary work.

Writing about the HealthGrid initiative very soon after itsinception, this community identified a number of objectives[1]:

• Identification of potential business models for medical gridapplications.

• Feedback to the grid development community on therequirements of the pilot applications deployed by the Euro-pean projects.

• Development of a systematic picture of the broad and spe-cific requirements of physicians and other health workerswhen interacting with grid applications.

• Dialogue with clinicians and those involved in medicalresearch and grid development to determine potentialpilots.

• Interaction with clinicians and researchers to gain feedbackfrom the pilots.

• Interaction with all relevant parties concerning legal andethical issues identified by the pilots.

• Dissemination to the wider biomedical community on theoutcome of the pilots.

• Interaction and exchange of results with similar groupsworldwide.

• The formulation and specification of potential new applica-tions in conjunction with the end user communities.

Apart from research, where the value of grid computing iswell established, a HealthGrid may be deployed to support thefull range of healthcare activities, from screening and diagno-sis, through treatment planning, to epidemiology and publichealth. For example, anticipating that population trends, airpollution and global warming may lead, through extremes ofheat, to increased risks to the elderly, we may deploy a moni-toring service to track conditions and medical episodes in hotsummers. Patients’ medical data would have to be stored ina local database in each healthcare centre. These databaseswould have to be federated and would essentially share thesame logical model. Secure access to data would have to be

granted to authorised individuals or services, which wouldtherefore need to be authenticated: only partial views of localdata would be available to external services and patient datawould have to be anonymised or at least pseudonymised.

a l i n

etrvdddtgo

aefimotarraeaaoagas

bMHodItetwtasatobwhcthddls

Me

i n t e r n a t i o n a l j o u r n a l o f m e d i c

Given the source of the concept of grid in the physical sci-nces, many of these requirements were not a central concerno grid developers in general. Indeed, even today, when theseequirements have been fed through to the middleware ser-ices community, they are not a priority for mainstream gridevelopers. Thus HealthGrid has been actively involved in theefinition of requirements relevant to the development andeployment of grids for health and was among the first to iden-ify the need for a specialist middleware layer, between theeneric grid infrastructure and middleware and the medicalr health applications.

Among data related requirements, the need for suitableccess to biological and medical image data arose in severalarly projects, but for the most part these are present in otherelds of application also. Looking to security requirements,ost of these are special to the medical field: anonymous

r private login to public and private databases; guaran-eed privacy, including anonymisation, pseudonymisationnd encryption as necessary; legal requirements, especially inelation to data protection, and dynamic negotiation of secu-ity and trust policies while applications remain live. Mostdministrative requirements are common to medicine andScience, although the flexibility of ‘virtual grids’, i.e. thebility to define sub-grids with restrictions on data storagend data access and also on computing power, is more obvi-usly required in healthcare. Medical applications also requireccess to small data subsets, like image slices and modeleometry. At the (batch) job level, medical applications needn understanding of job failure and the means to retrieve theituation.

Requirements of this kind have been addressed in a num-er of projects. Among the examples we shall mention areammoGrid [3], Health-e-Child [4] and Virtual Physiologicaluman (VPH) [5]. The former constructed a grid databasef standardised mammogram files and associated patientata to enable radiologists in Cambridge, UK, and Udine,

taly, to request second opinion and computer-aided detec-ion services. The project also enabled a further study of anpidemiological nature on breast density as a risk factor. Inhe larger Health-e-Child ‘integrated project’ radiologists areorking with oncologists, cardiologists and rheumatologists

o identify early imaging signs of conditions that may havestrong genetic component. The knowledge thus obtained

hould reduce the need for genetic maps to be obtained onn indiscriminate basis. VPH requires collaborative researcho create a methodological and technological framework thatnce established will enable the investigation of the humanody as a single complex system. VPH will provide a frame-ork within which observations made in laboratories andospitals may be collected, catalogued, organised, shared andombined in any possible way. It should also allow expertso collaboratively analyse observations and develop systemicypotheses that involve the knowledge of multiple scientificisciplines, and to interconnect predictive models defined atifferent scale, with different methods, and with different

evels of detail, into systemic networks that provide concreti-

ation to those verifiable systemic hypotheses.

In another European project, Wide In Silico Docking Onalaria (WISDOM) tens of millions of molecular docking

xperiments have been carried out to help identify poten-

f o r m a t i c s 7 8 S ( 2 0 0 9 ) S3–S12 S5

tial antigens for the malaria parasite. The experiment useslarge-scale virtual screening techniques to select molecu-lar fragments for further investigation in the developmentof pharmaceuticals for neglected diseases. The economicdynamics in this area are telling: only about 1% of drugs devel-oped in the last quarter century have been aimed at tropicaldiseases, and yet these are major killers in the third world,with mortality in excess of 14 million per annum. One ofthe relevant applications which are well suited to the grid ismolecule ‘docking’ and this has been successfully applied, e.g.to proteins of the malaria parasite. This work continues withincreasingly promising results and has now been extended toother diseases also, most notably avian influenza strain H5N1.

Meanwhile, the results of several major studies of the inter-face between bioinformatics and medical informatics hadbeen published with a remarkable promise of synergy betweenthe two disciplines, leading to what had already begun to bereferred to as ‘personalised medicine’ [6,7]. From the point ofview of HealthGrid, this made clear the need to unify the fieldand to put its various elements in perspective: how wouldthey – improved evidence bases, imaging, genetic information,pharmacology, epidemiology – fit together, what was their rel-ative importance in the unfolding programme of work?

3. The White Paper: from grid to HealthGrid

Thus, the next step for the HealthGrid community was to try tosystematise the concepts, requirements, scope and possibil-ities of grid technology in the life sciences. The White Paper[8] defines the concept of a HealthGrid more precisely thanbefore:

HealthGrids are grid infrastructures comprising appli-cations, services or middleware components that dealwith the specific problems arising in the processing ofbiomedical data. Resources in HealthGrids are databases,computing power, medical expertise and even medicaldevices. HealthGrids are thus closely related to eHealth.

The ultimate goal for eHealth in Europe would be the cre-ation of a single HealthGrid, i.e. a grid comprising all eHealthresources, incorporating a ‘principle of subsidiarity’ of inde-pendent nodes of the HealthGrid as a means of implementingall the legal, ethical, regulatory and negotiation requirements.We may anticipate, however, the development path to pro-ceed through specific HealthGrids with perhaps rudimentaryinter-grid interaction/interoperational capabilities. Thus, wemay identify a need to map future research and advice onresearch policy, so as to bring diverse initiatives to the pointof convergence.

HealthGrid applications address both individualisedhealthcare – diagnosis and treatment – and epidemiologywith a view to public health. Individualised healthcareis improved by the efficient and secure combination ofimmediate availability of personal clinical information andwidespread availability of advanced services for diagnosis and

therapy. Epidemiology HealthGrids combine the informationfrom a wide population to extract knowledge that can leadto the discovery of new correlations between symptoms,diseases, genetic features and other clinical data. With this

i c a l

S6 i n t e r n a t i o n a l j o u r n a l o f m e d

broad range of application in mind, the issues below areidentified as key features of our analysis.

• Business case, trust and continuity issues: healthgrids aredata- and collaboration-grids, but health institutions arereluctant to let information flow outside institutionalboundaries. Large-scale deployment, which would make anattractive business opportunity, requires ‘security’, usingthe word inclusively, to be scaled up to a very high levelof confidence. Data storage remains the responsibility ofa hospital, yet business opportunities may arise from datasharing and processing applications; this degree of federa-tion of databases introduces additional complexity.

• Biomedical issues: Management of distributed databases anddata mining capabilities are important tools for manybiomedical applications in fields such epidemiology, drugdesign or even diagnosis. Expert system services runningon the grid must be able to interrogate large distributeddatabases to explore sources of diseases, risk populations,evolution of diseases or suitable proteins to fight againstspecific diseases. Research communities in biocomputingor biomodelling and simulation have a strong need forresources that can be provided through the grid. Compli-ance with medical information standards is necessary foraccessing large databases. There are many consolidatedand emerging standards that must be taken into account,including those for complex and multimedia information.

• Security issues: These flow naturally from the nature ofmedical data and from business requirements. Securityin grid infrastructures is currently adequate for researchplatforms, but not for real healthcare applications. Biomed-ical information must be carefully managed to maintainits integrity and to avoid privacy leakages. Secure trans-mission must be complemented with secure storage, withstrictly controlled authenticated and authorised access.Automatic pseudo/anonymisation is necessary for a ‘pro-duction’ HealthGrid.

• Management issues: The central concept of a ‘virtual organi-sation’ (VO) at the heart of eScience, which gave rise to grids,is very apt for HealthGrid, but additional flexibility is neededto structure and to control VOs in the large, including, forexample, the meta-level of a VO of VOs. The management ofresources has to be more precise and dynamic, dependingon many criteria such as urgency, medical protocols, users’authorisation or other administrative policies.

Current examples of HealthGrids span a wide range. At oneend, we find the classic ‘high-throughput’ approach of numer-ical simulation of organs obtained from a patients’ data andused to aid understanding or to improve the design of med-ical devices; this leads to patient-customised approaches atleast at research-level in areas such as radiotherapy, craniofa-cial surgery and neurosurgery [9]. Other HealthGrids deal withlarge-scale information processing, such as medical imaging.Breast cancer imaging has been the focus of several success-ful grid projects and eHealth projects suitable for migration

to a HealthGrid. These efforts have concentrated on fed-erating and sharing the data and the implementation ofsemi-automatic processing tools that could improve the sen-sitivity and specificity of breast cancer screening programs.

i n f o r m a t i c s 7 8 S ( 2 0 0 9 ) S3–S12

Much effort has been invested to reduce the informationneeded to be exchanged and to protect privacy of the infor-mation. The concept of a patient-centric grid for health hasalso been explored [10]. The main aim of this approach isto make the information available to the whole health com-munity (patient, relatives, physicians, nursery), consideringaccess rights and language limitations.

Bioinformatics is the area where grid technologies are morestraightforwardly introduced. The main challenge faced bybioinformatics is the development and maintenance of aninfrastructure for the storage, access, transfer and simulationof biomedical information and processes. Current efforts onbiocomputation are coherent with the aims of grid technolo-gies. Work on the integration of clinical and genetic distributedinformation, and the development of standard vocabularies,will ease the sharing of data and resources.

A very helpful account of the state of the art at the end of2006 was given by Groen and Goldstein [11] with an expandedversion promised in a text book to follow. Among the manyinitiatives and links they mention, it is useful to single outthe Open Grid Forum’s Life Sciences Grid—Research Group (cf[12]). This emphasis reflects SHARE’s anticipation that Health-Grids will be deployed in the service of biomedical researchbefore they are adopted in healthcare as such.

4. The SHARE project: from White Paper toRoad Map

In the White Paper, the HealthGrid community expressed itscommitment to engage with and support modern trends inmedical practice, especially ‘evidence-based medicine’ as anintegrative principle, to be applied across the dimensions ofindividual through to public health, diagnosis through treat-ment to prevention, from molecules through cells, tissues andorgans to individuals and populations. In order to do this, ithad to address the question how to collect, organise, and dis-tribute the ‘evidence’; this might be ‘gold standard’ evidence,i.e. peer reviewed knowledge from published research, or itmight be more tentative, yet to be confirmed knowledge frompractice, and, in addition, would entail knowledge of the indi-vidual patient as a whole person. The community also hadto address the issues of law, regulation and ethics, and issuesabout crossing legal and cultural boundaries, finding ways toexpress these in terms that translate to technology—security,trust, encryption, pseudonymisation. Then it had to considerhow the services of the HealthGrid middleware would satisfythese requirements; and, if it was to succeed in the real world,how to make the business case for HealthGrid to hard-pressedhealth services across Europe while they are struggling withtheir own modernisation programmes [2].

The vision of health that informs the thinking of the WhitePaper and the work of HealthGrid since its publication hasbeen defined in the ‘Action Plan for a European e-Health Area’[13] as follows:

“. . . the application of information and communicationstechnologies across the whole range of functions that affectthe health sector. e-Health tools or ‘solutions’ includeproducts, systems and services that go beyond simply

i n t e r n a t i o n a l j o u r n a l o f m e d i c a l i n

Fig. 1 – Levels of biosocial organisation, knowledgedF

micahi

fePfmmbat

rado

isciplines, pathologies and informatics (with permissionernando Martín Sánchez).

Internet-based applications. They include tools for bothhealth authorities and professionals as well as person-alised health systems for patients and citizens. Examplesinclude health information networks, electronic healthrecords, telemedicine services, personal wearable andportable communicable systems, health portals, and manyother information and communication technology-basedtools assisting prevention, diagnosis, treatment, healthmonitoring, and lifestyle management.”

The ‘vertical integration’ implicit in this visionary state-ent can be translated into more concrete terms by mapping

t to its human subjects, their pathologies and the implicit dis-iplines. The relationships between the different ontologicalnd epistemological levels and the various modalities of dataave been captured by Fernando Martin-Sánchez et al. (cf [6])

n the schematic diagram (Fig. 1).In the light of the White Paper and its impact, the EC has

unded a ‘specific support action’ project, SHARE, to explorexactly what it would mean to realise the vision of the Whiteaper, investigate the issues that arise and define a roadmapor research and technology which would lead to wide deploy-

ent and adoption of HealthGrids in the next 10 years. To beore precise, based on the assumption that HealthGrid will

e the infrastructure of choice for biomedical and eHealthpplications within the next 10 years, the two objectives ofhe project are

a roadmap for research and technology to allow a widedeployment and adoption of HealthGrids both in the shorterterm (3–5 years) and in the longer term (up to 10 years); anda complementary and integrated roadmap for e-Healthresearch and technology development (RTD) policy relatingto grid deployment, as a basis for improving coordinationamong funding bodies, health policy makers and leaders ofgrid initiatives, avoiding legislative barriers and other fore-seeable obstacles.

Thus the project had to address the questions, What

esearch and development needs to be done now? and Whatre the right initiatives in eHealth RTD policy relating to grideployment?—with all that implies in terms of coordinationf strategy, programme funding and support for innovation.

f o r m a t i c s 7 8 S ( 2 0 0 9 ) S3–S12 S7

In summary, therefore, the project has sought to define acomprehensive and detailed European research and develop-ment roadmap, covering both technology and policy aspects,to guide and promote beneficial EU-wide uptake of HealthGridtechnologies, and their applications into health research andinto healthcare service provision.

5. Technical road map: step one

SHARE has defined a technical roadmap and a separate eth-ical, legal and socio-economic (ELSE) issue conceptual map,both informed by experiences in HealthGrid projects, merg-ing them into an initial integrated roadmap. We presenthere the technical road map component, which is com-prised of seven milestones representing the key technological,deployment and standard challenges for future HealthGridresearch. Two technical milestones have been defined forthe implementation, development and testing of Health-Grid services, two milestones as examples of grid standardsthat must be defined for the medical domain, and threedeployment milestones increasing in complexity and scope(Fig. 2).

MT1. Before deployment can begin, the first step should beto begin the testing of grid middleware(s) with medical appli-cations for scalability and robustness. This should begin at anearly stage, as any deficiencies in this area will only hamperdeployment and the development of a reference distribution.It is anticipated that this will be an ongoing activity, with dif-ferent generations of grid operating systems offering newer,faster and more stable capabilities.

A key issue for this milestone will be the robustness ofgrid solutions based on web services. Scalability, particularlyregarding medical applications, is still a concern for grid mid-dleware based on the Open Grid Services Architecture (OGSA)such as GT4 [14] and GRIA [15]. Middleware such as gLite andUnicore on the other hand have been deployed on large-scaleinfrastructures in Europe and have demonstrated their scala-bility and robustness, but are still awaiting migration to webservices.

MD1. The first deployment step will be the rollout of acomputational grid production environment demonstratorfor medical research. This would seem to be an achievablegoal in the reasonably near future given that there havealready been successful deployments of computational gridapplications (the WISDOM data challenges, for example) ongeneral purpose grid infrastructures such as DEISA [16] andEGEE [17]. Apart from such ‘innovative medicine’ applica-tions, there are examples of computational grid applicationsin healthcare delivery: vascular surgery, radiotherapy plan-ning, optimal drug delivery, monitoring the effects of a heatwave or modelling an epidemic of antibiotic resistant bacte-ria in a nosocomial setting. However, convincing healthcaremanagement of the benefits of deploying a computationalgrid on a hospital or clinic IT infrastructure, which wouldnot be composed of dedicated grid nodes and may already be

working near capacity, is a real concern. Many medical cen-tres may simply not have the necessary bandwidth or storagecapabilities to make best use of grid technology, or may nothave appropriate equipment to capture data in digital form.

S8 i n t e r n a t i o n a l j o u r n a l o f m e d i c a l i n f o r m a t i c s 7 8 S ( 2 0 0 9 ) S3–S12

ical r

Fig. 2 – A diagrammatic representation of the techn

The management and configuration of a grid is rather com-plex and may require significant investment in manpower andtraining.

Installation of grid nodes behind hospital firewalls isincompatible with the present security model, which requiredinbound connections. Secure services for data managementare under development that could allow firewall rules to berelaxed, enabling connections to grid infrastructures. Newarchitectures and designs should be defined that will min-imise the volume of data leaving hospital borders. Ideally,the grid node would be located outside the firewall withonly anonymised or pseudonymised data being stored on thegrid. However, even with the most stringent pseudonymisa-tion and de-identification techniques there is still some riskof unauthorised re-identification by a person with sufficientknowledge from other sources. There are therefore legal andethical implications of storing even anonymised personal dataon the grid, and further investigation will be required to deter-mine if this is possible with current national and Europeanpolicies and legislation.

MT2. Starting with MD1 and ending when productiondeployment begins, this milestone is the development of areference distribution of grid services, using standard webservice technology and allowing the secured manipulation ofdistributed data. An important consideration for this mile-stone will be the level of security required when dealing withdistributed medical data.

Key grid infrastructures such as EGEE have been devel-oped for scientific communities (such as high energy physics),and while providing an appropriate level of complexity tothose using and administering grid nodes in that envi-ronment, the installation and management of grid notesin hospitals and medical research centres will need to beconsiderably more user friendly, with an easy-to-use userinterface.

Emerging web services technologies such as the WS-Addressing standard and the WSRF specification must be

respected and promoted in the development of HealthGrids,and already have several widely used implementations. Theyprovide a standard and interoperable way for implementingstate in web services, and separate state information from

oad map, with milestones and external influences.

operations. Industry partners have recognised this, but haveconcerns about the level of tool support for these standards[18].

The EPSRC funded IBHIS project [19] found that web ser-vice description languages and registries are not yet matureenough, particularly WSDL, which describes how to accessweb services, and UDDI, which provides a registry for servicediscovery. For example, the UDDI registry was only search-able using keywords, but to identify all relevant data sourcesontology-based searching would be required. These technolo-gies are currently not flexible enough, cannot be used forsemantic queries or descriptions (e.g. a description of thefunction a service provides, or the meaning of parameternames) or non-functional descriptions such as quality of ser-vice and performance levels. The development of WSDL andUDDI is ongoing, with OWL-S extensions to UDDI to facilitatesemantic searching, upcoming versions of UDDI promisingto address other limitations, and WSDL 2.0 promising tosupport semantic descriptions and include non-functionalrequirements.

Considerable development is occurring in the area of webservices:

• A Negotiation Description Language (NDL) can be used toconstruct supply chains of services to achieve the desiredgoal. NDL could be extended to ‘many to many’ negotiations,where all participants in the chain can obtain informationabout suppliers and react accordingly. NDLs can also con-tain information other than technical conformance, such asnon-functional legal and cost conformances.

• An ISO-9126 compliant automated just-in-time servicequality assessment tool was being developed by membersinvolved with IBHIS to select the ‘best’ service. This assess-ment is not trivial as many of the quality characteristicsbeing weighed can conflict, so that a rise in one would resultin the decline of another (e.g. as usability increases, securitymay decrease).

Security is not an option but strict requirement for Health-Grids. Security is an issue at all technical levels: networksneed to provide protocols for secure data transfer, the grid

a l i n

iadasp

npadsttmttbiidpstbs

afuwtt‘wacteoadatfDmr

rdwsthaiHbp

i n t e r n a t i o n a l j o u r n a l o f m e d i c

nfrastructure needs to provide secure mechanisms for access,uthentication, and authorisation, as well as sites for secureata storage. The grid operating system needs to provideccess control to individual files stored on the grid. High levelervices need to properly manage legal issues related to therotection of medical data.

The security offered by the existing infrastructures doesot yet allow the manipulation of medical data. Importantrogress is being made in terms of fine-grained access controlnd data encryption. Some prototype services are underevelopment but they are not yet fully deployed. A specificecurity feature implemented by hospitals is a restriction inhe access to the Internet. Installation of grid elements behindhe hospital firewall is incompatible with the present security

odel where outbound connection is not allowed throughhis firewall. Ideally, the grid node would be located outsidehe firewall with only anonymised or pseudonymised dataeing stored on the grid. However, given the legal and ethical

mplications of storing any personal data on the grid, even if its fully anonymised, further investigation will be required toetermine if this is possible with current national and Euro-ean policies and legislation. For example, even with the mosttringent pseudonymisation and de-identification techniqueshere is still some risk of unauthorised re-identificationy a person with sufficient information from otherources.

Revocation of credentials and how to provide temporaryccess to data is still an open issue, and an important oneor HealthGrids. There are a number of situations wheresers would temporarily require access to data that theyould not normally have access to, such as a visiting expert

o a breast cancer unit being shown an unusual case. Cer-ificate authorisation servers have been developed in bothpull’ mode (VOLDAP, GridSite LDAP, and VOMS-httpd), in

hich sites periodically pull a list of valid members fromcentral service, and ‘push’ mode (VOMS attribute certifi-

ates), in which users obtain a short-lived attribute certificatehat they present to sites to prove their membership. How-ver, both of these would leave a window where revokedr expired credentials could be used to gain unauthorisedccess. Several HealthGrid projects have suggested that theata itself should have a ‘lifetime’—users with temporaryccess should not be able to access the data (or a copy ofhe data) once their credentials have expired. This would,or example, be in accord with the fifth principle of the UKata Protection Act, 1998; a form of Digital Rights Manage-ent (DRM) has been proposed as the means to provide this

estriction.MD2. Although several prototype data grids for medical

esearch have been demonstrated by HealthGrid projects,eveloping and maintaining a production quality data gridill require a number of issues relating to the distributed

torage of medical data to be resolved. In European grid infras-ructures, the distributed storage of medical images has beenampered by the limited data management services available,nd so the continuation of improvements in this area will be

mportant for the adoption of grids by the medical community.igh speed links between data providers and consumers wille a prerequisite, particularly given the high volume of dataredicted.

f o r m a t i c s 7 8 S ( 2 0 0 9 ) S3–S12 S9

Many legal and ethical issues will need to be resolved, suchas the ownership of patient data, ethical control of informa-tion, the patient’s right to access or be informed about datathat concerns them, as well as local, national, and Europeanlevel legislation governing the use of patent data and IT.

Another important concern for this milestone will be theintegration of heterogeneous data from multiple sources.While mechanisms for data integration have been demon-strated by previous projects, biomedical data can beexceptionally varied including images with associated meta-data and free form text or hand written notes from patientrecords. There is also the issue of how to deal with missing,inaccurate or obsolete data.

MS1 and MS2. The use of computer-based tools for clinicalresearch has led to the definition of standards for the exchangeof data in many areas. However, such standards are in manycases not universal, with different disciplines and countriesadopting different standards. The exchange of data betweenbioinformatics and medical informatics is an area where stan-dards are particularly limited.

Medical imaging is an exemplary case, in which the adop-tion of DICOM for the acquisition, connection and storage ofmedical images has been accepted worldwide. Medical recordsare another area where standardisation would have clear ben-efits, with HL7 being the favoured standard for the exchangeof data. However, previous standards such as CEN/TC251EN13606 focused more on the storage and structuring of clin-ical records and have prevented a wider uptake of HL7. Theadoption of both DICOM and HL7 has increased due to ini-tiatives such as IHE (Integrating the Healthcare Enterprise),which promotes the coordinated use of DICOM and HL7 bypublishing best practice guidelines. A particularly importantconsideration for both of these standards is their compatibilitywith grid technologies, and how they could be implementedon a HealthGrid. Both DICOM and HL7 developers are juststarting to study the interface between their standards andweb services technology.

An observation on collaboration through grids. Many success-ful examples of computational and data grids have alsoentailed a significant extra dimension, that of collaboration.This has been found to be of value in both research and modelhealthcare delivery applications. An exemplary case is thatof MammoGrid [20], a project in which Italian and Englishradiologists were able to collaborate in diagnostic studiesand, even more successfully, in epidemiology [21]. The projectsought to replicate the ‘workflow’ of a real breast screen-ing department by means of communication and exchangeof selected images through a VPN-like secure grid. In thisproject, a proprietary standard, SMFTM was used to normalisemammograms so that they could be compared as like-with-like. Day to day activities were imitated as much as possible.Thus, where further tests are conducted as part of the recallprocess, readers can see what their decisions amounted toin practice. Double reading accords opportunities for read-ers to compare their decisions with those of their colleaguesfor the same set of cases (double reading is not blind, and

readers may make notes for each other). Co-located readerscan walk down the hall to ask a colleague what they meantby this annotation or whether they had seen and ignoredor missed this feature. Commonly, disagreements between

i c a l

S10 i n t e r n a t i o n a l j o u r n a l o f m e d

readers are arbitrated by a third reader or by discussion. Agrid-enabled virtual meeting between readers could providethe medium where these issues can be discussed, but callsfor a very high degree of acceptance of the technology on thepart of the users. Problems may not be resolved as quickly,and the process may well become more formal. Even ques-tions of clarification may have connotations of professionalcompetence.

The visibility of readers’ decisions serves to alert readers tooccasions where colleagues are not aligned with a local under-standing about what constitutes an appropriately recallablepresentation. Physicians and other healthcare workers “typi-cally assess the adequacy of medical information on the basisof the perceived credibility of the source” [22]. In other words,healthcare workers develop a sense of the trustworthiness ofthe people (and machines) they work with and evaluate themeaning, status and quality of data in accordance. How can areader who lacks knowledge of the (local) conditions of a mam-mogram’s production read that mammogram confidently? (i.e.in accordance with their sense of professional competence)How can an ‘unknown’ reader be trusted to have read mam-mograms in an accountably acceptable manner? One projectdoes not answer all these questions, of course, but Mammo-Grid was able to establish a context within which realisticworkflow, in a simulated routine screening,

MD3. After the issues with the distributed storage andquerying of medical data have been resolved, the next taskwill be to deploy services that can build relationships betweendata items, and will provide appropriate representation tomedical researchers. Particularly given that there have beenno successful deployments of knowledge grids for medicalresearch to date, this will pose a significant challenge. The

data concerned can be extremely varied in nature, struc-ture, format and volume. Depending on the area of research,the synthesis of knowledge from data could require sophisti-cated data mining, integrated disease modelling and medical

Fig. 3 – Complexity vs time: adoption c

i n f o r m a t i c s 7 8 S ( 2 0 0 9 ) S3–S12

image processing applications, and may also involve the use oftechniques from artificial intelligence to derive relationshipsbetween data from different sources and in different contexts.

The development of medical ontologies and the mappingbetween ontologies will be particularly important for thesuccessful deployment of knowledge grids. The standardis-ation of interfaces can dramatically increase interoperabilitybetween biomedical resources, and by operating on stan-dardised data formats they can more easily be integratedinto complete bioinformatics experiments by eliminating therestructuring of data between each service. The constructionof standardised data formats can be improved by defining adomain ontology that covers the concepts used within a givendomain. These ontologies will allow relationships betweenconcepts and nuances in meaning to be captured, greatlyenhancing the opportunities for communication, knowledgesharing and reuse, and machine reasoning.

An ontology is the systematic description of a givenphenomenon: it often includes a controlled vocabulary andrelationships, captures nuances in meaning and enablesknowledge sharing and reuse. From an agreed ontology it ispossible to define a common data model that describes theformat of the data used by all the services. Another bene-fit of ontologies is that they can also help to provide usefulhigh-level functionalities based on machine reasoning. Thefield of machine reasoning would be further enhanced if thefunctionalities of services themselves were described in anontology, and not only the data they operate on. Examples ofsuch functionalities are automatic service discovery, invoca-tion and composition. The Semantic Web Activity at the WorldWide Web Consortium is dedicated to these topics. In particu-lar the HealthCare and Life Sciences Interest Group (HCLSIG) is

relevant for HealthGrids. An ontology is only as valuable as thegeneral support it has, as its primary purpose is to facilitate acommon understanding of terms. As an ontology is adoptedmore widely, this increases the possibilities for a resource that

urve for HealthGrid technologies.

l i n f o r m a t i c s 7 8 S ( 2 0 0 9 ) S3–S12 S11

seO

ohcsToitcdiattftoaofcsp

caoc

ottMoltaAltmiaN

6

Iatgaknb

Summary pointsWhat was already known on this topic

• Several successful project have been completed inthe field of grid computing for health (“HealthGrids”).These associated themselves with an effort to syn-thesise the potential and the prospects of HealthGridsthrough the HealthGrid association.

• HealthGrid published its White Paper in 2005 set-ting out a vision of what can be accomplishedthrough appropriate adaptations of grid comput-ing in biomedical research and healthcare appli-cations.

• The SHARE project was specifically funded to developa “road map” for HealthGrids which would see therealisation of the vision in the following decade orso and so also address the aspirations of the Euro-pean Union’s “Action Plan for a European e-HealthArea”.

What this study has added

• A state of the art report let to an outline road map withonly the most basic of milestones identified. Develop-ment than followed separate paths roughly along thelines of functional and non-functional requirements,thought of as technical and standards on one hand andas ethical, legal, social and economic (‘ELSE’) issues onthe other.

• These ideas were concurrently tested in two detaileduse cases, innovative medicine and epidemiology, andagainst three actual projects. These three distinctstreams fed into the development of the second roadmap, a significantly more sophisticated and differen-tiated document.

• An agreed road map was finally generated after severalpresentations to expert groups and workshops.

• In developing the road map it became necessary to dif-ferentiate between data, computing and collaborationgrids. These vary in their technical requirements andto some extent in the ELSE issues they give rise to.

• While it is clear that HealthGrids have immensepotential in biomedical research, it is equally clearthat applications in healthcare will present significantchallenges to regulatory regimes. In the latter context,creative compromises will be necessary but do appearpossible; moreover, the technology to automate certainaspects of regulatory compliance have the potential to

i n t e r n a t i o n a l j o u r n a l o f m e d i c a

upports it. Within life science and healthcare sector, the high-st degree of general support is arguably the Open Biomedicalntologies (OBO).

Semantic Web technologies are just emerging in the fieldf medical research and healthcare. Open issues includeow to integrate biomedical data using ontologies, how toombine different initiatives and how to employ advanced,emantic reasoning techniques for analysing medical data.he majority of the biomedical applications currently usingntologies mostly deal with decision support, namely assist-

ng health professionals in disease diagnosis, staging orherapy planning via preliminary detection services. Breastancer diagnosis and treatment is one of the most advancedomains with the development of a Breast Cancer Imag-

ng Ontology. Anatomy is another field where ontologicalpproaches have been explored. The interest is focused onhe representation of anatomical terminology and classifica-ion of surgical procedures, extraction of heart anatomicaleatures, etc. The GALEN model aims at developing advancederminology systems for clinical information systems. Thentology for the GALEN model is designed to be re-usablend application independent. It is intended to serve notnly for the classification of surgical procedures but alsoor a wide variety of other applications—electronic health-are records (EHCRs), clinical user interfaces, decision supportystems, knowledge access systems, and natural languagerocessing.

Ontology approaches are also under development in theardiological domain. For instance, NOESIS aims at developingplatform for wide scale integration and visual representationf medical intelligence for research and cure of cardiac andardiovascular diseases.

In most cases, biomedical ontologies function as terminol-gy vocabularies, containing the domain knowledge requiredo build the classes, rules and relationships according to whichhe several concepts interact with each other. The Unifiededical Language System (UMLS) facilitates the development

f computer systems that behave as if they “understand” theanguage of biomedicine and health. Developers use UMLSo build or enhance systems that create, process, retrieve,nd integrate biomedical and health data and information.

very good example is the NCI Thesaurus which is a pub-ic domain description logic-based terminology produced byhe American National Cancer Institute. This thesaurus imple-

ents rich semantic interrelationships between the nodes ofts taxonomies. The semantic relationships in the thesaurusre intended to facilitate translational research and to supportCI bioinformatics infrastructure.

. Conclusion

n total, SHARE predicts that the journey from a sustain-ble computing grid to a generalised knowledge grid shouldake from 7 to 15 years. However, the transition to datarid may not be as simple as the success of projects such

s BIRN and BRIDGES would suggest, and the transition tonowledge grid and its later generalisation will be breakingew ground. It is therefore possible that this timescale wille multiplied by several times; it has been suggested that

be incorporated in HealthGrids.

a more realistic timeframe might be 15 to 25 to 50 years(Fig. 3).

The distinction made in innovation studies between

‘visionaries’, ‘early adopters’, ‘early majority’ and ‘late major-ity’ is reflected here in Fig. 3. Even for early adopters,infrastructure interoperability and distributed data manage-ment are necessary; on demand access, ‘user friendliness’ and

i c a l

r

[22] A. Cicourel, in: J. Galegher, R. Kraut, C. Egido (Eds.), TheIntegration of Distributed Knowledge in CollaborativeMedical Diagnosis in Intellectual Teamwork: Social andTechnological Foundations of Cooperative Work, Lawrence

S12 i n t e r n a t i o n a l j o u r n a l o f m e d

quality of service are at the first point of inflection, beforerapid expansion, followed by sophisticated AI tools in the laterstages, where a second inflection occurs and the technologiesbecome routinely accepted.

Certain specific features of the community, such as issuesof patient ownership of her/his data and the tension betweenhospitals’ IT policies and the requirements of grids, willcontinue to prove troublesome unless addressed with polit-ical will. Another non-functional obstacle is the drag ontechnology transfer between EC projects. E.g. there is aneed for HealthGrid projects to begin thinking about datacuration and digital libraries, but researchers and providershave not come together to explore this need. We hopeto publish a further article to cover these softer issues inrelation to the technological advances we have canvassedhere.

Acknowledgements

The authors acknowledge the European Commission forfunding the project SHARE: EU-FP6-2005-IST 027694. Thework reported here has been carried out in collabora-tion with many colleagues in the HealthGrid community(www.healthgrid.org). Thanks are also due to numerous othercolleagues in Europe, the US and Asia for many helpful dis-cussions.

e f e r e n c e s

[1] V. Breton, A.E. Solomonides, R.H. McClatchey, A perspectiveon the healthgrid initiative, in: Second InternationalWorkshop on Biomedical Computations on the Grid held atthe Fourth IEEE/ACM International Symposium on ClusterComputing and the Grid (CCGrid 2004), Chicago, USA, April,2004.

[2] V. Breton, I. Blanquer, V. Hernandez, Y. Legré, A.E.Solomonides, Challenges and opportunities in healthgrids,in: Proceedings of HealthGrid 2006, vol. 120, IOS PressStudies in Health Technology and Informatics, 2006.

[3] C. del Frate, et al., Challenges and opportunities inhealthgrids, in: Proceedings of HealthGrid 2006, vol. 120, IOSPress Studies in Health Technology and Informatics, 2006.

[4] J. Freund, et al., Health-e-Child: an integrated biomedicalplatform for grid-based paediatric applications in challengesand opportunities of healthgrids, in: Proceedings ofHealthGrid 2006, vol. 120, IOS Press Studies in HealthTechnology and Informatics, 2006.

[5] STEP Consortium, Seeding the Europhysiome: A Roadmapfor the Virtual Physiological Human. http://www.europhysiome.org and http://www.biomedtown.org/biomed town/STEP/Reception/step presentations/RoadMap/vph roadmap v2b.pdf.

i n f o r m a t i c s 7 8 S ( 2 0 0 9 ) S3–S12

[6] F. Martin-Sanchez, V. Maojo, G. Lopez-Campos, Integratinggenomics into health information systems, in: Methods ofInformation in Medicine, vol. 41, 2002, pp. 25–30.

[7] BIOINFOMED Project, Synergy between Medical Informaticsand Bioinformatics: Facilitating Genomic Medicine forFuture Healthcare, BIOINFOMED Study Report White Paper,2003.

[8] V. Breton, K. Dean, T. Solomonides, From grid to healthgrid,in: Proceedings of HealthGrid 2005, vol. 112, IOS PressStudies in Health Technology and Informatics, 2005.

[9] GEMSS Project, Grid Enabled Medical Simulation Services.www.gemss.de.

[10] R.D. Stephens, A.J. Robinson, C.A. Goble, myGrid:personalised bioinformatics on the information grid,Bioinformatics 19 (Suppl. 1) (2003).

[11] P. Groen, D. Goldstein, Grid computing, health grids and EHRsystems Virtual Medical Worlds, vol. 26, 2006, Decemberhttp://www.hoise.com/vmw/07/articles/vmw/LV-VM-01-07-36.html.

[12] Open Grid Forum Life Sciences Grid—Research Group athttp://forge.gridforum.org/projects/lsg-rg.

[13] European Commission e-Health—making healthcare forEuropean citizens: an action plan for a European e-HealthArea COM, 2004, p. 356, Final this ‘Communication from theCommission’ is at: http://europa.eu.int/informationsociety/doc/qualif/health/COM 2004 0356 F EN ACTE.pdf.

[14] The Globus Alliance Globus Toolkit 4.http://www.globus.org/.

[15] GRIA Grid Resources for Industrial Applications.http://www.gridstart.org/factsheets/GRIA factsheet.pdf.

[16] DEISA Distributed European Infrastructure forSupercomputing Applications. http://www.deisa.org/.

[17] EGEE Enabling Grids for E-sciencE. http://www.eu-egee.org/.[18] SHARE Technology Baseline Report. http://eu-

share.org/about-share/deliverables-and-documents.html.[19] D. Budgen, M. Turner, I. Kotsiopoulos, F. Zhu, M. Russell, M.

Rigby, K. Bennett, P. Brereton, J. Keane, P. Layzell, From gridto healthgrid, in: Proceedings of HealthGrid 2005, vol. 112,IOS Press Studies in Health Technology and Informatics,2005.

[20] R. Warren, A.E. Solomonides, C. del Frate, et al.,MammoGrid a prototype distributed mammographicdatabase for Europe, Clinical Radiology 62 (November (11))(2007) 1044–1051.

[21] R. Warren, D. Thompson, C. del Frate, et al., A comparison ofsome anthropometric parameters between an Italian and aUK population: “proof of principle” of a European projectusing MammoGrid, Clinical Radiology 62 (November (11))(2007) 1052–1060.

Erlbaum Associates, 1990.