44
Semantic Interoperability for Health Network Deliverable 4.3: Ontology / Information models covering the public health use cases [Version 1, March 12, 2014] Call: FP7-ICT-2011-7 Grant agreement for: Network of Excellence (NoE) Project acronym: SemanticHealthNet Project full title: Semantic Interoperability for Health Network Grant agreement no.: 288408 Budget: 3.222.380 EURO Funding: 2.945.364 EURO Start: 01.12.2011 - End: 31.05.2015 Website: www.semantichealthnet.eu Coordinators: The SemanticHealthNet pro- ject is partially funded by the European Commission.

Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

Semantic Interoperability for Health Network

Deliverable 4.3: Ontology / Information models covering the public health use cases

[Version 1, March 12, 2014]

Call: FP7-ICT-2011-7

Grant agreement for: Network of Excellence (NoE)

Project acronym: SemanticHealthNet

Project full title: Semantic Interoperability for Health Network

Grant agreement no.: 288408

Budget: 3.222.380 EURO

Funding: 2.945.364 EURO

Start: 01.12.2011 - End: 31.05.2015

Website: www.semantichealthnet.eu

Coordinators:

The SemanticHealthNet pro-ject is partially funded by the European Commission.

Page 2: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 2 of 44

Document description

Deliverable: D4.3

Publishable summary:

SemanticHealthNet (SHN) assumes that several standards and proprietary imple-mentations for representing the content of electronic health records (EHRs) will co-exist for a long time. Thus, the project focuses on providing an integrative semantic abstraction on top of them that is able to act as mediator. As practical exemplars, SHN has set the focus on chronic heart failure and cardiovascular prevention, which drives the development of semantic resources by the work package four (WP4). In the two WP4 previous deliverables, we provided the basis of the interoperability approach proposed: an ontological framework and an initial set of semantic pat-terns obtained by following a bottom-up approach. As a result of our second deliv-erable (4.2) we created the Heart Failure Summary (HFS), a minimal dataset that contains essential information in order to optimize heart failure management. In this deliverable, we describe an extension of the underlying semantic architec-ture and we focus on the semantic interoperability challenges that exist when clini-cal practice data (e.g. EHRs) are used by public health systems. We exemplify it by extending the HFS with two risks factors of cardiovascular diseases (CVDs), which are of interest for public health studies, viz. tobacco and alcohol use. Thus, the HFS should be understood as a global resource that can be used by both stakeholder communities. A dedicated instrument (e.g. a questionnaire) for public health re-porting on heart failure would be very similar to the HFS, from a modelling point of view. Public health use cases require the access to heterogeneous information sources that include data described at different level of detail and generated within heter-ogeneous contexts (e.g. smoking cessation clinic record vs. primary care record) and attending at different requirements. We have demonstrated how semantic patterns can be used to improve semantic interoperability across them. Semantic patterns are based on a reference ontology model (i.e. SHN ontological framework) and facilitate the mapping of clinical model data into their semantic representation. Some query exemplars have been provided in chapter 4 in order to demonstrate: (1) that data from heterogeneous models can be homogeneously queried and (2) data can be retrieved within their context which help public health systems to interpret them right. There are also limits to semantic interoperability when codes such as Ex-smoker are used for referring both to someone who quit last month but had smoked for 30 years, and to a person who quit 20 years ago after having smoked two years only. Thus, there are limits to comparability and semantic interoperability that are fun-damental to differences in clinical processes and facts about the world that cannot be resolved at the level of information systems. At best, the degree of uncertain-ty/vagueness can be estimated.

Status:

Version:

Public: x No □ Yes

Deadline:

Contact: Catalina Martínez-Costa [email protected] Stefan Schulz [email protected]

Editors: Catalina Martínez-Costa, Stefan Schulz

Page 3: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 3 of 44

Table of contents

Table of contents ..................................................................................................................................... 3

1 Introduction and objectives ............................................................................................................ 4

1.1 Background .............................................................................................................................. 4

1.2 Objectives ................................................................................................................................ 4

1.3 The public health perspective ................................................................................................. 5

1.4 Monitoring Risk Factors for preventing Cardiovascular Disease ............................................. 6

2 Methodology and General Principles .............................................................................................. 9

2.1 SHN semantic infrastructure ................................................................................................... 9

2.1.1 Ontological framework .................................................................................................... 9

2.1.2 Ontology Content Patterns – Semantic Patterns – ....................................................... 10

2.1.3 OWL DL representation ................................................................................................. 17

3 Extending the Heart Failure Summary (HFS) to support Public Health ......................................... 20

3.1 Tobacco Use Summary .......................................................................................................... 20

3.1.1 Smoking Tobacco Model ............................................................................................... 21

3.1.2 Model for other forms of tobacco use (Snuff) .............................................................. 26

3.2 Alcohol Use Summary ........................................................................................................... 28

4 Dealing with heterogeneous tobacco use representations .......................................................... 31

4.1 Objectives .............................................................................................................................. 31

4.2 Description of the models ..................................................................................................... 31

4.3 Mapping of clinical model data to their semantic representation by using semantic patterns

as bridge ............................................................................................................................................ 35

4.4 Homogeneous query of data from the three clinical models: Tobacco Use (openEHR),

Meaningful Use (HL7 C-CDA) and Tobacco Detailed Use (DCM- HL7v3) .......................................... 40

5 Summary and conclusions ............................................................................................................. 43

Page 4: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 4 of 44

1 Introduction and objectives

1.1 Background

SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-

formation. It assumes that several standards and proprietary implementations for representing the

content of electronic health records (EHRs) will co-exist for a long time. Thus, the project focuses on

providing an integrative semantic abstraction on top of them, representing a homogeneous view that

is able to mediate across the heterogeneous underlying representations.

The approach targets the whole range of health-related information about all medical domains. As

practical exemplars, SHN has set the focus on chronic heart failure and cardiovascular prevention,

which drives the development of semantic resources by the work package four (WP4).

The two previous deliverables had two main focuses. Firstly, to clarify the distinction between infor-

mation entities and clinical entities in order to know what things have to be represented by infor-

mation models and by ontologies (with focus on SNOMED CT, as an ontology-based clinical terminol-

ogy). As a result we proposed an ontology engineering approach based on a top-level ontology, de-

scription logics and Semantic Web standards and investigated how the semantic equivalence of iso-

semantic expressions could be ascertained when this distinction was properly done. The second fo-

cus was the application of this approach to the semantic representation of the Heart Failure Sum-

mary (HFS), a minimal dataset that contains essential information in order to optimize heart failure

management. As a result, repetitive modelling issues were identified and a set of semantic patterns

were provided by following a bottom-up approach.

In this deliverable, we address the semantic interoperability challenges that exist when clinical prac-

tice data (e.g. EHRs) are used by public health systems. We exemplify it by focusing on two risks fac-

tors of cardiovascular diseases (CVDs), which are of interest for public health studies, viz. tobacco

and alcohol use. Public health and clinical practice requirements are not the same; and EHR data are

usually not appropriate for many public health tasks. The Heart Failure Summary should be under-

stood as a global resource that can be used by both stakeholder communities. A dedicated instru-

ment (e.g. a questionnaire) for public health reporting on heart failure would be very similar to the

HFS, from a modelling point of view. Therefore, we decided to extend the HFS with detail infor-

mation about both risk factors, which will be described in the following. Furthermore, we describe an

extension of the underlying semantic architecture that will be applied to the public health use case.

1.2 Objectives

The goal of this work is to demonstrate that the semantic interoperability solution proposed can be

applied to improve semantic interoperability between public health systems and clinical practice

records (EHRs). In order to do so we will extend the HFS with a set of good quality representations of

additional information required from a public health perspective. Based on this extension as working

example, we will provide and apply semantic patterns to that information in order to make it seman-

tically interoperable.

The main purpose of using semantic patterns is to facilitate the mapping of existing structured data

into an ontology-based representation, in a way that can be interpreted by computers independently

Page 5: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 5 of 44

of the degree of granularity in which it has been provided. These patterns are based on a formal

model of meaning (reference model), constituted by a set of OWL DL ontologies under a highly con-

strained top-level ontology, which assists the modelling task.

The set of semantic patterns we create will be produced together with the formal representation of

their meaning in OWL DL. Some patterns will be instantiated with fictitious data for demonstration.

The patterns produced will be based on a subset of top-level patterns, which can be reused in differ-

ent use cases.

We will exemplify the use of semantic patterns by applying them to encode clinical data rendered

from heterogeneous sources. This will demonstrate that patient information can be homogeneously

retrieved for a particular public health use case, independently on the underlying clinical model rep-

resentation.

1.3 The public health perspective

The focus of public health is to improve health and quality of life through the prevention and treat-

ment of disease and through the promotion of healthy behaviours. While the clinical approach fo-

cuses on the treatment of individual patients, public health considers the characteristics of groups of

people. Knowledge of the characteristics of the population can be derived not only from the aggrega-

tion of clinical records but from other sources such as social care records, personal health records,

service payments, health surveys, etc. Each of these information systems attend to different infor-

mation requirements and contexts. Thus, their homogeneous access by public agencies would signifi-

cantly improve and facilitate their work. In the following, we will concentrate on public health as-

pects of cardiovascular diseases (CVD) and risks.

As reported by deliverable 2.1, there is considerable evidence that more than 50% of the recent re-

duction in CVD is due to decreases in risks factors. The CVD risk is frequently the result of multiple

interacting risk factors and is usually expressed in terms of: developing CVD (incidence); experiencing

an event (e.g. heart attack); or dying.

The European guidelines on cardiovascular disease prevention (ESC)1 provide a set of recommenda-

tions to reduce the CVD risk factors. Many of the recommendations are related to behavioural fac-

tors such as: (i) provide smokers advice to quit and offer assistance; (ii) avoid exposure to passive

smoking; (iii) facilitate lifestyle change with cognitive-behavioural strategies; (iv) a healthy diet; (v)

reducing weight in overweight or obese patients; (vi) physical exercise; etc. Others are related to the

use of different prophylactic drug therapies such as using statins or antihypertensive treatment.

Although there is overwhelming evidence that most of the reduction in CVD is due to decrease in

risks factors, there are still major evidence gaps (e.g. how to help citizens to achieve lifestyle chang-

es) for which “real-world evidence”, i.e. outside the artificial environments of clinical trials, will be

crucial and which requires large numbers of EHRs and other applications to be analysed. The ESC

report highlights this fact and the lack of knowledge about people who do not usually participate in

clinical trials, as well as long-term outcomes of interventions.

1 http://www.escardio.org/guidelines-surveys/esc-guidelines/Pages/cvd-prevention.aspx

Page 6: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 6 of 44

In addition, the ESC report emphasizes the need to reach beyond the clinic into citizens’ daily lives in

order to understand how to modify CVD risk factors. This requires the collaboration of clinical care

and public health, the overlap of which varies within and across communities. Patients with behav-

ioural risk factors such as tobacco use, risky alcohol use, etc. present themselves to primary care ser-

vices every day. A coordinated approach between public health and primary care could be a synergis-

tic opportunity to address these behaviours. Among the challenges of collaborations2, there are bar-

riers like communication challenges, different practice cultures, policy, and funding mechanisms, etc.

Within the communication challenge, there are human and technical barriers. Here, we will focus on

the technical challenges related with the meaningful communication and reuse of patient data.

The outcomes of public health measures are measured in terms of processes (i.e. numbers taking up

a smoking cessation service), intermediate risk outcomes (e.g. six month quit rates) and disease out-

comes (CVD event rates; CVD and lung cancer incidence and death rates). The patient-level data that

come out of clinical processes are difficult to re-use for public health processes: they are unstruc-

tured to a large extent; coded data is often biased toward billing purposes, data of interest for public

health such as tobacco use behaviour is often too coarse-grained, incomplete, or unreliable3. Howev-

er, there is an increasing interest in the analysis of aggregated clinical data4. Nevertheless, we here

focus on the reuse of data from EHRs for public health purposes, assuming that at least parts of the

EHRs are sufficiently standardized and quality-assured. This can be the case with summaries like the

HFS. In order to support public health goals the underlying data models have to be expanded to in-

corporate environmental, psychosocial, and other non-medical data elements, which are typically

not represented in the EHR in a way that is sufficient for public health.

Since the objective of this deliverable is to show how the EHR can be used to satisfy public health

goals, we have expanded the Heart Failure Summary (HFS), produced in the second project year and

reported in deliverable 4.2. As working examples we have selected two behavioural factors, tobacco

and alcohol use, which are risks factors for developing CVDs.

Our work will consist in providing the new data elements and value restrictions for modelling both

risk factors as well as their semantic pattern-based representation, which allow it to be semantically

interoperable across heterogeneous systems, based on different EHR standard or proprietary repre-

sentations.

1.4 Monitoring Risk Factors for preventing Cardiovascular Disease

There is a wide range of data items that may be needed by population health experts to promote

CVD prevention. Table 1-1 shows a summary of possible information sources and examples of their

use by public health systems, extracted from deliverable 2.1.

A public health use case will usually involve more than one information source. As an example, in

order to compare the effectiveness of smoking cessation services we might use three different

sources of information to obtain quit rates: (1) national lifestyle surveys; (2) EHRs and (3) smoking

cessation service.

2 http://www.health.gov.on.ca/en/common/ministry/publications/reports/capacity_review06/phealth_pcare.pdf

3 http://www.ncbi.nlm.nih.gov/pubmed/10912559

4 http://www.sciencedirect.com/science/article/pii/S1532046407000603

Page 7: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 7 of 44

Common public health data sources Example of use

Scientific literature Effect of statin prescribing on CVD risk factor

National surveys Health surveys with age, sex, ethnicity, house-

hold income, education level, life style etc.

EHRs CVD events, cholesterol measure, blood pres-

sure measure, height, weight, smoking, alcohol use, etc.

Disease registers CVD incidence

Death certificates and other public administrative data

CVD deaths

Specific health services E.g. Smoking cessation service

Smoking prevalence and quit rates, intention to quit, etc.

Web and Mobile technologies Daily weight measurement at home, ‘wellbeing’

applications for physical activity monitoring Table 1-1 Common public health data sources

WP4 has focused on the representation of tobacco and alcohol consumption in the EHR to support

public health goals. WP1 clinicians had provided us with a dataset produced within the context of the

SICA-HF5 EU FP7 project, related to heart failure patients in a clinical setting. In this dataset the in-

formation about smoking and alcohol use consisted on a (yes / no) answer. In order to make it suita-

ble for public health purposes we have enriched this representation and added it to the heart failure

summary in order to bring it into a single combined resource of interest for both clinical care and

public health stakeholder communities. Both CVD risk factors are described by using deliverable 2.1

as source.

Tobacco use and in particular smoking is strongly associated with CVD and lung cancer. When a

smoker quits there is a measurable improvement in his/her health. When large groups quit, e.g., af-

ter the ban on smoking in public places, mass effects like significant drops in heart attacks rates are

observed. As smoking is highly addictive, the process of quitting is complex, and not easy to measure.

There are different sources of information on smoking quit rates, as previously described, and the

characteristics of smokers who quit. Either this information is gathered directly from those who use

smoking cessation services or it is estimated from population surveys for those who quit on their

own.

The health risks from smoking are proportional to the amount smoked and the period of exposure.

The ideal risk measure is often termed “pack years”, which takes numeric values. The smoking data

in EHRs are often much cruder. For example a code for “ex-smoker” referring to someone who quit

last month but was smoking for 30 years, alongside another person with the same code who quit 20

years ago after only smoking for two years.

Both primary care and public health services are paid to help smokers to quit. National health bodies

need to monitor the quality of these services and consider how they should evolve as the character-

istics of smokers change over time.

5 http://www.sica-hf.com/

Page 8: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 8 of 44

Many of the phenomena described with regard to tobacco can also be observed in the way alcohol

use is represented in the medical record6. Whereas for light and moderate alcohol use, certain pro-

tective cardiovascular effects are discussed, heavy drinkers are threatened by several cardiovascular

risks like hypertension, stroke, and alcoholic cardiomyopathy. Whereas light tobacco consumption is

rather the exception than the rule, light alcohol consumption is the normal case in Western societies,

so that the distinction between "drinker" and "non-drinker" is difficult and rather irrelevant from a

medical point of view, whereas "social drinker" and "heavy drinker" are two points on a continuum.

Heavy drinking is certainly more socially stigmatising than social drinking, so that the EHR may be

specially unreliable in identifying this risk group.

In order to extend the heart failure summary with the tobacco and alcohol use risk factors we have

used as input the requirements derived from the above content analysis. In addition to this, we have

used two other sources of requirements: (i) and analysis made by the Swedish National Board of

Health and Welfare within the project “Common information structure – the application in quality

registers” for the health care for patients with chronic heart failure in adults and (ii) a nurse-run HF

clinics where patient lifestyle/risk factors are captured.

The current heart failure summary should be seen as an exemplar, which aggregates a wide range of

phenomena that are typical for the representation of precise or approximate facts, hypothesis, plans,

etc. The inclusion of the alcohol and tobacco use cases will give additional insights in aspects like: (i)

the highly approximate and speculative character of reporting information about recreational drugs

in the EHR; (ii) Deficiencies in the representation of relevant entities in standard terminologies; (iii)

imprecision regarding numeric values; (iv) undocumented background assumptions and procedures

in history taking.

Following, Table 1-2 provides a list of the tobacco and alcohol use data items that have been selected

as relevant to support public health systems for preventing CVDs.

Tobacco Use data items Alcohol Use data items

Status Status

Form Form

Typical Smoked Amount Typical Alcohol Consumption

Pattern of Use Pattern of Use

Date Ceased Binge Drinking Pattern

Pack Years Date commenced

Date Ceased Table 1-2 Tobacco and alcohol use data items considered of interest for public health

6 http://www.ncbi.nlm.nih.gov/pubmed/10912559

Page 9: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 9 of 44

2 Methodology and General Principles

In the first WP4 deliverable (4.1), we had outlined the shared logical framework on which our ap-

proach is based. Later, deliverable 4.2 provided an overview of the main representational units of the

framework and a set of patterns elaborated by following a bottom-up approach for modelling the

heart failure summary. There, we stated that many of the patterns provided could be generalised

and converted into top-level patterns able to be composed and specialised. Below, we describe the

approach and the progress done in the last months concerning the use of patterns for facilitating the

modelling of clinical information based on a formal model of meaning.

2.1 SHN semantic infrastructure

The semantic infrastructure proposed consists of an ontological framework that constitutes the

model of meaning and a set of content ontology patterns (onwards semantic patterns) that use this

framework as reference. The framework aims at providing an unambiguous representation of clinical

information by providing a set of top-level concepts (clinical entities and information entities) and an

unambiguous way of relating them. It consists of three kinds of ontologies: (i) top-level; (ii) infor-

mation entity and (iii) medical domain ontologies, expressed in OWL 2 DL. How this framework inter-

acts with semantic patterns and this last with clinical models or structured clinical information in

general will be explained in the following.

2.1.1 Ontological framework

Semantic patterns are based on a model of meaning which consists of the following three ontologies:

A top-domain ontology, BioTopLite7 (prefix btl:) providing a set of canonical top-level catego-

ries and relationships, like btl:Condition, btl:InformationObject, btl:Quality, btl:Process, or

btl:hasPart, btl:bearerOf, respectively. We use this top-level ontology, especially because it

introduces a very clear distinction between information objects and other entities (e.g. clini-

cal entities) and allows their unambiguous binding, which is guided by the top-level catego-

ries.

SNOMED CT (prefix sct:), a huge clinical terminology partially built on formal-ontological

principles. We use subsets of it as an OWL 2 DL ontology. Selected SNOMED CT content is

placed under top-level classes provided by BioTopLite. SNOMED CT does not limit itself to

represent medical entities like “myocarditis”, but it also codes for complex statements such

as “possible myocarditis” which combine a domain term with epistemic information about its

specific use by the author of this statement. This is typical in SNOMED CT concepts which be-

long to the “Situation with explicit context” hierarchy, which can be seen as an information

model inside SNOMED CT. Here interoperability requires additional effort (e.g. the code for

“myocarditis” bound to a specific data element within an information model on diagnosis in-

cluding diagnostic certainty attributes should be found to be equivalent to the code “possible

myocarditis”). It requires re-modelling them guided by the semantic categories of the under-

7 http://purl.org/biotop/biotoplite.owl

Page 10: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 10 of 44

lying model of meaning and not primarily by its label, as this can be ambiguous, because

“myocarditis” could be interpreted as a disorder or as a morphology.

An EHR information entity ontology (prefix shn:) for representing pieces of information like

diagnostic statements, plans, orders, etc. They are outcomes of clinical actions like observa-

tions, investigations, or evaluations. All classes of this ontology are represented as subclasses

of the top-level class btl:InformationObject. Those SNOMED CT concepts that represent

statements rather than clinical entities should also be placed under this category after their

re-modelling (e.g. “possible myocarditis”).

Information entities will refer to (types of) clinical entities by means of the relation btl:represents

which can be further specialized by shn:isAboutSituation (e.g. shn:InformationItem

shn:isAboutSituation sct:SmokingSituation) and shn:isAboutQuality (e.g. shn:ObservationResult

shn:isAboutQuality sct:DoseForm) for referring to a patient clinical situation8 (e.g. smoking situation)

or a quality indirectly observed of some material object (e.g. pharmaceutical product) or process (e.g.

medication administration process referring to the route quality).

2.1.2 Ontology Content Patterns – Semantic Patterns –

Semantic patterns act as bridge between structured data and their semantic representation based on

the above ontological framework, and can be used to guide the mapping process between both. The

main rationale for semantic patterns is to facilitate recurring content modelling tasks. Examples of

recurrent issues are: Who does what, when and where? – What is the location of something (an ob-

ject or a process)? – What is the time frame of something, e.g. the duration of a process (like a clini-

cal situation) or the life of a material entity? – etc. In their attempt to provide clinical information

within a standardized structure that addresses the requirements of some particular use case9 seman-

tic patterns are not too different from clinical archetypes and their use in templates.

Semantic patterns are based on the above ontological framework and are employed to insulate users

– more precisely those who semantically annotate clinical models – from the underlying ontology

formalisms. Semantic patterns are characterized by the following:

They are based on the Closed World Assumption (CWA), which implies that everything we do

not know is false, while the open world assumption (OWA) states that everything we do not

know is undefined. As an example, the question “Was the heart failure caused by a heart at-

tack?” will look for that statement in our model, and since we do not have that statement,

the two systems will interpret it differently: false for a closed world approach and not com-

putable for an open world approach.

Patterns can be specialized by following a similar paradigm to the object-oriented design and

frame systems, in which the properties defined in a parent class are inherited by all its child

classes, which can further constrain them. For instance, if we specialise a diagnosis pattern

8 S. Schulz, A. Rector, J. Rodrigues, C. Chute, B. Üstün, K. Spackman, Ontology-based convergence of medical terminologies.

SNOMED CT and ICD-11. Proc. of eHealth2012. Vienna, Austria: OCG, 2012 9 E. Blomqvist, E. Daga, A. Gangemi, V. Presutti, Modelling and using ontology design patterns. [http://www.neon-

project.org/web-content/media/book-chapters/Chapter-12.pdf]

Page 11: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 11 of 44

with one specific for breast cancer diagnoses, we might be interested in specifying its severi-

ty values by using the values of some cancer staging system as the TNM system. Thus, se-

mantic patterns can be organized in hierarchies from more general (top-level patterns) to

more specific ones.

They can be composed with other patterns to cover larger use cases10 in a semantically con-

sistent way. As an example, the disease expressed by the diagnosis pattern might be further

specified by following its own semantic pattern, which includes properties for expressing the

body site affected or the disease cause.

We propose to use semantic patterns as a “proxy” mediating between clinical models (e.g. arche-

types) and the proposed formal model of meaning (i.e. ontological framework). Thus, they should

support the mapping of clinical data rendered according to existing EHR , messaging, or reporting

standards and other structured representations into their semantic representation. Besides, they

could be used for the semantic validation of the internal consistency of proposed clinical models or

even for guiding the creation of clinical models.

Our assumption is that a broad range of clinical models can be represented by the specialisation and

composition of a limited set of top-level semantic patterns. In deliverable 4.2, we demonstrated the

creation and application of such patterns for representing information on heart failure in a bottom-

up approach. We found out that one way of handling this could be described by means of specialisa-

tion and composition based on a set of higher-level patterns (top-level patterns).

Semantic patterns describe information acquired in a medical context about clinical-related issues

(e.g. Diagnosis, Lab results, Examination results, Symptoms, etc.). Their use by clinical modellers

should be supported by tools that provide a close-to user and friendly view on them, independently

of their internal representation. Ideally, their selection should be guided by a tool that suggests a

pattern based on a list of “keywords” introduced by the modeller (e.g. lab result, cholesterol, diagno-

sis, etc.).

Here we represent them as Subject-Predicate-Object (SPO) triples, enhanced by a cardinality attrib-

ute. The Subject and the Object correspond to sets of classes of the ontology framework. However,

the predicate does not correspond to an object property but to an OWL DL expression, as we will

describe later in Section 2.1.3.

In the following, we describe some of the top-level patterns we will use to encode the tobacco and

alcohol use clinical models. They can be specialized and composed by following certain cardinality

and value restrictions.

Cardinality constraints restrict the number of instances in which some predicate is used with differ-

ent values (e.g. the cardinality “1..*” of the predicate ‘describe situation’ means that the latter must

have at least one object value, with the maximum unbounded).

10

A. Gangemi, Ontology Design Patterns for Semantic Web Content. In Proceedings of the Fourth International Semantic Web Conference, 2005, pp. 262-276

Page 12: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 12 of 44

The value restrictions constrain the value of the Object part of the triple. They limit the possible val-

ues for some predicate, allowing another pattern as object part of a triple (e.g. a specific disease or

the pattern that further describes a disease).

Note that instantiation at this (meta-)level means that the instances are object classes, not domain

individuals.

Top-level pattern I_CS_PT (cf. Table 2-1):

The information about clinical situation pattern (I_CS_PT) can be used to represent some piece of

information about a particular clinical situation of the patient. Clinical situations, as described in11,

correspond to SNOMED CT clinical findings. Precisely, they represent the period or a patient’s life

during which a certain clinical condition is present. The pattern consists of the following SPO triples,

described as enumerated in the left column of the table:

- S1: Type of clinical situation in focus (e.g. heart failure, smoking, etc.)

- S2: Process performed to acquire the information. (e.g. diagnostic measures, physical exami-

nation, history taking, etc.)

- S3: Any epistemic and contextual aspect that qualifies or modifies the information related

with the clinical situation in focus (e.g. severity, temporal context, finding context (absence,

certainty, etc.), etc.)

#N Subject Predicate Cardinality Object

S1 shn:InformationItem 'describes situation' 1..* shn:ClinicalSituation (CS_PT)

S2 shn:InformationItem 'results from process' 1..* btl:Process (CP_PT)

S3 shn:InformationItem 'has attribute' 0..* shn:InformationAttribute

Table 2-1 Semantic pattern I_CS_PT

In the above table, the Object part of the triple may contain the name of another pattern in brackets,

which means that the main pattern can be composed with that one.

Examples of the application of this top-level pattern are Diagnoses, Co-morbidities, Symptoms, Aller-

gies, etc. The following table shows its application for representing “Suspected heart failure diagno-

sis”. The Object part of the triples has been constrained by following the cardinality and value con-

straints indicated by the pattern.

#N Subject Predicate Object

S1 shn:InformationItem 'describes situation' sct:HeartFailure

S2 shn:InformationItem 'results from process' sct:DiagnosticProcess

S3 shn:InformationItem 'has attribute' sct:Suspected

Table 2-2 Diagnosis example based on I_CS_PT pattern

Top-level pattern I_NCS_PT (cf. Table 2-3):

The information about no clinical situation pattern (I_NCS_PT) is defined as a specialisation of

I_CS_PT and can be used to represent the absence of a particular patient clinical situation. It consists

of the following SPO triples:

11

S. Schulz, A. Rector, J. Rodrigues, C. Chute, B. Üstün, K. Spackman, Ontology-based convergence of medical terminologies. SNOMED CT and ICD-11. Proc. of eHealth2012. Vienna, Austria: OCG, 2012.

Page 13: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 13 of 44

- S1: Clinical situation in focus (e.g. heart failure, smoking, etc.)

- S2: Process performed to acquire the information. (e.g. diagnostic, physical examination, his-

tory taking, etc.)

- S3: Any epistemic and contextual aspect that qualifies or modifies the information related

with the clinical situation in focus (e.g. severity, certainty, temporal context, etc.).

- S4: The information attribute that represents the absence of the clinical situation in focus.

This triple is coloured in grey since it further specialises the triple S3 (shn:InformationItem

‘has attribute’ 0..* shn:InformationAttribute) of the parent pattern (I_CS_PT) for represent-

ing the absence aspect). Thus, the predicate ‘has situation context’ specialises the predicate

‘has attribute’ and constrains its value to sct:Absent, which is placed as subclass of

shn:InformationAttribute in the ontology framework. The cardinality “1..1” indicates that this

triple is obligatory when this pattern is instantiated. Note that this is a meta-description, so it

has no existential import at the level of individuals.

- #N Subject Predicate Cardinality Object

S1 shn:InformationItem 'describes situation' 1..* shn:ClinicalSituation(CS_PT)

S2 shn:InformationItem 'results from process' 1..* btl:Process (CP_PT)

S3 shn:InformationItem 'has attribute' 0..* shn:InformationAttribute

S4 shn:InformationItem 'has situation context' 1..1 sct:Absent

Table 2-3 Semantic pattern I_NCS_PT (shaded triple indicates difference to parent (I_CS_PT))

Examples of application of this top-level pattern are Diagnosis of absence of X, No Symptom X, No

Allergy X, etc. The following table shows its application for representing “Diagnosis of no heart fail-

ure”. The Object part of the triples has been constrained by following the cardinality and value con-

straints indicated by the pattern.

#N Subject Predicate Object

S1 shn:InformationItem 'describes situation' sct:HeartFailure

S2 shn:InformationItem 'results from process' sct:DiagnosticProcess

S4 shn:InformationItem 'has situation context' sct:Absent

Table 2-4 Diagnosis example based on I_NCS_PT pattern

Top-level pattern PH_CS_PT (cf. Table 2-5):

The past history of clinical situation pattern (PH_CS_PT) is defined as a specialisation of I_CS_PT and

can be used to represent any clinical situation of the patient that occurred in the past and may con-

tinue or not at present. It consists of the following SPO triples:

- S1: Clinical situation in focus (e.g. heart failure, smoking, etc.)

- S2: Process performed to acquire the information. (e.g. diagnostic, physical examination, his-

tory taking, etc.)

- S3: Any epistemic and contextual aspect that qualifies or modifies the information related

with the clinical situation in focus (e.g. severity, certainty, absence, etc.) Note that temporal

context is no more within the possible predicate values, since this is expressed by triple S4.

- S4: The information attribute that represents the past temporal context. This triple is col-

oured in grey since it further specialises the triple S3 of the parent pattern (I_CS_PT). The

predicate ‘has temporal context’ specialises the predicate ‘has attribute’ and constrains its

value to sct:InThePast, which is placed as subclass of shn:InformationAttribute in the ontolo-

Page 14: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 14 of 44

gy framework. The cardinality “1..1” indicates that this triple is obligatory when this pattern

is instantiated., i.e. if OWL TBox statements are generated as instances of the pattern.

#N Subject Predicate Cardinality Object

S1 shn:InformationItem 'describes situation' 1..* shn:ClinicalSituation(CS_PT)

S2 shn:InformationItem 'results from process' 1..* btl:Process (CP_PT)

S3 shn:InformationItem 'has attribute' 0..* shn:InformationAttribute

S4 shn:InformationItem 'has temporal context' 1..1 sct:InthePast

Table 2-5 Semantic pattern PH_CS_PT (shaded triple indicates difference to parent (I_CS_PT))

Examples of application of this top-level pattern are Past History of Disease, Past History of Allergy,

Past History of Symptom, etc. The following table shows its application for representing Past history

of heart failure diagnosis. The Object part of the triples has been constrained by following the cardi-

nality and value constraints indicated by the pattern.

#N Subject Predicate Object

S1 shn:InformationItem 'describes situation' sct:HeartFailure

S2 shn:InformationItem 'results from process' sct:DiagnosticProcess

S4 shn:InformationItem 'has temporal context' sct:InThePast

Table 2-6 Diagnosis example based on PH_CS_PT pattern

Top-level pattern NPH_CS_PT (cf. Table 2-7):

The No Past History of Clinical Situation pattern (NPH_CS_PT) is defined as a specialisation of

PH_CS_PT and can be used to represent the absence of some patient clinical situation in the past,

independently of its existence or no at the present time. It consists of the following triples SPO:

- S1: Clinical situation in focus (e.g. heart failure, smoking, etc.)

- S2: Process performed to acquire the information. (e.g. diagnostic, physical examination, his-

tory taking, etc.)

- S3: Any epistemic and contextual aspect that qualifies or modifies the information related

with the clinical situation in focus (e.g. severity, certainty, temporal context, etc.)

- S4: The information attribute that represent the past temporal context (sct:InThePast)

- S5: The information attribute that represent the absence of the clinical situation in focus

(sct:Absent). This triple is coloured in grey since it further specialises the triple S3 of the par-

ent pattern (PH_CS_PT). The predicate ‘has situation context’ specialises the predicate ‘has

attribute’ and constrains its value to sct:Absent, which is placed as subclass of

shn:InformationAttribute in the ontology framework. The cardinality “1..1” indicates that this

triple is obligatory when this pattern is instantiated.

#N Subject Predicate Cardinality Object

S1 shn:InformationItem 'describes situation' 1..* shn:ClinicalSituation(CS_PT)

S2 shn:InformationItem 'results from process' 1..* btl:Process (CP_PT)

S3 shn:InformationItem 'has attribute' 0..* shn:InformationAttribute

S4 shn:InformationItem 'has temporal context' 1..1 sct:InthePast

S5 shn:InformationItem 'has situation context' 1..1 sct:Absent

Table 2-7 Semantic pattern NPH_CS_PT (shaded triple indicates difference to parent (PH_CS_PT))

Examples of application of this top-level pattern are No Past History of Disease, No Past History of

Allergy, No Past History of Symptom, etc. The following table shows its application for representing

“No Past history of heart failure diagnosed”. The Object part of the triples has been constrained by

following the cardinality and value constraints indicated by the pattern.

Page 15: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 15 of 44

#N Subject Predicate Object

S1 shn:InformationItem 'describes situation' sct:HeartFailure

S2 shn:InformationItem 'results from process' sct:DiagnosticProcess

S4 shn:InformationItem 'has temporal context' sct:InthePast

S5 shn:InformationItem 'has situation context' sct:Absent

Table 2-8 Diagnosis example based on NPH_CS_PT pattern

Top-level pattern OB_CS_PT (cf. Table 2-9):

The observation result process quality pattern can be used to represent the result of an observation

or assessment about some quality of a given clinical situation. It consists of the following SPO triples:

- O1: The quality observed / assessed (e.g. mass intake)

- O2: The clinical situation to which that quality belongs (e.g. tobacco smoking situation)

- O3: The result of the observation / assessment (e.g. 15 cigarettes per day)

- O4: The scale in which the observed value is expressed (e.g. qualitative, quantitative (ratio,

interval, ordinal, etc.)

- O5: The observation / assessment process performed to acquire the information.

#N Subject Predicate Cardinality Object

O1 shn:ObservationResult 'describes quality' 1..1 btl:ProcessQuality

O2 btl:Quality 'is quality of' 1..* shn:ClinicalSituation

O3 shn:ObservationResult 'has observed value' 1..1 btl:ValueRegion

O4 btl:ValueRegion 'has value' 0..1 xml:datatype

O5 btl:ValueRegion 'has units' 0..1 shn:MeasurementUnits

O6 btl:ValueRegion 'has scale' 0..1 shn:Scale

O7 shn:ObservationResult 'results from process' 1..* shn:ObservationProcess or shn:AssessmentProcess

Table 2-9 Semantic pattern OB_CS_PT

Examples of application of this top-level pattern are Clinical Test Result, Physical Examination Result,

measurement result, etc. The following table shows its application for representing “10 cigarettes /

day tobacco smoking assessment”. The Object part of the triples has been constrained by following

the cardinality and value constraints indicated by the pattern.

#N Subject Predicate Object

O1 shn:ObservationResult 'describes quality' shn:MassIntake

O2 btl:Quality 'is quality of' sct:CigaretteTobaccoSmokingSituation

O3 shn:ObservationResult 'has observed value' btl:ValueRegion

O4 btl:ValueRegion 'has value' 10

O5 btl:ValueRegion 'has units' sct:PerDay

O7 shn:ObservationResult 'results from process' shn:Assessment

Table 2-10 Observation result example based on OB_CS_PT

Top-level pattern CP_PT (cf. Table 2-11):

The Clinical Process pattern (CP_PT) is used to describe a clinical process. As a difference with the

previous ones it does not describe some information about some related clinical issue but the clinical

process itself. It will be used by composition by the previous patterns.

It consists of the following SPO triples:

- P1: The participants of the process which play some role as agent (participant that is active

the process) or as patient (participant that is passive in the process). Examples of an active

process participant is the clinician in a diagnostic process, where passive process participants

could be the patient, in some cases, or the devices involved. This predicate can be further

Page 16: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 16 of 44

specialised by the predicates ´has information subject´ and ‘has information provider’. The

first indicates the person or artefact that is the subject of the information (e.g. patient, pa-

tient relative, etc.). The second one indicates the provider of the information (e.g. patient,

clinician, etc.). Again, we have to distinguish between the meta-model (template) and the

model (OWL ontology). For instance, the predicate “has participant” has a minimal cardinali-

ty of zero. This does not mean that processes have no participants, it only means that at this

place the OWL ontology does not require any explicit axiom on process participants.

- P2: The date/time in which the process takes place. It can be provided as a time interval or as

single date/time points

- P3: Information about where the process occurs (eg. hospital).

#N Subject Predicate Cardinality Object

P1 btl:Process ‘has participant’ 0..* btl:MaterialObject

P2 btl:Process ‘occurs at’ 0..1 btl:TemporalRegion

P3 btl:Process ‘happens at’ 0..* btl:MaterialObject or btl:InmaterialObject

Table 2-11 Semantic pattern CP_PT

Examples of application of this top-level pattern are a history taking procedure, assessment, observa-

tion, etc. The following table shows its application for representing a history taking procedure carried

out in an outpatient clinic.

#N Subject Predicate Object

P1 sct:HistoryTaking ‘has information provider’ shn:Clinician

P1 sct:HistoryTaking ‘has information subject’ shn:Patient

P3 sct:HistoryTaking ‘happens at’ sct:HospitalBasedOutpatientDepartment

Table 2-12 History taking example based on CP_PT

Top-level pattern CS_PT (cf. Table 2-13)

The Clinical Situation pattern (CS_PT) is defined as a specialisation of CP_PT and can be used to de-

scribe a patient clinical situation. It is used also as compositional pattern. It consists of the following

SPO triples:

- C1: The participants of the clinical situation which play some role as agent or patient (e.g. pa-

tient)

- C2: The date/time in which the situation takes place. It can be provided as a time interval or

as single date/time points

- C3: The place where the process occurs (e.g. hospital).

- C4: The disposition, process, or material object which precedes the clinical situation. The

predicate ‘follows’ can be further refined by ‘is caused by’. For instance, a cause of a heart

failure situation can be a myocardial infarction:

#N Subject Predicate Cardinality Object

C1 shn:ClinicalSituation ‘has participant’ 0..* btl:MaterialObject

C2 shn:ClinicalSituation ‘occurs at’ 0..1 btl:TemporalRegion

C3 shn:ClinicalSituation ‘happens at’ 0..* btl:MaterialObject or btl:InmaterialObject

C4 shn:ClinicalSituation ‘follows’ 0..* shn:ClinicalSituation (CS_PT)

Table 2-13 Semantic pattern CS_PT

Page 17: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 17 of 44

Examples of application of this top-level pattern are Heart failure caused by heart attack, Diabetes

mellitus, etc. The following table shows its application for representing a Heart failure situation

caused by heart attack.

#N Subject Predicate Object

C1 sct:HeartFailure ‘has participant’ shn:HumanOrganism

C2 sct:HeartFailure ‘occurs at’ 12/01/2012

C4 sct:HeartFailure ‘follows’ sct:HeartAttack

Table 2-14Heart failure situation example based on CS_PT

Finally, Figure 2-1 depicts the hierarchical graphical representation of the top-level patterns de-

scribed above. The clinical process and clinical situation semantic patterns can only occur as compo-

nents of other patterns since semantic patterns encode information entities. They will be used

through composition by other patterns such as I_CS_PT, I_NCS_PT, etc. The UML notation has been

used for representing specialisation and composition.

Figure 2-1 Hierarchy of top-level patterns

2.1.3 OWL DL representation

The representation of the above top-level patterns into OWL 2 DL allows the precise formalization of

the ontological framework proposed and the use of DL reasoning. DL reasoning is useful for the

achievement of two important goals.

On the one hand, it can be used for detecting equivalent clinical information from iso-semantic mod-

els. This includes the ability to compare different distributions of content between information mod-

els and ontologies/terminologies, and to test whether they are semantically equivalent. For instance,

there are two possible representations to encode a breast cancer diagnosis when using SNOMED CT:

(1) using one diagnosis information model element and the concept Breast cancer or (2) using two

information model elements for representing the disease diagnosed Cancer and the disease location

Breast structure. An appropriate representation, supported by a DL reasoner should discover that

both representations are semantically equivalent.

On the other hand, DL reasoning can be used to provide an advanced exploitation of clinical infor-

mation by means of semantic query possibilities such as retrieving patients who use tobacco, inde-

pendently of the form of the tobacco (e.g. cigar, pipe, etc.) and of the type of consumption (e.g. snuff

or smoking).

Table 2-15 and Table 2-16 depict the translation of the patterns into OWL DL, according to the pro-

posed ontological framework. The first table depicts the translation of the predicates into OWL DL

Page 18: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 18 of 44

expressions. By following the triple-based pattern representation of the patterns, the subject (SUB)

and object (OBJ) correspond to ontology classes and the predicate to an OWL DL expression. These

DL expressions use one or more object properties from our ontologies, together with different quan-

tifiers, as a result of the underlying ontological model.

Predicate OWL DL expression

'describes situation' SUBJ subClassOf shn:isAboutSituation only OBJ

'describes quality' SUBJ subClassOf shn:isAboutQuality only OBJ

'results from process' SUBJ subClassOf btl:isOutcomeOf some OBJ

'has attribute' SUBJ subClassOf btl:hasInformationAttribute some OBJ

'is quality of' SUBJ subClassOf btl:inheresIn some OBJ

'has observed value' SUBJ subClassOf btl:Quality and btl:projectsOnto some OBJ

'has value' SUBJ subClassOf btl:isRepresentedBy only

(shn:hasValue some OBJ)

'has units' SUBJ subClassOf btl:isRepresentedBy only

(shn:hasInformationAttribure some OBJ)

'has scale' SUBJ subClassOf btl:isRepresentedBy only

(shn:hasInformationAttribure some OBJ)

‘has participant’ SUBJ subClassOf btl:hasParticipant some OBJ

‘occurs at’ SUBJ subClassOf btl:projectsOnto some OBJ

‘happens at’ SUBJ subClassOf btl:isIncludedIn some OBJ

‘follows’ SUBJ subClassOf btl:isPrecededBy some OBJ

Table 2-15 OWL DL representation of the top-level patterns predicates

Note that the quantifiers used are different. For connecting information entities with clinical entities

we have used the universal quantifier (='only'), because we cannot assume that an instance of X

exists where there is an instance of "information on X". We are aware that there are technical diffi-

culties in such models, as they lead to the possibility of statements that are not about anything at all.

But to the best of our knowledge, adverse reasoning consequences are avoided as long as we use

very specific object properties (here is_about_situation instead of represents.) On the other hand, a

false diagnosis is an existing statement that is not about any clinical individual situation at all. A sec-

ond possibility is to allow hypothetical entities. More satisfactory but more complicated would be

the use of a higher order logic so that the uncertainty can be correctly targeted on the statement or

belief rather than on the underlying state of the world. However, this is likely to remain beyond the

scope of easily used computational logics for the near future, so that approximations using descrip-

tion logic are required.

The second table (cf. Table 2-16) provides the result of applying the predicates OWL DL translation

on each pattern. The OWL snippets can be understood as equivalent to the predicates. As it can be

observed, apart from the predicates translation, for some patterns such as the one that refer to the

past history of clinical situation (PH_CS_PT) or which include negation (I_NCS_PT, NPH_CS_PT), addi-

tional changes are required.

Page 19: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 19 of 44

Predicate OWL DL expression

I_CS_PT

shn:InformationItem and shn:isAboutSituation only shn:ClinicalSituation and btl:isOutcomeOf some btl:Process and shn:hasInformationAttribute some shn:InformationAttribute

PH_CS_PT

shn:InformationItem and shn:isAboutSituation only (btl:BiologicalLife and btl:hasPart some shn:ClinicalSituation) and btl:isOutcomeOf some btl:Process and shn:hasInformationAttribute some shn:InformationAttribute

I_NCS_PT

shn:InformationItem and shn:isAboutSituation only (shn:ClinicalSituation and not btl:hasPart some shn:ClinicalSituation) and btl:isOutcomeOf some btl:Process and shn:hasInformationAttribute some shn:InformationAttribute

NPH_CS_PT

shn:InformationItem and shn:isAboutSituation only (btl:BiologicalLife and not btl:hasPart some shn:ClinicalSituation) and btl:isOutcomeOf some btl:Process and shn:hasInformationAttribute some shn:InformationAttribute

OB_CS_PT

shn:ObservationResult

and shn:isAboutQuality only (btl:ProcessQuality

and btl:inheresIn some shn:ClinicalSituation

and btl:projectsOnto some (btl:ValueRegion

and btl:isRepresentedBy only (

shn:hasInformationAttribute some shn:MeasurementUnits

and shn:hasValue some xml:datatype

and shn:hasInformationAttribute some shn:Scale)))

CP_PT

btl:Process

and btl:hasParticipant some btl:MaterialObject

and btl:projectsOnto some btl:TemporalRegion

and btl:isIncludedIn some btl:MaterialObject

CS_PT

shn:ClinicalSituation

and btl:hasParticipant some btl:MaterialObject

and btl:projectsOnto some btl:TemporalRegion

and btl:isIncludedIn some btl:MaterialObject

and btl:isPrecededBy only shn:ClinicalSituation

Table 2-16 OWL DL representation of the top-level patterns

Example of how the above two tables are used to transform data modelled according to the seman-

tic patterns into their OWL DL representation, as well as how they can be queried and some of the

implications with regards to negation are commented at the end of Chapter 4.

Page 20: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 20 of 44

3 Extending the Heart Failure Summary (HFS) to support Public

Health

In the following, we will describe the two models created for recording the tobacco and alcohol use

respectively. For each model, we provide its graphical representation as provided by the openEHR

HFS template12 for readability purposes. This template uses a subset of OpenEHR archetypes, where

textual descriptions for each of data elements included are provided. Figure 3-1 depicts the main

structure of the template. Then, we will apply the semantic patterns described in Chapter 2 to the

modelling of the tobacco and alcohol use information. The patterns proposed do not cover aspects of

information presentation such as order or indentation but the semantic of the information they rep-

resent.

Figure 3-1 Main structure of the OpenEHR Heart Failure Summary extended to support public health

3.1 Tobacco Use Summary

Figure 3-2 depicts the graphical representation of the Tobacco Use Summary Model. The model can

be subdivided into two sub-models, depending on how the tobacco is consumed (i.e. smoked or

snuff). Table 3-1 and Table 3-11 summarize the data elements and value restrictions for each one.

SNOMED CT terms were used when possible for encoding the value restrictions. Next, we analyse

each of the models in separate subsections.

12

http://www.openehr.org/ckm/#showTemplate_1013.26.14_2

Page 21: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 21 of 44

Figure 3-2 Graphical view of the OpenEHR Tobacco Use Template

3.1.1 Smoking Tobacco Model

The model consists of a set of findings obtained probably by asking the patient about certain clinical

situation (e.g. Do you smoke?) or about certain quality of the clinical situation such as the mass in-

take (e.g. How many cigarettes do you smoke per week?). In the first case the situation is defined as

the part of a patient's life in which intermittent smoking processes occur; and in the second one we

ask about the amount of cigarettes smoked (i.e. mass intake) in a certain period. The answer to the

questions can be quantitative (e.g. "I smoke 10 cigarettes per day") or qualitative (e.g. "I smoke to-

bacco daily").

In the following we will apply the top-level semantic patterns described in Section 2 to encode the

Smoking Tobacco model clinical data. Table 3-1 shows the model data elements and their value con-

straints as well as the semantic pattern applied for representing them (cf. #PatternID).

In order to decide which pattern to apply for each data element / value constrain combination, the

meaning of the information that we aim at representing has to be analysed. By formulating a set of

questions (could be ideally made by a modelling tool), the correct pattern can be chosen.

We have stated that the model consists of a set of findings about patient clinical situations and about

their qualities. At first glance, for representing some information about a clinical situation the top-

level pattern that best fit is the “information about clinical situation” pattern (I_CS_PT). However, by

further analysing the meaning of the clinical situation terms (e.g. smoker, ex-smoker, never smoked

tobacco, heavy smoker, etc.) we have to consider aspects such as: Do they refer to a present or past

situation? – Do they indicate their presence or absence? In the case of referring to a past situation

(e.g. ex-smoker), then the semantic pattern “past history about clinical situation” (PH_CS_PT) has to

be used. If besides the term refers to a non-existing past situation the “no past history of clinical situ-

ation” pattern fits this purpose (NPH_CS_PT).

Page 22: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 22 of 44

In order to represent qualities of certain clinical situations (e.g. amount of tobacco smoked), the ob-

servation result pattern that describes the result of an observation or assessment about some quality

of a given clinical situation (OB_CS_PT) fits this purpose.

Data Element Value #Pattern ID

Smoking status

77176002 160616005 8517006 266919005

Smoker(finding) Trying to give up smoking (finding) Ex-smoker(finding) Never smoked tobacco (finding)

I_CS_PT PH_CS_PT NPH_CS_PT

Form 66562002 26663004 84498003

Cigarette smoking tobacco (substance) Cigar smoking tobacco (substance) Pipe smoking tobacco (substance)

I_CS_PT I_CS_PT I_CS_PT

Typical Smoked Amount

xml:double Units: /d, /wk OB_CS_PT

365982000 230060001 56578002

Heavy smoker (over 20 per day) Light cigarette smoker Moderate smoker (20 or less per day)

I_CS_PT I_CS_PT I_CS_PT

Pattern of Use 449868002 230059006

Smokes tobacco daily Occasional cigarette smoker

I_CS_PT I_CS_PT

Date Ceased xml:Date I_CS_PT

Pack Years xml:double OB_CS_PT

Table 3-1 Data Elements and Value Restrictions of the Smoking Tobacco Model

Before describing how the semantic patterns are applied here, it is important to comment on the

SNOMED CT concepts that have been selected for encoding the values of the model data elements.

The first important observation is that some of the names of the SNOMED CT terms are misleading.

In order to correctly place the terms within the pattern, it is important to know to which semantic

category (SNOMED CT hierarchy axis) they belong to (e.g. Finding, Procedure, Qualifier Value, etc.).

As an example, the term Smoker does not refer to a person but to a smoking situation, because it is

placed in the SNOMED CT Clinical finding hierarchy. An immediate conclusion is that it is important to

consider the underlying model of meaning (i.e. the SNOMED CT concept model) in order to use the

concept consistently within the EHR and with regards to the information model and thus to use the

right pattern for its modelling. For SNOMED CT, a desideratum would be to revise preferred terms

and fully specified names in order to prevent misleading interpretations.

An additional complication might exist in cases when the model of meaning is incomplete or faulty.

As an example, we find two terms in SNOMED CT for encoding the tobacco form: cigarette smoking

tobacco and cigarette tobacco smoker, which are in the substance and a finding hierarchy, respec-

tively. There is a risk that these terms are used as binding values within clinical models without con-

sidering their semantic types. The data element may refer to the finding, but the value is indeed a

substance. Then, apart from the complication derived of using these terms interchangeably, without

taking into consideration their intended meaning, the SNOMED CT concept model does not provide

any relationship between them. As a consequence, models that use them differently, will not be se-

mantically interoperable. However, this can be solved if both of them are semantically related as we

demonstrate here:

sct:CigaretteTobaccoSmokingSituation subClassOf btl:hasParticipant

some sct:CigaretteTobaccoSmokeSubstance

For clarity, left column of Table 3-2 depicts the code and fully specified name (FSN) of the SNOMED

CT concepts we use in the upcoming examples; the right column shows the names we use to refer to

them in order to facilitate the reader understanding. The re-naming reflects the semantic category of

the concepts.

Page 23: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 23 of 44

SNOMED CT CODE & FSN Renaming suggestion

77176002 Smoker (finding) Tobacco smoking situation

160616005 Trying to give up smoking (finding) Effort of quitting tobacco smoking situation

8517006 Ex-smoker(finding) Past history tobacco smoking situation

266919005 Never smoked tobacco (finding) No past history tobacco smoking situation

66562002 cigarette smoking tobacco (substance) Cigarette tobacco smoke substance

26663004 cigar smoking tobacco (substance) Cigar tobacco smoke substance

84498003 pipe smoking tobacco (substance) Pipe tobacco smoke substance

65568007 cigarette smoker (finding) Cigarette tobacco smoking situation

59978006 cigar smoker (finding) Cigar tobacco smoking situation

82302008 pipe smoker (finding) Pipe tobacco smoking situation

365982000 heavy smoker (over 20 per day) Heavy tobacco smoking situation

230060001 light cigarette smoker Light cigarette tobacco smoking situation

56578002 moderate smoker (20 or less per day) Moderate cigarette tobacco smoking situation

449868002 smokes tobacco daily Daily tobacco smoking situation

230059006 occasional cigarette smoker occasional cigarette tobacco smoking situation

Table 3-2 Re-naming of the SNOMED CT concepts used in the smoking tobacco model

Besides, Figure 3-3 shows the original SNOMED CT hierarchy of the above concepts and how it would

look like after their re-modelling taking into consideration the two aspects described above: mislead-

ing term names and incomplete or faulty underlying model of meaning.

Figure 3-3 Re-modelling of the original Smoking Tobacco hierarchy

In the re-naming suggestion of the SNOMED CT concepts, we have not considered aspects such as

“over 20 per day” or “20 or less per day” which are part of concepts such as heavy smoker and mod-

erate smoker, respectively. On the one hand, this kind of knowledge is not formally represented in

SNOMED CT and is only included in the name of a primitive concept. On the other hand, the defini-

tion of (heavy / light / moderate) smoker can probably not be universally defined. We consider that

this kind of knowledge might vary across institutions or depend on study purposes and therefore

should not be included in the terminology. It could be implemented as a separate set of rules that

could be shared in an interoperability scenario. For instance, if in our local implementation a heavy

cigarette smoker means “>= 10 cigarettes per day”, we could represent it with the following general

OWL DL ontology axiom:

shn:ObservationResult and shn:isAboutQuality some (shn:MassIntake and btl:inheresIn some sct:CigaretteTobaccoSmokingSituation and btl:projectsOnto some (btl:ValueRegion

and btl:IsRepresentedBy only (shn:hasInformationObjectAttribute some sct:PerDay and shn:hasValue some double[> =10])))

SubClassOf (shn:InformationItem and shn:isAboutSituation only sct:HeavyCigaretteTobaccoSmokingSituation)

Page 24: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 24 of 44

Taking into consideration what has been explained we will apply the mentioned top-level patterns to

the tobacco smoking model. Its application will consist of constraining them by following the cardi-

nality and value restrictions provided by the semantic pattern.

The most frequently used pattern is I_CS_PT. We will use it for representing all data element / value

combinations shown in left column of Table 3-3. Two triple of the patterns are used, S1 and S2, which

represents a clinical situation and the process performed to acquire that information respectively

and whose predicate cardinality has been constrained to 1 and their value range (object part of the

triple) to a subclass of clinical finding (i.e. clinical situation) and to an evaluation procedure according

to SNOMED CT.

We will constrain it as follows:

#N Subject Predicate Cardinality Object

S1 shn:InformationItem 'describes situation' 1 << 404684003 clinical finding (finding)

S2 shn:InformationItem 'results from process' 1 << 386053000 evaluation procedure (procedure)

S3 shn:InformationItem 'has attribute' 0 shn:InformationAttribute

Table 3-3 Semantic pattern I_CS_PT constrained

We have decided to constrain the value of the process used to acquire the information to an evalua-

tion procedure. However, a more specific procedure concept could be used instead, such as assess-

ment of lifestyle (procedure) or history taking (procedure). More contextual information can be add-

ed about this process such as who provided the information, who is the subject of the information,

the purpose, where was it acquired, etc. This can be done by means of the clinical process pattern

(cf. CP_PT), which provides a set of triples that can be constrained with that purpose. Here, for sim-

plicity we will limit it to providing the procedure term. In Chapter four we will provide examples of

this.

Table 3-4 shows the result of applying the I_CS_PT pattern to each of the data elements / value com-

binations of the clinical model. Each row has three columns: (1) data element and value combina-

tions that is the modelling focus; (2) the triple-based representation of this data element / value after

constraining the I_CS_PT and (3) the corresponding pattern triple number.

The representation of (Date Ceased / xml:DateTime) is an example of pattern composition in which

the semantic pattern that is used to further describe a clinical situation (CS_PT) is used to describe

the end date of the tobacco smoking situation.

Data Element / Value Triple representation #N

Smoking Status / Smoker (finding)

shn:InformationItem ‘describes situation’ sct:TobaccoSmokingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Form / cigarette smoking tobacco (substance)

shn:InformationItem ‘describes situation’ sct:CigaretteTobaccoSmokingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Form / cigar smoking tobacco (substance)

shn:InformationItem’ describes situation’ sct:CigarTobaccoSmokingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Form / pipe smoking tobacco (substance)

shn:InformationItem ‘describes situation’ sct:PipeTobaccoSmokingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Typical Smoked Amount / heavy smoker (over 20 per day)

shn:InformationItem ‘describes situation’ sct:HeavyTobaccoSmokingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Typical Smoked Amount / light cigarette smoker

shn:InformationItem ‘describes situation’ sct:LightCigaretteTobaccoSmokingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Typical Smoked Amount / moder-ate smoker (20 or less per day)

shn:InformationItem ‘describes situation’ sct:ModerateTobaccoSmokingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Pattern of Use / smokes tobacco daily

shn:InformationItem ‘describes situation’ sct:DailyTobaccoSmokingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Pattern of Use / occasional ciga- shn:InformationItem ‘describes situation’ sct:OccasionalCogaretteTobaccoSmokingSituation #S1

Page 25: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 25 of 44

rette smoker shn:InformationItem ‘results from process’ sct:Evaluation #S2

Date Ceased / xml:Date shn:InformationItem ‘describes situation’ sct:TobaccoSmokingSituation sct:TobaccoSmokingSituation ‘has end time’ xml:DateTime

#S1 #C2

Table 3-4 Pattern-based representation of the Tobacco smoking model (I_CS_PT)

We have used the observation result pattern (OB_CS_PT), which describes the result of an observa-

tion or assessment for representing the typical smoked amount as a quantitative value, and for the

representation of pack years. The last is defined as follows by the clinical model:

As before, we will constrain the OB_CS_PT pattern by following the cardinality and value constraints

for representing both data element / value pairs shown in the left column of Table 3-6. Following,

Table 3-5 shows how the pattern is constrained:

#N Subject Predicate Cardinality Object

O1 shn:ObservationResult 'describes quality' 1 btl:ProcessQuality

O2 btl:Quality 'is quality of' 1 << 404684003 clinical finding (finding)

O3 shn:ObservationResult 'has observed value' 1 btl:ValueRegion

O4 btl:ValueRegion 'has value' 1 xml:Double

O5 btl:ValueRegion 'has units' 1 shn:MeasurementUnits

O6 btl:ValueRegion 'has scale' 1 shn:QuantitativeScale

O7 shn:ObservationResult 'results from process' 1 evaluation procedure (procedure)

Table 3-5 Semantic pattern OB_CS_PT constrained

All pattern triples constrain their predicate cardinality to one. Triple O2 constrains their value to a

subclass (<<) of clinical finding; Triple O4 is constrained to an xml double datatype; O6 to a quantita-

tive value scale and O7 is constrained to an evaluation procedure.

Table 3-6 shows the result of applying the OB_CS_PT pattern to each of the data elements / value

combinations of the clinical model.

Data Element / Value Triple representation #N

Typical smoked amount / 10 cigarettes / day

shn:ObservationResult 'describes quality' shn:MassIntake shn:MassIntake 'is quality of' sct:CigaretteTobaccoSmokingSituation shn:ObservationResult 'has observed value' btl:ValueRegion btl:ValueRegion 'has value' 10 btl:ValueRegion 'has units' sct:PerDay

#O1 #O2 #O3 #O4 #O5

Pack years / 2

shn:ObservationResult 'describes quality' shn:MassIntake shn:MassIntake 'is quality of' sct:TobaccoSmokingSituation shn:ObservationResult 'has observed value' btl:ValueRegion btl:ValueRegion 'has value' 2 btl:ValueRegion 'has units' sct:PackYears

#O1 #O2 #O3 #O4 #O5

Table 3-6 Pattern-based representation of the Tobacco smoking model (instantiated pattern OB_CS_PT)

Finally we will represent Smoking status / Ex-smoker (finding) and Smoking status / Never smoked

tobacco (finding). For the first one we will constrain the top-level semantic pattern for representing

the past history of a clinical situation (PH_CS_PT) by limiting the cardinality of the triple that de-

scribes the situation and the process used to acquire the information to one, and their value range

to a subclass of clinical finding and to evaluation procedure respectively (cf. Table 3-7).

Page 26: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 26 of 44

#N Subject Predicate Cardinality Object

S1 shn:InformationItem 'describes situation' 1..1 << 404684003 clinical finding (finding)

S2 shn:InformationItem 'results from process' 1..1 evaluation procedure (procedure)

S3 shn:InformationItem 'has attribute' 0 shn:InformationAttribute

S4 shn:InformationItem 'has temporal context' 1..1 sct:InthePast

Table 3-7 Semantic pattern PH_CS_PT constrained for representing Smoking status / Ex-smoker(finding)

Table 3-8 depicts the triple based representation of Smoking status / Ex-smoker (finding) after apply-

ing the above pattern:

Data Element / Value Triple representation #N

Smoking status / ex-smoker(finding)

shn:InformationItem ‘describes situation’ sct: TobaccoSmokingSituation shn:InformationItem ‘results from process’ sct:Evaluation shn:InformationItem 'has temporal context' sct:InThePast

#S1 #S2 #S4

Table 3-8 Pattern-based representation of the Tobacco smoking model (PH_CS_PT)

For representing Smoking status / Never smoked tobacco (finding) we will constrain the top-level semantic

pattern NPH_CS_PT which describes the absence of a clinical situation in the past, as follows (cf. Table 3-9):

#N Subject Predicate Cardinality Object

S1 shn:InformationItem 'describes situation' 1..1 << 404684003 clinical finding (finding)

S2 shn:InformationItem 'results from process' 1..1 evaluation procedure (procedure)

S3 shn:InformationItem 'has attribute' 0 shn:InformationAttribute

S4 shn:InformationItem 'has temporal context' 1..1 sct:InthePast

S5 shn:InformationItem 'has situation context' 1..1 sct:Absent

Table 3-9 Semantic pattern NPH_CS_PT constrained for representing Smoking status / Never smoked tobacco

Next, their triple-based representation after applying the above pattern (cf. Table 3-10):

Data Element / Value Triple representation #N

Smoking status / Never smoked tobacco(finding)

shn:InformationItem ‘describes situation’ sct: TobaccoSmokingSituation shn:InformationItem ‘results from process’ sct:Evaluation shn:InformationItem 'has temporal context' sct:InThePast shn:InformationItem 'has situation context' sct:Absent

#S1 #S2 #S4 #S5

Table 3-10 Pattern-based representation of the Tobacco smoking model (NPH_CS_PT)

3.1.2 Model for other forms of tobacco use (Snuff)

The Snuff Tobacco Model is very similar to the Smoke Tobacco Model and therefore the semantic

patterns used and how they are specialised and composed is practically the same. The only differ-

ence is the set of SNOMED CT concepts used, that in this case refer to the snuff tobacco use instead

of the smoke tobacco use. Table 3-11 depicts the model data elements and value constraints, using

SNOMED CT concepts when possible, as well as the top-level pattern applied for each data element /

value combination that appears in the clinical model.

Data Element Value #Pattern ID

Snuff use status 228494002 228493008 228492003

snuff user (finding) ex-snuff user (finding) never used snuff (finding)

I_CS_PT PH_CS_PT NPH_CS_PT

Typical Snuff Consumption (tins) xml:Double /wk OB_CS_PT

Pattern of Use daily tobacco snuff user occasional tobacco snuff user

I_CS_PT I_CS_PT

Table 3-11 Data Elements an Value Restrictions of the Snuff Tobacco Model

As with the Smoke Tobacco Model, we have re-named the SNOMED CT concepts used by the model

taking into consideration their underlying model of meaning. We have not found SNOMED CT con-

Page 27: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 27 of 44

cepts for encoding all the model values. They can be recognised in Table 3-11 because no codes are

provided. Table 3-12 depicts the result of the re-naming process.

SNOMED CT CODE & FSN Renaming suggestion

228494002 snuff user (finding) tobacco snuff situation

228493008 ex-snuff user (finding) past history tobacco snuff situation

228492003 never used snuff (finding) no past history tobacco nuff situation

Table 3-12 Re-naming of the SNOMED CT concepts used in the snuff tobacco model

Since the semantic patterns used have been constrained as in the tobacco smoking model, we will

show directly their triple-based representation after their application (cf. Table 3-13).

Data Element / Value Triple representation #N

Smoking Status / snuff user (finding)

shn:InformationItem ‘describes situation’ sct:TobaccoSnuffSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Smoking Status / ex-snuff user (finding)

shn:InformationItem ‘describes situation’ sct TobaccoSnuffSituation shn:InformationItem ‘results from process’ sct:Evaluation shn:InformationItem ‘has attribute’ sct:InThePast

#S1 #S2 #S4

Smoking Status / never used snuff (finding)

shn:InformationItem ‘describes situation’ sct: TobaccoSnuffSituation shn:InformationItem ‘results from process’ sct:Evaluation shn:InformationItem ‘has attribute’ sct:InThePast shn:InformationItem ‘has attribute’ sct:Absent

#S1 #S2 #S4 #S5

Typical Snuff Amount / xml:double /week

shn:ObservationResult 'describes quality' shn:MassIntake shn:MassIntake 'is quality of' sct:TinTobaccoSnuffSituation shn:ObservationResult 'has observed value' btl:ValueRegion btl:ValueRegion 'has value' 10 btl:ValueRegion 'has units' sct:PerWeek

#O1 #O2 #O3 #O4 #O5

Pattern of Use / daily tobacco snuff user

shn:InformationItem ‘describes situation’ shn:DailyTobaccoSnuffSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Pattern of Use / occasional tobacco snuff user

shn:InformationItem ‘describes situation’ shn:OccasionalTobaccoSnuffSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Table 3-13 Pattern-based representation of the Tobacco snuff model

Comparing the granularity of SNOMED CT regarding tobacco smoking to other forms of tobacco use,

it is not surprising that the representation of e.g. tobacco snuff usage is more coarse-grained, as this

kind of tobacco use is less relevant under clinical and public health aspects. For instance, concepts

analogous to "daily tobacco smoking" and "occasional tobacco smoking" are missing. We interpret

the meaning of these concepts as process patterns that indicate the regularity of behaviour, inde-

pendently of the intensity of the behaviour. It is therefore not the same as an observation result "per

day", which represents an average value and should not describe the regularity. It is no contradiction

to state that someone is an occasional smoker, on average, 2-3 cigarettes per day: There are occa-

sions where he smokes one pack, and then nothing at all for a whole week.

It would be advantageous if SNOMED provided concepts like "occasional behaviour", or "daily behav-

iour", which then can be post-coordinated with all kinds of drug or food intake, as well as other be-

haviours, e.g. physical activity. This would help represent information on, e.g. snuff use in the granu-

larity needed, without needing a further proliferation of primitive concepts.

Another difficult decision is how to deal with units such as "pack years", "cigarettes per day", (snuff)

"tins per day" etc. On the one hand one could consider them as alternative units for mass intake per

time unit (dimension: mass · time-1), adapted to the concrete context, and using rough approxima-

tions, instead of referring to exact masses such as 10mg tar, or 1 mg nicotine per cigarette. On the

other hand one could consider them as number per time unit (dimension: time-1), but then the model

has to distinguish which is exactly the thing you count (e.g. a single smoking event).

Page 28: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 28 of 44

Units like "cigarettes per day" could also be seen as equivalents, e.g. for comparing cigar or pipe

smokers with cigarette smokers. In this case, these units would not describe the cardinality of repeti-

tive processes per time unit but are proxies for an estimated amount of tar and nicotine.

As much as cigarettes, cigars or pipe fillings, vary in their content of tar and nicotine, also packages

contain different amounts of units (cigarette packs may have 12 or 20 cigarettes). The same applies

to snuff tins. Using these references alone, one has to accept large variations and systematic biases,

which affect comparability. Ontologies cannot solve this problem, and therefore have to define relat-

ed concepts as primitive.

3.2 Alcohol Use Summary

Our second risk factor under scrutiny is alcohol use. Figure 3-4 depicts the graphical representation

of the Alcohol Use Summary Model, as provided by the OpenEHR template. Table 3-14 summarize

the data elements and value restrictions used in the clinical model. As before, SNOMED CT terms

were used when possible for encoding the value restrictions.

Figure 3-4 Graphical view of the OpenEHR Alcohol Use Template

Data Element Value #Pattern ID

Status 219006 82581004 228274009

current drinker of alcohol (finding) ex-drinker (finding) lifetime non-drinker (finding)

I_CS_PT PH_CS_PT NPH_CS_PT

Form 160589000 160591008

beer drinker (finding) drinks wine (finding)

I_CS_PT I_CS_PT

Typical Alcohol Consumption

xml:double /d OB_CS_PT

160576006 86933000 228277002

moderate drinker - 3-6u/day (finding) heavy drinker (finding) light drinker (finding)

I_CS_PT I_CS_PT I_CS_PT

Pattern of Use 228318004 228310006

regular drinker (finding) drinks in morning to get rid of hangover (finding)

I_CS_PT I_CS_PT

Binge Drinking Pattern

None Less than once per month Monthly Weekly Daily

I_NCS_PT I_CS_PT I_CS_PT I_CS_PT I_CS_PT

Date commenced xml:Date I_CS_PT

Date Ceased xml:Date I_CS_PT

Standard Drink Definition xml:double /mg OB_CS_PT

Table 3-14 Data Elements and Value Restrictions of the Alcohol Use Model

Page 29: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 29 of 44

As in the previous models, we have re-named the SNOMED CT concepts used by the model taking

into consideration their underlying model of meaning. We have not found SNOMED CT concepts for

encoding all the model values. They can be recognised in Table 3-14 because no codes are provided.

Table 3-15 depicts the result of the re-naming process.

SNOMED CT CODE & FSN Renaming suggestion

219006 current drinker of alcohol (finding) alcohol drinking situation

82581004 ex-drinker (finding) past history alcohol drinking situation

228274009 lifetime non-drinker (finding) no past history alcohol drinking situation

160589000 beer drinker (finding) beer alcohol drinking situation

160591008 drinks wine (finding) wine alcohol drinking situation

160576006 moderate drinker - 3-6u/day (finding) moderate alcohol drinking situation

86933000 heavy drinker (finding) heavy alcohol drinking situation

228277002 light drinker (finding) light alcohol drinking situation

228318004 regular drinker (finding) regular alcohol drinking situation

228310006 drinks in morning to get rid of hangover (find-ing)

morning alcohol drinking situation follows alcohol drinking hangover situation

228315001 binge drinker (finding) heavy episodic drinking situation

Table 3-15 Re-naming of the SNOMED CT concepts used in the alcohol use model

Besides, Figure 3-5 shows the original SNOMED CT hierarchy of the above concepts and how it would

look like after their re-modelling taking into consideration the two aspects described above: mislead-

ing term names and incomplete or faulty underlying model of meaning.

Figure 3-5 Re-modelling of the original Alcohol use hierarchy

Finally, Table 3-16 shows the triple-based representation after the application of the patterns to the

alcohol use clinical model.

As already commented with tobacco use, interoperability of data on alcohol consumption is affected

by a series of issues:

Ill-defined concepts, like "one drink", "glass", "less than once per month" etc.

Great variety of alcohol content even of similar products of the same kind such as beer

(compare Swedish beer with Belgian beer).

The question of defining references such as "glass per day" as a unit of measurement or

counting drinking events of a certain type ("glass wine") and then refer to the number per

time unit.

Page 30: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 30 of 44

Different process patterns, such as regular consumption, binge consumption, morning con-

sumption etc., which are, per se, ill defined, but which could be introduced into the termi-

nology on a relatively abstract level and then used for post-coordination with a broad range

of different behaviours, in a similar way as recommended for the tobacco use cases.

Alike data on tobacco use, alcohol consumption data are not results from observations but

from questions asked to the patient or to relatives, and are therefore subject to considerable

bias, compared to observable entities in a strict sense.

Interpretations that are the result of an assessment cannot be formalized, such as "moder-

ate" or "heavy" drinker.

Data Element / Value Triple representation #N

Status / current drinker of alcohol (find-

ing)

shn:InformationItem ‘describes situation’ sct:AlcoholDrinkingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Status / ex-drinker (finding)

shn:InformationItem ‘describes situation’ sct:AlcoholDrinkingSituation shn:InformationItem ‘results from process’ sct:Evaluation shn:InformationItem ‘has attribute’ sct:InThePast

#S1 #S2 #S4

Status / lifetime non-drinker (finding)

shn:InformationItem ‘describes situation’ sct:AlcoholDrinkingSituation shn:InformationItem ‘results from process’ sct:Evaluation shn:InformationItem ‘has attribute’ sct:InThePast shn:InformationItem ‘has attribute’ sct:Absent

#S1 #S2 #S4 #S5

Form / beer drinker (finding)

shn:InformationItem ‘describes situation’ sct:BeerAlcoholDrinkingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Form / drinks wine (finding)

shn:InformationItem ‘describes situation’ sct:WineAlcoholDrinkingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Typical Alcohol Consumption xml:double /d

shn:ObservationResult 'describes quality' shn:MassIntake shn:MassIntake 'is quality of' sct:AlcoholDrinkingSituation shn:ObservationResult 'has observed value' btl:ValueRegion btl:ValueRegion 'has value' some x:double btl:ValueRegion 'has units' sct:PerDay

#O1 #O2 #O3 #O4 #O5

Pattern of Use / regular drinker (finding)

shn:InformationItem ‘describes situation’ sct:RegularAlcoholDrinkingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Pattern of Use / drinks in morning to get rid of

hangover (finding)

shn:InformationItem ‘describes situation’ sct:MorningAlcoholDrinkingSituation sct:MorningAlcoholDrinkingSituation ‘follows’ sct:AlcoholDrinkingHangoverSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #C4 #S2

Binge Drinking Pattern / None

shn:InformationItem ‘describes situation’ sct:HeavyEpisodicDrinkingSituation shn:InformationItem ‘results from process’ sct:Evaluation shn:InformationItem ‘has attribute’ sct:Absent

#S1 #S2 #S4

Binge Drinking Pattern / Less than once per month

shn:InformationItem ‘describes situation’ sct:HeavyEpisodicDrinkingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S2

Binge Drinking Pattern / Monthly

shn:InformationItem ‘describes situation’ sct:HeavyEpisodicDrinkingSituation shn:InformationItem ‘describes situation’ sct:MonthlyDrinkingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S1 #S2

Binge Drinking Pattern / Weekly

shn:InformationItem ‘describes situation’ sct:HeavyEpisodicDrinkingSituation shn:InformationItem ‘describes situation’ sct:WeeklyDrinkingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S1 #S2

Binge Drinking Pattern / Daily

shn:InformationItem ‘describes situation’ sct:HeavyEpisodicDrinkingSituation shn:InformationItem ‘describes situation’ sct:DailyDrinkingSituation shn:InformationItem ‘results from process’ sct:Evaluation

#S1 #S1 #S2

Date commenced / xml:Date

shn:InformationItem ‘describes situation’ sct:AlcoholDrinkingSituation sct:TobaccoSmokingSituation ‘has start time’ xml:Double

#S1 #C2

Date Ceased / xml:Date

shn:InformationItem ‘describes situation’ sct:AlcoholDrinkingSituation sct:TobaccoSmokingSituation ‘has end time’ xml:Double

#S1 #C2

Table 3-16 Pattern-based representation of the Alcohol use model

Page 31: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 31 of 44

4 Dealing with heterogeneous tobacco use representations

4.1 Objectives

In Chapter 2, we have introduced the subset of top-level semantic patterns that later in Chapter 3 we

have applied for the modelling of the tobacco and alcohol use information. We have focused on the

semantic-based representation of this information for the models produced as an extension of the

heart failure summary. However, it is a known issue that clinical records do not provide such detailed

information that can be directly used by public health systems13.

In Chapter 1 we mentioned that we set our focus on EHRs, or more exactly, on a highly standardised

summary (the HFS) within an EHR, as source of information for public health purposes, although we

were aware that this was not the only possible source, and likely not the primary one. However, data

sources specifically produced for public health purposes would probably look very similar, so that this

may justify our reference to the HFS for the sake of simplicity, given that SemanticHealthNet, in its

current phase, targets on exemplars for feasibility testing and not yet to ready-to-use artefacts.

Within this framework, we will expose an interoperability use case in which the EHR information

about tobacco and alcohol use has been produced within different contexts, attending to different

requirements. Additionally, it has been recorded by using a different EHR standard (i.e. OpenEHR,

HL7 CDA and HL7v3) and all of them refer to SNOMED CT as terminology.

Our goal here is: (1) demonstrate that data from heterogeneous models can be homogeneously que-

ried and (2) data can be retrieved within their context which help public health systems to interpret

them right.

The three models are heterogeneous in the sense that they are syntactically different and also may

provide different information detail, however iso-semantic parts can be identified across them.

In order to carry out our two goals we will use semantic patterns to act as bridge between the three

clinical models and the ontological framework proposed. Besides they will be used to guide the map-

ping between clinical models and ontologies. In this way we will get the information semantically

represented according to the ontologies proposed and it will be possible to perform homogeneous

queries across the different datasets.

4.2 Description of the models

The first model is part of the heart failure summary and is the one described in Chapter 3. It collects

detailed information about tobacco consumption, obtained from different sources, and is considered

as an extension of the HFS targeted to investigate the tobacco use in heart failure patients by public

health systems. It has been represented by following the openEHR specification.

Each of the models described in this section could have been represented according to a different

EHR standard (e.g. ISO 13606) or EHR modelling approach (SIAMM). However, we do not focus on

syntactical/structural representational aspects but in formalizing the meaning of the information

13

http://www.j-biomed-inform.com/article/S1532-0464%2807%2900060-3/abstract

Page 32: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 32 of 44

represented by clinical models and demonstrating that this formalization enables the homogeneous

query of heterogeneous datasets.

The second model has been represented according to HL7 CDA and follows one of the templates de-

fined as part of the Consolidated CDA (C-CDA)14 solution, which provides a library of reusable CDA

templates. The template comprises the data elements and vocabulary requirements needed for

meeting the EHR Certification Criteria in support of the U.S. Meaningful Use Stage 215 and might be

extended depending on additional information requirements. Thus, this CDA model is very generic

and only records a patient’s smoking status within the social history section of the patient record.

Here we will not consider any extension.

The third model is part of a Detailed Clinical Model (DCM) used to encode the information related

with the patient smoking assessment and cessation process and has been represented by using HL7

v3. The smoking assessment and cessation process consists of the following seven steps: (i) Advice

stop smoking; (ii) Draft smoke profile; (iii) Increase motivation; (iv) Barriers inventory; (v) Patient ed-

ucation and advise; (vi) Make an appointment to stop, and (vii) After care. Here, we show a snapshot

of the model, which records the smoking profile of the patient, which corresponds to the second step

of the cessation workflow.

Table 4-1 shows an excerpt of some data elements and terminology value requirements for the DCM

and Meaningful Use models. The OpenEHR Tobacco Use data elements and value restrictions were

described in Table 3-1 in Chapter 3.

Tobacco Detailed Use (DCM) Meaningful Use (C-CDA)

Data Element Value Data

Element Value

Number per day

365982000

finding of tobacco smoking consumption

(finding) 259032004

Quantity and units per day

Smoking status

449868002 428041000124106 8517006 266919005 428071000124103

Current every day smoker Current some day smoker Former smoker Never smoker Heavy Tobacco Smoker, etc.

type of tobacco use

77176002 smoker (finding) <<77176002 smoker (finding)

Table 4-1 Data elements and values (SNOMED CT) of an excerpt of DCM and C-CDA tobacco models

The Meaningful Use model provides only one data element for recording the tobacco smoking status.

The status value is constrained to a set of SNOMED CT codes to meet the certification criteria in sup-

port of Meaningful Use Stage 2 (e.g. Current every day smoker).

For the DCM we have represented the amount of tobacco consumed per day and the type of tobacco

used. Figure 4-1 and Figure 4-2 depict an excerpt of each clinical model represented in XML.

From the above we can state that it is not possible to impose a single model representation across

diverse clinical communities (e.g. public health vs. primary care vs. specialised care) and clinical prac-

tices, and that the requirements will dictate the level of information detail needed. As we can see,

14

HL7 IG for CDAR2: IHE Health Story Consolidation, R1", Consolidated CDA, C-CDA: http://www.hl7.org/implement/standards/ (Last accessed Jan. 2014) 15

US Meaningful Use Stage 2: http://www.cms.gov/Regulations-and-Guidance/Legislation/EHRIncentivePrograms/Downloads/Stage2_Guide_EPs_9_23_13.pdf (Last accessed Jan. 2014)

Page 33: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 33 of 44

each of the models provides a different level of information granularity apart from being represented

by following different EHR standards. Table 4-2 shows a summary of the tobacco data items names of

each model that will be used in the interoperability use case:

Then, by considering these clinical limits, the immediate question is which degree of semantic in-

teroperability we can offer, or up to which degree we can make the above models semantically in-

teroperable. At the beginning of this chapter we mentioned that we had two goals: (i) demonstrate

that data from heterogeneous models can be homogeneously queried, and (ii) data can be retrieved

within their context which help public health systems to interpret them right.

Assuming that we have provided the means for recording the contextual information previously de-

scribed (e.g. information provider, site of care, etc.) (cf. Section 4.1) and it can be retrieved as answer

of each query, the next step is demonstrating that data rendered according to each model can be

homogeneously queried and that we get an answer even when information is provided at different

granularity level by each model.

Tobacco Use (openEHR) Meaningful Use (C-CDA) Tobacco Detailed Use (DCM)

Status Status Number per day

Form Type of tobacco use

Typical Smoked Amount

Pattern of Use

Date Ceased

Pack Years Pack years

Status Table 4-2 Summary of the data items of each of the three clinical models used for the interoperability use case

Page 34: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 34 of 44

Figure 4-1 Excerpt of the DCM expressed in HL7 v3

Figure 4-2 Excerpt of the HL7 CDA clinical model

<component typeCode="COMP">

<!—Smoking profile -->

<organizer moodCode="EVN" classCode="CATEGORY">

<templateId root="6619D143-F556-41ed-89D5-B74EAF1149C0" />

<code code="365981007" displayName="finding of tobacco smoking behavior (finding)"

codeSystemName="SCT" codeSystem="2.16.840.1.113883.6.96" />

<component typeCode="COMP">

<!—Start smoking -->

<observation moodCode="EVN" classCode="OBS">

<templateId root="EB2B71AA-4125-4413-8D64-3AB2663217F1" />

<code code="266929003" displayName="smoking started (finding)" codeSystemName="SCT"

codeSystem="2.16.840.1.113883.6.96" />

<value xsi:type="TS" value="" />

</observation>

</component>

<component typeCode="COMP">

<!-- Number per day-->

<observation moodCode="EVN" classCode="OBS">

<templateId root="C16CF761-C456-44a3-A8C1-CD90D1AA3152" />

<code code="365982000" displayName="tobacco smoking consumption - finding" codeSys-

temName="SCT" codeSystem="2.16.840.1.113883.6.96" />

<value xsi:type="PQ" unit="" value="" />

</observation>

</component>

<component typeCode="COMP">

<!—Pack years -->

<observation moodCode="EVN" classCode="OBS">

<templateId root="F3434D44-7D7A-4f24-BA22-EFA0F604CE55" />

<code code="8664-5" displayName="Cigarettes smoked total (pack per year) - Reported"

codeSystemName="LOINC" codeSystem="2.16.840.1.113883.6.1" />

<value xsi:type="PQ" unit="y" value="" />

</observation>

</component>

<component typeCode="COMP">

<!-- type of tobacco use -->

<observation moodCode="EVN" classCode="OBS">

<templateId root="613E318E-5796-4a98-A9F1-556D61029882" />

<code code="77176002" displayName="smoker (finding)" codeSystemName="SCT" codeSys-

tem="2.16.840.1.113883.6.96" />

<value xsi:type="CD" displayName="" code="" codeSystemName="SoortTabakGebruik" code-

System="" />

</observation>

</component>

</organizer>

</component>

<!-- Smoking status observation -->

<section>

<!-- Social History Section templateId -->

<templateId root="2.16.840.1.113883.10.20.22.2.17"/>

<code code="29762-2" codeSystem="2.16.840.1.113883.6.1"

displayName="Social history"/>

<title>Social History</title>

<text>…</text>

<entry>

<observation classCode="OBS" moodCode="EVN">

<templateId root="2.16.840.1.113883.10.20.22.4.78"/>

<code code="ASSERTION" codeSystem="2.16.840.1.113883.5.4"/>

<statusCode code="completed"/>

<effectiveTime>

<!-- Date the person began smoking -->

<low …/>

</effectiveTime>

<value xsi:type="CD" code="…" displayName="…" />

</observation>

</entry>

</section>

Page 35: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 35 of 44

4.3 Mapping of clinical model data to their semantic representation by

using semantic patterns as bridge

In order to be able to query homogeneously data rendered according to the previous models we

have to make explicit their underlying semantics. The following steps have to be followed in order to

get finally the data semantically represented: (i) select the semantic pattern to be applied; (ii) apply

the pattern to the clinical model by accomplishing the value range and cardinality constraints indi-

cated by the pattern; (iii) instantiate the pattern with the concrete patient data, and (iv) transform

the pattern-based representation of the data into OWL DL or RDF in order to be queried.

Next, we will demonstrate the three first steps for an excerpt of the tobacco use openEHR archetype.

The fourth step as well as examples of queries will be demonstrated in the subsequent chapter sec-

tion. Figure 4-4 shows on the left the archetype ADL representation and on the right hand side the

semantic patterns, represented as tables, and selected for the mapping. The archetype includes two

data elements: smoking status (ELEMENT [at0002]) and smoking form (ELEMENT [at0028]). Three

different patterns, each corresponding to a table colour, have been used in the mapping process:

information about clinical situation pattern (I_CS_PT); No past history of clinical situation pattern

(NPH_CS_PT), and clinical process pattern (CP_PT).

The selection of the semantic pattern to be applied could be helped by the use of keywords such as

“past situation”, “absence”, “observation result”, “test result”, “assessment”, “symptom” etc. to-

gether with considering the semantic category of each of the SNOMED CT terms used as values with-

in the archetype. By following the principles that have been explained in Chapter 2 for each of the

top-level patterns described and by its implementation in an appropriate tool, the modeller should

be assisted in the selection process.

In order to map the smoking status data element (ELEMENT [at0002]) to a semantic pattern, their

possible values (i.e. Current smoker and Never smoked) have to be considered. Current smoker or

tobacco smoking situation corresponds to a clinical entity, however, never smoked corresponds to an

information entity that refers to the concept Tobacco smoking situation. This means that each of the

values for this data element corresponds to a different semantic category in the ontology (clinical

entity vs. information entity) and therefore the same semantic pattern cannot be applied to both of

them. This distinction is important in order to be able to provide semantic interoperability when dif-

ferent modelling possibilities coexist (cf. Figure 4-3).

We assume that modelling cases such as the one shown in this figure will always exist, since it is

hardly possible to avoid them when proprietary data base schemas or when clinical models adapted

to particular contexts are used. Therefore, a different pattern is used for representing each value and

then the mapping is done at the “value level”. The two pattern tabular representations show how

the mapping is done. The cell shaded in red shows the pattern part that is filled in with the archetype

data element value.

Page 36: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 36 of 44

Figure 4-3 Two modelling options for past history of tobacco smoking situation

The smoking form data element (ELEMENT [at0028]) has three possible values (cigar, cigarette, and

pipe). In this case, the three of them belong to the same semantic category (clinical entity) and the

mapping with the semantic pattern can be done at the level of the data element. The pattern applied

has been the I_CS_PT and the variable part has been filled with and OR expression of the three val-

ues. When the model is instantiated, only one of them will be selected, as the cardinality of the triple

predicate states (1).

Finally, the clinical process pattern (CP_PT) has been used to represent the contextual information

related with the process of acquiring the smoking status and data information. The root data ele-

ment (EVALUATION [at0000]) has been mapped to the semantic pattern that is used by composition

by the other semantic patterns applied, as the dotted arrows depict. In this case, we have indicated

that the process used to acquire the information was history taking and the information provider,

subject of information and site of care have to be instantiated when the model is filled with patient

clinical data.

Page 37: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

Next, Figure 4-4 shows how an excerpt of the mapping of an archetype with a set of semantic patterns.

Figure 4-4 Mapping the OpenEHR archetype with a set of semantic patterns

Page 38: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

Once we have demonstrated an example of “mapping” between a clinical model (and its instantia-

tion) and a set of semantic patterns, in the following will provide final pattern-based representation

of some data exemplars from each of the three models. As clinical data exemplars we will represent:

- openEHR: Patient X has a past history of heavy cigarette smoking, he smoked typically 10 cig-

arettes per day, he stopped in March 2010 (Part of the Heart Failure Summary record, ob-

tained by asking the patient).

- HL7 CDA: Patient Y is a heavy cigarette smoker. According to Meaningful Use it means >=10

/day (part of the social history record of the patient).

- DCM in HL7v3: Patient Z is a cigar smoker and smokes typically 15 cigar per day (part of the

profile of the smoker recorded in a smoking cessation service).

Next the corresponding data element / value pairs (cf. Table 3-1 and Table 4-1) by using SNOMED CT

terms: OpenEHR: Smoking status/ex-smoker (finding); Form/cigarette smoking tobacco (substance);

typical smoked amount/10 per day and Date ceased / March 2010; HL7 CDA: Smoking status/heavy

cigarette smoker (finding) and HL7v3: finding of tobacco smoking consumption (finding) /15 per day;

type of tobacco use / cigar smoker (finding).

As already mentioned, for the effective public health use of the data, they have to be retrieved to-

gether with their context. The following data items specify the context:

The process used to acquire the information (e.g. history taking, observation assessment,

etc.)

The provider of the information: the agent who provided the information. This is usually the

patient or the clinician, but may be someone else, or a software application or device.

The subject of information: The person that is the subject of the information (e.g. patient,

relative, etc.).

The incentive: The incentive of acquiring the information (e.g. screening program for certain

age band).

The location: the service location where the information was acquired (e.g. outpatient clinic,

online platform, cessation clinic, etc.)

The above list is not a “closed list”, and additional contextual information might be required to sup-

port public health such as degree of trust, source of evidence used, etc. In the previous figure and in

Chapter 2 (Table 2-12) an example of recording of contextual information by following the clinical

process pattern (CP_PT) was provided.

Finally, Table 4-3, Table 4-4 and Table 4-5 depict the result of the application of the semantic top-

level patterns from Chapter 2 in order to represent the openEHR, HL7 CDA and HL7v3 DCM conform-

ing clinical data respectively.

The left and right columns of each table show the correspondences between the model data ele-

ments / value pairs and the pattern triples.

In the OpenEHR model case, the smoking status and the form are both mapped to the I_CS_PT pat-

tern, since the smoking status refers to a patient smoking situation and the form is part of the situa-

Page 39: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 39 of 44

tion class definition, refining it. The typical amount smoked is mapped to the OB_CS_PT pattern since

it is an assessment result. In the same table, the triples obtained are provided. Triples with minimum

cardinality one are mapped to the model (eg. shn:InformationItem 'describes situation'

shn:ClinicalSituation). Value constraints have been applied constraining the object part of the triple

(e.g. shn:InformationItem 'describes situation' shn:ClinicalSituation) to the specific clinical situation

(sct:TobaccoSmokingSituation).

Data Element / Value Triple representation #N

Smoking Status / ex-smoker (finding)

shn:InformationItem ‘describes situation’ sct: TobaccoSmokingSituation shn:InformationItem ‘results from process’ sct:Evaluation shn:InformationItem'has temporal context' sct:InThePast

#S1 #S2 #S4

Form / cigarette smoking tobacco (substance)

shn:InformationItem 'describes situation' sct:CigaretteTobaccoSmoking Situation-shn:InformationItem 'results from process' sct:Evaluation

#S1 #S2

Typical smoked amount / 10 cigarettes / day

shn:ObservationResult 'describes quality' shn:MassIntake shn:MassIntake 'is quality of' sct:CigaretteTobaccoSmokingSituation shn:ObservationResult 'has observed value' btl:ValueRegion btl:ValueRegion 'has value' 10 btl:ValueRegion 'has units' sct:PerDay

#O1 #O2 #O3 #O4 #O5

Date Ceased / xml:Date shn:InformationItem ‘describes situation’ sct TobaccoSmokingSituation sct:TobaccoSmokingSituation ‘has end time’ “2010-03-01”

#S1 #C2

Table 4-3 OpenEHR: “Smoker, cigarette smoker, 10 cigarettes per day”; Correspondences and Pattern triples

Table 4-4 depicts the result of specialising the top-level content patterns and the correspondences

with regards to the HL7 CDA data. The smoking status, as in the openEHR case, is mapped to the in-

formation about clinical situation pattern. The HL7 CDA model defines heavy smoker as at least 10

cigarettes / day. However, the definition is particular to this HL7 implementation and might vary

across institutions or depend on research study purposes.

Data Element / Value

Triple representation #N

Smoking Status / Heavy Cigarette Tobacco Smoker

(finding)

shn:InformationItem describes situation sct:HeavyCigaretteSmokingSituation shn:InformationItem results from process sct:Evaluation

#S1 #S2

Table 4-4 HL7 CDA “Heavy cigarette tobacco smoker (>=10)”; Correspondences and Pattern triples

Finally, Table 4-5 depicts the triple-based representation for the HL7 v3 DCM clinical data. The pat-

terns I_CS_PT and OB_CS_PT have been used for representing the type of tobacco used and the

number of cigar per day respectively.

Data Element / Value

Triple representation #N

type of tobacco use / cigar smoker (finding)

shn:InformationItem 'describes situation' sct:CigarTobaccoSmoking Situation shn:InformationItem 'results from process' sct:Evaluation

#S1 #S2

Number per day / 15 cigar / day

shn:ObservationResult 'describes quality' shn:MassIntake shn:MassIntake 'is quality of' sct:CigarTobaccoSmokingSituation shn:ObservationResult 'has observed value' btl:ValueRegion btl:ValueRegion 'has value' 15 btl:ValueRegion 'has units' sct:PerDay

#O1 #O2 #O3 #O4 #O5

Table 4-5 HL7v3 DCM “Cigar tobacco smoker and smokes 15 cigar per day” )”; Correspondences and Pattern triples

Page 40: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 40 of 44

4.4 Homogeneous query of data from the three clinical models: Tobacco Use (openEHR), Meaningful Use (HL7 C-CDA) and Tobacco Detailed Use (DCM- HL7v3)

Once the patient data obtained from the three heterogeneous systems have been mapped into their

semantic pattern-based representation (cf. Table 4-3, Table 4-4 and Table 4-5). By applying the cor-

respondences provided by Table 2-15 and Table 2-16, we can get data OWL DL representation in or-

der to allow their homogeneous query and the use of DL reasoning.

Next, we will formulate a set of queries that could be interesting for a public health system in order

to aggregate data about heart failure diagnosed patients, and then assess the incidence of smoking

tobacco in developing heart failure.

(This use case is fictitious and it only aims at demonstrating the value of the approach proposed with regards to the exploitation of EHR information for public health purposes.)

Question exemplars: How many heart failure diagnosed patients…:

(Q1) … are currently heavy smokers (>10 / day)

(Q2) … quitted smoking within the last 20 years

(Q3) … never used tobacco in any of its forms (smoke, snuff, etc.)

Table 4-6 depicts the DL representations of the above questions:

The first question (Q1) should retrieve patient Y and Z. Patient Z data specifically state that smokes

“15 cigar / day”; Patient Y data state that is a heavy cigarette smoker. The term is used according to

Meaningful Use, meaning “>=10 /day” (here >= 10 cigarette / day). The query does not specify the

smoking form, just that they smoke tobacco, thus, both patients smoke more than 10 / day (ciga-

rettes or cigars) and both information instances will be retrieved together with its contextual infor-

mation.

The second question (Q2) should retrieve patient X, since he has a past smoking history and quitted

in March 2010 (within the last twenty years). The question asks for all the past tobacco smokers, in-

dependently of the form, whose end smoking date is greater or equal than 1994, March (within the

last twenty years).

Finally the third query (Q3) will not retrieve any patient. Two of them are current smokers, and the

other one smoked in the past. We consider that if it has not been explicitly recorded that the patient

never smoked we cannot infer it so (absence of information does not mean negation). However, on-

tologies, as opposite to traditional databases where complete information is assumed, assume in-

complete information, which affects how data can be queried. If something has not explicitly negat-

ed it is not considered false as opposed to traditional databases. This has to be considered in the

formulation of queries, which constitutes a barrier for most users. However, based on our present

experience, we think that the use of query patterns using principles similar to the semantic patterns

described above could hide that complexity.

Page 41: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 41 of 44

In the case of (Q3), it explicitly asks for those patients of whom the records state that they do not

have a past history of smoking (Q3). We will only count those in the result and will not consider ab-

sence of information as result. Where we know that this information can be derived from other

statements and is generalizable for all patients, then a specific data instance about absence will have

to be created, probably in an automatic way whenever certain data are checked for presence. How-

ever, it is possible that the decision whether missing data for a specific patient characteristic can be

interpreted as absent condition can only be made individually. In this case, only mechanisms that

make easier taken such decision can be provided.

As it can be observed in the rendering of the DL queries, they also follow the semantic patterns, in

this case after their OWL DL transformation. Q1 follows the observation result semantic pattern, Q2

the past history situation, and Q3 the no past history of clinical situation. The three of them will re-

trieve the data within context, this last encoded as part of the evaluation procedure (cf. Table 2-12).

#Q1 shn:ObservationResult

and shn:isAboutQuality only (shn:MassIntake and btl:inheresIn some sct:TobaccoSmokingSituation and btl:projectsOnto some (btl:ValueRegion and btl:isRepresentedBy only

(shn:hasInformationAttribute some sct:PerDay and shn:hasValue some int[>10])))

#Q2 shn:InformationItem and btl:isOutcomeOf some sct:Evaluation and shn:isAboutSituation only (btl:BiologicalLife and btl:hasPart some (sct:TobaccoSmokingSituation and btl:projectsOnto some (btl:PointInTime

and btl:isRepresentedBy some dateTime[>="1994-03-01T00:00:00Z"])))

and shn:hasInformationAttribute some sct:InThePast

#Q3 shn:InformationItem and shn:isAboutSituation only (btl:BiologicalLife and not btl:hasPart some sct:TobaccoSmokingSituation) and btl:isOutcomeOf some sct:Evaluation and shn:hasInformationAttribute some sct:InThePast

Table 4-6 DL Query examples

We have demonstrated that although data are expressed at different detail levels, they can be ac-

cessed homogeneously thanks to the underlying model of meaning (i.e. ontological framework).

We have formulated the above queries as DL queries for facilitating the reader the understanding of

what is being queried. However, query languages (QLs) based on DL are computationally expensive

and therefore have a limited scalability. Other query languages based on RDF graphs such as SPARQL

are more powerful and performs better but are agnostic with regards OWL DL semantics, not allow-

ing generally DL reasoning. However, combined solutions that perform better and use partially DL

reasoning capabilities exist. Besides, QLs as SPARQL are more expressive, query functionality closer

to traditional databases QLs such as SQL (ordering, count, filter, string matching, etc.).

Table 4-7, shows the rendering of Q2 in SPARQL using Turtle syntax and by using the count operation

for getting the number of patients retrieved by the query (here: ?count = “2”), instead of each indi-

vidual data instance as we would retrieve with the DL query (cf. Table 4-8).

Page 42: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 42 of 44

#Q2

SELECT COUNT(?s1) as ?count WHERE{

?s1 a [[a owl:Class;

owl:intersectionOf (shn:InformationItem

[a owl:Restriction ;

owl:onProperty btl:isOutcomeOf;

owl:someValuesFrom sct:Evaluation ]

[a owl:Restriction ;

owl:onProperty shn:hasInformationAttribute;

owl:someValuesFrom sct:InThePast ]

[ a owl:Restriction ;

owl:onProperty shn:isAboutSituation;

owl:allValuesFrom [ a owl:Class ;

owl:intersectionOf (sct:TobaccoSmokingSituation

[ a owl:Restriction ;

owl:onProperty btl:projectsOnto;

owl:someValuesFrom [ a owl:Class ;

owl:intersectionOf ( btl:PointInTime

[ a owl:Restriction ;

owl:onProperty btl:isRepresentedBy;

owl:someValuesFrom [ a owl:Restriction ;

owl:onProperty shn:hasValue;

owl:someValuesFrom [ a rdfs:Datatype ;

owl:onDatatype xsd:dateTime ;

owl:withRestrictions (

[ xsd:minInclusive "1994-03-01T00:00:00-00:00"])]]])]])]])] .}

Table 4-7 Q2 rendered in SPARQL, including COUNT option

Finally Table 4-8 shows a data exemplar retrieved as answer to query two:

Individual: Instance_Tobacco_Smoking_Situation_Patient Types: shn:InformationItem and shn:isAboutSituation (btl:BiologicalLife and btl:hasPart some (sct:TobaccoSmokingSituation and btl:projectsOnto some (shn:EndPointInTime

and btl:isRepresentedBy value "1994-03-01T00:00:00Z"^^dateTime)))

and btl:isOutcomeOf value Instance_Evaluation_Process and shn:hasInformationAttribute some sct:InThePast

Individual: Instance_Evaluation_Process Types: sct:HistoryTaking and shn:informationProvider value Instance_Patient_X and shn:informationSubject value Instance_Patient_X and shn:siteOfCare value Instance_Outpatient_Consultation_01 …

Table 4-8 OWL DL rendering of example data instance retrieved for Q2. There are two instances, the one that represent the tobacco smoking situation and the end point in time and the one that provides some contextual information

Page 43: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 43 of 44

5 Summary and conclusions

In this document (SemanticHealthNet deliverable 4.3), we have addressed semantic interoperability

challenges that exist when EHRs are used for public health specific data analysis. Our objective has

not been addressing interoperability challenges due to the use of different EHR standards and repre-

sentations. We have explained that granularity, completeness and data quality required for public

health inquiries is often not provided by EHRs. Besides, clinical information mostly has to be re-

trieved within its context in order to be interpreted safely. So is it not the same whether some infor-

mation is provided by a clinician, a patient, or by a machine. In addition, data acquisition and record-

ing can be highly biased in cases the data is not considered of primary importance by the clinician, as

it is often the case with data on recreational drugs like tobacco and alcohol. Thus, all this contextual

information is required in order to know the degree of trust of data. Smoking and alcohol use data

are usually measured approximately and data items such as pack years are often used in public

health for assessing CVDs risks. There are also limits to semantic interoperability when codes such as

Ex-smoker are used for referring both to someone who quit last month but had smoked for 30 years,

and to a person who quit 20 years ago after having smoked two years only. Thus, there are limits to

comparability and semantic interoperability that are fundamental to differences in clinical processes

and facts about the world that cannot be resolved at the level of information systems. At best, the

degree of uncertainty/vagueness can be estimated.

We have focused on the reuse of data from EHRs for public health purposes, assuming that at least

parts of the EHRs are sufficiently standardized and quality-assured. This can be the case with sum-

maries like the HFS. Since the objective of this deliverable is to show how the EHR can be used to

satisfy public health goals, we have expanded the Heart Failure Summary (HFS) with two relevant

behavioural factors, tobacco and alcohol use. Furthermore, we have described an extension of the

underlying semantic architecture, which has been applied to the public health use case in order to

provide the semantic representation of the tobacco and alcohol use data and demonstrate its benefit

to improve semantic interoperability.

The use of semantic patterns to support and guide the mapping process of structured data into their

semantic representation, introduced in D4.2 has been expanded and supported by more examples.

The selection of the appropriated semantic pattern could be supported by the use of keywords such

as “past situation”, “absence”, “observation result”, “test result”, “assessment”, “symptom” etc.

together with taking into consideration the semantic category of each of the SNOMED CT concepts

used within the clinical model (i.e. finding, substance, qualifier value, etc.).

We have provided evidence that SNOMED CT, as the underlying domain ontology, bears a considera-

ble risk of being misled by the textual description of the concept (e.g. bumetanide substance vs.

product). For these cases, the use of a semantic pattern could also help to guide the terminology

binding. We have also shown that SNOMED CT concepts are placed under a wrong semantic category

(e.g. representing information, but placed under a clinical entity category), that SNOMED CT con-

cepts are underspecified (e.g. no explicit relationship between substance consumption and the relat-

ed substance), or are not expressive enough for our use case (e.g. no concept for representing occa-

sional tobacco snuff user but there is one for occasional tobacco user).

Page 44: Semantic Interoperability for Health Network - i~HD D4_3... · 2014-05-15 · SemanticHealthNet (SHN) faces the challenge of improving semantic interoperability of clinical in-formation

D4.3 Ontology / Information models covering the public health use cases Page 44 of 44

Given the above, the use of SNOMED CT as domain terminology within the EHR requires some deci-

sions to be made such as (i) to create value sets to be used across medical specialties; (ii) to clarify

the role of the terminology within the EHR by defining what kind of entities should be represented

(information vs. clinical entities); and (iii) to add axioms to hitherto underspecified concepts. To this

end, the approach proposed does not aim at forbidding the use of complex terms, but to correctly

place them in the ontology in a way that can be consistently used within information models. This

contrasts with other approaches which are normative and disallow the use of complex terms. The

SemanticHealthNet approach, instead, is rather descriptive than normative, and it is suited to medi-

ate across heterogeneous normative approaches, as we have demonstrated.

Although not perfect, SNOMED CT is increasingly being adopted by different countries in their na-

tional healthcare systems. It is increasingly grounded on formal ontological principles and redesigned

according to these principles. SemanticHealthNet is keeping pace with SNOMED CT’s progress, which,

however, should not preclude the use of other medical terminologies. The fact that other termino-

logical standards like ICD, LOINC and ICNP are in the process of being harmonized with SNOMED CT

via a common ontology demonstrate an increasing concern to clarify the meaning of medical terms

by formal-ontological grounding, thus following a similar route as SemanticHealthNet.

Given the state of the art of semantic technologies, the biggest challenge consists in finding interme-

diate scalable solutions which leave the door open to include the upcoming progress in the field. This

might influence the use of a logic- or non-logic-based language, as well as the type of reasoning

done, if any, which may depend on the technological state of the art and the semantic interoperabil-

ity requirements. Moreover, several renderings of a rich representation might be required, each at-

tending a different purpose, e.g. data validation vs. data query.

In the third project year, SemanticHealthNet WP 4 will open its scope to the semantic annotation of

formalised clinical practice guidelines. It will also try to explore the relationships with existing model-

ling approaches (e.g. CIMI, SIAMM, etc.) and standards such as ContSys (CEN ISO/DIS 13940). This

last defines a system of concepts from and enterprise / clinical perspective, which represent both the

content and context of the health care services under a process view and a generic clinical process

model.16 It seems to be complementary to the approach proposed. It pays attention to the modelling

of healthcare and clinical processes, considering their workflows, while in the SemanticHealthNet

approach as it has been developed by now, the sequence of healthcare processes that end up pro-

ducing the data have not been modelled in detail. Yet there is some overlap, e.g. by using concepts

like “health condition”, “observed condition”, “observed condition value” or “health component

specification” for which we have found a parallelism with the use of “clinical situation”, “observation

result about a clinical situation quality”, “result of the observation of the clinical situation quality”.

We will have a deeper insight in this standard in order to identify ways of harmonization.

16

http://www.standard.no/Global/PDF/Helse/CEN-TC251_N2013028_N13028_WG1_report_of_the_2nd_Madrid_Work.pdf