65
XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM E31 Electronic Healthcare Records

XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Embed Size (px)

Citation preview

Page 1: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

XML in Biomedical Informatics

Jonathan Borden, M.D.Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, BostonChair, ASTM E31 Electronic Healthcare Records

Page 2: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

The Goal

Answer questions like:“Of all the patient’s I operated on for

brain tumors between 1996-2000, matching severity of pathology and matching clinical status and who have the “P53” mutation, did PCV chemotherapy improve the cure rate at five years?”

Page 3: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Healthcare: The current situation

A disaster: 1.1 Trillion $/year in the USA30-40 % overheadmostly paper basedhighly proprietary commercial systemstens of thousands of Americans die each

year due to poor information/errorsMost of the information is rendered

useless

Page 4: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Strategies

Define open standardsCapture information in an electronic

formReduce errors related to informationDefine distributed, web enabled,

query models

Page 5: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Tactics

XML, schemas, query modelSemantic Web/URI graphsData analysis based on actual

population rather than small, potentially biased, samples

Google for biomedical information

Page 6: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Why XML?

Widely implemented with excellent open source tools

Life of data is longer than life of application

Data driven, Platform independentFormal schema and query models

Page 7: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Reinventing medical informatics

Get the data format right and the rest will follow

Structured information has been the holy grail of medical informatics for the last 30+ years

XML is the culmination of 30+ years of work in structured information

Time to do something

Page 8: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

XML Briefly

Simplification of SGML … markup language for the web

<element> content </element><element attribute=“value”>

<child-element another=“123”/></element>

Page 9: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

ASTM E31.25

XML DTDs for HealthcareEmphasize Human ReadabilityFlexibilityOpenhealth reference

implementation http://www.openhealth.org/ASTM

Compatible with HL7 CDA

Page 10: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

ASTM Healthcare DTDs

clinical.header compatible with HL7 CDA

clinical.body specific to document type operative.report radiology.report discharge.summary etc.

Page 11: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Healthcare Schema

Page 12: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Healthcare datatypes

<person> <person.name>

<prefix>Ms.</prefix> <given>Susan</given> <given>Samantha</given> <family>Jones</family>

</person.name> <id type=“SSN”>000-11-2233</id>

Page 13: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Healthcare datatypes

<patient> <person.name> … </person.name> <id authority=“New England Medical

Center”>000112233</id>

</patient> <provider>

<person.name><prefix>Dr.</prefix><given>Amanda</given><family>Smith</family></person.name>

</provider>

Page 14: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Encounter

<encounter> <patient>…</patient> <provider>…</provider> <date.time>…</date.time> <location> … </location> <encounter.id>…</encounter.id>

</encounter>

Page 15: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Capturing encounters

Encounters are billable units of workU.S Govt pays ~50% of the billsPayors often require associated

clinical information prior to paying bill

-This information should be aggregated for statistical purposes-

Page 16: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Leveraging HIPAA: attachments are key!

Collect attachments

Page 17: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Integrating binary formats

MIME <-> XMTPHL7 V2X12 EDIDICOM

Page 18: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Internet Telemedicine

The OceanMed project, 1998Merchant vessel, e-mail access via

satellite gatewayDigital cameraWeb based physician access

Page 19: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

XMTP

ShipGateway

XMTPMIME -> XML ->

XSLT -> HTML

SMTP

HTML

Page 20: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

XMTP Consult

36 year old male has itchy rash for 6 days

Hydrocortisone cream 1% to affected area t.i.d.|

reply

Page 21: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

How it works

Messages arrive in MIME formatMIME SAX parser ‘converts’ to XML

by SAX eventsXMTP employs XML object model

*not necessarily* serialization format ->

grove processing

Page 22: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

XMTP

From: [email protected] To: [email protected] Content-type: multipart/related; charset=iso-8859-1 --------- startDocument()

startElement(“MIME”) startElement(“From”)

• characters(“[email protected]”) endElement(“From”) startElement(“Content-Type”, attribute(“charset”,”iso-8859-

1”))• characters(“multipart/related”)

endElement(“Content-Type”)

Page 23: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

The XMTP/MIME grove

Content-type: text/plain

From: [email protected]

To: [email protected]

Hi Sue! See you in Boston, Joe

<MIME>

<Content-type>text/plain</Content-Type>

<From>[email protected]</From>

<Body>Hi Sue! See you in Seattle, Joe</Body>

</MIME>

Page 24: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Healthcare Groves

<patient> <person.name>

<given>James</given><given>Steven</given>

<family>Smith</family><suffix>3rd</suffix>

</person.name>startElement(“patient”)

startElement(“person.name”)startElement(“given”);characters(“James”);...

Page 25: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

The HL7 Grove

MSH|PAT|Jones^James^Stephen^3rd|

startElement(“patient”) startElement(“person.name”) startElement(“family”)

characters(“Jones”);

endElement(“family”)

Page 26: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Regular Expressions

Pattern matching“*TATA*”bp ::= ‘G’ | ‘T’ | ‘A’ | ‘C’tata ::= bp*, ‘T’, ‘A’, ‘T’, ‘A’, bp*

Page 27: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

XML DTD

<!ELEMENT foo (bar*)><!ELEMENT bar (baz?)><!ATTLIST bar bop CDATA

#IMPLIED><!ELEMENT baz (#PCDATA)>

Page 28: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Tree Regular Expressions

foo[bar[

@bop[int]baz[‘xxx’]]

]

<foo><bar bop=“23”>

<baz>xxx</baz>

</bar></foo>

Page 29: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Tree Regular Expressions

RELAXNG http://www.relaxng.org<pattern name=“foo”>

<element name=“foo”>< element name=“bar”>

• <attribute name=“bop”>– <data type=“int”/>

• </attribute>• <element name=“baz”>

– <value>xxx</value>• </element>

Page 30: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Simple building blocks

XML parsersXSLT transform enginesHTTP clients and servers

Page 31: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

The shape of information

“…..TATA…..”

gene

tatasnp

snp

Pattern matching transform

Page 32: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

How it works

Browser

Apache

XSLT

Servlet engine

xml:dbRDF

Page 33: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Form generation

Form.xml

Defaults.xml

Formgen.xsl

XML + XSLT => XHTML

Page 34: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Workflow

Form createdTransform into ASTM XML formatXHTML editing (opnote-edit.xsl)Sign finished productRender as XHTML for viewing,

printingemail to Medical Records and Billing

Page 35: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Workflow

generate

edit

sign

Billing

repository

Page 36: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Document analysis

Like gene sequences, it turns out that …Medical documentation is highly repetitiveWith ‘hot spots’ of unique informationSchema defines template filled with valuesEasily expanded into HTML for human

consumptionEasily analyzed by software

Page 37: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Document analysis

Page 38: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

RDF in Healthcare

<rdf:Description about=“…/patient/12345”><lab:HIV>positive</lab:HIV><lab:CD4>100</lab:CD4>

</rdf:Description>

<path:Biopsy about=“…/patient/12345”>

<path:description>The brain demonstrates areas of PML including viral inclusion bodies

</path:description>

</path>

Page 39: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

RDF is...

A standard syntax to represent (edge labeled) directed graphs in XML

Page 40: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Edge Labeled Directed Graphs

foo

bar

baz

bop bing

isa

has

wantsplays(isa, foo, bar)(has, bar, baz)(plays, baz, bop)(wants, baz, bing)

Page 41: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Semantic Networks

A way to represent natural language circa 1970s

A format for organizing statements in a way that can be queries by computers

Page 42: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Semantic Networks

vertebrate

mammal bird

canary ostrich

heartspine

hair

fly

wings

walkdoesn’t fly

yellow

isa isa

isa

has

can

freddie hugo

Page 43: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Semantic Networks

“Can freddy fly?”“Does hugo have wings?”“Does freddy have a spine?”“Of all the canaries, how many live in

cages?”

Page 44: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

XML form

<patient ID=“Patient12345”>

<person.name>

<given>Jonathan</given>

<family>Borden</family>

<person.name>

<primary.care.physician>

<provider ...

Page 45: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

RDF Graph

Person12345

Jonathan

Borden

person.name

given

family

value

value

PersonName LiteralPerson

Page 46: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Semantic analysis

repository

instance

Class

Class

Property

domain

type

subClass

Class

type

Page 47: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Semantic analysis

“Of all the patient’s I operated on for brain tumors between 1996-2000, matching severity of pathology and matching clinical status and who have the “P53” mutation, did PCV chemotherapy improve the cure rate at five years?”

Page 48: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

First Order Predicate Logic

(for-all ?pat (exists ?surgeon (last-name ?surgeon “Borden”))

(exists ?procedure (craniotomy ?procedure)(patient ?procedure ?pat)(surgeon ?procedure ?surgeon)(between (date ?procedure)

“1996” “2000”)(sequence ?procedure “p53”)

...

Page 49: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

DAML+OIL

DARPA Agent Markup LanguageOntology Inferencing LanguageAdds description logic capabilities to

RDFAn extension of RDF SchemaW3C WebOnt“Semantic networks on the web using

c. 2001 technology”

Page 50: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Simplified Healthcare Schema

<rdfs:Class rdf:ID=“Provider”>

<rdfs:subClassOf rdf:resource=“#Person”/>

</rdfs:Class>

Page 51: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Simplified Healthcare Schema

Page 52: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Healthcare Schema

Page 53: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

XML Namespaces

Namespace name is a URI “http://…”Namespace name may/should

identify a resource directory (RDDL)RDDL resource directory contains

various schemata, descriptions, code etc. associated with namespace

Page 54: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Resource Directory Description Language (RDDL)

Proposed as a solution to what a namespace name URI ought reference

Both human and machine readableXHTML Basic + XLink resourcesParsers available two weeks after

initial proposalAn XML-DEV project

Page 55: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

RDDL

Proposed January 2001Adopted by namespaces such as

XML Schema, Schematron, RSS, Examplotron, XSLT Extension framework, SWAG

http://www.rddl.org/

Page 56: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

DAML Schema resource

<rddl:resource id=“DAML” xl:role=“http://www.daml.org/2001/04” -- Nature xl:arcrole=“http://www.rddl.org/

purposes#schema-validation” -- Purpose xl:title=“My DAML Ontology” > <p>This is my DAML</p>

</rddl:resource>

Page 57: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

XSLT resource

<rddl:resource xl:role=“http://www.w3.org/1999/XSL/

Transform”

xl:arcrole=“http://purl.org/rss/1.0” xl:href=“toRSS.xsl” >

Page 58: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Java resources

<rddl:resource xl:role=“…application/java-archive”

xl:arcrole=“…purposes/software#xslt-extension”

xl:href=“thisNS-xslt-extension.jar” ><p>The xslt extensions bound to this

namespace are packaged in a JAR</p> </rddl:resource>

Page 59: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Putting it all together

Biomedical information has many vocabularies - each in its own namespace

genetics “Bio ML”pathology “SNOMED”surgery “CPT”medicine “ICD”radiology “DICOM”

Page 60: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Putting it all together

Electronic medical record

genesdiagnoses

drugs

procedures

Page 61: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

genetics

MRIPath-specimen

personGene:p53

Left temporal tumorSNOMED:

gliomblastoma

DAML across schemas

Page 62: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

The shape of ontologies

glioblastoma

p53

...Ring enhancing

enhancing astrocytoma p53

Page 63: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Queries

Query as universal/existential quantification

DAML/RDF subgraph matchingXML Query modelRegular expression pattern matching

Page 64: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Future directions

The technology is here …Define schemas and ontologiesStandardize data formatsCollect datajust do it!

[email protected]

Page 65: XML in Biomedical Informatics Jonathan Borden, M.D. Assistant Professor of Neurosurgery, Tufts University, New England Medical Center, Boston Chair, ASTM

Contact Information

Jonathan Borden, M.D.Department of NeurosurgeryNew England Medical Center750 Washington StreetBoston, MA 02111617-636-5859

www.openhealth.org/ASTMwww.openhealth.org/opnote (demo)www.openhealth.org/RDF

[email protected]