35
Transitioning a 4 TB Health Care Security Auditing System to MongoDB Michael Poremba Director, Data Architecture Practice Fusion

Michael Poremba, Director, Data Architecture at Practice Fusion

  • Upload
    mongodb

  • View
    501

  • Download
    3

Embed Size (px)

DESCRIPTION

Practice Fusion, the largest cloud-based electronic health records (EHR) system in the US, used by more than 100,000 health care providers managing over 100 million patient medical records, faced the need to move their four terabyte HIPAA audit reporting system off of a relational database. Practice Fusion selected MongoDB for their new HIPAA audit reporting system. Learn how the team designed and implemented a highly scalable system for storing protected health information in the cloud. This case study covers the move from a relational database to a document database; data modeling in JSON; sharding strategies; indexing; sharded cluster design supporting high availability and disaster recovery; performance testing; and data migration of billions of historical audit records.

Citation preview

Page 1: Michael Poremba, Director, Data Architecture at Practice Fusion

Transitioning a 4 TB Health Care

Security Auditing System to MongoDBMichael Poremba

Director, Data Architecture

Practice Fusion

Page 2: Michael Poremba, Director, Data Architecture at Practice Fusion

IntroductionsGetting started

Page 3: Michael Poremba, Director, Data Architecture at Practice Fusion

+ 20 years software engineering

+ Data architect / application architect

+ High-volume OLTP relational databases

+ Application performance and scalability

+ Domain experience:Health care; financial services; IT management; content management and distribution;

targeted advertising; telecom billing; manufacturing; insurance

Michael Poremba @ Practice Fusion

Page 4: Michael Poremba, Director, Data Architecture at Practice Fusion

+ Cloud-based electronic health records (EHR)

+ Over 100,000 health care providers in US

+ Over 90,000,000 patient medical records

+ OLTP database: Week day peak ~ 40,000 transactions per second

+ 4 TB security auditing records ~ 50% of OLTP database storage

Practice Fusion

Page 5: Michael Poremba, Director, Data Architecture at Practice Fusion

+ HIPAA: Health Insurance Portability and Accountability Act of 1996

+ Who did what to which patient’s medical record when?

+ Regulatory requirement—audit log must be kept and reviewed

+ Law enforcement and evidence in legal discovery

+ Save the audit log forever

+ Primary use cases:

Audit report in EHR: Security audit log viewer

Physician data analytics: Clinical quality measures (CQM)

HIPAA Security Audit Log

Page 6: Michael Poremba, Director, Data Architecture at Practice Fusion
Page 7: Michael Poremba, Director, Data Architecture at Practice Fusion

HIPAA Security Auditing on MongoDB

Project anatomy & lessons learned

Page 8: Michael Poremba, Director, Data Architecture at Practice Fusion

Security Auditing – Legacy Architecture

Public

Load

Balancer

App 1

App 2

App n

.

.

.

EHR

(OLTP DB)

ActivityFeed

ActivityFeedParameter

4..8

CQM

(reporting)

ETL

Audit

Report

Page 9: Michael Poremba, Director, Data Architecture at Practice Fusion

+ Latency on SAN increased

+ Response time slowed for writes

+ Database connections held longer

+ Connection pool expanded

+ User interface locked up—waiting

+ Users tried to log in again

+ Login is heaviest user operation

+ [Repeat]

The Log Jam

Found at: http://anchorhardwoods.com/wp-content/uploads/2011/08/log-jam.jpg

Page 10: Michael Poremba, Director, Data Architecture at Practice Fusion

Audit Service – New Architecture

Public

Load

Balancer

App 1

App 2

App n

.

.

.

MongoDB

Audit Log

Audit

ServiceAMQ

Queue

Listener

Audit

Report

CQM

(reporting)

ETL

Page 11: Michael Poremba, Director, Data Architecture at Practice Fusion

+ Isolate auditing system from EHR OLTP database

+ Extract audit IO off of EHR SAN

+ New service interface for audit events

+ Scale out audit service

+ Scale out data store for auditing

Benefits of New Architecture

Page 12: Michael Poremba, Director, Data Architecture at Practice Fusion

Project Objectives

+ New infrastructure for MongoDB

and AMQ

+ Modernize audit service API

+ Modernize audit report UI

+ Convert ~200 audit write operations

to new service API

+ Data warehouse ETL from MongoDB

+ Migrate 4 billion exiting audit records

New Security Auditing SystemColetteprogram management

Ernestservices expert

Bhaviktest engineering

Michaeldata architecture

Jeffcluster architecture

JayMongoDB expert

BrettAMQ expert

Bryaninfrastructure coordination

Carlosdata warehouse ETL

Page 13: Michael Poremba, Director, Data Architecture at Practice Fusion

+ Transaction volume: Sustain 1,000 new documents per second

+ Data volume: Scale to 10’s of billions of audit event records

+ High availability and disaster recovery—higher SLA than EHR

+ Quick UI response time for interactive audit report

+ Tamper prevention and detection

No updates or deletes permitted on audit log

Security alerts when audit log is altered

+ Leverage industry standards for health care security audit logging

~300 distinct auditable user actions

Required and varying data elements

Security Auditing – Application Requirements

Page 14: Michael Poremba, Director, Data Architecture at Practice Fusion

AuditEvent

ParticipantObject

AuditSystem

User

0..n1..1 1..2

Health Care Industry Standards for Audit Logging

+ ISO 27789:2013: Health

Informatics – Audit trails for

electronic health records

+ ASTM E2147-01(2013):

Standard Specification for Audit

Disclosure Logs for Use in

Health Information Systems

+ FHIR SecurityEvent – resource

definition for auditing

Page 15: Michael Poremba, Director, Data Architecture at Practice Fusion

{

"_id" : <BinaryData(4)>, // The audit event GUID

"docHash" : <String; Required>, // Tamper detection

"audOrgGuid" : <BinaryData(4); Required>, // Shard key

"crtdDttmUtc" : <Date; Required>, // Datetime record was inserted

"evnt" : {// Required subdocument

"dttmUtc" : <Date; Required>, // Date/time that event occurred

"typ" : <String; Required>, // Event record type; ~ 300 types

"ptDataTyp" : <String; Required>, // Standard set of patient data types

"actn" : <String; Required>, // Standard set of actions

"sys" : <String; Required> // Source system for audit event

},

"usr" : { // Required subdocument

"usrId" : <String; Required>, // Human-readable ID

"usrGuid" : <BinaryData(4); Required>, // Machine-readable ID

"dispNm" : <String; Required>, // Required; Display name for user

"orgId" : <String; Required>,

"orgNm" : <String; Required>

},

"altUsr" : { // Optional subdocument for second user

... // Subdocument contains same properties as "usr"

},

"pt" : { // Optional subdocument

"ptId" : <String; Required>, // Human-readable ID for patient

"ptPracGuid" : <BinaryData(4); Required>, // Machine-readable ID for patient

"dispNm" : <String; Required>, // Display name for patient

"orgId" : <String; Required>,

"orgNm" : <String; Required>

},

"body" : { // Optional subdocument

... // Flattened list of attributes, specific to audit event subtype

}

}

JSON Document Schema for Audit Events

AuditEvent

ParticipantObject

AuditSystem

User

0..n1..1 1..2

Page 16: Michael Poremba, Director, Data Architecture at Practice Fusion

Schema Design – Lessons Learned

+ Prop nms strd per doc Long names add up for large collections (ours: 1 TB)

Consider using abbreviated property names

Up-vote this feature request:

https://jira.mongodb.org/browse/SERVER-863

+ Know your application read/write patterns

+ Application responsible for data integrity

+ Be aware of data type behaviors Indexed string search is case sensitive

Several binary data types for UUID—use type 4

(default type is specific to database driver)Found at: http://www.milesfinchinnovation.com/blog/wp-

content/uploads/2013/02/iStock_000019474446Medium.jpg

Page 17: Michael Poremba, Director, Data Architecture at Practice Fusion

Schema Design – Lessons Learned

Leverage native data types:

+ Date

+ Boolean

+ Numeric "1" + "1" "11"

"11" + "1" "111"

+ UUID "8c290139-f4e3-49c1-9ba2-a883defc6a15"

"8C290139-F4E3-49C1-9BA2-A883DEFC6A15"

"8c29-0139-f4e3-49c1-9ba2-a883-defc-6a15"

"8c290139f4e349c19ba2a883defc6a15"

"{8c290139-f4e3-49c1-9ba2-a883defc6a15}"

"{8C290139-F4E3-49C1-9BA2-A883DEFC6A15}"

Found at: http://www.industryweek.com/innovation/innovation-one-size-fits-one

Page 18: Michael Poremba, Director, Data Architecture at Practice Fusion

ActivityFeed

Audit EventType

ActivityFeed

Parameter

Action TypePatient

Data Type

(~300)

(~4 billion)

(~30 billion)

(10) (18)

UserPatient

(~100,000)(~90 million)

Practice

(~50,000)

Legacy Auditing System – Relational Schema

Issues around data normalization

+ New requirements introduced

+ Filter criteria and sort criteria

stored in five different tables

+ Audit events must be read into

memory for filtering and sorting

Join and expand data set by practice

Sort and filter expanded data set

+ Response time suffers for large

practices with many audit events

Page 19: Michael Poremba, Director, Data Architecture at Practice Fusion

Schema Design – Lessons Learned

ActivityFeed

Audit EventType

ActivityFeed

Parameter

Action TypePatient

Data Type

UserPatient

Practice

Denormalize with care:

{

"_id" : <BinaryData(4)>,

"docHash" : <String; Required>,

"audOrgGuid" : <BinaryData(4); Required>,

"crtdDttmUtc" : <Date; Required>,

"evnt" : {

"dttmUtc" : <Date; Required>,

"typ" : <String; Required>,

"ptDataTyp" : <String; Required>,

"actn" : <String; Required>,

"sys" : <String; Required>

},

"usr" : {

"usrId" : <String; Required>,

"usrGuid" : <BinaryData(4); Required>,

"dispNm" : <String; Required>,

"orgId" : <String; Required>,

"orgNm" : <String; Required>

},

"pt" : {

"ptId" : <String; Required>,

"ptPracGuid" : <BinaryData(4); Required>,

"dispNm" : <String; Required>,

"orgId" : <String; Required>,

"orgNm" : <String; Required>

},

"body" : { ... }

}

Page 20: Michael Poremba, Director, Data Architecture at Practice Fusion

+ Millions of events per owning organization

+ Quick UI Response Time for Interactive Audit Reports

+ Audit report UI allows events to be sorted/filtered five different ways

+ UI allows paging through audit event

+ Create a secondary index for each sort method

Index Design

Page 21: Michael Poremba, Director, Data Architecture at Practice Fusion

+ Organization, event date DESCdb.auditEvent.ensureIndex ( {"audOrgGuid": 1, "evnt.dttmUtc": -1} );

+ Organization, patient, event date DESCdb.auditEvent.ensureIndex ( {"audOrgGuid": 1, "pt.ptId": 1, "evnt.dttmUtc": -1 } );

+ Organization, user, event date DESCdb.auditEvent.ensureIndex ( {"audOrgGuid": 1, "usr.usrId": 1, "evnt.dttmUtc": -1 } );

+ Organization, patient data type, event date DESCdb.auditEvent.ensureIndex ( {"audOrgGuid": 1, "evnt.ptDataTyp": 1, "evnt.dttmUtc": -1

} );

+ Organization, user action type, event date DESCdb.auditEvent.ensureIndex ( {"audOrgGuid": 1, "evnt.actn": 1, "evnt.dttmUtc": -1} );

+ Document created date DESCdb.auditEvent.ensureIndex ( {"crtdDttmUtc": -1 } );

Index Definitions

Page 22: Michael Poremba, Director, Data Architecture at Practice Fusion

+ Filter by practice GUID

+ Sort by event created date time, descending order

+ Limit to 20 documents

db.auditEvent.find( {"audOrgGuid": BinData(4,"ABrlAG57Rx6gY3zyHzFK3Q==")} )

.sort( {"evnt.dttmUtc" : -1} ).limit(20).explain();

{

"clusteredType" : "ParallelSort",

"shards" : {

"RepSet02/MNGODDB03-SHRD02:27018, MNGODDB04-SHRD02:27018" : [

{

"cursor" : "BtreeCursor auditEvent_audOrgGuid_dttmUtc",

...

} ] }

...

"numshards" : 1,

...

Query Plan

Page 23: Michael Poremba, Director, Data Architecture at Practice Fusion

Indexing Strategy – Lessons Learned

+ As with relational databases,

indexes are essential for efficient

queries

+ Learn how to use .explain()

to read query plans

+ Avoid collection scans:"cursor" : "BasicCursor"

+ For compound indexes, query

sort order must match index sort

orderFound at: http://www.ebay.com/itm/13-pc-Hex-Shank-Titanium-Drill-Bit-Set-Quick-Change-

Bits-/350526103504?pt=LH_DefaultDomain_0&hash=item519cfbdfd0

Page 24: Michael Poremba, Director, Data Architecture at Practice Fusion

Principle of least privilege

+ MongoDB cluster not accessible from public Internet

+ Security enabled on cluster

+ Application users granted minimum permissions required

Signed audit events

+ Audit events signed with hash of audit event contents

+ Recompute hash on reads—test the data against hash value

+ Send security alert when hash does not match

Oplog monitoring

+ Use mongo-connector Python scripts to monitor oplog

+ Watch for .update() and .delete() operations on collection

+ Send security alert when data changes are detected

Tamper Prevention and Detection

Found at:http://legacymedia.localworld.co.uk/275663/Article/images/17639732/4416792.jpg

Page 25: Michael Poremba, Director, Data Architecture at Practice Fusion

Security – Lessons Learned

+ Minimize network access to

MongoDB cluster

+ Enable authentication

+ Leverage role-based

authorization

+ Use SSL (MongoDB Enterprise)

+ Disable REST interface and

HTTP status interface

Found at: http://www.harborfreight.com/3-1-2-half-inch-circular-padlock-98972.html

Page 26: Michael Poremba, Director, Data Architecture at Practice Fusion

+ Shard the database to scale out

+ Begin with small number of shards (2 or 3)

+ Group all audit events from the same medical practice

Every audit event is “owned” by some practice

Audit report UI always queries events by medical practice

+ Composite shard key on { PracticeGuid, _id }db.runCommand({

shardcollection : "AuditLog.auditEvent",

key: {audOrgGuid: 1,

_id: 1}});

Transaction Volume: 1,000 New Documents per Second

Found at:http://s3.amazonaws.com/Reconsales/800/0bfe72e0-9b06-42ac-9644-5727a3ca9c79.jpg

Page 27: Michael Poremba, Director, Data Architecture at Practice Fusion

Sharding the Database – Lessons Learned

+ At the onset of development

determine whether to shard

+ Specify shard key in queries Allows mongos to route query

Minimize distributed “scatter/gather” queries

Queries spanning chunks likely span shards

+ Choose a key that allows even

balancing Balancing is performed in 32 MB chunks

Design shard key to ensure chunks will not

exceed 32 MB

Found at: http://www.airbrushaction.com/content/sites/default/files/tipstricks-images/4_27.png

Page 28: Michael Poremba, Director, Data Architecture at Practice Fusion

High Availability and Disaster Recovery – Replica Sets

+ If audit log is down, then 100,000

health care providers are idle

+ Audit logging subsystem must be

more reliable than customer EHR

+ Node failover must be automatic

+ Protect against network and data

center failure scenarios

Found at: http://www.huntsmart.com/App_Themes/hs.com/ProductImages/250/DNSBC.jpg

Page 29: Michael Poremba, Director, Data Architecture at Practice Fusion

Disaster Recovery DCPrimary DC DC2 AZ2

Sharded Cluster Replicated Across Multiple Data Centers

config

mongos shard 2

arbitermongos

amq

arbiter

amq

DC3 AZ1

shard 2

DC2 AZ1

shard 2

mongos shard 3

arbitermongos

arbiter

shard 3shard 3

mongos shard 1

arbitermongos

arbiter

shard 1shard 1

config config

amq amq

Page 30: Michael Poremba, Director, Data Architecture at Practice Fusion

Performance and Stress Testing – Lessons Learned

+ Acquire or build load testing tools

+ Test using a realistic, unbiased data set

+ Test database cluster to ensure write

throughput

+ Ensure read & write performance meets

load requirements

+ Find the performance ceiling

+ Find and resolve bottlenecks

+ Tune IO and memory

Found at: http://www.webdesign.org/img_articles/21892/broken_chain.jpg

Page 31: Michael Poremba, Director, Data Architecture at Practice Fusion

Data Migration – Lessons Learned

Data Migration

+ Parallelize data migration process

+ Identify and remove bottlenecks

+ Scale out MongoDB cluster to handle

heavy write load

+ Determine whether best to add

indexes before or after migration

+ It takes a while to extract, transform,

and load billions of documentsFound at: http://www.dennissy.com/wp-content/uploads/2010/07/house_moving_malaysia.jpg

Page 32: Michael Poremba, Director, Data Architecture at Practice Fusion

Choosing the Appropriate Data Store

MongoDB over relational?

+ Scale out for transaction volume

and data volume

+ Highly varying document

structure

+ Developer productivityEasy map between application and data store

+ Offload read activity in optimized

format different from data writes(a.k.a. CQRS pattern)

Found at: http://www.meonuk.com/hammers-mauls

Page 33: Michael Poremba, Director, Data Architecture at Practice Fusion

Choosing the Appropriate Data Store

Relational over MongoDB?

+ Complex normalized data model

+ Diverse read patterns requiring

joins

+ Ad hoc reporting and analysis

+ Data integrity difficult to manage

in application layerFound at:

http://3.bp.blogspot.com/_QUmmdgc7l6A/TTPUyRWFNPI/AAAAAAAAAO8/KV_i2c2lrRk/s1600/saws+various.jpg

Page 34: Michael Poremba, Director, Data Architecture at Practice Fusion

MongoDB @ Practice Fusion

Upcoming MongoDB projects

+ Read cache for patient medical

records

+ Online patient intake process

+ Ad campaign segmentation

+ Scale-out data store for

patient clinical observationsFound at: http://jbirdmedia.org/vessels/images/uploads/framing-new-const-lg.jpg

Page 35: Michael Poremba, Director, Data Architecture at Practice Fusion

Q&A

Michael Poremba

[email protected]