Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
EvolvingEvolving Data Quality Management Capabilities Data Quality Management Capabilities
in Complex Organizations:in Complex Organizations: The Health Care CaseThe Health Care Case
Bruce N. Davidson, Ph.D., M.P.H.Bruce N. Davidson, Ph.D., M.P.H.Director, Resource & Outcomes ManagementDirector, Resource & Outcomes Management
CedarsCedars--Sinai Health SystemSinai Health System
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 5
Bruce Davidson, PhD - copyright Cedars-Sinai Health System, 2007
The Context The Context CedarsCedars--Sinai Medical CenterSinai Medical Center
•• Academic Medical Center/Health SystemAcademic Medical Center/Health System
•• Largest NonLargest Non--Profit Hospital in the Western USProfit Hospital in the Western US
•• 950 Beds, 10,000 employees, 2000 MDs950 Beds, 10,000 employees, 2000 MDs
•• Basic Annual StatisticsBasic Annual Statistics
–– 55,000 inpatients55,000 inpatients
–– 280,000 outpatients280,000 outpatients
–– 55,000 ER visits55,000 ER visits
–– 7,000 deliveries7,000 deliveries
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 6
Bruce Davidson, PhD - copyright Cedars-Sinai Health System, 2007
Data Management ImplicationsData Management Implications
•• Complex, informationComplex, information--intensive organizationintensive organization
•• Distributed oversight responsibilitiesDistributed oversight responsibilities
•• Transactional data systems populated as Transactional data systems populated as byproduct of patient care byproduct of patient care
•• Information managed as departmental resource Information managed as departmental resource rather than as enterprise resourcerather than as enterprise resource
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 7
Bruce Davidson, PhD - copyright Cedars-Sinai Health System, 2007
Evolving from this viewEvolving from this view……
MISSIONMISSION
Patient carePatient care
TeachingTeaching
ResearchResearch
Community ServiceCommunity Service
RESOURCESRESOURCES
PeoplePeople
MoneyMoney
EquipmentEquipment
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 8
Bruce Davidson, PhD - copyright Cedars-Sinai Health System, 2007
……to this view, over the last 10 yearsto this view, over the last 10 years
MISSIONMISSION
Patient carePatient care
TeachingTeaching
ResearchResearch
Community ServiceCommunity Service
RESOURCESRESOURCES
PeoplePeople
MoneyMoney
EquipmentEquipment
InformationInformation
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 9
Bruce Davidson, PhD - copyright Cedars-Sinai Health System, 2007
•• 19971997
–– DPG convened to address DPG convened to address data crisesdata crises
•• 1998 1998
–– TDQM Summer CourseTDQM Summer Course
–– DQMWG Spun Off of DPGDQMWG Spun Off of DPG
–– IQ Survey, round 1IQ Survey, round 1
•• 1999 1999
–– DQ Concept KickDQ Concept Kick--OffOff
–– IQ Survey, round 2IQ Survey, round 2
–– DQM Objectives first appear in DQM Objectives first appear in Annual PlanAnnual Plan
The The ““long and winding roadlong and winding road”” (1)(1)•• 2000 2000
–– Big DQ Improvement ProjectBig DQ Improvement Project
–– DQM Objectives appear again DQM Objectives appear again in Annual Planin Annual Plan
–– ROM Dept reorganization to ROM Dept reorganization to capitalize on DQ frameworkcapitalize on DQ framework
•• 20012001
–– DPG & DQMWG Charters DPG & DQMWG Charters renewedrenewed
–– DQM Objectives again in DQM Objectives again in Annual PlanAnnual Plan
–– IQ Survey, round 3IQ Survey, round 3
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 10
Bruce Davidson, PhD - copyright Cedars-Sinai Health System, 2007
•• 20022002–– DPG thinks about proDPG thinks about pro--active active
DQM infrastructureDQM infrastructure–– ROM designated as data ROM designated as data
““clearinghouseclearinghouse”” for approval for approval of all clinical statistics of all clinical statistics reported outreported out
–– JCAHO accreditation JCAHO accreditation standards for MOI linked to standards for MOI linked to DQM initiativeDQM initiative
•• 20032003–– Data crisis: No P&Ls for 8 Data crisis: No P&Ls for 8
monthsmonths–– Initiate Data Warehouse Initiate Data Warehouse
Improvement ProjectImprovement Project
•• 20042004
–– Propose DQMU (x2)Propose DQMU (x2)
–– Implement Data Warehouse Implement Data Warehouse Improvement ProjectImprovement Project
–– DQMU funded as part of ROMDQMU funded as part of ROM–– IQ Survey, round 4IQ Survey, round 4–– Data crisis: No Management Data crisis: No Management
Reports for 5 months following Reports for 5 months following new PM system implementationnew PM system implementation
•• 20052005–– DWIP Objectives appear in Annual DWIP Objectives appear in Annual
PlanPlan
–– DQMU staffed and DQM Program DQMU staffed and DQM Program initiatedinitiated
–– DWIP Project Plan institutionalized DWIP Project Plan institutionalized as primary focusas primary focus for DPGfor DPG
The The ““long and winding roadlong and winding road”” (2)(2)Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 11
Bruce Davidson, PhD - copyright Cedars-Sinai Health System, 2007
•• 20062006–– DQ Objectives appear in DQ Objectives appear in
Annual PlanAnnual Plan–– DQ Objectives linked to DQ Objectives linked to
executive management executive management incentive compensationincentive compensation
–– Development of Data Development of Data Certification Program for Certification Program for ““High PriorityHigh Priority”” data elementsdata elements
–– Pilot estimation of DQ ROIPilot estimation of DQ ROI–– Development of explicit Development of explicit
criteria for resolving criteria for resolving ““High High PriorityPriority”” Data Quality Data Quality IncidentsIncidents
The The ““long and winding roadlong and winding road”” (3)(3)•• 20072007
–– IQ Survey, round 5IQ Survey, round 5–– 117 Data Quality Incidents 117 Data Quality Incidents
logged since initiation in logged since initiation in February 2005, of which 82 February 2005, of which 82 have been resolved and have been resolved and closedclosed
–– Collaboration with Internal Collaboration with Internal Audit department relative to Audit department relative to minimizing risk due to minimizing risk due to defective datadefective data
–– Continuing challenge to frame Continuing challenge to frame strategic issues in context of strategic issues in context of ““managing information as an managing information as an enterprise resourceenterprise resource””
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 12
Bruce Davidson, PhD - copyright Cedars-Sinai Health System, 2007
What We Hope ForWhat We Hope For
•• Quick resultsQuick results
•• Unflagging supportUnflagging support
•• Universal cooperationUniversal cooperation
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 13
Bruce Davidson, PhD - copyright Cedars-Sinai Health System, 2007
What We GetWhat We Get•• Often painfully slow progress Often painfully slow progress
with regular periods of with regular periods of stagnation, if not reversalstagnation, if not reversal
•• Occasional bursts of support with Occasional bursts of support with frequent periods of inattention, or frequent periods of inattention, or forgetfulnessforgetfulness
•• Gradually enhanced, but Gradually enhanced, but intermittent, cooperationintermittent, cooperation
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 14
Bruce Davidson, PhD - copyright Cedars-Sinai Health System, 2007
What Does It Mean?What Does It Mean?
•• Evolutionary processEvolutionary process
•• Limitations of hierarchical controlLimitations of hierarchical control
•• Development of Development of ““shared mental shared mental modelsmodels”” = work= work
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 15
Galaxy Data Quality Program MIT IQ Industry Symposium
July 18-19, 2007IngenixUnited Health AnalyticsGalaxy – Shared Data WarehouseLaura Sebastian-ColemanIS Manager – Data Quality & End User Support
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 16
©2006 Ingenix, Inc
Overview
Ingenix and GalaxyGalaxy’s DQ programEvolving business needs and the pace of changeData quality in relation to evolving business needs
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 17
©2006 Ingenix, Inc
Ingenix Background
A global healthcare information companyFounded in 1996 to develop, acquire, and integrate some of the nation’s best-in-class healthcare information capabilities Significant and rapidly evolving portfolio of tools and servicesnow transform data into actionable, fact-based, technology-enabled decision supportRanked among the top 10 providers of informatics by Healthcare Informatics magazine in June 2006Today there is an Ingenix product at work in nearly every U.S. healthcare organization.Ingenix is a wholly owned subsidiary of UnitedHealth Group (UHG).
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 18
©2006 Ingenix, Inc
Galaxy Overview
Atomic Data Warehouse with transformationsIntegrates data from more than a dozen subject areas (claim, membership, customer, provider, etc.) across multiple sourcesSize
350 source input files from more than 25 distinct internal and external sources (and counting)18 TB of data; 62 TB footprint3,159 attributes across 12,632 columns in 600 tables (and counting)Largest table: more than 1.5 billion rows
1,704,717,031 on Claim Statistical Service as of 5/3/07
UsageOver 1,000 registered users7,888 queries per day / 256,656 per month, on average Ad hoc, scheduled queries, production extracts to applications and martsDirect access to Galaxy via user-selected tools – Sagent is administratively supported
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 19
©2006 Ingenix, Inc
Galaxy Physical Architecture
Man
agem
ent S
ervi
ces
Dom
ain
Tool
s R
esou
rce
(176
/ 49
)
Sto
rage
/Bac
kup
-IP
(40
/ 93)
Stor
age/
Bac
kup
-IP
(40)
Isol
ated
VLA
N (1
92.1
60.0
.0/1
6)Is
olat
ed V
LAN
(172
.16.
0.0/
16)
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 20
©2006 Ingenix, Inc
System Components
Hardware7 IBM P-series Servers P5752 IBM P-series Servers P5101 IBM P-series Server P5704 EMC DMX 3000 Storage CabinetsAdditional supporting servers for Sagent, Autosys, etc.
SoftwareUDB with DPF v8.2AIX 5.3.0DataStage/PX 7.0.1Optiload 3.1CoSort 7.5.3Autosys 4.5Sagent 4.5i
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 21
©2006 Ingenix, Inc
Galaxy Source Systems & Subject AreasProceedings of the MIT 2007 Information Quality Industry Symposium
PG 22
©2006 Ingenix, Inc
Functions of Galaxy Data
Galaxy is the single source of truth for key business functionsMedical Trend AnalyticsPricingProvider Utilization & ProfilingAppropriateness of CareNetwork AdequacyCare Management / Pattern of Care / Preventive CareFraud & AbuseCustomer ReportingHEDIS ReportingMember DemographicsProduct Penetration
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 23
©2006 Ingenix, Inc
Galaxy’s Data Quality Program
Management recognized need for DQ when Galaxy was launched Theoretical / methodological foundations
Correct data problems at the sourceData as a productStatistical process control
Primary functions of DQ programMonitor, measure, and report on Galaxy’s Data QualityRecommend and implement actions based on findings
Biggest initial challenge = establishing useful metricsWhat to measure / how to measureHow to respond to the results of measurements
2003 Initiated metrics & reporting program2004 Implemented first automated measures 2004-2007: Deliver weekly/cyclic, monthly, quarterly, semi- annual reporting through largely automated processes
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 24
©2006 Ingenix, Inc
Example of Weekly Measure
TGT_TBL_ETL_DT
Indi
vidu
al V
alue
5/24/
2007
4/2/20
07
2/10/
2007
12/27
/2006
11/2
/2006
9/17/
2006
7/26/20
06
6/2/20
06
4/14/20
06
2/23/
2006
2.5
2.0
1.5
1.0
0.5
0.0
_X=1.532
UCL=1.869
LCL=1.195
1111
111111
11
1
111
1
1
DQ 128 Procedure Revenue Code on Unet Claim Stat Header
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 25
©2006 Ingenix, Inc
Quarterly Management Report
Baseline Data Quality –annual review and baseline assessmentQuality Indicators – key attribute monitoringGalaxy Data Issues Tracking
III . Galaxy Data Issues Tracking
Report Date: 9/16/05Galaxy Operations Contacts: Nancy Grimaldi, Eric Infeld, Laura Sebastian-Coleman
c.2005 UnitedHealth Group - Proprietary & Confidential
II . Quality Indicators -- Key Attribute Monitoring
Galaxy Data QualityAs of 9/16/2005
I . Baseline Data Quality -- Annual Review and Baseline Assessment Update
GALAXYThe DQ team conducted its Annual Baseline Assessment in July and August. This year’s Baseline reviewed 1483 primary and foreign keys on Galaxy’s fact tables for business expected values. The Baseline focuses on identiying and researching business illogical values and provides a benchmark for the validity and accuracy of key data. The results showed 99.73% of attributes exceeded 4 sigma, as compared with 99.44% in 2004 and 96.82% in 2003. This improvement was due to issues identified and resolved through previous baseline assessments and ongoing maintenance and quality improvement efforts.
For the past two years, we have measured values based on a four sigma standard. This year we also collecteded data for five and six sigma. • 99.46% of attributes exceeded 5 sigma • 99.12% of attributes exceeded 6 sigma
4 attributes in two subject areas did not meet 4 sigma • One involves invalid values being populated due to issues with a source feed (see P&I request 1613). • One involves a field that was hard-coded with a default value (see P&I request 1617) • Two involve blanks in fields where blanks are not expected; these need to be researched to determine whether blanks are being supplied by the source or caused by a Galaxy process (see P&I Requests 1618 and 1619)
• Additional automated reports on this model are now being implemented on newly integrated MAMSI data.
• By indicating whether Galaxy matching processes are operating as expected, these reports allow for early identification of potential data and processing issues.
• Galaxy data issues are identified by SDW business analysts or by end users, tracked through an internally maintained database, and addressed as part of ongoing operations."Issues" reported do not include funded projects or source system changes that require a work effort on the part of Galaxy Operations.
• The total number of issues quarter-to-quarter has remained relatively steady. Currently 83 issues are identified as medium to high priority, compared to 82 last quarter.
• Of the 83 identified issues, 3 are actively being resolved. This is 3.6% of identified issues, compared to 8.5% last quarter. See graph on the upper left.
• The graph on the lower left shows the number of requests closed per quarter. Requests include issues, external initiatives, internal initiatives, and source system changes. 18 issues were closed this quarter compared to 12 in Q2 2005. 17 external requests were closed this quarter, as opposed to 3 in Q2 2005.
• The graph on the right shows the distribution of issues across subject areas for the second and third quarters of 2005. In all subject areas, SDW will continue to leverage project teams to address issues as opportunities are identified to do so.
• The graph to the right above tracks results from the Pharmacy Claim control and compares the level of default Member System IDs on UNet and Prime claims.
• This quarter (Q3 2005) there have not been any spikes in the Pharmacy Claim matching process.
• The two dramatic spikes were caused by an increase of several hundred records above expected levels. The overall level of defaults remains at less than a tenth of a percentage. Before process changes in July 2004, the level for both source systems was around 0.10%.
• Control reporting tracks key quality indicators as part of the cyclic data prep and load process.
• The Unet Claim graph (above) illustrates how well Galaxy identifies CES membership and matches members to UNet claims. This process has been operating within expected limits at this level for over a year and is now showing a strong downward trend.
eaProv ider
ProductPharmacy
OrganizationMemberLab
GeographyFAOCustomer
C laim Statistical
Claim F inancial
35
30
25
20
15
10
5
0
VariableIn constructionOutstanding
Number of Issues Outstanding By Subject Area 6-15-2005
Graph represents a point-in-time snapshot of data on 6/16/05
1
5 1 1
82 Issues identified 7 Solutions under construction75 Issues outstanding
5
18
2
31
1
3
21
2
Load Date
Perc
ent
of 0
Mbr
Sys
ID
cla
ims
8/27/
2005
7/18/
2005
6/12/
2005
5/3/20
05
3/25/
2005
2/14/
2005
1/2/20
05
11/29
/2004
10/9/20
04
9/2/20
04
7/28/
2004
0.080
0.075
0.070
0.065
0.060
0.055
0.050
Accuracy MeasuresMAPE 6.26364MAD 0.00411MSD 0.00003
VariableActualF its
Trend Analysis Plot for Percent of 0 Mbr Sys ID claimsLinear Trend Model
Yt = 0.0714214 - 0.000162139*t
Load Date
Perc
enta
ge
7/28
/200
5
5/3/
2005
2/14
/200
5
12/3
/200
4
9/12
/200
4
6/26
/200
4
4/3/20
04
1/11
/200
4
10/24/
2003
8/2/20
03
5/21
/200
3
0.11
0.10
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
VariablePrime PercentageUnet Percentage
Percentage of 0 Member System IDs on Pharmacy Claim by Source SystemData as of 09-03-05
Num
ber
of R
equ
ests
Clo
sed
2005 - Q32005 - Q22005 - Q12004 - Q42004 - Q32004 - Q22004 - Q1
60
50
40
30
20
10
0
QuarterIssuesExternal initiativesInternal InitiativesSource system changes
Requests Closed Per Quarter 2004 - Q1 through 2005 Q3
aProv ider
ProductPharmacy
OrganizationMemberLab
GeographyFAOCustomer
C laim Statistical
Claim Financial
35
30
25
20
15
10
5
0
VariableIn constructionOutstanding
Number of Issues Outstanding by Subject Area 9-15-05
83 Issues Identified 3 Solutions under construction79 Issues outstanding
Graph represents point-in-time data on 9/16/05
1
28
2
41 1
2
1
30
6 7
Date
Num
ber
of I
ssu
es9/16/20056/16/20053/11/200512/09/20049/10/20046/16/20043/01/2004
90
80
70
60
50
40
30
20
10
0
Variable
Issues Outstanding
Issues IdentifiedSolutions in Construction
Addressing Identified Issues in Galaxy
Graph represents point-in-time data for dates taken once per quarter.
Q1 2004 - Q3 2005
Num
ber
of a
ttri
bute
s an
alyz
ed
Year 200520042003
1600
1400
1200
1000
800
600
400
200
0
VariableAttributes not meeting 4 sigmaAttributes meeting 4 sigma
Baseline Assessment Results 2003-2005
36
1101
8
1239
4
1480
96.82% met 4 sigma
99.44%met 4 sigma
99.73%met 4 sigma
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 26
©2006 Ingenix, Inc
Current Situation
Galaxy = a mature, enterprise data warehouseHigh demand for data and for organizational servicesGalaxy’s DQ program also relatively mature
Defined metricsAutomated data collectionRegular reportingDQ Community
UHG growing, largely through acquisitions and partnershipsHealthcare industry changing – relation of government to health care, new products, esp. consumer driven
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 27
©2006 Ingenix, Inc
Pace of Change for Galaxy
2004 Galaxy integrated data from MAMSI, a United Health Group acquisition
Used the existing structure1+ year to integrate
2006 Integrated data from three new source systems Developed a new subject area, RevenueSignificantly expanded Customer subject areaResponded to healthcare industry changes
Part D data HRA (Health Reimbursement Account) data
2007Integrate data from additional acquisitionsExpand the Revenue subject areaContinue to support the use and enhancement of existing data.
2008Two major integrations already scheduled Potential for several others
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 28
©2006 Ingenix, Inc
Pace of Change for Galaxy DQ
Biggest challenge 2003 what to measure and how to measure2007 how to rapidly analyze and act on DQ data
Baseline Assessment of Galaxy Data Quality2003
800 person hours to pull and analyze data for first Baseline AssessmentDuration = more than 3 months Measured 1137 attributes
2006 Pulled 75% of data in less than 10 hours through an automated process Measured 1506 attributesPull data quarterly
Automated reports 2004: 4 reports2007: 80 reports Reports now implemented as part of standard development process.
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 29
©2006 Ingenix, Inc
2007 – 2008 Key UHG Business Needs
UHG acquisitions and partnerships –More data for Galaxy More users need access
Users need data sooner –Time to integrate data into Galaxy must be shortened
Legacy data critical for ensuring reporting continuity and analytics –
Continued support is necessaryData consistency across sources critical for reporting continuity and analytics –
Integration methodologies need to promote and enforce consistency
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 30
©2006 Ingenix, Inc
How to Respond?
Data Quality included in set of changes to improve efficiency and agility
Common Interface – puts more responsibility on source systems for data qualityGateway – changes how Galaxy prepares data.
DQ measuresMore comprehensiveTaken earlier in the process More fully automated
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 31
©2006 Ingenix, Inc
Common Interface Approach
Galaxy defines standard requirements and layouts for dataSources map to these requirements and feed to GalaxyStreamlined transformation/load into GalaxyCommon model across the enterprise
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 32
©2006 Ingenix, Inc
Common Interface Architecture – Views
Common InterfaceTable - Source 1
Common InterfaceTable - Source
2Common Interface
Table - SourceXExisting
UNet/COSMOS/MAMSI Table(s)
Physical Tables (Objects)
COMMON DATA ITEMS Source 1 +
Source 2 + Source N +Existing
UNet/COSMOS/MAMSI(common data items only;
updated field formats)
Enterprise View
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 33
©2006 Ingenix, Inc
Gateway Integration Tool
Facilitates mapping disparate data sources into a Master Data DefinitionApplies generic transformation logic to the outputUtilizes reusable transformsPerforms automatic code generationEnsures consistency across source-to-target mappingsProvides true-to-code documentationIncorporates data quality modulesIncreases speed and reduces complexity of data integrations
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 34
©2006 Ingenix, Inc
Gateway Integration Tool
Gateway Mapping
and Validation DatabaseTransformation
New DataSources
GatewayMaster DataFile Format
Standard reusabletransformations
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 35
©2006 Ingenix, Inc
Gateway – Data Quality Features
DQ functionsMonitor and react to events in processing Collect trend data
Field validationData type checkingValue range checkingValid value list checkingAssignment of default valuesInformational, error and warning messages
File validationFormat checkingField counts / record length validationSummary of field error and warning messagesThresholds of summary counts of errors and warnings that allow job to be aborted if counts or percentages exceeded – generate alerts
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 36
©2006 Ingenix, Inc
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 37
©2006 Ingenix, Inc
Back to Basic DQ
Data in the warehouse is only as good as data in the source
Ensuring sources to supply better data through the Common Interface
Manufacturing model: Data as a product produced through a process
Executing processes more consistently across the database through the Gateway
Measure to improve Gateway integrates and executes DQ measures consistently across the database. Both tools measure ETL processes (timing of jobs, etc.) that affect other aspects of data quality from end-to-end
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 38
©2006 Ingenix, Inc
DQ: Chicken or Egg?
After 4 years – back to the beginning Applying theory/methodology more fully Applying at the beginning of integrations Applying more comprehensively across the warehouse
Major re-thinking of all Galaxy processes Interacting with customersWriting specificationsObtain source filesMapping source-to-targetImplementing ETLBuilding physical tablesTaking DQ measures
DQ still requires championing New problem: How to analyze and respond to findings from the data gathered through new process.
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 39
April 4, 2007
DSOWebInnovation in ETLTransforming Raw Data into the Building Blocks of Intelligence
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 40
© Ingenix, Inc. 2
About Ingenix - Data Services Organization (DSO)
The Ingenix Data Services Organization (DSO) annually processes and integrates millions of service lines and other claim-related data from more than 500 diverse health and productivity related data sources for more than 125 large employers and hundreds of small employers. This translates to:
4250 feeds processed in 20058964 feeds processed in 200618,000+/- feeds processed in 2007
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 41
© Ingenix, Inc. 3
About Ingenix - DSO Processes
Data is received across several media types related to eligibility, medical claim (includes vision and mental health), pharmacy claim, workers compensation, short-term disability, long-term disability, FMLA, lab results, disease management, health risk appraisal, payroll information, etc.Implementation is the design phase focusing on data layout, client account structure, initial data quality plan, conversion rules and successful completion of the first cycle. Update cycle is the periodic receipt of data from carriers/clients, focusing on data review, investigation/resolution of inconsistencies to ensure clean data, and loading onto analytical environment
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 42
© Ingenix, Inc. 4
Business and Operational Challenges
Long turnaround time for data deliveryNeed for a cost effective data management solutionAbsence of streamlined data quality assessment and investigation processAbsence of company-wide standardized data intake processQuality review process that is manual and subjective leading to errors, rework and decreased confidence in resultsHigh maintenance costs
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 43
© Ingenix, Inc. 5
Business Needs
Increase customer satisfaction through faster turnaround time and higher data qualityHigher productivity through faster learning curveElimination of manual intervention and continuous inspection of dataContinually increasing automationMeet growing demand for faster data delivery and higher level of data qualityEfficient data profiling and data managementFlawless data delivery to analytical environmentsAbility to leverage transformed data across multiple Ingenix products
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 44
© Ingenix, Inc. 6
DSOWeb – Innovative ETL & DQ SolutionStandardized data processing by data types – eligibility, medical, drug, disability, WC, FMLA, HRA, lab and disease managementStandardized incoming data into common format by data typeAutomated data quality checksAutomated data trendingAutomated file processing and job monitoringUser ‘friendly’ interfaceData quality and trending failure analysesTransformation and mapping functionalitiesIntegrated data quality investigation functionalitiesValidated client and carrier details from profiles such as:
– Employee status, employee type and types of coverage – Number of covered lives and products
Operational and data quality metrics
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 45
© Ingenix, Inc. 7
DSOWeb - Express eVal – UI Driven Mapping Functionality
Enables users to create source to common data mappingPoint and click functionality; functions, operators and attribute selection easily added to transformation boxDynamic Help; displays syntax and example for each functionOn-screen listing of source attributes, mapped attributes and look-ups for transformationsText box to capture business logic for each transformationAll transformations are stored in meta data tables and available for hard copy documentation and searches
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 46
© Ingenix, Inc. 8
DSOWeb – Data Quality
Data Quality Rule engine systematically compares data results to rule thresholds and only flags exceptions for DSOWeb user interventionMetadata Driven Rules Engine - Quick turnaround for adding, deleting or modifying rulesMore than 7,000 data quality rules tailored to each data typeDifferent rule types for column property enforcement, structure enforcement and business rules enforcementFormulated from verified client claims, eligibility, lab results, HRA and workforce productivity data experienceBased on record distribution and benefits paid threshold
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 47
© Ingenix, Inc. 9
DSOWeb – Trending
Flags unexpected trends in data. Two levels are:Month-to-date trending with the previous month’s dataYear-to-date trending with the previous year’s data.
– For year-to-date, the trending is done up to the month the data is being processed
– Example: If the plan begins in January and the current processing month is June, then trending is performed between received year’s January-June and data for January-June of the previous year.
Metadata driven trending rules engine; quick turn around for adding, deleting or modifying rulesConfigurable tolerance ranges for trending rulesExamples of tolerable ranges
Record Count Passing Range Less Than 10000 5% to +5% 10001 -25000 -3% to +3% Greater Than 25000 -1% to +1%
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 48
© Ingenix, Inc. 10
DSOWeb – Job and File Processing Dashboard
ABC Client
ABC Client
ABC Client
ABC Client
ABC Client
ABC Client
ABC Client
ABC Client
Carrier A
Carrier B
Carrier C
abc_a_a_F20051001 _T20051231_VEL60
22.txt
abc_b_b_F2006040 1_T20060630_vEL2
267.txt
abc_c_c_F20051201 _T20051231_V6D00
58.txt
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 49
© Ingenix, Inc. 11
DSOWeb – UI Driven DQ Failure Analysis and Decision Making
ABC Client
Abc_car1_car2_F20051001_T20051231_VEL6447.txt
Car1
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 50
© Ingenix, Inc. 12
DSOWeb – Integrated DQ Investigation Tool
Data investigations and validation environment integrated within applicationUses BI tool allowing users to query converted data without needing to understand the data structure or a query languageAd-hoc and standard reports by data typeAutomatically schedules and creates standard reports with each production update
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 51
© Ingenix, Inc. 13
DSOWeb – Raw Data Analysis Tool
Raw data investigation tool integrated within applicationUses home grown tool that generates SAS queries without the need to understand SASAbility to save queries for re-useAutomatically emails formatted results to end user for printing or sharing
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 52
© Ingenix, Inc. 14
DSOWeb – Benefits
Cost-effective data management solutionFacilitates timely data investigationFaster turnaround time for data deliveryAutomated process eliminates human errorsStreamlined process ensures scalabilityMetadata based DQ engine; ease of rules maintenanceHuman intervention only targeted at exception cases; increases productivity and maximizes quality outputTimely issue resolutionMetrics on data quality and operational processes can be reported as needed with current dataReduces dependency on technologists in transformation process
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 53
© Ingenix, Inc. 15
Ingenix - DSOWeb
Thank You
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 54
Cambridge, Massachusetts, USA
The MIT 2007 Information Quality Industry Symposium
Proceedings of the MIT 2007 Information Quality Industry
Symposium
9 - 10:30 AM Session 1B: Federal Data Architecture Moderator: Skip Slone, Lockheed Martin 1. Adel Harris, Citizant, Inc 2. Mark Amspoker, Citizant, Inc 3. Burton Cutting, Royal Bank of Canada
E51-372
Proceedings of the MIT 2007 Information Quality Industry Symposium
PG 55