Running Multiple ETL Workflow Loads into a Data Mart and ...Da… · presentation on running...

Preview:

Citation preview

Running Multiple ETL Workflow Loads into a Data Mart and Dimensional Data Loading

UTOUG Training Days 2019

Scott Heffron

Senior Business Systems Analyst

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 2

What you will learn today

Today you will gain an understanding of three key takeaways from the presentation on running multiple ETL workflow loads into a Data Mart:

1. Payment Accuracy Division (PAD) BI/Reporting Environment

2. Differences between the processes that we run

3. How we monitor our process

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 3

Agenda

Use Case

Data Infrastructure

Facts & Fact Data Flow Processing

Dimension Data Loading

Monitoring

Q & A

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 4

Use Case

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 5

Understanding the business problem

What do we do:• We work with insurance providers and receive millions

of medical claims everyday. We evaluate these claims against a series of clinical, contractual and business rules to prevent fraud, waste, and abuse (FWA)

What is the problem:• Clients Data – Volume and Velocity of claims being

loaded is increasing

• Nurse Analyst – Compare reviewing claims against team members in near real time

What is our solution:• Create an ETL framework that allows for multiple

selected processes to run in parallel and not interfere with each other

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 7

Data Infrastructure

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 8

Data sources and destinations

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 9

Facts & Fact Data Flow Processing

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 10

Fact Area(s)

Granularity

LINE FACT

FLAG FACT

CLAIM ACTION FACT

APPEAL FACT

FRAUD TRIGGER FACT

INVOICE FACT

QA ANALYST FACT

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 11

Claim Line (PAD_LINE_FACT) Counts – 1 Days worth of records

Client A - Count: 4,487,259Client B - Count: 2,391,428Client C - Count: 1,173,785Client D - Count: 365,230Client E - Count: 262,997Client F - Count: 226,885Client G - Count: 225,619Client H - Count: 121,157Client I - Count: 118,372

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 12

Data Loading Use Cases

Large volumes of data being loaded.

Receiving data more frequently

KPI data needs to be seen throughout the day with current information

Need to be able to backfill data in history while daily processes are running

Need to be able to do quality assurance on normal data flow without interfering with any other processes running

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 13

Architectural Requirements

Each process runs independent of any other process

Make sure only one process can load data into the fact at a time

A process can not overlap itself

Need the ability to say which facts can be processed during the process run.

Each process indicators needs to be data driven

Some processing types have higher precedence resources than others.

High: Daily and Near Real Time

Low: Quality and Back Fill

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 14

Process Parameters

Process Parameters

Run Type:

1. Records processed based on system generated data range. This uses the record created and last_updated

2. Records processed based on user defined date range. This uses the record created date

3. Target records processing

Begin Date: Start date of given processing range

End Date: End date of given processing range

Processing Type:

1. PAD: Daily Processing

2. NRT: Near Real Time Processing

3. BF: Back Fill Processing

4. QA: Quality Assurance Processing

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 15

Insert Update Table QA

Insert Update Table NRT

Insert Update Table BF

Staging Table QA

Staging Table NRT

Staging Table BF

Staging Table PAD Insert Update Table PAD

Each processing type will have its own set of staging tables to be able to process the records before loading into the target fact table.

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 16

High LevelWork Flow Process

1

2

3

Identify the top four most critical or important steps from our standpoint

4

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 17

High Level FactWork Flow Process

1

2

3

Identify the top three most critical or important steps from our standpoint

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 18

System SettingData Elements

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 19

Manually Start of PAD Process

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 20

Manually Start of Near Real Time Process

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 21

Manually Start of Back Fill Process

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 22

Manually Start of QA Process

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 23

Dimension Data Loading

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 24

We need to find a way to allow daily loading of dimension data from multiple clients into a single dimension table. So that the data can be shared and maintain integrity.

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 25

Issues with Dimension Data Loading

General

Which Dimension Gets Loaded First

Multiple Dimensions needing to be loaded

Client Side

Is another client loading data into the dimension table?

Has the data already been loaded

Dimension Side

Never knowing when the data will be loading to the dimension table

Did the client ETL process finish?

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 26

Dimension Type(s)

System

This data is not generated by the client

Client

This data is generated by the client

Hybrid

This is a combination of System and Client data

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 27

Design areas of the processes

Allow only one client to load a dimension table at a time

Make sure only new records are added to the dimension table

What happens if a failure occurs

Notification of failure

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 28

Data Flow Diagram

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 29

Brain Trust – Data Model Dimension Data Process

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 30

Example of Dimension ETL Configuration Data

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 31

Example of Dimension ETL Table Priority Data

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 32

Example of Dimension ETL Table Usable Activity

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 33

Monitoring

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 34

Example of Processes Running

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 35

Detail Look At Client Processing

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 36

High Level Look At Client Processing

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 37

Look At Client Back Fill Processing

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 38

Look At QA Check

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 39

6am Data Mart Email Status

© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 40

T H A N K Y O U

Q & A

Recommended