Upload
dotu
View
219
Download
1
Embed Size (px)
Citation preview
Äriprotsesside modelleerimine ja
automatiseerimine Loeng 3
Protsessi analüüs
Enn Õunapuu
Sisu
• Protsessi analüüs
• Disco
• Küsimused
3
Kursuse sisu lihtsalt
Kursuse edukalt läbinu oskab ettevõtte
protsess kirjeldada, dokumenteerida,
analüüsida, mõõta ja parandada ja
automatiseerida.
4
Äriprotsessi määrang
• The word “process” is defined in the dictionary
as “a series of actions, changes, or
functions bringing about a result”
• Martyn Ould: A process is a coherent set of
activities carried out by a collaborating group
to achieve a goal
5
6
Business process measurement
Process mining – ProM
Probleem
Chapter 17
Process Mining and
Simulation
Moe Wynn
Anne Rozinat
Wil van der Aalst
Arthur ter Hofstede
Colin Fidge
Overview
• Introduction
• Preliminaries
• Process mining (with ProM)
• Process simulation for operational decision
support
• Tools: YAWL, ProM & CPN Tools
• Conclusions
Introduction
• Correctness, effectiveness and efficiency of
business processes are vital to an
organization
• Significant gap between what is prescribed
and what actually happens
• Process owners have limited info about what
is actually happening
• Model-based (static) analysis
Validation
Verification (correctness of a model)
Performance analysis
• Process Mining – post-execution analysis
• Process Simulation – ‘what-if’ analysis
Preliminaries
Preliminaries: Data Logging
• Keeping track of execution data
Activities that have been carried out
Timestamps (Start and end times of activities)
Resources involved
Data
• Purposes
Audit trails
Disaster recovery
Monitoring
Data Mining
Process Mining
Process Simulation
Preliminaries: Process Mining
• Event logs (recorded actual behaviors)
• Covers a wide-range of techniques
• Provide insights into
control flow dependencies
data usage
resource involvement
performance related statistics etc.
• Identify problems that cannot be identified by
inspecting a static model alone
Preliminaries: Process Simulation
• Develop a simulation model at design time
• Carry out experiments under different
assumptions
• Used for process reengineering decisions
• Data input is time-consuming and error-prone
• Requires careful interpretation
Abstraction of the actual behavior
Different assumptions made
Inaccurate or Incomplete data input
Starts from an empty initial state
Process Mining
• Process discovery: "What is
really happening?"
• Conformance checking: "Do
we do what was agreed
upon?"
• Performance analysis:
"Where are the bottlenecks?"
• Process prediction: "Will this
case be late?"
• Process improvement: "How
to redesign this process?"
• Etc.
Example: mining student data
• Process discovery: "What is the real curriculum?"
• Conformance checking: "Do students meet the prerequisites?"
• Performance analysis: "Where are the bottlenecks?"
• Process prediction: "Will a student complete his studies (in time)?"
• Process improvement: "How to redesign the curriculum?"
software
system
process/
system
model
event
logs
models
analyzes
discovery
records
events, e.g.,
messages,
transactions,
etc.
specifies
configures
implements
analyzes
supports/
controls
conformance
“world”
people machines
organizations
components
business processes
Process mining: Linking events to
models
Where to start?
process
design
implementation/
configuration
process
enactment
diagnosisprocess
controlprocess mining
Process Mining with ProM
ProM framework
• One of the leading approaches to Process
Mining http://www.processmining.org/
• Covers a wide range of analysis approaches
• 250+ plug-ins
Process Discovery
Social Network
Conformance Checking
• Conversion capabilities between different
formalisms
Petri nets, EPCs, BPMN, BPEL, YAWL
• Mining XML (MXML) log format
Basic Performance Analysis
Resource Analysis
LTL Checker
throughput
time
bottle-
necks
flow time
from A to
B
Performance analysis showing
bottlenecks
Dotted chart analysis
time
(relative)
case
s
short
cases
long
cases
46138 events
ProM and YAWL • YAWL logs workflow events and data attributes
• An extractor function available as a ProMImport plug-in
• ProM can analyze YAWL logs in MXML format
• Prom can transform YAWL models into Petri nets
<Process id="Payment_subprocess.ywl">
<ProcessInstance id="3f9dfc70-5420-40e7-b9f7-329b5c6f0ded">
<AuditTrailEntry>
<WorkflowModelElement>Check_PrePaid_Shipments_10</WorkflowModelElement>
<EventType>start</EventType>
<Timestamp>2008-07-08T10:11:18.104+01:00</Timestamp>
<Originator>JohnsI</Originator>
</AuditTrailEntry>
<AuditTrailEntry>
<Data><Attribute name="PrePaidShipment">true</Attribute></Data>
<WorkflowModelElement>Check_PrePaid_Shipments_10</WorkflowModelElement>
<EventType>complete</EventType>
<Timestamp>2008-07-08T10:11:28.167+01:00</Timestamp>
<Originator>JohnsI</Originator>
</AuditTrailEntry>
</ProcessInstance>
</Process>
Starting point: event logs
YAWL logs or other event
logs, audit trails, databases,
message logs, etc.
unified event log
(MXML)
Process Simulation
Integrated Simulation Approach
Linking process mining to simulation • Gather process statistics using process mining
techniques
• Calibrate simulation experiments with this data
• Analyze simulation logs in the same way as execution
logs
Data sources for process
characteristics • Design (Workflow and Organizational Models)
Control and data flow
Organizational model
Initial data values
Role assignments
• Historical (Event logs) Data value range distributions
Execution time distributions
Case arrival rate
Resource availability patterns
• State (Workflow system) Progress state
Data values for running cases
Busy resources
Run time for cases
Tools: YAWL, ProM and CPN Tools
Architecture II • YAWL
Create and execute process models
Maintain organizational models
Extractor functionalities for event logs, organizational models and
current state of the workflow system
• ProM
Translate and integrate all the components into a Petri nets model
Analyze event logs and simulation logs
• CPN Tools
Run simulation experiments
Incorporate current state of workflows
Generate simulation logs
Tool: Architecture
Tool: Architecture
•Use existing models
Tool: Architecture II
•Use existing models
•Derive parameters
Tool: Architecture III
•Use existing models
•Derive parameters
•Consider current state
Tool: Architecture IV
•Use existing models
•Derive parameters
•Consider current state
•Simulation logs in MXML
Simulation: Example Payment
[Invoice required][else] [pre-paid shipments]
payment for the shipment
c: Finance Officer
o: Account Manager
customer makes the payment
c: Senior Finance Officer
Start
Issue Shipment
Invoice s: Supply Admin Officer
Check
Pre-paid
shipments
Issue Shipment
Remittance Advice
Issue Shipment
Payment Order
Approve Shipment
Payment Order
Update Shipment
Payment Order
Issue Credit
Adjustment
issue Debit
Adjustment
Finalise
Produce Freight
Invoice
Check Invoice
Requirement
End
Process Shipment
Payment
Complete
Invoice
Requirement
[payment incorrect due to
overcharge]
[payment correct][payment incorrect due to
underpayment]
account settled
payment for the freight
o: Account Managero: Account Manager
c: Finance Officer
c: Finance Officer
s: Supply Admin Officer
customer notified of the payment, customer makes the payment
[s. order approved][s. order not approved]
s: Supply Admin Officer
Process Freight
Payment
s: Supply Admin Officer
s: Supply Admin Officer
s: Supply Admin Officer
s: Supply Admin Officer
o: Account Manager
c: Finance Officer
Simulation: Example
• 13 staff members
5 `supply admin officers‘
3 `finance officers'
2 `senior finance officers'
3 `account managers‘
• Case arrival rate: 50 payments per week
• Throughput time: 5 working days on average
• 30% of shipments are pre-paid
• 50% of orders are approved first-time
• 20% of payments are underpaid
• 10% of payments are overpaid
• 70% of payments are correct
• 80% of orders require invoices
• 20% of orders do not require invoices
Assumption: Payment process running in YAWL for some time.
Simulation: Scenario
• 4 weeks till the end of financial year
• A backlog of 30 payments (some for more
than a week)
• Goal: All payments to be processed in 4
weeks time
• Run simulation experiments to
see if the backlog can be cleared using current
resources
evaluate the effect of avoiding underpayments
• Possible remedial action: Allocate more
resources
ProM screenshots
CPN Tools
Four Scenarios
1. An empty initial state ( ‘empty’)
2. After loading the current state file with the 30
applications currently in the system (‘as is’)
3. After loading the current state file but adding
13 extra resources (‘to be A’)
4. After loading the current state file but
changing the model so that underpayments
are no longer possible (‘to be B')
Evaluation
Simulation for operational decision
support • Combine the real process execution log (`up
to now') and the simulation log (which
simulates the future `from now on')
• Look at the process execution in a unified
manner
• Track both the history and the future of current
cases
Alpha algorithm
α
Process log
• Minimal information in log: case id’s and task id’s.
• Additional information: event type, time, resources, and data.
• In this log there are three possible sequences: ABCD
ACBD
EF
case 1 : task A
case 2 : task A
case 3 : task A
case 3 : task B
case 1 : task B
case 1 : task C
case 2 : task C
case 4 : task A
case 2 : task B
case 2 : task D
case 5 : task E
case 4 : task C
case 1 : task D
case 3 : task C
case 3 : task D
case 4 : task B
case 5 : task F
case 4 : task D
>,,||,# relations
• Direct succession:
x>y iff for some case x is
directly followed by y
• Causality:
xy iff x>y and not y>x
• Parallel:
x||y iff x>y and y>x
• Choice:
x#y iff not x>y and not y>x AB
AC
BD
CD
EF
case 1 : task A
case 2 : task A
case 3 : task A
case 3 : task B
case 1 : task B
case 1 : task C
case 2 : task C
case 4 : task A
case 2 : task B
case 2 : task D
case 5 : task E
case 4 : task C
case 1 : task D
case 3 : task C
case 3 : task D
case 4 : task B
case 5 : task F
case 4 : task D
A>B
A>C
B>C
B>D
C>B
C>
D
E>F B||C
C||B
Basic idea (1)
x y
xy
Basic idea (2)
xy, xz, and y||z
x
z
y
Basic idea (3)
xy, xz, and y#z
x
z
y
Basic idea (4)
xz, yz, and x||y
x
y
z
Basic idea (5)
xz, yz, and x#y
x
y
z
It is not that simple: Basic alpha
algorithm
Let W be a workflow log over T. a(W) is defined as follows.
1. TW = { t T | $s W t s},
2. TI = { t T | $s W t = first(s) },
3. TO = { t T | $s W t = last(s) },
4. XW = { (A,B) | A TW B TW "a A"b B a W b "a1,a2 A
a1#W a2 "b1,b2 B b1#W b2 },
5. YW = { (A,B) X | "(A,B) XA A B B (A,B) = (A,B) },
6. PW = { p(A,B) | (A,B) YW } {iW,oW},
7. FW = { (a,p(A,B)) | (A,B) YW a A } { (p(A,B),b) | (A,B) YW b
B } { (iW,t) | t TI} { (t,oW) | t TO}, and
8. a(W) = (PW,TW,FW).
The alpha algorithm has been proven to be correct for a large class of free-choice nets.
Example
case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D
A
B
C
D
E F
a(W)
W
DEMO Alpha algorithm
A
E
G
invite
reviewers
D
get review 2
time-out 2
collect
reviews
H
decide
I
accept
J
reject
invite
additional
reviewer
K
M
L
get review X
time-out X
C
B
get review 1
time-out 1
G
F
get review 3
time-out 3
48 cases
16 performers
Logging system
• Nlog
• NLog can process diagnostic messages
emitted from any
.NET language (such as C# or Visual Basic),
augment them with contextual information
(such as date/time, severity, thread, process,
environment enviroment), format them
according to your preference and send them
to one or more targets such as file or
database.
Supported targets
• Files - single file or multiple, with automatic file
naming and archival
• Event Log - local or remote
• Database - store your logs in databases
supported by .NET
• Network - using TCP, UDP, SOAP, MSMQ
protocols
• Command-line console - including color coding
of messages
• E-mail - you can receive emails whenever
application errors occur
• ASP.NET trace
• ... and many more
Conclusions
• Introduction
Concise assessment of reality needed for processes
• Preliminaries
Data logging, Process Mining, Process Simulation
• Process mining with ProM
Understanding process characteristics
• Process simulation
Operational decision support
Utilizing log info for simulation experiments
• Tools: YAWL, ProM & CPN Tools
Payment example
• Conclusion
ProM usage
• ProM usage example
Log
Process model mined from log
Example • The running example is about a process to repair telephones in a
company. The company have 3 types of phones (\T1", \T2" and \T3"). The
process starts by registering a telephone device sent by a customer. After
registration, the telephone is sent to the Problem Detection (PD)
department. There it is analyzed and its defect is categorized. In total,
there are 10 dierent categories of defects that the phones xed by this
company can have. Once the problem is identifed, the telephone is sent to
the Repair department and a letter is sent to the customer to inform
him/her about the problem. The Repair (R) department has two teams.
One of the teams can x simple defects and the other team can repair complex
defects. However, some of the defect categories can be repaired by both
teams.
Once a repair employee nishes working on a phone, this device is sent to the
Quality Assurance (QA) department. There it is analyzed by an employee
to check if the defect was indeed fixed or not. If the defect is not repaired,
the telephone is again sent to the Repair department. If the telephone is
indeed repaired, the case is archived and the telephone is sent to the
customer. To save on throughput time, the company only tries to x a defect
a limited number of times. If the defect is not xed, the case is archived
anyway and a brand new device is sent to the customer.
What we can do?
• Inspecting and Cleaning an Event Log
• Mining the Control-Flow Perspective of a
Process - “Alpha algorithm“
• Social networks
Example Scenario
•Purchasing process ERP
Requester Purchasing
Agent
Requester
Manager Supplier Financial
Manager
Problems
1. Inefficient operations
2. Need to demonstrate compliance
3. Complaints about process duration
Analysis Goals
1. Understand the process in detail
2. Check whether there are deviations
from the payment guidelines
3. Control performance targets (21
days)
Outline
1. Example Scenario
2. Roadmap
3. Hands-on Session
4. Take-away Points
Roadmap
• Determine
questions
• Process
scope
• Which IT
systems
• Via DB
administrato
r
• CSV file or
database
extract
• Extract ‘As-
is’ process
• Answer
questions
• Present results
(e.g., report,
presentation,
workshop etc.)
1. How does the process actually look like?
2. Are there deviations from the prescribed process?
3. Do we meet the performance targets?
Roadmap
Roadmap
•IT staff extracts history logs from the
ERP system
•CSV file is starting point for our session
ERP
CSV
Data
Data Extraction
Roadmap
•We use the process mining tool Disco to perform the
data analysis
CSV
Data
Download from
fluxicon.com/disco
Data Analysis
Focus of today’s session
Roadmap
Outline
1. Example Scenario
2. Roadmap
3. Hands-on Session
4. Take-away Points
•Let’s get started!
Hands-on Session
•Open PurchasingExample.csv file in Excel
and inspect its contents
- Every row corresponds to one event
- You can see information about Case IDs, Activities,
Start and end times, Resources, Roles
Step 1 - Inspect Data
•Load PurchasingExample.csv in Disco
Assign columns as follows:
- Case ID ➞ Case ID
- Start and Complete Timestamp ➞ Timestamp
- Activity ➞ Activity
- Resource ➞ Resource
- Role ➞ Other
•Click ‘Start import’
Step 2 - Import Data
•Look at the resulting process model
- Numbers in rectangles are activity frequencies
- Number at arcs is frequency of connection
•→ You see the main process flows
- All 608 cases start with activity ‘Create Purchase Requisition’
- Lots of changes were made (amendments)!
Step 3 - Inspect Process
Start point
•It’s important to be able to adjust the level of detail
for the process map
•Move up the ‘Activities’ slider down to lowest
position (0%)
- Only the activities from the most frequent process variant are
shown
Step 3 - Inspect Process
•Gradually move the ‘Activities’ slider up to 100%
again until all activities are shown
- Even infrequent activities such as ‘Amend Purchase
Requisition’ are shown
•You’ll notice that 11 cases are flowing in to ‘Amend Purchase Requisition’ but only 8 are moving out - Where are the other 3?
Step 3 - Inspect Process
•Move up the ‘Paths’ slider up to the top
•You now see a 100% detailed picture of the
executed process
- The 3 missing cases move from ‘Amend Purchase
Requisition’ to ‘Create Request for Quotation’
Step 3 - Inspect Process
•Look at ‘Statistics’ tab to see overview information
about event log
- 9,119 events were recorded for 608 cases
- Timeframe is January - October 2011
•The case duration is typically up to 15 or 16 days, but some cases take very long (more than 70 or 80 days!)
Step 4 - Inspect Statistics
•Select ‘Cases’ tab to inspect variants and individual
service instances
- The third most frequent process variant ends after ‘Analyze
Purchase Requisition’ (ca. 10.36% of all cases follow this
pattern)
•→ Why are so many requests stopped? Do people
not know what they can buy? (We can find this back
in the process map, too)
Step 5 - Inspect Cases
Leading to
End point
•Original Questions:
1. How does the process actually look like?
• Objective process map discovered • Lots of amendments and stopped requests:
Update of purchasing guidelines needed
2. Are there deviations from the prescribed process?
3. Do we meet the performance targets?
• Not by all (some take longer than 21 days): Where in the process do we lose the time? -> Next
Results so far...
•Click on the Filter symbol in the lower left corner and
add a Performance filter
- Select 21 days as lower boundary
- You’ll see that ca. 15% of the purchase orders take longer than 21 days
• Press ‘Apply filter’ to focus only on those cases that take longer than 21 days
Step 6 - Filter on Performance
1
2
•The filtered process map shows process flow for the
92 (15%) slow cases
- On average 3 amendments per case!
•Switch to ‘Performance’ view
- ‘Total duration’ shows the high-impact areas
- Switch to ‘Mean duration’: On average it takes more than 14
days to return from the rework loop to the normal process
Step 7 - Visualize Bottlenecks
Switch to ‘Mean
duration’
•Visualize bottleneck: Press ▷ button to start animation
•Observe how purchase orders move through the
process
•Drag needle to end of time line and observe how the
most used paths get thicker and thicker
Step 8 - Animate Process
1. How does the process actually look like?
• Objective process map discovered • Lots of amendments and stopped requests:
Update of purchasing guidelines needed
2. Are there deviations from the prescribed process? -> Next
3. Do we meet the performance targets?
• Not by all (some take longer than 21 days)
• The ‘Analyze Request for Quotation’ activity is a huge bottleneck: Process change is needed
Results so far...
•Exit the animation, return to Filter settings, and
remove performance filter
•Switch back to Frequency Map view and scroll to end of the process
- 10 cases skip the mandatory ‘Release Supplier’s Invoice’
activity!
Step 9 - Compliance Check
1 2
•Drill down: Click on the path from ‘Send invoice’ to
‘Authorize Supplier’s Invoice payment’ and press
‘Filter this path...’
•Switch to Cases view to see the 10 cases
- Actionable result: We can either change the operational
system to prevent the violation or provide targeted training
Step 9 - Compliance Check
Shortcut to
filter this path
1. How does the process actually look like?
• Objective process map discovered • Lots of amendments and stopped requests:
Update of purchasing guidelines needed
2. Are there deviations from the prescribed process? -> Yes, training or system change needed
3. Do we meet the performance targets?
• Not by all (some take longer than 21 days)
• The ‘Analyze Request for Quotation’ activity is a huge bottleneck: Process change is needed
Results so far...
•Last Step: We want to take an alternative view on the
data to see organizational flow
•Go to ‘Project view’ and press ‘Reload’:
•Set ‘Activity’ column to ‘Other’ and configure ‘Role’
column as ‘Activity’
1 2
Step 10 - Organizational View
•Instead of the activity flow, we are now looking at how
the process moves through different roles in the
organization
- Inefficiencies can often be found at the borders of
organizational units
- Clearly, the Purchasing agents are causing the biggest delays
in the process!
Step 10 - Organizational View
•We learned how to discover a process model and
found opportunities to improve the process
•Now: Take action and verify results
Close the loop
• Import data again and configure both ‘Activity’ and
‘Role’ column as ‘Activity’
- Can you see what happens now?
• Export XES file and analyze further in ProM
- Download ProM from http://promtools.org/
Bonus
Outline
1. Example Scenario
2. Roadmap
3. Hands-on Session
4. Take-away Points
•Real processes are often more complex than
you would expect
•There is no one “right” model: You can take
multiple views on the same data
•Process mining is an explorative, interactive
activity
Take-away Points
Questions?
124