18
Mining Declarative Models using Intervals Jan Martijn van der Werf Ronny Mans Wil van der Aalst

Mining Declarative Models using Intervals Jan Martijn van der Werf Ronny Mans Wil van der Aalst

Embed Size (px)

Citation preview

Mining Declarative Models using Intervals

Jan Martijn van der Werf

Ronny Mans

Wil van der Aalst

A service landscape

How to combine logs?

Merge using time stamps!

Are timestamps synchronized in landscape?

Semantics of timestamps?• Time when the event occurred?• Time when it started / completed?• Time when the event is recorded?• Time when the event is stored?• ...

Time stamps

• Time scale of data?• Dense (time stamps)• Coarse (hour, minute, day)

• Reliability of the data?• User entered?• System generated?

Events & intervals: “old theory”

• Structure of concurrency:− Observe whether an event preceded another event− Observe whether events occurred simultaneously

• Implies an order• Interval order!

• Position of intervals on the axis!

Interval orders

• Define relation > by a > b iff “a occurs wholly after b”• Interval order if:

• [ a > b and c > d ] imply [ a > d or c > b ]

• Generalization of transitivity• Simultaneousness: ⌐ ( a > b) /\ ⌐ ( b > a)

b a

cd

b

a

b

a

But only works on level of events!

Process mining & intervals

1. Derive interval for each event• Singleton set (single time stamp)• Accurracy interval ( t ± )• Time scale (week, day, hour, minute, ...)

2. Relate events and intervals to activity

3. Discover process model

Activities & intervals

• First event until last event

• Following the life cycle of the activities

Activities & intervals

• Activities relate to a set of intervals• Many different mappings possible!• Granularity (Density of intervals)

− Fine: many small intervals− Coarse: few large intervals

• Finest interval function:• Only intervals of single points

• Coarsest interval function• Each activity maps to a single interval

Process mining & intervals

1. Derive interval for each event• Singleton set (single time stamp)• Accurracy interval ( t ± )• Time scale (week, day, hour, minute, ...)

2. Relate events and intervals to activity• Many different approaches!

3. Discover process model

Relations on interval sets (1)

• Simultaneousness• Weak: there is somewhere some overlap

• Dependent: always if A occurs, then B occurs as well

• Strong: if A occurs, then B occurs and vice versa

Relations on interval sets (2)

• Causality• Wholly: all intervals of A before B

• Succeeded: each interval of B followed by one of C

• Preceeded: each interval of B occurs after one of A

Declarative language

• Interval relations are highly declarative:• Granularity influences degree of concurrency

• Activities occur simultaneously, unless prohibited

Succeeds!

Preceeds!

Declarative language

An example

Discover declarative model

1. Derive interval sets

2. Calculate relations on interval sets

3. Generate declarative model− Problems:

− Simultaneousness relations overlapping− Causality: always finds the transitive closure!

• Transitive reduction: S S* = R* R

• Minimal edge problem:• Only use “existing” edges for transitive reduction• What are existing arcs in process mining?

Causality & transitive closure

Polynomial

NP-hard

Next to and betweenness relation

• Next to• Weak: there is an interval of A directly followed by A• Strong: all intervals of A are directly followed by B

• Betweenness: • interval of B is between two intervals of A• Weak or strong?

ba

c

aa

c

b

d? ?

Conclusions & future work

• Approach:1. Derive interval for each event

2. Relate events and intervals to activity− Many possibilities!

3. Discover process model

• Proof of concept implemented in ProM• Apply approach to case studies