Research
Marigianna Skouradaki, FrankLeymannInstituteofArchitectureofApplication Systems
University ofStuttgart,Germany
VincenzoFerme,Cesare PautassoFacultyofInformatics,
University ofLugano,Switzerland
AndrévanHoornInstituteofSoftwareTechnology,University ofStuttgart,Germany
Micro– BenchmarkingBPMN2.0WorkflowManagementSystems
withWorkflowPatterns
22
Research
©Marigianna Skouradaki
Motivation
*the appearance and company selection are random
77
Research
©Marigianna Skouradaki
Motivation
WorkloadMix Results
2. What is the impact of the core BPMN 2.0 language constructs on the performance of the Workflow Management Systems?
3. Can simple workload mix reveal performance bottlenecks?
1. How to define meaningful workload models candidates?
Benchmark
88
Research
©Marigianna Skouradaki
Agenda
n Backgroundn WorkloadModelsn BenchFlowEnvironmentn ExperimentsSetupn ExperimentsMethodologyn Resultsn Discussionn ConclusionsandFutureWork
99
Research
©Marigianna Skouradaki
WhatisaWorkflowManagementSystem(WfMS)?
4.Navigator
3.Web Services
6.ApplicationServer
WorkflowEngine
5.InstanceDatabase
2.Users& IT tools
…
1.ProcessModels
1010
Research
©Marigianna Skouradaki
WhatisaWorkflowManagementSystem(WfMS)?
4.Navigator
3.Web Services
6.ApplicationServer
WorkflowEngine
5.InstanceDatabase
2.Users& IT tools
…
1.ProcessModels
1313
Research
©Marigianna Skouradaki
Micro-Benchmark
n Simpleworkloadandtargetstheevaluationofatomicoperations1
1Waller, J., Hasselbring, W.: A benchmark engineering methodology to measure the overhead of application-level monitoring. In: KPDAYS. CEUR Workshop Proceedings, vol. 1083, pp. 59–68. CEUR-WS.org (2013) 2Oracle, “Java Microbenchmark Harness”, (2013). Online resource: http://openjdk.java.net/projects/code-tools/jmh/3Van Der Aalst, W.M.P., Hofstede, T., et al.: Workflow patterns. Distrib. Parallel Databases 14(1), 5–51 (Jul 2003)
HelloWorld() micro-benchmark measures “nothing”.It is a good showcase for the overheads in infrastructure2
à For Workflow Management SystemsThe basic control flow
workflow patterns can be considered as atomic operations.
ContributionIntroduce & Execute
the first Micro-Benchmark onBPMN 2.0 WfMS using
Workflow Patterns
1818
Research
©Marigianna Skouradaki
WorkflowPatternsasWorkloadModels(Constraints)
1. Maximizethesimplicityoftheprocessmodels2. Omitinteractionswithexternalsystems(i.e.,stressthe
ProcessNavigator)3. Mostscripttasksareempty,unlessthe
implementationoftheworkflowpatternrequiresotherwise
4. Defineequalprobabilityforpassingthecontrolflowtotheoutgoingbranches
5. Combine:- Simplemerge+exclusivechoice- Parallelsplit+synchronization
2323
Research
©Marigianna Skouradaki
WorkloadModelssequencePattern
EmptyScript 1
EmptyScript 2
Sequence Pattern [SEQ]parallelSplitPattern
EmptyScript 1
EmptyScript 2
Parallel Split & Synchronization [PAR]
implicitTerminationPattern
EmptyScript 1
Wait 5Sec
Explicit Termination Pattern [EXT]arbitaryCyclesPattern
generate1 or 2
EmptyScript 1
EmptyScript 2
i++
i < 10
i >= 10case_1
case_2
Arbitrary Cycle Pattern [CYC]
exclusiveChoicePattern
generatenumber
EmptyScript 1
EmptyScript 2case_2
case_1
Simple Merge & Exclusive Choice [EXC]
2424
Research
©Marigianna Skouradaki
ThreeOpen-SourceWfMS UnderTest
1. Widely used in the industry2. Tested against BPMN 2.0 conformance1
3. Vendor provided configurations are used
1.Geiger, M., Harrer, S., Lenhard, J., Casar, M., Vorndran, A., Wirtz, G.: BPMN conformance in open source engines. In: Proc. of the 9th IEEE International Symposium on Service-Oriented System Engineering (SOSE 2015). SOSE 2015, IEEE, San Francisco Bay, CA, USA (March 30 – April 3 2015)
2525
Research
VincenzoFerme
TheBenchFlowEnvironment5
5Ferme, Vincenzo; Ivanchikj, Ana and Pautasso Cesare: "A Framework for Benchmarking BPMN 2.0 Workflow Management Systems", In: Proceedings of BPM’15, Innsbruck, Austria, Springer, August, 2015.
2626
Research
VincenzoFerme
Metrics/KPIs
2727
Research
VincenzoFerme
ExperimentsMethodology
n Experiment1:Max(1500)concurrentInstanceProducersstartinginstancesforeachworkflowpattern[SEQ,EXC,EXT,PAR,CYC]
n Experiment2:1500concurrentInstanceProducersstartinganequalnumberinstancesoftheworkflowpatterns(20%[MIX]of[SEQ,EXC,EXT,PAR,CYC])
à Threeroundsperexperimentà Eachexperiment lastsfor10minutesà Discardedthedataofthewarm-upstate(first1min)
2828
Research
©Marigianna Skouradaki
Results:MeanDuration(ms)
SEQ EXC EXT PAR CYC MIX0
5
10
15
20
0.39
0.48
14.1
13.29
2.91
8.16
6.39
9.3
2,62
2
10.06
38.65
540.02
0.74
0.85
0.4
0.7 3.
06
1.22
(a)MeanDuration
(ms)
Activiti jBPM Camunda
SEQ EXC EXT PAR CYC MIX0
20
40
60
80
100
43.21 57.42
60.2 66.1
70.09
77.69
5.83
5.73
0.24 5.64
4.67
0.66
36.75
44.62
33.34
41.76
41.67
48.53
(b)MeanCPU
(%)
Fig. 2. (a) Mean Duration (ms) per workflow pattern and (b) Mean CPU (%) Usageper workflow pattern
shows that the duration times are not notably a↵ected as the values are close tothose of [SEQ].
More particularly, we have a mean of 0.48 ms for Activiti, 9.30 ms for jBPMand 0.85 ms for Camunda. Concerning the CPU and RAM utilization, we seea slight increase with respect to the [SEQ]. Activiti uses an average of 57.42%CPU and 12, 215 MB RAM for executing 775, 455 workflow instances in 540 sec,jBPM takes approximately the same amount of time (562 sec) to execute 27, 805workflow instances. For this, it utilizes a mean of 5.73% CPU and 2, 976.37 MBof RAM, and Camunda 43.21% of CPU and 824.96 MB of RAM for executing765, 274 workflow instances in 540 sec.
4.2.3 Explicit Termination Pattern [EXT] As discussed in Sec. 2 the[EXT] executes concurrently an empty script and a script that implements a fiveseconds wait. According to the BPMN 2.0 execution semantics, the branch of the[EXT] that finishes first terminates the rest of the workflow’s running branches.We have therefore designed the model considering that the fastest branch (emptyscript) will complete first, and stop the slow script on the other branch whenthe terminate end event following the empty script is activated. This was thecase for Activiti and Camunda, which executed the workflow patterns in anaverage of 14.11 ms and 0.4 ms respectively. The resource utilization of these twoWfMSs also increases in this workflow pattern, i.e., we have 60.20% mean CPUusage and 12, 025 MB mean RAM usage for Activiti and 33.34% mean CPUusage and 794.92 MB mean RAM usage for Camunda. We can already see aninteresting di↵erence on the performance of the two WfMS as [EXT] constitutesthe the slowest workflow pattern for Activiti and the fastest workflow patternfor Camunda. However, we should notice that in this experiment Camunda hada higher response time which was slowing down the instance producers. Thisresulted in an experiment execution time of 539 sec and a lower throughput.
SEQ EXC EXT PAR CYC MIX0
5
10
15
20
0.39
0.48
14.1
13.29
6.23 8.
16
6.39
9.3
2,62
2
10.06
39.36
540.02
0.74
0.85
0.4
0.7 3.
06
1.22
(a)MeanDuration
(ms)
WfMS A WfMS B Camunda
SEQ EXC EXT PAR CYC MIX0
20
40
60
80
100
43.21 57.42
60.2 66.1
70.09
77.69
5.83
5.73
0.24 5.64
4.67
0.66
36.75
44.62
33.34
41.76
41.67
48.53
(b)MeanCPU
(%)
Fig. 2. (a) Mean Duration (ms) per workflow pattern and (b) Mean CPU (%) Usageper workflow pattern
mentary material. The behaviour of all the WfMSs is discussed thoroughly inSec. 4.3.
4.2 Results Analysis
Sequence Flow Pattern [SEQ] The [SEQ] workflow pattern lasted onaverage 0.39 ms for WfMS A, 6.39 ms for WfMS B and 0.74 ms for Camunda.The short duration of this workflow pattern justifies the low mean CPU usagewhich is 43.21% for WfMS A, 5.83% for WfMS B and 36.75% for Camunda.WfMS B also has a very low average throughput of 63.31 wi/s while for the othertwo WfMS the average throughput is similar. Concerning the memory utilizationunder the maximum load WfMS A needed in average 12, 074, WfMS B 2, 936 andCamunda 807.81 MB of RAM respectively. As observed from the Table 1 [SEQ]is the workflow pattern with the highest throughput for all the WfMS under test.
Exclusive Choice & Simple Merge Patterns [EXC] Before proceedingto the results analysis of the [EXC], we should consider that the first script taskof the workflow pattern generates a random integer, which is given as an inputto the very simple evaluation condition of the exclusive choice gateway. This wasexpected to have some impact on the performance. However, Fig. 2(a) showsthat the duration times are not notably a↵ected as the values are close to thoseof [SEQ]. More particularly, we have a mean of 0.48 ms for WfMS A, 9.30 ms forWfMS B and 0.85 ms for Camunda. Concerning the CPU and RAM utilization,we see a slight increase with respect to the [SEQ]. WfMS A uses an average of57.42% CPU and 12, 215 MB RAM for executing 775, 455 workflow instancesin 540 sec, WfMS B takes approximately the same amount of time (562 sec) toexecute 27, 805 workflow instances. For this, it utilizes a mean of 5.73% CPU and2, 976.37 MB of RAM, and Camunda 43.21% of CPU and 824.96 MB of RAMfor executing 765, 274 workflow instances in 540 sec.
Mean duration (ms)
sequencePattern
EmptyScript 1
EmptyScript 2
[SEQ]
2929
Research
©Marigianna Skouradaki
Results:MeanDuration(ms)
SEQ EXC EXT PAR CYC MIX0
5
10
15
20
0.39
0.48
14.1
13.29
2.91
8.16
6.39
9.3
2,62
2
10.06
38.65
540.02
0.74
0.85
0.4
0.7 3.
06
1.22
(a)MeanDuration
(ms)
Activiti jBPM Camunda
SEQ EXC EXT PAR CYC MIX0
20
40
60
80
100
43.21 57.42
60.2 66.1
70.09
77.69
5.83
5.73
0.24 5.64
4.67
0.66
36.75
44.62
33.34
41.76
41.67
48.53
(b)MeanCPU
(%)
Fig. 2. (a) Mean Duration (ms) per workflow pattern and (b) Mean CPU (%) Usageper workflow pattern
shows that the duration times are not notably a↵ected as the values are close tothose of [SEQ].
More particularly, we have a mean of 0.48 ms for Activiti, 9.30 ms for jBPMand 0.85 ms for Camunda. Concerning the CPU and RAM utilization, we seea slight increase with respect to the [SEQ]. Activiti uses an average of 57.42%CPU and 12, 215 MB RAM for executing 775, 455 workflow instances in 540 sec,jBPM takes approximately the same amount of time (562 sec) to execute 27, 805workflow instances. For this, it utilizes a mean of 5.73% CPU and 2, 976.37 MBof RAM, and Camunda 43.21% of CPU and 824.96 MB of RAM for executing765, 274 workflow instances in 540 sec.
4.2.3 Explicit Termination Pattern [EXT] As discussed in Sec. 2 the[EXT] executes concurrently an empty script and a script that implements a fiveseconds wait. According to the BPMN 2.0 execution semantics, the branch of the[EXT] that finishes first terminates the rest of the workflow’s running branches.We have therefore designed the model considering that the fastest branch (emptyscript) will complete first, and stop the slow script on the other branch whenthe terminate end event following the empty script is activated. This was thecase for Activiti and Camunda, which executed the workflow patterns in anaverage of 14.11 ms and 0.4 ms respectively. The resource utilization of these twoWfMSs also increases in this workflow pattern, i.e., we have 60.20% mean CPUusage and 12, 025 MB mean RAM usage for Activiti and 33.34% mean CPUusage and 794.92 MB mean RAM usage for Camunda. We can already see aninteresting di↵erence on the performance of the two WfMS as [EXT] constitutesthe the slowest workflow pattern for Activiti and the fastest workflow patternfor Camunda. However, we should notice that in this experiment Camunda hada higher response time which was slowing down the instance producers. Thisresulted in an experiment execution time of 539 sec and a lower throughput.
SEQ EXC EXT PAR CYC MIX0
5
10
15
20
0.39
0.48
14.1
13.29
6.23 8.
16
6.39
9.3
2,62
2
10.06
39.36
540.02
0.74
0.85
0.4
0.7 3.
06
1.22
(a)MeanDuration
(ms)
WfMS A WfMS B Camunda
SEQ EXC EXT PAR CYC MIX0
20
40
60
80
100
43.21 57.42
60.2 66.1
70.09
77.69
5.83
5.73
0.24 5.64
4.67
0.66
36.75
44.62
33.34
41.76
41.67
48.53
(b)MeanCPU
(%)
Fig. 2. (a) Mean Duration (ms) per workflow pattern and (b) Mean CPU (%) Usageper workflow pattern
mentary material. The behaviour of all the WfMSs is discussed thoroughly inSec. 4.3.
4.2 Results Analysis
Sequence Flow Pattern [SEQ] The [SEQ] workflow pattern lasted onaverage 0.39 ms for WfMS A, 6.39 ms for WfMS B and 0.74 ms for Camunda.The short duration of this workflow pattern justifies the low mean CPU usagewhich is 43.21% for WfMS A, 5.83% for WfMS B and 36.75% for Camunda.WfMS B also has a very low average throughput of 63.31 wi/s while for the othertwo WfMS the average throughput is similar. Concerning the memory utilizationunder the maximum load WfMS A needed in average 12, 074, WfMS B 2, 936 andCamunda 807.81 MB of RAM respectively. As observed from the Table 1 [SEQ]is the workflow pattern with the highest throughput for all the WfMS under test.
Exclusive Choice & Simple Merge Patterns [EXC] Before proceedingto the results analysis of the [EXC], we should consider that the first script taskof the workflow pattern generates a random integer, which is given as an inputto the very simple evaluation condition of the exclusive choice gateway. This wasexpected to have some impact on the performance. However, Fig. 2(a) showsthat the duration times are not notably a↵ected as the values are close to thoseof [SEQ]. More particularly, we have a mean of 0.48 ms for WfMS A, 9.30 ms forWfMS B and 0.85 ms for Camunda. Concerning the CPU and RAM utilization,we see a slight increase with respect to the [SEQ]. WfMS A uses an average of57.42% CPU and 12, 215 MB RAM for executing 775, 455 workflow instancesin 540 sec, WfMS B takes approximately the same amount of time (562 sec) toexecute 27, 805 workflow instances. For this, it utilizes a mean of 5.73% CPU and2, 976.37 MB of RAM, and Camunda 43.21% of CPU and 824.96 MB of RAMfor executing 765, 274 workflow instances in 540 sec.
Mean duration (ms)
[EXC]
exclusiveChoicePattern
generatenumber
EmptyScript 1
EmptyScript 2case_2
case_1
3030
Research
©Marigianna Skouradaki
Results:MeanDuration(ms)
SEQ EXC EXT PAR CYC MIX0
5
10
15
20
0.39
0.48
14.1
13.29
2.91
8.16
6.39
9.3
2,62
2
10.06
38.65
540.02
0.74
0.85
0.4
0.7 3.
06
1.22
(a)MeanDuration
(ms)
Activiti jBPM Camunda
SEQ EXC EXT PAR CYC MIX0
20
40
60
80
100
43.21 57.42
60.2 66.1
70.09
77.69
5.83
5.73
0.24 5.64
4.67
0.66
36.75
44.62
33.34
41.76
41.67
48.53
(b)MeanCPU
(%)
Fig. 2. (a) Mean Duration (ms) per workflow pattern and (b) Mean CPU (%) Usageper workflow pattern
shows that the duration times are not notably a↵ected as the values are close tothose of [SEQ].
More particularly, we have a mean of 0.48 ms for Activiti, 9.30 ms for jBPMand 0.85 ms for Camunda. Concerning the CPU and RAM utilization, we seea slight increase with respect to the [SEQ]. Activiti uses an average of 57.42%CPU and 12, 215 MB RAM for executing 775, 455 workflow instances in 540 sec,jBPM takes approximately the same amount of time (562 sec) to execute 27, 805workflow instances. For this, it utilizes a mean of 5.73% CPU and 2, 976.37 MBof RAM, and Camunda 43.21% of CPU and 824.96 MB of RAM for executing765, 274 workflow instances in 540 sec.
4.2.3 Explicit Termination Pattern [EXT] As discussed in Sec. 2 the[EXT] executes concurrently an empty script and a script that implements a fiveseconds wait. According to the BPMN 2.0 execution semantics, the branch of the[EXT] that finishes first terminates the rest of the workflow’s running branches.We have therefore designed the model considering that the fastest branch (emptyscript) will complete first, and stop the slow script on the other branch whenthe terminate end event following the empty script is activated. This was thecase for Activiti and Camunda, which executed the workflow patterns in anaverage of 14.11 ms and 0.4 ms respectively. The resource utilization of these twoWfMSs also increases in this workflow pattern, i.e., we have 60.20% mean CPUusage and 12, 025 MB mean RAM usage for Activiti and 33.34% mean CPUusage and 794.92 MB mean RAM usage for Camunda. We can already see aninteresting di↵erence on the performance of the two WfMS as [EXT] constitutesthe the slowest workflow pattern for Activiti and the fastest workflow patternfor Camunda. However, we should notice that in this experiment Camunda hada higher response time which was slowing down the instance producers. Thisresulted in an experiment execution time of 539 sec and a lower throughput.
SEQ EXC EXT PAR CYC MIX0
5
10
15
20
0.39
0.48
14.1
13.29
6.23 8.
16
6.39
9.3
2,62
2
10.06
39.36
540.02
0.74
0.85
0.4
0.7 3.
06
1.22
(a)MeanDuration
(ms)
WfMS A WfMS B Camunda
SEQ EXC EXT PAR CYC MIX0
20
40
60
80
100
43.21 57.42
60.2 66.1
70.09
77.69
5.83
5.73
0.24 5.64
4.67
0.66
36.75
44.62
33.34
41.76
41.67
48.53
(b)MeanCPU
(%)
Fig. 2. (a) Mean Duration (ms) per workflow pattern and (b) Mean CPU (%) Usageper workflow pattern
mentary material. The behaviour of all the WfMSs is discussed thoroughly inSec. 4.3.
4.2 Results Analysis
Sequence Flow Pattern [SEQ] The [SEQ] workflow pattern lasted onaverage 0.39 ms for WfMS A, 6.39 ms for WfMS B and 0.74 ms for Camunda.The short duration of this workflow pattern justifies the low mean CPU usagewhich is 43.21% for WfMS A, 5.83% for WfMS B and 36.75% for Camunda.WfMS B also has a very low average throughput of 63.31 wi/s while for the othertwo WfMS the average throughput is similar. Concerning the memory utilizationunder the maximum load WfMS A needed in average 12, 074, WfMS B 2, 936 andCamunda 807.81 MB of RAM respectively. As observed from the Table 1 [SEQ]is the workflow pattern with the highest throughput for all the WfMS under test.
Exclusive Choice & Simple Merge Patterns [EXC] Before proceedingto the results analysis of the [EXC], we should consider that the first script taskof the workflow pattern generates a random integer, which is given as an inputto the very simple evaluation condition of the exclusive choice gateway. This wasexpected to have some impact on the performance. However, Fig. 2(a) showsthat the duration times are not notably a↵ected as the values are close to thoseof [SEQ]. More particularly, we have a mean of 0.48 ms for WfMS A, 9.30 ms forWfMS B and 0.85 ms for Camunda. Concerning the CPU and RAM utilization,we see a slight increase with respect to the [SEQ]. WfMS A uses an average of57.42% CPU and 12, 215 MB RAM for executing 775, 455 workflow instancesin 540 sec, WfMS B takes approximately the same amount of time (562 sec) toexecute 27, 805 workflow instances. For this, it utilizes a mean of 5.73% CPU and2, 976.37 MB of RAM, and Camunda 43.21% of CPU and 824.96 MB of RAMfor executing 765, 274 workflow instances in 540 sec.
Mean duration (ms)
implicitTerminationPattern
EmptyScript 1
Wait 5Sec
[EXT]
3131
Research
©Marigianna Skouradaki
Results:MeanDuration(ms)
SEQ EXC EXT PAR CYC MIX0
5
10
15
20
0.39
0.48
14.1
13.29
2.91
8.16
6.39
9.3
2,62
2
10.06
38.65
540.02
0.74
0.85
0.4
0.7 3.
06
1.22
(a)MeanDuration
(ms)
Activiti jBPM Camunda
SEQ EXC EXT PAR CYC MIX0
20
40
60
80
100
43.21 57.42
60.2 66.1
70.09
77.69
5.83
5.73
0.24 5.64
4.67
0.66
36.75
44.62
33.34
41.76
41.67
48.53
(b)MeanCPU
(%)
Fig. 2. (a) Mean Duration (ms) per workflow pattern and (b) Mean CPU (%) Usageper workflow pattern
shows that the duration times are not notably a↵ected as the values are close tothose of [SEQ].
More particularly, we have a mean of 0.48 ms for Activiti, 9.30 ms for jBPMand 0.85 ms for Camunda. Concerning the CPU and RAM utilization, we seea slight increase with respect to the [SEQ]. Activiti uses an average of 57.42%CPU and 12, 215 MB RAM for executing 775, 455 workflow instances in 540 sec,jBPM takes approximately the same amount of time (562 sec) to execute 27, 805workflow instances. For this, it utilizes a mean of 5.73% CPU and 2, 976.37 MBof RAM, and Camunda 43.21% of CPU and 824.96 MB of RAM for executing765, 274 workflow instances in 540 sec.
4.2.3 Explicit Termination Pattern [EXT] As discussed in Sec. 2 the[EXT] executes concurrently an empty script and a script that implements a fiveseconds wait. According to the BPMN 2.0 execution semantics, the branch of the[EXT] that finishes first terminates the rest of the workflow’s running branches.We have therefore designed the model considering that the fastest branch (emptyscript) will complete first, and stop the slow script on the other branch whenthe terminate end event following the empty script is activated. This was thecase for Activiti and Camunda, which executed the workflow patterns in anaverage of 14.11 ms and 0.4 ms respectively. The resource utilization of these twoWfMSs also increases in this workflow pattern, i.e., we have 60.20% mean CPUusage and 12, 025 MB mean RAM usage for Activiti and 33.34% mean CPUusage and 794.92 MB mean RAM usage for Camunda. We can already see aninteresting di↵erence on the performance of the two WfMS as [EXT] constitutesthe the slowest workflow pattern for Activiti and the fastest workflow patternfor Camunda. However, we should notice that in this experiment Camunda hada higher response time which was slowing down the instance producers. Thisresulted in an experiment execution time of 539 sec and a lower throughput.
SEQ EXC EXT PAR CYC MIX0
5
10
15
20
0.39
0.48
14.1
13.29
6.23 8.
16
6.39
9.3
2,62
2
10.06
39.36
540.02
0.74
0.85
0.4
0.7 3.
06
1.22
(a)MeanDuration
(ms)
WfMS A WfMS B Camunda
SEQ EXC EXT PAR CYC MIX0
20
40
60
80
100
43.21 57.42
60.2 66.1
70.09
77.69
5.83
5.73
0.24 5.64
4.67
0.66
36.75
44.62
33.34
41.76
41.67
48.53
(b)MeanCPU
(%)
Fig. 2. (a) Mean Duration (ms) per workflow pattern and (b) Mean CPU (%) Usageper workflow pattern
mentary material. The behaviour of all the WfMSs is discussed thoroughly inSec. 4.3.
4.2 Results Analysis
Sequence Flow Pattern [SEQ] The [SEQ] workflow pattern lasted onaverage 0.39 ms for WfMS A, 6.39 ms for WfMS B and 0.74 ms for Camunda.The short duration of this workflow pattern justifies the low mean CPU usagewhich is 43.21% for WfMS A, 5.83% for WfMS B and 36.75% for Camunda.WfMS B also has a very low average throughput of 63.31 wi/s while for the othertwo WfMS the average throughput is similar. Concerning the memory utilizationunder the maximum load WfMS A needed in average 12, 074, WfMS B 2, 936 andCamunda 807.81 MB of RAM respectively. As observed from the Table 1 [SEQ]is the workflow pattern with the highest throughput for all the WfMS under test.
Exclusive Choice & Simple Merge Patterns [EXC] Before proceedingto the results analysis of the [EXC], we should consider that the first script taskof the workflow pattern generates a random integer, which is given as an inputto the very simple evaluation condition of the exclusive choice gateway. This wasexpected to have some impact on the performance. However, Fig. 2(a) showsthat the duration times are not notably a↵ected as the values are close to thoseof [SEQ]. More particularly, we have a mean of 0.48 ms for WfMS A, 9.30 ms forWfMS B and 0.85 ms for Camunda. Concerning the CPU and RAM utilization,we see a slight increase with respect to the [SEQ]. WfMS A uses an average of57.42% CPU and 12, 215 MB RAM for executing 775, 455 workflow instancesin 540 sec, WfMS B takes approximately the same amount of time (562 sec) toexecute 27, 805 workflow instances. For this, it utilizes a mean of 5.73% CPU and2, 976.37 MB of RAM, and Camunda 43.21% of CPU and 824.96 MB of RAMfor executing 765, 274 workflow instances in 540 sec.
Mean duration (ms)
parallelSplitPattern
EmptyScript 1
EmptyScript 2
[PAR]
3232
Research
©Marigianna Skouradaki
Results:MeanDuration(ms)
SEQ EXC EXT PAR CYC MIX0
5
10
15
20
0.39
0.48
14.1
13.29
2.91
8.16
6.39
9.3
2,62
2
10.06
38.65
540.02
0.74
0.85
0.4
0.7 3.
06
1.22
(a)MeanDuration
(ms)
Activiti jBPM Camunda
SEQ EXC EXT PAR CYC MIX0
20
40
60
80
100
43.21 57.42
60.2 66.1
70.09
77.69
5.83
5.73
0.24 5.64
4.67
0.66
36.75
44.62
33.34
41.76
41.67
48.53
(b)MeanCPU
(%)
Fig. 2. (a) Mean Duration (ms) per workflow pattern and (b) Mean CPU (%) Usageper workflow pattern
shows that the duration times are not notably a↵ected as the values are close tothose of [SEQ].
More particularly, we have a mean of 0.48 ms for Activiti, 9.30 ms for jBPMand 0.85 ms for Camunda. Concerning the CPU and RAM utilization, we seea slight increase with respect to the [SEQ]. Activiti uses an average of 57.42%CPU and 12, 215 MB RAM for executing 775, 455 workflow instances in 540 sec,jBPM takes approximately the same amount of time (562 sec) to execute 27, 805workflow instances. For this, it utilizes a mean of 5.73% CPU and 2, 976.37 MBof RAM, and Camunda 43.21% of CPU and 824.96 MB of RAM for executing765, 274 workflow instances in 540 sec.
4.2.3 Explicit Termination Pattern [EXT] As discussed in Sec. 2 the[EXT] executes concurrently an empty script and a script that implements a fiveseconds wait. According to the BPMN 2.0 execution semantics, the branch of the[EXT] that finishes first terminates the rest of the workflow’s running branches.We have therefore designed the model considering that the fastest branch (emptyscript) will complete first, and stop the slow script on the other branch whenthe terminate end event following the empty script is activated. This was thecase for Activiti and Camunda, which executed the workflow patterns in anaverage of 14.11 ms and 0.4 ms respectively. The resource utilization of these twoWfMSs also increases in this workflow pattern, i.e., we have 60.20% mean CPUusage and 12, 025 MB mean RAM usage for Activiti and 33.34% mean CPUusage and 794.92 MB mean RAM usage for Camunda. We can already see aninteresting di↵erence on the performance of the two WfMS as [EXT] constitutesthe the slowest workflow pattern for Activiti and the fastest workflow patternfor Camunda. However, we should notice that in this experiment Camunda hada higher response time which was slowing down the instance producers. Thisresulted in an experiment execution time of 539 sec and a lower throughput.
SEQ EXC EXT PAR CYC MIX0
5
10
15
20
0.39
0.48
14.1
13.29
6.23 8.
16
6.39
9.3
2,62
2
10.06
39.36
540.02
0.74
0.85
0.4
0.7 3.
06
1.22
(a)MeanDuration
(ms)
WfMS A WfMS B Camunda
SEQ EXC EXT PAR CYC MIX0
20
40
60
80
100
43.21 57.42
60.2 66.1
70.09
77.69
5.83
5.73
0.24 5.64
4.67
0.66
36.75
44.62
33.34
41.76
41.67
48.53
(b)MeanCPU
(%)
Fig. 2. (a) Mean Duration (ms) per workflow pattern and (b) Mean CPU (%) Usageper workflow pattern
mentary material. The behaviour of all the WfMSs is discussed thoroughly inSec. 4.3.
4.2 Results Analysis
Sequence Flow Pattern [SEQ] The [SEQ] workflow pattern lasted onaverage 0.39 ms for WfMS A, 6.39 ms for WfMS B and 0.74 ms for Camunda.The short duration of this workflow pattern justifies the low mean CPU usagewhich is 43.21% for WfMS A, 5.83% for WfMS B and 36.75% for Camunda.WfMS B also has a very low average throughput of 63.31 wi/s while for the othertwo WfMS the average throughput is similar. Concerning the memory utilizationunder the maximum load WfMS A needed in average 12, 074, WfMS B 2, 936 andCamunda 807.81 MB of RAM respectively. As observed from the Table 1 [SEQ]is the workflow pattern with the highest throughput for all the WfMS under test.
Exclusive Choice & Simple Merge Patterns [EXC] Before proceedingto the results analysis of the [EXC], we should consider that the first script taskof the workflow pattern generates a random integer, which is given as an inputto the very simple evaluation condition of the exclusive choice gateway. This wasexpected to have some impact on the performance. However, Fig. 2(a) showsthat the duration times are not notably a↵ected as the values are close to thoseof [SEQ]. More particularly, we have a mean of 0.48 ms for WfMS A, 9.30 ms forWfMS B and 0.85 ms for Camunda. Concerning the CPU and RAM utilization,we see a slight increase with respect to the [SEQ]. WfMS A uses an average of57.42% CPU and 12, 215 MB RAM for executing 775, 455 workflow instancesin 540 sec, WfMS B takes approximately the same amount of time (562 sec) toexecute 27, 805 workflow instances. For this, it utilizes a mean of 5.73% CPU and2, 976.37 MB of RAM, and Camunda 43.21% of CPU and 824.96 MB of RAMfor executing 765, 274 workflow instances in 540 sec.
Mean duration (ms)
CYC Instance Producers:
Camunda: 600
WfMS A: 600WfMS B: 600
arbitaryCyclesPattern
generate1 or 2
EmptyScript 1
EmptyScript 2
i++
i < 10
i >= 10case_1
case_2
[CYC]
3333
Research
©Marigianna Skouradaki
Results:MeanDuration(ms)
SEQ EXC EXT PAR CYC MIX0
5
10
15
20
0.39
0.48
14.1
13.29
2.91
8.16
6.39
9.3
2,62
2
10.06
38.65
540.02
0.74
0.85
0.4
0.7 3.
06
1.22
(a)MeanDuration
(ms)
Activiti jBPM Camunda
SEQ EXC EXT PAR CYC MIX0
20
40
60
80
100
43.21 57.42
60.2 66.1
70.09
77.69
5.83
5.73
0.24 5.64
4.67
0.66
36.75
44.62
33.34
41.76
41.67
48.53
(b)MeanCPU
(%)
Fig. 2. (a) Mean Duration (ms) per workflow pattern and (b) Mean CPU (%) Usageper workflow pattern
shows that the duration times are not notably a↵ected as the values are close tothose of [SEQ].
More particularly, we have a mean of 0.48 ms for Activiti, 9.30 ms for jBPMand 0.85 ms for Camunda. Concerning the CPU and RAM utilization, we seea slight increase with respect to the [SEQ]. Activiti uses an average of 57.42%CPU and 12, 215 MB RAM for executing 775, 455 workflow instances in 540 sec,jBPM takes approximately the same amount of time (562 sec) to execute 27, 805workflow instances. For this, it utilizes a mean of 5.73% CPU and 2, 976.37 MBof RAM, and Camunda 43.21% of CPU and 824.96 MB of RAM for executing765, 274 workflow instances in 540 sec.
4.2.3 Explicit Termination Pattern [EXT] As discussed in Sec. 2 the[EXT] executes concurrently an empty script and a script that implements a fiveseconds wait. According to the BPMN 2.0 execution semantics, the branch of the[EXT] that finishes first terminates the rest of the workflow’s running branches.We have therefore designed the model considering that the fastest branch (emptyscript) will complete first, and stop the slow script on the other branch whenthe terminate end event following the empty script is activated. This was thecase for Activiti and Camunda, which executed the workflow patterns in anaverage of 14.11 ms and 0.4 ms respectively. The resource utilization of these twoWfMSs also increases in this workflow pattern, i.e., we have 60.20% mean CPUusage and 12, 025 MB mean RAM usage for Activiti and 33.34% mean CPUusage and 794.92 MB mean RAM usage for Camunda. We can already see aninteresting di↵erence on the performance of the two WfMS as [EXT] constitutesthe the slowest workflow pattern for Activiti and the fastest workflow patternfor Camunda. However, we should notice that in this experiment Camunda hada higher response time which was slowing down the instance producers. Thisresulted in an experiment execution time of 539 sec and a lower throughput.
SEQ EXC EXT PAR CYC MIX0
5
10
15
20
0.39
0.48
14.1
13.29
6.23 8.
16
6.39
9.3
2,62
2
10.06
39.36
540.02
0.74
0.85
0.4
0.7 3.
06
1.22
(a)MeanDuration
(ms)
WfMS A WfMS B Camunda
SEQ EXC EXT PAR CYC MIX0
20
40
60
80
100
43.21 57.42
60.2 66.1
70.09
77.69
5.83
5.73
0.24 5.64
4.67
0.66
36.75
44.62
33.34
41.76
41.67
48.53
(b)MeanCPU
(%)
Fig. 2. (a) Mean Duration (ms) per workflow pattern and (b) Mean CPU (%) Usageper workflow pattern
mentary material. The behaviour of all the WfMSs is discussed thoroughly inSec. 4.3.
4.2 Results Analysis
Sequence Flow Pattern [SEQ] The [SEQ] workflow pattern lasted onaverage 0.39 ms for WfMS A, 6.39 ms for WfMS B and 0.74 ms for Camunda.The short duration of this workflow pattern justifies the low mean CPU usagewhich is 43.21% for WfMS A, 5.83% for WfMS B and 36.75% for Camunda.WfMS B also has a very low average throughput of 63.31 wi/s while for the othertwo WfMS the average throughput is similar. Concerning the memory utilizationunder the maximum load WfMS A needed in average 12, 074, WfMS B 2, 936 andCamunda 807.81 MB of RAM respectively. As observed from the Table 1 [SEQ]is the workflow pattern with the highest throughput for all the WfMS under test.
Exclusive Choice & Simple Merge Patterns [EXC] Before proceedingto the results analysis of the [EXC], we should consider that the first script taskof the workflow pattern generates a random integer, which is given as an inputto the very simple evaluation condition of the exclusive choice gateway. This wasexpected to have some impact on the performance. However, Fig. 2(a) showsthat the duration times are not notably a↵ected as the values are close to thoseof [SEQ]. More particularly, we have a mean of 0.48 ms for WfMS A, 9.30 ms forWfMS B and 0.85 ms for Camunda. Concerning the CPU and RAM utilization,we see a slight increase with respect to the [SEQ]. WfMS A uses an average of57.42% CPU and 12, 215 MB RAM for executing 775, 455 workflow instancesin 540 sec, WfMS B takes approximately the same amount of time (562 sec) toexecute 27, 805 workflow instances. For this, it utilizes a mean of 5.73% CPU and2, 976.37 MB of RAM, and Camunda 43.21% of CPU and 824.96 MB of RAMfor executing 765, 274 workflow instances in 540 sec.
Mean duration (ms)
3434
Research
©Marigianna Skouradaki
Results:MeanCPU(%)andRAM(MB)usage
SEQ EXC EXT PAR CYC MIX0
5
10
15
20
0.39
0.48
14.1
13.29
6.23 8.
16
6.39
9.3
2,62
2
10.06
39.36
540.02
0.74
0.85
0.4
0.7 3.
06
1.22
(a)MeanDuration
(ms)
WfMS A WfMS B Camunda
SEQ EXC EXT PAR CYC MIX0
20
40
60
80
100
43.21 57.42
60.2 66.1
70.09
77.69
5.83
5.73
0.24 5.64
4.67
0.66
36.75
44.62
33.34
41.76
41.67
48.53
(b)MeanCPU
(%)
Fig. 2. (a) Mean Duration (ms) per workflow pattern and (b) Mean CPU (%) Usageper workflow pattern
mentary material. The behaviour of all the WfMSs is discussed thoroughly inSec. 4.3.
4.2 Results Analysis
Sequence Flow Pattern [SEQ] The [SEQ] workflow pattern lasted onaverage 0.39 ms for WfMS A, 6.39 ms for WfMS B and 0.74 ms for Camunda.The short duration of this workflow pattern justifies the low mean CPU usagewhich is 43.21% for WfMS A, 5.83% for WfMS B and 36.75% for Camunda.WfMS B also has a very low average throughput of 63.31 wi/s while for the othertwo WfMS the average throughput is similar. Concerning the memory utilizationunder the maximum load WfMS A needed in average 12, 074, WfMS B 2, 936 andCamunda 807.81 MB of RAM respectively. As observed from the Table 1 [SEQ]is the workflow pattern with the highest throughput for all the WfMS under test.
Exclusive Choice & Simple Merge Patterns [EXC] Before proceedingto the results analysis of the [EXC], we should consider that the first script taskof the workflow pattern generates a random integer, which is given as an inputto the very simple evaluation condition of the exclusive choice gateway. This wasexpected to have some impact on the performance. However, Fig. 2(a) showsthat the duration times are not notably a↵ected as the values are close to thoseof [SEQ]. More particularly, we have a mean of 0.48 ms for WfMS A, 9.30 ms forWfMS B and 0.85 ms for Camunda. Concerning the CPU and RAM utilization,we see a slight increase with respect to the [SEQ]. WfMS A uses an average of57.42% CPU and 12, 215 MB RAM for executing 775, 455 workflow instancesin 540 sec, WfMS B takes approximately the same amount of time (562 sec) toexecute 27, 805 workflow instances. For this, it utilizes a mean of 5.73% CPU and2, 976.37 MB of RAM, and Camunda 43.21% of CPU and 824.96 MB of RAMfor executing 765, 274 workflow instances in 540 sec.
SEQ EXC EXT PAR CYC MIX0
2,000
4,000
12,074
.2
12,215
.9
12,025
.54
12,201
.16
12,340
.65
12,310
.88
2,93
6.6
2,97
6.37
2,74
7.34
2,93
5.81
2,85
1.9
2,78
6.54
807.81
824.96
794.92
828.37
933.31
991.86
MeanRAM
(MB)
WfMS A WfMS B Camunda
SEQ EXC EXT PAR CYC0
10
20
30
0.6
0.72
18.99
17.2
3.45
9.21 12
.39
2,62
7.09
12.66
51.95
0.74
0.81
0.54
0.85 3.49
MeanDuration
(ms)
Fig. 3. (a) Mean RAM (MB) Usage per workflow pattern and (b) Mean Duration (ms)per workflow pattern in [MIX]
Explicit Termination Pattern [EXT] As discussed in Sec. 2 the [EXT]executes concurrently an empty script and a script that implements a five secondswait. According to the BPMN 2.0 execution semantics, the branch of the [EXT]that finishes first terminates the rest of the workflow’s running branches. Wehave therefore designed the model considering that the fastest branch (emptyscript) will complete first, and stop the slow script on the other branch whenthe terminate end event following the empty script is activated. This was thecase for WfMS A and Camunda, which executed the workflow patterns in anaverage of 14.11 ms and 0.4 ms respectively. The resource utilization of these twoWfMSs also increases in this workflow pattern, i.e., we have 60.20% mean CPUusage and 12, 025 MB mean RAM usage for WfMS A and 33.34% mean CPUusage and 794.92 MB mean RAM usage for Camunda. We can already see aninteresting di↵erence on the performance of the two WfMS as [EXT] constitutesthe slowest workflow pattern for WfMS A and the fastest for Camunda.
As seen in Fig. 2(a), WfMS B has very high duration results for this workflowpattern. We have investigated this matter in more detail and we have observedthat over the executions WfMS B chooses the sequential execution of each pathwith an average percentage of 52.23% for following the waiting script first and47.77% for following first the empty script. Since the waiting script takes fiveseconds to complete, every time it is chosen for execution it adds a five secondsoverhead, and thus the average duration time is so high. This alternate executionof the two branches also explains the rest of the statistics. For example, weobserve a very high standard deviation of 2500.44 that indicates that there isa very large spread of the values around the mean duration. Concerning theresource utilization we can observe a very low average usage of CPU at 0.24%and a mean RAM usage similar to the rest of the workflow patterns at 2, 747.34MB. In Sec. 4.3 we attempt to give an explanation of this behavior for WfMS B.
SEQ EXC EXT PAR CYC MIX0
5
10
15
20
0.39
0.48
14.1
13.29
6.23 8.
16
6.39
9.3
2,62
2
10.06
39.36
540.02
0.74
0.85
0.4
0.7 3.
06
1.22
(a)MeanDuration
(ms)
WfMS A WfMS B Camunda
SEQ EXC EXT PAR CYC MIX0
20
40
60
80
100
43.21 57.42
60.2 66.1
70.09
77.69
5.83
5.73
0.24 5.64
4.67
0.66
36.75
44.62
33.34
41.76
41.67
48.53
(b)MeanCPU
(%)
Fig. 2. (a) Mean Duration (ms) per workflow pattern and (b) Mean CPU (%) Usageper workflow pattern
mentary material. The behaviour of all the WfMSs is discussed thoroughly inSec. 4.3.
4.2 Results Analysis
Sequence Flow Pattern [SEQ] The [SEQ] workflow pattern lasted onaverage 0.39 ms for WfMS A, 6.39 ms for WfMS B and 0.74 ms for Camunda.The short duration of this workflow pattern justifies the low mean CPU usagewhich is 43.21% for WfMS A, 5.83% for WfMS B and 36.75% for Camunda.WfMS B also has a very low average throughput of 63.31 wi/s while for the othertwo WfMS the average throughput is similar. Concerning the memory utilizationunder the maximum load WfMS A needed in average 12, 074, WfMS B 2, 936 andCamunda 807.81 MB of RAM respectively. As observed from the Table 1 [SEQ]is the workflow pattern with the highest throughput for all the WfMS under test.
Exclusive Choice & Simple Merge Patterns [EXC] Before proceedingto the results analysis of the [EXC], we should consider that the first script taskof the workflow pattern generates a random integer, which is given as an inputto the very simple evaluation condition of the exclusive choice gateway. This wasexpected to have some impact on the performance. However, Fig. 2(a) showsthat the duration times are not notably a↵ected as the values are close to thoseof [SEQ]. More particularly, we have a mean of 0.48 ms for WfMS A, 9.30 ms forWfMS B and 0.85 ms for Camunda. Concerning the CPU and RAM utilization,we see a slight increase with respect to the [SEQ]. WfMS A uses an average of57.42% CPU and 12, 215 MB RAM for executing 775, 455 workflow instancesin 540 sec, WfMS B takes approximately the same amount of time (562 sec) toexecute 27, 805 workflow instances. For this, it utilizes a mean of 5.73% CPU and2, 976.37 MB of RAM, and Camunda 43.21% of CPU and 824.96 MB of RAMfor executing 765, 274 workflow instances in 540 sec.
CPU(%) usage
Memory (MB) Usage
3737
Research
©Marigianna Skouradaki
ResultsThroughput(wi/s)
SEQ EXC EXT PAR CYC MIX
WfMS A 1,447.66 1,436.03 1,426.35 1,429.65 644.02 1,402.33
WfMS B 63.31 49.04 0.38 48.89 13.46 1.78
Camunda 1,456.79 1,417.17 1,455.68 1,455.68 327.99 1,061.27
CYC Instance Producers: WfMS A: 800 WfMS B: 1500 Camunda: 600
WfMS B uses a synchronous instantiation API.Therefore, load drivers must wait for the workflow instances to end
Before instantiating the next one
4646
Research
©Marigianna Skouradaki
Findings
Even though the defined workload is simple
SEQ EXC EXT PAR CYC MIX0
5
10
15
20
0.39
0.48
14.1
13.29
2.91
8.16
6.39
9.3
2,62
2
10.06
38.65
540.02
0.74
0.85
0.4
0.7 3.
06
1.22
(a)MeanDuration
(ms)
Activiti jBPM Camunda
SEQ EXC EXT PAR CYC MIX0
20
40
60
80
100
43.21 57.42
60.2 66.1
70.09
77.69
5.83
5.73
0.24 5.64
4.67
0.66
36.75
44.62
33.34
41.76
41.67
48.53
(b)MeanCPU
(%)
Fig. 2. (a) Mean Duration (ms) per workflow pattern and (b) Mean CPU (%) Usageper workflow pattern
shows that the duration times are not notably a↵ected as the values are close tothose of [SEQ].
More particularly, we have a mean of 0.48 ms for Activiti, 9.30 ms for jBPMand 0.85 ms for Camunda. Concerning the CPU and RAM utilization, we seea slight increase with respect to the [SEQ]. Activiti uses an average of 57.42%CPU and 12, 215 MB RAM for executing 775, 455 workflow instances in 540 sec,jBPM takes approximately the same amount of time (562 sec) to execute 27, 805workflow instances. For this, it utilizes a mean of 5.73% CPU and 2, 976.37 MBof RAM, and Camunda 43.21% of CPU and 824.96 MB of RAM for executing765, 274 workflow instances in 540 sec.
4.2.3 Explicit Termination Pattern [EXT] As discussed in Sec. 2 the[EXT] executes concurrently an empty script and a script that implements a fiveseconds wait. According to the BPMN 2.0 execution semantics, the branch of the[EXT] that finishes first terminates the rest of the workflow’s running branches.We have therefore designed the model considering that the fastest branch (emptyscript) will complete first, and stop the slow script on the other branch whenthe terminate end event following the empty script is activated. This was thecase for Activiti and Camunda, which executed the workflow patterns in anaverage of 14.11 ms and 0.4 ms respectively. The resource utilization of these twoWfMSs also increases in this workflow pattern, i.e., we have 60.20% mean CPUusage and 12, 025 MB mean RAM usage for Activiti and 33.34% mean CPUusage and 794.92 MB mean RAM usage for Camunda. We can already see aninteresting di↵erence on the performance of the two WfMS as [EXT] constitutesthe the slowest workflow pattern for Activiti and the fastest workflow patternfor Camunda. However, we should notice that in this experiment Camunda hada higher response time which was slowing down the instance producers. Thisresulted in an experiment execution time of 539 sec and a lower throughput.
3. Differences in the duration
of the workflow instances
2. Discovered Bottleneck on WfMS B
1. Differences in resource utilization
show space for improvement
4. EXT pattern is not implemented
consistently on WfMS B
5. Sequential patterns are better candidatesfor discovering the
maximum throughputsequencePattern
EmptyScript 1
EmptyScript 2
6. Complex and parallel workflows
are better candidatesfor stressing the WfMS on
Resource UtilizationparallelSplitPattern
EmptyScript 1
EmptyScript 2
implicitTerminationPattern
EmptyScript 1
Wait 5Sec
7. The CYC pattern caused
time out errors in Camunda
arbitaryCyclesPattern
generate1 or 2
EmptyScript 1
EmptyScript 2
i++
i < 10
i >= 10case_1
case_2
8. The execution of the MIX
had the expectedbehavior
4747
Research
©Marigianna Skouradaki
FutureWork
n FinalizeandreleasetheBenchFlowframeworkinanopen-sourcemodelonGithubandDockerHub:
n Generatesyntheticprocessmodelsw.r.t.thepresentresultsn Executeamacrobenchmarkwiththesynthetic,complex,
representativeworkflowmixn Executedifferenttestswithrealisticloadfunctionsandtestdatan Production-likeconfigurations
benchflowhttps://github.com/benchflow
goo.gl/Olw32m [email protected]