25
Sep 2010 Scaling to Millions of Concurrent SPARQL Queries on the Cloud OWLIM Replication Cluster @ Amazon EC2

Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Embed Size (px)

Citation preview

Page 1: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Sep 2010

Scaling to Millions of Concurrent SPARQL Queries on the Cloud

OWLIM Replication Cluster @ Amazon EC2

Page 2: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Goals

• Test the scalability of OWLIM RC on a really largecluster

• Can we break the million queries per hour barrier?

#2OWLIM Replication Cluster @ AWS Sep 2010

Page 3: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

INTRODUCTION

OWLIM Replication Cluster @ AWS #3Sep 2010

Page 5: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Benchmarking AWS

• Extensive performance tests of EC2 instances

– I/O, CPU, Network

– BSBM (SPARQL), RDF materialisation

• High Memory EC2 instances offer (surprisingly) goodperformance for RDF related processing

– Comparable to local non-virtualised hardware

#5OWLIM Replication Cluster @ AWS Sep 2010

Page 6: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Benchmarking AWS – testbeds

#6OWLIM Replication Cluster @ AWS Sep 2010

CPU cores RAM (GB) Virtualisation

Local-L 2×2.4 GHz 8 ESX

Local-XL 4×2.9 GHz 12 No

Local-3XL 8×3.3 GHz 48 No

L 2×2 ECU* 7.5 Xen

XL 4×2 ECU* 15 Xen

High-Mem XL 2×3.25 ECU* 17 Xen

High-Mem 2XL 4×3.25 ECU* 34 Xen

High-Mem 4XL 8×3.25 ECU* 68 Xen

High-CPU XL 8×2.5 ECU* 7 Xen

1 ECU provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor

Page 7: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Benchmarking AWS – BSBM 100M results

#7OWLIM Replication Cluster @ AWS Sep 2010

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

1 4 16 32 64

Qu

ery

mix

es /

ho

ur

concurrent clients

Local-L

L-ub

Local-XL

XL-ub

HM-XL-ub

HM-2XL-ub

Local-3XL

Local-3XL-SSD

HM-4XL-ub

HC-XL-ub

Page 8: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Benchmarking AWS – RDF materialisation

#8OWLIM Replication Cluster @ AWS Sep 2010

0

1000

2000

3000

4000

5000

6000

ma

teri

ali

sa

tio

n t

ime

(se

c)

UMBEL

DBP-SKOS

Page 9: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

OWLIM Replication Cluster

• Improves scalability with respect to concurrent userrequests

• How does it work?

– Each write request is multiplexed to all repositoryinstances

– Each read request is dispatched to one instance only

– To ensure load-balancing, read requests are sent to the instance with the shortestexecution queue

#9OWLIM Replication Cluster @ AWS Sep 2010

Page 10: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

OWLIM CLUSTER ON EC2 –BENCHMARKS

OWLIM Replication Cluster @ AWS #10Sep 2010

Page 11: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

AWS testbed setup

• OWLIM Replication Cluster

– One Master node, 10-100 Slave nodes

– 100 million triples / 16GB database size

• BSBM 100M dataset

– Each cluster node has a replica of the database

– 1000 concurrent BSBM clients

• Amazon EC2

– Master node – HM-2XL (34GB RAM, 4x3.25 ECU)

– Slave nodes – HM-XL (17 GB RAM, 2x3.25 ECU)

– Ubuntu (x64)

#11OWLIM Replication Cluster @ AWS Sep 2010

Page 12: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Total QMpH (Query Mix per Hour)

#12OWLIM Replication Cluster @ AWS Sep 2010

0

50000

100000

150000

200000

250000

10 20 30 40 50 60 70 80 90 100

tota

l Q

Mp

H

cluster size (HM-XL nodes)

BSBM-100M, 1000 concurrent clients

1000 clients

Page 13: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Total QMpH – summary

• (almost) Linear scalability of the cluster

• 20 nodes handle more than 1 million SPARQL queriesper hour (40,000 QMpH)

– 1 Query Mix = 25 SPARQL queries

• 100 nodes handle 5 million SPARQL queries per hour(200,000 QMpH)

#13OWLIM Replication Cluster @ AWS Sep 2010

Page 14: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

QMpH per cluster node

#14OWLIM Replication Cluster @ AWS Sep 2010

1800

1900

2000

2100

2200

2300

2400

10 20 30 40 50 60 70 80 90 100

QM

pH

pe

r n

od

e

cluster size (HM-XL nodes)

BSBM-100M, 1000 concurrent clients

1000 clients

trendline (Power)

Page 15: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

QMpH per cluster node – summary

• Low parallelisation overhead

– Only 10% deterioration in QMpH per cluster node whenthe cluster grows 10 times (from 10 to 100 nodes)

– Cluster nodes handle 2,000-2,300 QMpH (a standaloneHM-XL node on EC2 handles ~2,500 QMpH)

#15OWLIM Replication Cluster @ AWS Sep 2010

Page 16: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

What about the cost?

• 100,000 SPARQL queries per 1$ on AWS

– ~4,000 Query Mixes / $• 1 Query Mix = 25 SPARQL queries

– EC2 pricing• Master node (on-demand HM-2XL) – $1.00/hour

• Slave node (on demand HM-XL) – $0.50/hour

#16OWLIM Replication Cluster @ AWS Sep 2010

Page 17: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

What about the cost (2)

#17OWLIM Replication Cluster @ AWS Sep 2010

3400

3600

3800

4000

4200

4400

4600

10 20 30 40 50 60 70 80 90 100

Qu

ery

Mix

es /

$

cluster size

Query Mixes per 1 USD

QMpH/$

Page 18: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

DETAILED CLUSTER METRICS

OWLIM Replication Cluster @ AWS #18Sep 2010

Page 19: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Cluster monitoring

• Amazon CloudWatch provides instance levelmonitoring for EC2

– CPU load, Bandwidth utilisation, I/O, …

– Minimum granularity of monitoring periods – 1 minute

• OWLIM Cluster metrics

– Monitor Master and a random Slave for ~180 min

– Many test runs• a single run takes a few minutes

– Idle CPU/IO/Network on diagram is the time between testruns

#19OWLIM Replication Cluster @ AWS Sep 2010

Page 20: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

CPU load (Master)

#20OWLIM Replication Cluster @ AWS Sep 2010

0

10

20

30

40

50

60

70

80

0 5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

105

110

115

120

125

130

135

140

145

150

155

160

165

170

175

180

185

%

time (min)

CPU load (Master)

CPU load

Page 21: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

CPU load (Slave)

#21OWLIM Replication Cluster @ AWS Sep 2010

0

20

40

60

80

100

120

0 5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

105

110

115

120

125

130

135

140

145

150

155

%

time (min)

CPU load (random Slave)

CPU load

Page 22: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Network traffic (Master)

#22OWLIM Replication Cluster @ AWS Sep 2010

0

5

10

15

20

25

30

35

0 5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

105

110

115

120

125

130

135

140

145

150

155

160

165

170

175

180

185

MB

/s

time (min)

Network traffic (Master)

inbound (MB/s)

outbound (MB/s)

Page 23: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Network traffic (Slave)

#23OWLIM Replication Cluster @ AWS Sep 2010

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0 5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

105

110

115

120

125

130

135

140

145

150

155

MB

/s

time (min)

Network traffic (random Slave)

inbound (MB/s)

outbound (MB/s)

Page 24: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

I/O (Slave)

#24OWLIM Replication Cluster @ AWS Sep 2010

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

0 5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

105

110

115

120

125

130

135

140

145

150

155

160

165

170

MB

/s

time (min)

I/O (random Slave)

Disk Read (MB/s)

Disk Write (MB/s)

Page 25: Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Q & A

Questions?@ontotext

#25OWLIM Replication Cluster @ AWS Sep 2010