33
Achieving Horizontal Scalability Alain Houf – Sales Engineer

Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

Achieving Horizontal ScalabilityAlain Houf – Sales Engineer

Page 2: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

2 | © InterSystems Corporation. All rights reserved. |

Scale Matters

InterSystems IRIS Database Platform lets you:

• Scale up and scale out

• Scale users and scale data

• Mix and match a variety of approaches to scalability, to suit your application and business needs

Page 3: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

3 | © InterSystems Corporation. All rights reserved. |

Scaling Up: Vertical Scalability

Page 4: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

4 | © InterSystems Corporation. All rights reserved. |

Expand capacity of an individual server by adding CPU, memory, I/O & networking

components to address workload requirements

Advantages Challenges

• Architectural simplicity

• Fine-grained balancing possible

• Software complexity

• Hardware limitations

• Non-linear price / performance

• Requires careful upfront sizing

Vertical Scalability

Page 5: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

5 | © InterSystems Corporation. All rights reserved. |

InterSystems SQL Parallel Query Execution

Leverage multiple CPU cores to serve up SQL query results

• Spawns 1 process per core = vertical scalability

• Most beneficial for aggregation queries on large datasets

Currently considered by optimizer based on the %PARALLEL hint, fully transparent

automation under development

Page 6: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

6 | © InterSystems Corporation. All rights reserved. |

Scaling Out: Horizontal Scalability

Page 7: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

7 | © InterSystems Corporation. All rights reserved. |

Horizontal Scalability

Expand capacity of a cluster by adding servers to address workload requirements

Advantages Challenges

• Near-linear price /

performance

• Leverage commodity, virtual

& cloud-based systems

• Allows elastic scaling

• Software complexity

• Emphasis on networking

Page 8: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

9 | © InterSystems Corporation. All rights reserved. |

Horizontally Scaling Users

Page 9: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

10 | © InterSystems Corporation. All rights reserved. |

InterSystems ECP Application Servers

The InterSystems Enterprise Cache Protocol is a powerful mechanism to distribute data

and application logic across database instances. It decouples the execution of application

code from persisting the data it handles:

• ECP Application Server services user requests off a local database cache

• ECP Data Server persists updates to disk

Horizontally Scaling Cache: Allows the caches of multiple instances to each have an

independent working set in memory, kept in sync with persisted data

• Fully transparent to application code

Page 10: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

11 | © InterSystems Corporation. All rights reserved. |

Horizontally Scaling Data

Page 11: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

12 | © InterSystems Corporation. All rights reserved. |

Horizontally Scaling Data

Page 12: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

13 | © InterSystems Corporation. All rights reserved. |

InterSystems SQL Sharding

SQL Sharding allows table data to be partitioned over multiple instances

• Takes parallel SQL processing one step further by distributing the work over multiple servers

rather than multiple processes on the same server

• Distributed data layout can further be exploited through parallel loading and 3rd party

frameworks like Apache Spark

Horizontally Scaling Cache: Allows cache of multiple instances to be added up to keep a

larger overall working set in memory

• Fully transparent to application code

Page 13: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

14 | © InterSystems Corporation. All rights reserved. |

Independently Scaling Users and Data

Page 14: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

InterSystems SQL Sharding

Page 15: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

16 | © InterSystems Corporation. All rights reserved. |

Two main instance roles participate in a sharded cluster:

One Shard Master (DM)

• Entry point to the sharded namespace

• Stores table definitions, code, data for nonsharded tables

Any number of Shard Servers (DS)

• Provide scalable storage, cache capacity for sharded tables

• Sharded tables are partitioned across shard servers

• Nonsharded tables are mapped to shard servers via ECP

• Routine database is shared between all shard servers

• Transparent to user code, not accessed directly by users

shard master

shard

servershard

server

Sharded Architecture – Basics

Page 16: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

17 | © InterSystems Corporation. All rights reserved. |

Sharded Architecture – Query Processing

1. Application issues query to shard master

2. Shard master analyzes query for partitioning

opportunities and sends shard-local queries to

shard servers

3. Shard-local queries are resolved by shard servers

and results sent back to master via ECP

4. Shard master aggregates shard-local query results

and sends main query results back to application

application

shard master

shard master

shard

servershard

server

Page 17: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

18 | © InterSystems Corporation. All rights reserved. |

Sharded Architecture – Shard Master App Servers

Shard Master Application Servers (AM) scale user application workload while Shard

Servers scale query processing

• Use ECP to read nonsharded tables from the Shard Master Data Server (DM)

• Connect directly to the Shard Servers for sharded table data

DM

DS DS DS DS

AM AM AM

Page 18: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

19 | © InterSystems Corporation. All rights reserved. |

Sharded Architecture – Query Shards

For demanding use cases, application servers can also be added to the shard level to

spread the shard-local query workload:

• Data Shards (DS) persist a partition’s data

• Query Shards (QS) query the data of the corresponding

Data Shard via ECP

For example, large ingestion workloads can

be sent straight to the data shards while query

shards reserve their cache for a concurrent

analytical query workloadDS

analytic ingest

DS DS

QS QS QS

DM

Page 19: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

20 | © InterSystems Corporation. All rights reserved. |

Joining Sharded Tables

Cosharded joins

• Equijoins on the user-defined shard keys of two or more tables can be executed locally on each

shard

• Extremely efficient, scales well with number of tables and number of shards

Any set of sharded tables can be joined

• Each shard server can access data from other shards via ECP

• Efficient “shard tuple” algorithm assigns shard sets to each shard server

Sharded tables can be joined with nonsharded tables

• Shard servers access data from nonsharded tables via ECP

Page 20: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

21 | © InterSystems Corporation. All rights reserved. |

Leveraging Other Features of InterSystems IRIS Data Platform

Mirroring

• Sharding leverages mirroring to provide High Availability

• Fully supported for all data-storing components of sharded clusters (DM & DS)

• Automatic completion of sharded queries upon node failover

InterSystems Connector for Apache Spark

• Leverages sharded topologies - Spark workers connect directly to shards to execute local

queries, do aggregating work in Spark itself

JDBC

• Transparently makes direct parallel connections to shards for high speed data ingestion

Page 21: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

Use Cases

Page 22: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

23 | © InterSystems Corporation. All rights reserved. |

Use Cases

Multi-Asset Global Trading System

• One of the top global investment banks who processes 13% of global equities trading volume,

runs its global trading system on top of InterSystems data platform.

• More than 2 billions of transactions/day, more than 6TB data are generated every day

• Has evaluated InterSystems IRIS for real-time data access, short term and long term storage,

replacing ECP app servers, replacing Sybase ASE, Sybase IQ and Rainstor. InterSystems IRIS

improves query performance by 300% and reduces cost by 70%.

Benchmark Service

• Another top global investment bank is evaluating InterSystems IRIS for replacing its existing

Sybase IQ for its benchmark service

• Has found that InterSystems IRIS is up to 2x faster than another in memory data base, and up to

3x~10x faster than Sybase IQ

Page 23: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

Use Case 1Multi-Asset Global Trading System

Real Time Access and Data Storage on Private Cloud

Page 24: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

25 | © InterSystems Corporation. All rights reserved. |

Use Case 1: Current InterSystems SQL Environment

The trading system persists intraday transactions to

hundreds of InterSystems SQL instances:

• They are divided into Data Servers (DS) and App Servers (AS)

• Interconnected by InterSystems ECP

• All of them are running on physical servers

• AS needs 3x of RAM than DS (128GB vs 40GB)

To avoid additional load on the AS’s by non trading

related queries, the customer has also set up Sybase ASE

instances and is replicating data from trading system/InterSystems SQL environment to

these ASE instances to serve those queries.

TSS/Hermes

TIS/Persistor

Data

Server

AS AS AS

Data

Server

Data

Server

Page 25: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

26 | © InterSystems Corporation. All rights reserved. |

Use Case 1: Current Storage Infrastructure

In near real time, the customer replicates data from trading system/Caché to Sybase ASE

instances, typically one ASE instance will hold data from more than one trading system/Caché

instance for 7 days. At EOD, the customer dumps data from its trading system/Caché instances to

Sybase IQ for up to 6 months, and to Rainstor to keep them there forever.

Rainstor

forever

Sybase IQ

6 months

Sybase ASE

7 days

Caché

intraday

Page 26: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

27 | © InterSystems Corporation. All rights reserved. |

Use Case 1: InterSystems IRIS for Real Time Access

Proposed InterSystems IRIS Architecture

• The trading system components TIS and persistors will

continue to store data into existing DS’s

• There will be no more expensive AS’s

• Cloud based InterSystems IRIS query cluster

• One or more InterSystems IRIS shard master(s)

• For each DS, there will be one or more IRIS query shard(s)

• Each node only requires 40GB RAM, no expensive

storage either.

This cloud based InterSystems IRIS configuration will provide a

real time, horizontally scalable query facility, that can replace current AS’s and Sybase ASE for

intraday queries. It will improve query performance by 300%, cut hardware cost by 70%.

TSS/Hermes & client apps

TIS/Persistor

DS DS DS

QS QS QS

DM

Page 27: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

28 | © InterSystems Corporation. All rights reserved. |

Use Case 1: InterSystems IRIS for Data Storage

InterSystems IRIS native data replication will move data from InterSystems Caché data servers to

InterSystems IRIS data shards in near real time. The cloud based InterSystems IRIS data storage

facility can hold 7days, 30 days or 6months of trading data.

TSS/Hermes

TIS/Persistor

DS DS DS

QS QS QS

DM

ASE/IQ/Rainstor clients

DS DS DS

QS QS QS

DM

Page 28: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

Use Case 2Benchmark Service

Succeed where Hadoop and Traditional Data Warehouse Fail to Deliver

Page 29: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

30 | © InterSystems Corporation. All rights reserved. |

Use Case 2: Background

The investment bank has 18,000 benchmarks (14,000 benchmarks are from

external sources, 4,000 benchmarks are created internally). 8TB total data

volume.

Its asset managers need to use the benchmark service to compare the portfolio

they are managing for their clients against one or more benchmarks. Typically

end of the day.

Its real time strategy trading platform also uses the benchmark service to make

trading decisions during trading hours.

Page 30: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

31 | © InterSystems Corporation. All rights reserved. |

Use Case 2: Challenges

The bank has a peta-bytes data lake on Hadoop.

Complex SQL Joins

• The bank has created many curated SQL stores to serve enterprise applications/customers.

• Currently Sybase ASE cannot keep up with applications/customers demand.

Low Latency Requirements

• In-memory SQL solutions are expensive and/or unstable

Page 31: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

32 | © InterSystems Corporation. All rights reserved. |

Use Case 2: InterSystems IRIS Succeeds Where Others Fail

The bank deployed InterSystems IRIS on VMs provisioned from its private cloud

• Each VM has 4 cores, 32GB RAM, 200GB internal disk.

• Different sharding strategies by different sharding keys (indexID, businessDate)

InterSystems IRIS is up to 2x faster than another distributed in memory

database, up to 3x~10x faster than Sybase IQ in many test cases, and

InterSystems IRIS is always fast across the board.

Page 32: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

Q&A

Page 33: Achieving Horizontal Scalability - InterSystems€¦ · Horizontal Scalability Expand capacity of a cluster by adding servers to address workload requirements Advantages Challenges

Thank you.