35
TopStack Architecture Q3 2013 Update

TopStack Product Architecture 2013-Q3

Embed Size (px)

DESCRIPTION

An overview of Transcend Computing's TopStack product architecture.

Citation preview

Page 1: TopStack Product Architecture 2013-Q3

TopStack Architecture

Q3 2013 Update

Page 2: TopStack Product Architecture 2013-Q3

2

The basics

Page 3: TopStack Product Architecture 2013-Q3

3

Overview

TopStack is a suite of services to extend Infrastructure as a Service (IaaS) solutions and deliver key Platform Services (PaaS)

TopStack delivers a clean-room implementation of many of Amazon’s most popular services

TopStack runs on private clouds as well as third party public clouds

The 2013-Q3 focus for TopStack is to act as a complement for OpenStack.

TopStack is available in both a Community Edition (open source) and an Enterprise Edition (commercial license & support).

Page 4: TopStack Product Architecture 2013-Q3

4

Source Code Organization & Control

Source code for TopStack is stored in Git, open source is in github

Each service is stored in a separate repo

Common repo: ToughCore, for shared utility code

Common repo: ToughResources, for shared static assets

All repos have a master branch for current production code

Tag is applied for each production release

All repos have at least one development branch for current developmentAdditional feature branches are created for feature development, as needed

Page 5: TopStack Product Architecture 2013-Q3

5

Build Management and Quality Assurance

Services are built with Ant build files, with Maven tasks for dependency resolution

Dependencies are resolved through local file copies (in dev mode)

Dependencies are resolved through Jenkins artifacts (in build mode)

Builds are managed through Jenkins continuous integration

All service include unit tests that run with each build

Services are deployed as part of continuous integration to Dev clouds

Post deploy, Java integration tests are performed against fresh deploy

Any failed integration tests will cause the build to be marked broken

Once a day, a “long running” set of integration tests are run

Long running tests spin up instances and test advanced connectivity

Page 6: TopStack Product Architecture 2013-Q3

6

Continuous Deployment

Continuous deployment is performed by Jenkins, with jobs deploying to Dev, etc.

Deployments pushed to multiple cloud platforms, versions, …

Cloud 1 Cloud 2

Page 7: TopStack Product Architecture 2013-Q3

7

Installation & Deployment

Installation package is a single package file (.tar.gz), output from continuous build

Unpacked, install package consists of:

Master installation shell script

Install guide (PDF)

Packaged services, to be deployed by installation script

Base Image configuration script

Installation may be re-run as needed to install/configure additional instances

Options to installation allow the installer to include/exclude particular services

Required supporting services are always installed

Page 8: TopStack Product Architecture 2013-Q3

8

Deployment

Current tested deployment configuration:

OpenStack Grizzly or greater (older versions work, but are not commercially supported)

nova-compute with libvirt+KVM, libvirt+QEMU, libvirt+XenServer

nova-volume/Cinder, any iSCSCI backend

nova-network/Quantum, single VLAN

Linux VM, Ubuntu 12.04 or greater

Page 9: TopStack Product Architecture 2013-Q3

9

Services Offered to Customers

Elastic Load Balancer

Route 53

Relational Database Service

ElastiCache

Simple Queue Service

CloudWatch

CloudFormation

Elastic Beanstalk

Auto Scale

This deck wont cover these in any

detail

Page 10: TopStack Product Architecture 2013-Q3

10

Internal Services & Components

Internal Services (daemons):

Service registry & configuration

Orchestration & events

Job Scheduling

Common Components:

Common logging

Persistence

Instance configuration (Chef)

Authorization & access control

Quotas & metering

Cloud platform bindings

Instrumentation

Administration

Inter-Service Communication

Page 11: TopStack Product Architecture 2013-Q3

11

DEPLOYMENT MODEL

Page 12: TopStack Product Architecture 2013-Q3

12

Cloud Image Repo

TopStack Master VM

Tomcat 7

Deployment Model - Evaluation/TopStack Lite

12

TopStack SLB

DNS53

SQS

CloudWatch

RDS

Other TopStack Services

Apache 2

StackStudio

Chef Server

PubSub Queue

TS Base Image

TopStack DNS53

MySQL

Page 13: TopStack Product Architecture 2013-Q3

13

TopStack SLB VM

TopStack Service VM2

Deployment Model - TopStack Enterprise

13

TopStack SLB

TopStack Service VM1

Tomcat 7

DNS53

SQS

CloudWatch

RDS

Other TopStack

StackStudio VM

Apache 2

StackStudio

Cloud Image Repo

Chef VM

Chef Server

[Optional]

DB VM

MySQL

Queue VM

PubSub Queue

DNS53 VM

DNS53

TS Base Image

Page 14: TopStack Product Architecture 2013-Q3

14

Chef VM

TopStack Service VMn

Deployment Model - TopStack HA

14

TopStack Service VM2

StackStudio VM

Apache 2

StackStudio

TopStack Service VM1

TopStack ELB VM2TopStack ELB VM1

Cloud Image Repo

[Optional]

DB Active

MySQL

DB Standby

MySQL

Chef Cluster

Chef Server

Chef VMQueue Cluster

PubSub QueueChef VM

DNS53 Cluster

DNS53

TS Base Image

Page 15: TopStack Product Architecture 2013-Q3

15

Internal Services

Page 16: TopStack Product Architecture 2013-Q3

16

Service Registration & Configuration

All services must register with DNS53 on startup

DNS53 maintains private zone for Transcend internal use

Installation creates addresses for TopStack hosts

Registration creates CNAMEs for individual services in DNS

DNS information is used by Transcend load balancer to direct traffic

Page 17: TopStack Product Architecture 2013-Q3

TopStack ServiceSLB

Request Handler Thread

17

Orchestration & Events

Request Handler Thread

Open Transaction

IaaS Provider

Create WF

ClientRequest

Response

Commit Transaction

TopStack Workflow

Cloud Op Task

Notify Task Complete

RDSWork

CFWork

SLBWork

Workflow Step 1

Workflow Error State

Quartz

Open Transaction

Commit Transaction

Cloud Op Task

Workflow Step 2

Rollback Resources

Continuation

Request ID Cache

Page 18: TopStack Product Architecture 2013-Q3

18

Orchestration & Events

Services only own workflow steps and a light servlet for request/response

Pub-sub mechanism between TopStack API front end and service workers

ZeroMQ (http://www.zeromq.org)

Protocol Buffers as serialization format for ZeroMQ

Workflow solution to handle multiple asynchronous service steps:

Mule ESB (http://www.mulesoft.org/)

Asynchronous requests from HTTP handlers

Tomcat 7 with Servlet 3.0 asynchronous servlets (continuations)

Request IDs to marry asynchronous responses to requests

Page 19: TopStack Product Architecture 2013-Q3

19

Workflow

19

Many services consist of multiple operations, both synchronous and asynchronous

For example, a Relational Database is created:

An instance must be spun up

Volume is created (in parallel)

Public IP must be associated

Instance startup is complete

Volume is attached

Database installation is performed

etc.

Any workflow step may fail, in which case:

Allocated resources must be torn down, freed

Failure must be reported, handled appropriately

Page 20: TopStack Product Architecture 2013-Q3

20

Job Scheduling

Scheduled jobs are executed using Quartz Enterprise Job Scheduler

Quartz runs in clustered configuration

Jobs are executable by any TopStack instance

Scheduled jobs are stored in relational DB

Services may add new jobs to be executed during e.g. maintenance windows

Quartz is a source of workflow jobs

For example, on setting RDS maintenance window, a Quartz job is created

When Quartz job fires, RDS code is invoked to submit workflow

Page 21: TopStack Product Architecture 2013-Q3

21

Common Components

Page 22: TopStack Product Architecture 2013-Q3

22

Common Logging

Logging from all TopStack services is performed through SLF4J library

Logging implementation is typically Log4J

Logging may be directed to syslog (including TCP) or simple files

Configuration provides opportunity for aggregation, mining

Page 23: TopStack Product Architecture 2013-Q3

23

Authorization & Access Control

Each TopStack account will require an active IaaS cloud credential set

IaaS credentials are encrypted at rest

Actions are performed using credentials associated with TopStack account

IaaS authorization and access limits define TopStack limits

Page 24: TopStack Product Architecture 2013-Q3

24

Instance Configuration

Chef Server

Deployment includes an embedded Chef server (http://www.opscode.com/chef)

Embedded Chef includes a set of Transcend recipes to build up resources

Chef Client

Transcend Base Image burns a Chef client into the image

As new instances are started by TopStack, a Chef configuration and role are injected

Instances dial-back to TopStack as the final step of configuration to become ready

Page 25: TopStack Product Architecture 2013-Q3

25

Persistence

Configuration and event data is stored in a relational database (default MySQL)

Data access is through a DAO layer and Hibernate, an O/R mapping layer

Page 26: TopStack Product Architecture 2013-Q3

26

Cloud Platforms Bindings

TopStack configuration requires cloud “flavor” as input; OpenStack, Eucalyptus, etc.

IaaS cloud must provide the core operations used by TopStack (or equivalents):

Create/Terminate VM Instance

Allocate/Release IP Address

Associate/Disassociate IP Address

Describe Instances

Create/Delete Security Group

Describe Security Groups

Authorize/Revoke Security Ingress

Create/Delete Volume

Describe Volume

Page 27: TopStack Product Architecture 2013-Q3

27

Quotas & Metering

All quotas enforced by IaaS provider apply to TopStack instances as well

Some quota is consumed by TopStack constructs that map to quota items

E.g., RDS security group consumes an IaaS security group

Page 28: TopStack Product Architecture 2013-Q3

28

Instrumentation

All TopStack hosts are monitored as CloudWatch instances

Installation process configures hosts

Metrics are available though normal CloudWatch APIs

All TopStack service hosts expose basic management information

All hosted services are available, with service status

Service workers (workflow steps) maintain “health” information

Count of tasks processed

Count of task with abnormal outcome

Transactions processed per second

Collected via metrics (http://metrics.codahale.com/)

Page 29: TopStack Product Architecture 2013-Q3

29

Administration

TopStack Enterprise Edition provides an Administration Console

Console runs on each TopStack host

Allows central administration of services

Allows provisioning of user accounts

Provides information on active services, failure rates, scheduled jobs

Page 30: TopStack Product Architecture 2013-Q3

30

Inter-Service Communication

TopStack services communicate with each other only as workflow steps

Subsequent workflow steps are routed through Pub/Sub queue

Loosely coupled, via workflow

Page 31: TopStack Product Architecture 2013-Q3

31

Non-functional requirements

Page 32: TopStack Product Architecture 2013-Q3

32

High Availability

TopStack service hosts run in parallel on different VMs; scale-out architecture

VMs may be removed from service & load will redistribute across remaining instances

Workflows in progress will be continued by other instances

TopStack persistence tier may be run in master/slave or cluster configuration

Page 33: TopStack Product Architecture 2013-Q3

33

Scalability

TopStack host machines can run any or all TopStack services

TopStack endpoints are load balanced across available service hosts

Many service hosts can run in an environment; new hosts register services on start

TopStack persistence tier scales vertically to support large transaction volumes

Page 34: TopStack Product Architecture 2013-Q3

34

Portability

The Dasein cross-cloud library allows TopStack to operate against the most popular clouds

TopStack assumes only core IaaS services are available

Most clouds provide core IaaS services, or services which may be mapped to IaaS

Page 35: TopStack Product Architecture 2013-Q3

35

Security

TopStack services are secured with access key and a secret key/password

Optionally, customer can add HSM for increased security

Secret key/password is not transmitted without encryption

Enterprise Edition provides additional OS level lock-downs (PCI DSS)