40
Scalable web apps execution time vs development time Piotr Pelczar me@athlan.pl

Scalable Web Apps

Embed Size (px)

DESCRIPTION

Scalable web apps - execution time vs development time.

Citation preview

Page 1: Scalable Web Apps

Scalable web appsexecution time

vsdevelopment time

Piotr [email protected]

Page 2: Scalable Web Apps

Types of scaling

Horizontal scalingscale out

Vertical scalingscale up

Page 3: Scalable Web Apps

Server #3Server #2Server #1

Think about your app as a workernot single instance

Load balancer

App #1 App #2 App #3 App #4 App #5

OS

App

Page 4: Scalable Web Apps

Server #3

Server #2

Server #1

Load balancer

App #1 App #2

App #3 App #4

App #5Load balancer

Server #n

Think about your app as a workernot single instance

Page 5: Scalable Web Apps

Sessions

We need:

• Common• Fast• Persistent

Storage for sessions.

Page 6: Scalable Web Apps

Sessions

Server #3Server #2Server #1

Load balancer

App #1 App #2 App #3 App #4 App #5

Session storage

App

OS

Page 7: Scalable Web Apps

Sessions - Redis

• Key-value in memory database (hash-tabled)• Scalable up to 1k nodes• Partitioning with Query routing• Non blocking M-S replication on nodes• Clustered (currently not production ready)http://athlan.pl/symfony2-redis-session-handler/

Page 8: Scalable Web Apps

Redis - Partitioning with Query routing

Node #1 Node #2 Node #3

Query random

node

Miss Hit, abort

Also supported:• Client-side partitioning (app calls appropriate

node)• Proxy assisted partitioning (proxy selects

appropriate node)

Page 9: Scalable Web Apps

Centralized Logging

• Logs should be centrailzed to avoid taking notice to each node separately

• Approaches:– File replication (rsync + cron)– syslog (easy to integrate with log4j)• syslogd over UDP p:514• rsyslog over TCP, stores data in db

Page 10: Scalable Web Apps

Common storage, no local changes!

• Keep storage avaliable to all nodes– Symfony2 Gaufrette Bundle• FTP• Amazon S3• OpenCloud• AzureBlobStorage• Rackspace

Page 11: Scalable Web Apps

Architecture

Server #3Server #2Server #1

Load balancer

App #1 App #2 App #3 App #4 App #5

Session storage Files storage abstraction

Centralized logging

App

OS

OS

Page 12: Scalable Web Apps

Continuous Integration

• To keep all nodes up-to-date, you need CI• Automatize disabling nodes, building,

deploying– Jenkins CI

Page 13: Scalable Web Apps

Contineous Integration

1. Disable service on node2. Deploy/build app

1. Copy files2. Update db schema (liquibase, ORM schema

update)3. Execute scripts

3. Re-run service

Page 14: Scalable Web Apps

Balance the payload - HAProxy

• Very, very fast proxy!• Software TCP/HTTP load balancer• Different node selecting algorithms:– roudrobin (limit 4128)– static-rr– leastconn (lowest number of connections)

Yeah guys, this is logo :)

But no schema is needed just imagine how it works.

Page 15: Scalable Web Apps

Balance the payload - HAProxy

• You can check node’s status by pinging• Dead node is excluded from balancing strategy

vi /etc/haproxy/haproxy.cfgoption httpchk HEAD /check.txt HTTP/1.0server webA 192.168.0.102:80 checkserver webB 192.168.0.103:80 check

Page 16: Scalable Web Apps

Balance the payload - HAProxy

• Monitor node’s status by read stats from socket via socat.

echo "show stat" | socat /tmp/haproxy.sock stdio

Page 17: Scalable Web Apps

Balance the payload - HAProxy

• Monitor node’s status by native stats webapp console

Page 18: Scalable Web Apps

Nodes Monitoring - Zabbix

• Zabbix, centralized server monitoring

Page 19: Scalable Web Apps

Zabbix + HAProxy

• UserParameter=haproxy.qcur[*], echo "show stat" | socat /tmp/haproxy.sock stdio | grep -i '$1' | sed 's/,/\ /g' | awk '{print $$3}'

Page 20: Scalable Web Apps

Reverse Proxy and Varnish cache

http://tomayko.com/writings/things-caches-do

• Global virtual user = global cache

Page 21: Scalable Web Apps

Reverse Proxy – Expiration model

http://tomayko.com/writings/things-caches-do

Page 22: Scalable Web Apps

Reverse Proxy – Expiration model

http://tomayko.com/writings/things-caches-do

Page 23: Scalable Web Apps

Reverse Proxy – Validation model

http://tomayko.com/writings/things-caches-do

Page 24: Scalable Web Apps

Reverse Proxy – Validation model

http://tomayko.com/writings/things-caches-do

Page 25: Scalable Web Apps

Reverse Proxy and Varnish cache

Varnish:80

Apache:81

App

Page 26: Scalable Web Apps

Reverse Proxy and Varnish cache

Varnish:8080

Apache:8081

App

Varnish:8082

Apache:8083

App

HAProxy:80

Page 27: Scalable Web Apps

Reverse Proxy and Varnish cache

Apache:8081

App

Apache:8082

App

HAProxy:81

Varnish:80

Page 28: Scalable Web Apps

Varnish and ESI

<!DOCTYPE html><html><body><!-- ... some content --><!-- Embed the content of another page here --> <esi:include src="http://..." /><!-- ... more content --></body></html>

Page 29: Scalable Web Apps

Scaling databases - Master slave

Master

Slave Slave Slave

Write

Read

• All data redundancy

Page 30: Scalable Web Apps

MongoDB scaling

• Common models to spread data over nodes:– range keys– hash keys

• Many nodes on cheap machines• No all data redundancy in each node

Page 31: Scalable Web Apps

MongoDB – range-based keys

• Awesome for range queries (grab data from min nodes – Query isolation)

• Not good enough to distribute data over nodes in case of monotinic incemental

http://docs.mongodb.org

Page 32: Scalable Web Apps

MongoDB – hash-based keys

• Take notice: not good for range queries whilemerge-sorting, no Query isolation in this case

• Write scaling – Write to many nodes simultaneously (take notice to readers-writer lock, where write is exclusive)

http://docs.mongodb.org

Page 33: Scalable Web Apps

Mongodb sharding and clustering

http://docs.mongodb.org

Page 34: Scalable Web Apps

CQRS

• Command Query Responsibility Segregation– separate application service layers for writing and

readng from DB (possibility to use different data sources like RAM or DB)

Page 35: Scalable Web Apps

CQRS

• Examples– post-insert population cache• all SELECTs are from cache (even invalid)• consider LFU instead of LRU to invaidate cache

– pre-insert into memory• dump results periodicaly

In both approaches there is convenient to use Queues or data bus !

Page 36: Scalable Web Apps

Queues, RabbitMQ

• RabbitMQ is based on AMQP (Advanced Message Queuing Protocol)– point-to-point– publish-and-subscribe– queueing, routing

• AMQP is not JMS (Java Message Service is an API, not protocol)

• Happy Rabit is empty Rabbit– do not try to store any data (messages) in queue

system in persistent mode to keep HA

Page 37: Scalable Web Apps

• Simple queue

• Work queues(one consumer)

• Publish/Subscribe(many consumers)

Queues, RabbitMQ

Page 38: Scalable Web Apps

Box vs spread architecture.

• Box architecture– no scaling– easy to maintenance

Server

RabbitMQ DBRedis

Webapp Varnish

Page 39: Scalable Web Apps

Server #2

Box vs spread architecture.

• Spread architecture– High availability– more integrations, more administrative

Server #1

RabbitMQ

DB shardRedis

HAProxy

Varnish

Webapp

Server #3

DB shard Varnish

Webapp

Page 40: Scalable Web Apps

Scalable web appsexecution time

vsdevelopment time

Piotr [email protected]