Scaling Data in Magento

Preview:

DESCRIPTION

In this technical session, we will look at how you can scale the database horizontally behind Magento. We will discuss the reasons for scaling through replication and how this may impact on your infrastructure, deployment and Magento implementation. Replication brings with it a great deal of benefits but also some pitfalls and potential problems with things such as high rates of data change. We will present simple solutions with MySQL replication and also using Tungsten by Continuent to bring high availability and high performance to MySQL. This will be backed by real data and metrics from one of the highest volume Magento stores in the UK showing how Magento can be deployed at scale with high availability to serve the UK, USA and Australia from a single implementation generating over $100 million in revenue.

Citation preview

Scaling Data in Magento

Alistair SteadCTO

@alistairstead

Focused seasonal trends such as Cyber Monday have the capacity to melt tin

Server capacity can of course be expandedhardware is inexpensive...

Back pressure

from lower systems will block PHP

When the DB is the bottleneck

adding more servers will only make it worse

I'm not a DBA

When your store is popular you have to cope learn

Don't just copy what is on stackoverflow

identify your exact problem.

You have to learn to question the current state?

You have to find people that can help!

After all engineering is not a solitary affair

So you have your store up and running...

All web servers are scaled and running nicely...

You have Magento configured for optimal

running...

Now you have more traffic and things are slowing

down...

What do you do?

Well... ..?

You can't do anything with out instrumentation

Development instrumentationidentify problems early

Production instrumentation

will show the real problems

We have metrics that say the DB is a slowing us

down

We need to take some actions...

But first...a brief interlude

Scaling, high availability and redundancy

All related but separate things

Scaling:

The ability to function within acceptable limits as the number of users increases

High availability:

The ability to facilitate continuous function following and during failure

Redundancy:

Duplication of critical systems so as to have no single point of failure

In a mission critical application

all these need to be balanced

In commerceconversion rates are directly effected

the decisions made in these areas

Technology should facilitate conversion!

So where should we start?

The Apache / Nginx process is blocking

waiting on PHP...

PHP is waiting on the Database...

Step 1Make your queries FASTer

MySQL IndexesIdentify missing indexes for a query

and speed up the result

Re-design queriesNot recommended for core queries but

sometimes you have to...

However send a patch back to Magento for inclusion in the next release

Step 2Cache as much as you can

Expand query cache as much as you can

Can you fit your entire DB into memory?

Use Full Page Cache

State the obvious but it protects the database at peak loads

Use proxy or edge caches

If you don't need to execute PHP don't

At some point your cache MUST expire

On highly merchandised sites then cache is simply not as effective

But this is all for read operations...

What about writing data?

Lock wait timeout...

Have you seen this in your exception log?

Step 4Ensure all tables are INNODB

Some legacy code will have created MYISAM

Increase lock_wait_timeout

Don't this is an anti-pattern

Step 5Transaction level

Use READ COMMITTED

instead of the MySQL default of REPEATABLE READ

Step 6Reduce transaction size

Your transaction is not committed?

Your waiting for external service calls or none critical writes...

Step 7Reducing non-critical write operations

Logging can be done somewhere elese<?xml version="1.0" encoding="UTF-8"?><frontend> <events> <controller_action_predispatch> <observers><log><type>disabled</type></log></observers> </controller_action_predispatch> <controller_action_postdispatch> <observers><log><type>disabled</type></log></observers> </controller_action_postdispatch> <customer_login> <observers><log><type>disabled</type></log></observers> </customer_login> <customer_logout> <observers><log><type>disabled</type></log></observers> </customer_logout> <sales_quote_save_after> <observers><log><type>disabled</type></log></observers> </sales_quote_save_after> <checkout_quote_destroy> <observers><log><type>disabled</type></log></observers> </checkout_quote_destroy> </events></frontend>

HTTP 101Only modify state on HTTP POST

#TIP 1. This simple rule can help so many aspects of scaling

Off-load functionality to third parties

logging and tracking can be handled else where

Move data and logic to the client

If state has not changed then the client should know all it needs to know

Step 8Asynchronous write operations

Use job queues

non-crtitical write operations can be pushed to the queue

You then have to work with eventual consistency

Step 9Clustering & replication

Introduce a slave databaseReplicate data to the slave database

Use standard MySQL replication

Enable binary logging

Ensure you have compression enabled!

Or you will flood you internal network

Use MIXED Binary logging format

for quicker replication

STATEMENT Binary logging

Can cause PK clashes... in our experience...

Single threaded replication

Prior to MySQL 5.6 you only have one thread

Split read from write operations

Across the cluster

Write, read consistency

Can be resolved with module level connections

Module config.xml

<?xml version="1.0" encoding="UTF-8"?><config> <global> <resources> <module_read> <connection> <use>core_write</use> </connection> </module_read> </resources> </global></config>

Cluster & replication options

Tungsten

Replicator

A multi-threaded replication process over MySQL

Connection manager

A smart connection manager that can filter based on query content

High availability

Connection manager provides active service discovery

Connection Manager

runs on every server and allows the master to float around the cluster

Hot production upgrades

MySQL can be configured or upgraded with zero downtime

The Master Database

Can be moved to any node without config changes or downtime

Service discovery

All servers connect to their own Connection Manager

Next steps...

One setting does not rule them all

Use many tuned connections for specific operations types

Alternate replication architecture

Fan-in for example allowing multiple masters

Sharding

The smart connector can re-write the query on the fly

Gotchas...

Turn off security updates because your cluster will FAIL

Ensure enough RAM for the transaction size

Do you have enough file descriptors

This will be limited ensure you have enough

Thank you!

Questions?

http://bit.ly/sdinmage