Scaling APIs: Predict, Prepare for, Overcome the Challenges

Scaling APIsFeeding your Speeds

April 5, 2012

Greg Brail @gbrail

Brian Pagano @brianpagano

@brianpagano@gbrail

groups.google.com/group/api-craft

youtube.com/apigee

IRC Channel#api-crafton freenode

New!

What is scale to an API?

Developers: Lots of developers building apps

Apps: Lots of apps for end users to use

End users: Millions and millions of app users

Versions: Lots of API versions to manage

API calls: All of these things result in API calls…

Today’s Topic: API Calls

Today we are going to focus on handling huge numbers of API calls in an API infrastructure

Ever been in a meeting where someone said, “let’s not talk about ‘speeds and feeds’ today?”

This is not that meeting.

Why do APIs need to scale?

Thousands of app developers…Each one must be managed, signed up, etc.

Thousands of apps…Each one has credentials that need to be validated

Millions of end users…Each one has one or more OAuth access tokens

Result in lots of API calls. For example:One million end users,

Making 1000 API calls a day,

Results in one billion API calls a day,

Or about 11,000 API calls per second on average

Tracking API calls

Today we’ll mainly talk about throughput

Measured in API calls or transactions per second (tps)For an API, usually as the number of users increases, throughput increases

As throughput increases, latency often increases tooIt’s not enough just to handle lots of throughput – it’s important to handle it with a reasonable amount of latency

What Limits Scale?

Disk

Network

CPU

Memory

Database

App server

API Proxy

Load balancer

Cache servers

What are some limits?

Seek time

Rotational speed

Transfer speed

Clock speed

Number of cores

Amount of RAM

Database design & tuning

App server coding & config

Proxy configuration

Load balancer policies

Cache configuration

And many more…

Some examples

We’re going to talk about things to look at as throughput grows from one level of traffic to another…

1 tps

10 tps

100 tps

1000 tps

10,000 tps

100,000 tpsand beyond…..

Image from valdosta.edu

At 1 transaction per second

86,400 per day / 2.5 million per month

Almost everything can handle this.

What about the database?If each API call makes several big SQL queries, it may not!

Strategies for 1 tps

Test (always)

Tune the database installation

Tune the database design

Monitor query performance

Test again!

image from istockphoto.com

At 10 transactions per second

864,000 per day / 25 million per month

Most infrastructure can still handle this.

What about the application server?Is the app well-designed enough?Does it make an excessive number of database calls?


Ensure that the app server is properly optimized

Do API calls make the minimum number of database calls?

Do API calls depend on large numbers of external services?

image from istockphoto.com


8.6 million per day / 259 million per month

(Now we are starting to get somewhere)

RDBMS systems may struggleLess-efficient app servers may struggle“Free” tiers on hosting platforms aren’t an option


Database optimization and tuning is critical hereAllocate fast storage, and lots of itAllocate lots of memoryTune the database to use it!Find bad queries and fix them or optimize them

App server tuning is critical hereAre there enough threads in the thread pool?Are there enough processes?

Image from http://www.jigzone.com


86 million per day / 2.5 billion per month

Now everything may start to break…

What is the mix between reads and writes?


Understand the mix between reads and writesCache the reads as much as you canIf you can cache them closer to the client, better

Understand your app server performanceFaster app servers should still be able to handle (like Java)RoR, Python, PHP, etc will require much bigger clustersStateless app servers are your friend!

More strategies for 1000 tps

Can the database handle the load?It can if most transactions are readsAnd you cache as much as you can

Otherwise it’s time to scale the database layerSharded RDBMSesOr a scalable NoSQL database works here

At 10,000 transactions per second

864 million per day / 25 billion per month

If most transactions are reads, caching is your friend

Otherwise, this is serious businessNo single database can handle thisFew single app servers can handle thisIf API calls are large, what will the bandwidth be?

Strategies for 10,000 tps

Caching is even more essentialEven a simple cache can handle this load on one or two boxes

Database writes are problematicNo single database server can write 10,000 times per second

Scalable, eventually-consistent databases can scale this big, (like Cassandra)

More for 10,000

App serversYou’ll need a cluster of app servers no matter what!

What about session management?

What about load balancing?

100,000 API calls per second

8.6 billion per day!

Now your API is truly impressive(either that or it is very poorly designed!)

You will need racks of infrastructure no matter what!

Some other considerations

API design

Client design

Latency

Bandwidth

What about API design?

Every API call has overhead:TCP connection / SSL handshake / load balancer CPU / API

proxy CPU / App server CPU and thread pool / database connections / disk I/O…

Do you need to make so many?Can you design your APIs to support fewer high-value API calls?

Can you have “batch” calls in your API?

What about the client?

Can client apps use the API more efficiently?Don’t make the same API calls over and over

Utilize compression

Utilize conditional requests in HTTPWhich means that the API server should support them!

Request only the data that’s neededWhich means that the API server should trim responses

Or paginate them

What about latency?

Latency kills user experience!

How can the API server reduce it?Remove steps in the processing flow through caching

Cache closer to the API clients

What about the network?

What kind of network connection to you have?For instance, in EC2 you get 1Gbps,

Or about 122 megabytes / second

At 10,000 tps, for instance, that’s 12K per API call

THANK YOUQuestions and ideas to:

@gbrail

@brianpagano

Technology

Scaling APIs: Predict, Prepare for, Overcome the Challenges