Boosting the Performance of your Rails Apps

Preview:

DESCRIPTION

A short presentation showing some ways to improve the performance of ruby on rails apps. Presented at the Jakarta ruby user group meetup.

Citation preview

Boosting the performance of your Rails apps

Jakarta RoR meetup, January 2013

Matt KuklinskiCTO, Gopher

Where to start?

• There are many ways to improve RoR app speed – some easy, some hard

• Concentrate on methods that give you the biggest bang for the buck

• This presentation shows a few different methods that should give you a good performance return for your time investment

1. DB Indexes

• Your app will be constrained by database performance

• Appropriate DB indexes can give you 100x performance gains on large tables

• Not all rails developers realise how important this is

• It’s easy to add indexes:class AddIndexToClientIndustry < ActiveRecord::Migration def change add_index :client_industries, :client_id endend

Example with index

CREATE INDEX addresses_addressable_id_addressable_type_idx ON addresses USING btree (addressable_id, addressable_type);

t1 = Time.now c = Company.find(178389) a = c.addresses.firstt2 = Time.nowputs "---Operation took #{t2-t1} seconds---”

Result with index:---Operation took 0.012412 seconds---

Now without the index

DROP INDEX addresses_addressable_id_addressable_type_idx;

t1 = Time.now c = Company.find(178389) a = c.addresses.firstt2 = Time.nowputs "---Operation took #{t2-t1} seconds---”

Result without index:---Operation took 0.378073 seconds---

0.378073 / 0.012412 = 30.46 times slower without the index

Index Tips

• add indexes to all referential attributes, and other attributes that you regularly search or sort over if they contain a lot of distinct values

• don't add too many indexes - each one increases the DB size and reduces the performance of insert and update queries

2. Minimise amount of DB queries

• RoR makes it easy to program quickly

• The downside:RoR makes it easy for the number of database queries per request to explode out of control

Example: N+1 queries

• Let’s say we have a Client model, and each Client can have one or more industries through ClientIndustry.

• We want to show a list of clients, and their primary industries:

<% @clients.each do |client| %> <tr> <td><%= client.id %></td> <td><%= client.business_name %></td> <td><%= client.industries.first.name %></td> </tr><% end %>

Be careful doing this:

# app/controllers/clients_controller.rbdef index @clients = Client.allend

If you have 50 clients, then 51 DB queries will be run:

Processing by ClientsController#index as HTMLSELECT "clients".* FROM "clients" SELECT "industries".* FROM "industries" INNER JOIN "client_industries" ON "industries"."id" = "client_industries"."industry_id" WHERE "client_industries"."client_id" = 1 LIMIT 1SELECT "industries".* FROM "industries" INNER JOIN "client_industries" ON "industries"."id" = "client_industries"."industry_id" WHERE "client_industries"."client_id" = 2 LIMIT 1SELECT "industries".* FROM "industries" INNER JOIN "client_industries" ON "industries"."id" = "client_industries"."industry_id" WHERE "client_industries"."client_id" = 3 LIMIT 1…

Solution: Eager Loading# app/controllers/clients_controller.rbdef index @clients = Client.includes(:industries).allend

Now just 2 or 3 queries are performed instead of 51

Processing by ClientsController#index as HTMLSELECT "clients".* FROM "clients" SELECT "client_industries".* FROM "client_industries" WHERE "client_industries"."client_id" IN (1, 2, 3)SELECT "industries".* FROM "industries" WHERE "industries"."id" IN (1, 5, 7, 8, 4)

3. Minimise memory usage

• Only use gems that you actually need

• Don’t load objects into memory unless you need to use them

• When processing massive datasets, split them into batches

Example: find_each

An example using real data:

Using find:t1 = Time.now Company.where(:country_id=>1).find do |c| puts "do something!" if ['Mattski Test'].include?(c.common_name)endt2 = Time.nowputs "---Operation took #{t2-t1} seconds---”

Result:1 query, taking 46.65 seconds

Now using find_each:t1 = Time.now Company.where(:country_id=>1).find_each do |c| puts "do something!" if ['Mattski Test'].include?(c.common_name)end t2 = Time.nowputs "---Operation took #{t2-t1} seconds---"

Result:>100 queries, taking 15.53 seconds in total (3x faster)

Sometimes more queries is better!

3. Caching

• Can make a huge difference to performance

• Lots of options:– page caching– action caching– fragment caching–Memcached, Redis

• Tip: get your data model correct first. Caching can hide structural problems

What is Memcached?

• Free & open source, high-performance, distributed memory object caching system.

• Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.

• www.memcached.org

What is Redis?

• Redis is an open source, advanced key-value store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets and sorted sets.

• redis.io

Pre-calculate summary dataExample:• We have tables holding sales, sales_reps and teams• We need to provide live monthly and daily charts showing

cumulative sales per sales_rep, per team, and for the company as a whole

• We could produce a complicated ruby method that results in a query like this:

select date(sales.created_at) as sale_date ,sales_reps.name ,sum(sales.amount) as daily_sales from sales join sales_reps on sales_reps.id = sales.sales_rep_id where sales.created_at > '2013-01-01’ group by 1,2;

But that’s not very efficient if we have 300 sales reps and managers checking all their charts every few minutes. How can we speed it up?

Solution:• Summarise the data in a sales_metrics table with good indexes and

use observers and delayed_jobs to recalculate the sales data in near-real time.

• Then we can do:sales_rep.sales_metrics.where(:date>'2013-01-01')

To get an optimised query like this:select date ,sales_rep_id ,daily_sales from sales_metrics where sales_metrics.date >= '2013-01-01’

Now instead of 300 sales reps, imagine having 20,000 daytraders checking their daily stock portfolio charts… It has to be pre-calculated.

6. Make web requests fast

• You have a limited number of processes available to serve web requests, so they need to be fast

• Ideally, web processes should finish within milliseconds. 1-2 seconds is slow. 10+ seconds is very slow.

• If you have slow web requests then your rails app wont be able to support many simultaneous users.

Solution: use background processes

• Use background processes such as delayed jobs for long-running jobs. This will free your web processes up to handle more requests.

• What types of things?– sending email– running reports– processing images– obtaining information from third party APIs

• Suggestion: use priorities so that important background processes get actioned before less important ones if there is a build up in jobs

• Note: Rails 4 will support background processing out of the box

7. Monitor performance

• Make sure to monitor the performance of your apps so that you can pinpoint which areas are running slowly.

• New Relic is an excellent tool for monitor rails apps

What does this tell me?

- Response time is good.- There’s no request queuing.

- I can scale back the web processes

What does this tell me?

- Performance is not that great- The database is being overworked

- There may be some inefficient DB queries

The slowness is almost entirely caused by the SearchController. This is a target for optimisation.

8. Use an in-memory DB

• Databases are fast when the searching and sorting is done in memory

• They slow down a lot when they have to go to disk

Solution: keep your DB trim

• Try to limit the size of the DB so that it fits entirely within memory

• Move non-essential information out of the main DB into a secondary DB or elsewhere (i.e. audit logs, inactive accounts, old email logs, etc.)

• Consider using non-relational databases if you have massive storage requirements

9. Manage your load

• load balancing– essential for public web apps– Cloud hosting providers help to manage this for

you

• database replication– run heavy reports on a replicated database so

that the performance of the main database isn’t affected

• database sharding

More performance tips

• Use a content distribution network for static files.– AWS CloudFront, etc.

• Make the UI asynchronous– use AJAX lazy loading for anything that takes more

than 1-2 seconds to load

• Use a Service Oriented Architecture, so that some actions can be done in parallel on a different hosting stack

Thanks for your interest!

• Contact– www.linkedin.com/in/matthewkuklinski–@mattkuklinski

– www.gopher.co.id– www.gopher.co.nz

Recommended