Alexander Dymo - RailsConf 2014 - Improve performance: Optimize Memory and Upgrade to Ruby 2.1

Embed Size (px)

Citation preview

Improve Performance Quick and Cheap: Optimize Memory and Upgrade to Ruby 2.1

http://www.slideshare.net/adymo/alexander-dymo-railsconf-2014-improve-performance

Ok, Let's talk about performance

Can I have a show of hands. Who here thinks Ruby is fast:C'mon, only a few people I disagree, Ruby is fast, especially the latest version except for one thing memory consumption and garbage collection make it slow.

Oh, most people here think it's fast I do agree, ruby is fast until your program takes so much memory that it becomes slow.

Part 1Why?

Why am I talking so much about memory? Here's why.

Memory optimization is the #1 thing that makes your Ruby application fast

Memory overhead+Slow GC algorithm=

Why? Two reasons:

Large memory overhead where every object takes at least 40 bytes in memoryPlusSlow gc algorithm that got improved in 2.1 but not as much as we will later see

That all equals not universal love an peace

Memory overhead+Slow GC algorithm=High memory consumption+Enormous time spent in GC

But high memory consumption and because of that enormous time that app spends doing GC

That is why memory optimization is so important. It saves you that GC time

That's also why Ruby 2.1 is so important. It makes GC so much faster.

Memory Optimized Rails App (Ruby 1.8)

Same $1k/mo hardware all these years

Some examples from my own experience.

Rails App Upgraded from Ruby 1.9 to 2.1

Compare before/after

Here's another example. No memory optimization done, but Ruby upgraded from 1.9 to 2.1

Optimize Memory andOptionallyUpgrade to Ruby 2.1

But here's another thing. If you can upgrade fine. If not - you can still get same and better performance by optimizing memory.

It's OK to Optimize Memory Only

BeforeAfterOptimization

Heavy rendering1.840.732.5x

Big data17.063.634.7x

Charting5.750.2374.0x

Average request time, seconds

require "csv"data = CSV.open("data.csv")

output = data.readlines.map do |line| line.map do |col| col.downcase.gsub(/\b('?[a-z])/) { $1.capitalize } } endend

File.open("output.csv", "w+") do |f| f.write output.join("\n")end

Unoptimized Program

Ruby 2.1 Is 40% Faster, Right?

require "csv"output = File.open("output.csv", "w+")

CSV.open("examples/data.csv", "r").each do |line| output.puts line.map do |col| col.downcase! col.gsub!(/\b('?[a-z])/) { $1.capitalize! } end.join(",")end

Memory Optimized Program

Ruby 2.1 Is NOT Faster...once your program is memory optimized

Takeaways

Ruby 2.1 is not a silver performance bullet

Memory optimized Ruby app performs the same in 1.9, 2.0 and 2.1

Ruby 2.1 merely makes performance adequate by default

Optimize memory to make a difference

Part 2How?

5 Memory Optimization Strategies

Tune garbage collector

Do not allow Ruby instance to grow

Control GC manually

Write less Ruby

Avoid memory-intensive Ruby and Rails features

Strategy 1Tune Ruby GC

Ruby GC Tuning Goal

Goal: balance the number of GC runs and peak memory usage

How to check:> GC.stat[:minor_gc_count]> GC.stat[:major_gc_count]> `ps -o rss= -p #{Process.pid}`.chomp.to_i / 1024#MB

How does tuning help?You can balance...By default this balance is to do more GC and reduce memory peaks. You can shift this balance.

Change GC settings and see how often GC is called and what your memory usage is

When Is Ruby GC Triggered?

Minor GC (faster, only new objects collected):- not enough space on the Ruby heap to allocate new objects- every 16MB-32MB of memory allocated in new objects

Major GC (slower, all objects collected):- number of old or shady objects increases more than 2x- every 16MB-128MB of memory allocated in old objects

Let's step back for a minute and look when GC is triggered

Environment Variables

Initial number of slots on the heapRUBY_GC_HEAP_INIT_SLOTS10000Min number of slots that GC must freeRUBY_GC_HEAP_FREE_SLOTS4096Heap growth factorRUBY_GC_HEAP_GROWTH_FACTOR1.8Maximum heap slots to addRUBY_GC_HEAP_GROWTH_MAX_SLOTS-

New generation malloc limitRUBY_GC_MALLOC_LIMIT16MMaximum new generation malloc limitRUBY_GC_MALLOC_LIMIT_MAX32MNew generation malloc growth factorRUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR1.4

Old generation malloc limitRUBY_GC_OLDMALLOC_LIMIT16MMaximum old generation malloc limitRUBY_GC_OLDMALLOC_LIMIT_MAX128MOld generation malloc growth factorRUBY_GC_OLDMALLOC_LIMIT_GROWTH_FACTOR1.2

Environment Variables

Initial number of slots on the heapRUBY_GC_HEAP_INIT_SLOTS10000Min number of slots that GC must freeRUBY_GC_HEAP_FREE_SLOTS4096Heap growth factorRUBY_GC_HEAP_GROWTH_FACTOR1.8Maximum heap slots to addRUBY_GC_HEAP_GROWTH_MAX_SLOTS-

New generation malloc limitRUBY_GC_MALLOC_LIMIT16MMaximum new generation malloc limitRUBY_GC_MALLOC_LIMIT_MAX32MNew generation malloc growth factorRUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR1.4

Old generation malloc limitRUBY_GC_OLDMALLOC_LIMIT16MMaximum old generation malloc limitRUBY_GC_OLDMALLOC_LIMIT_MAX128MOld generation malloc growth factorRUBY_GC_OLDMALLOC_LIMIT_GROWTH_FACTOR1.2

When Is Ruby GC Triggered?

ruby-performance-book.com

http://samsaffron.com/archive/2013/11/22/demystifying-the-ruby-gchttp://thorstenball.com/blog/2014/03/12/watching-understanding-ruby-2.1-garbage-collector/

Strategy 2Limit Growth

3 Layers of Memory Consumption Control

Internal
read `ps -o rss= -p #{Process.pid}`.chomp.to_i / 1024
or VmRSS from/proc/pid/#{Process.pid}
and exit worker

3 Layers of Memory Consumption Control

Internal
read `ps -o rss= -p #{Process.pid}`.chomp.to_i / 1024
or VmRSS from/proc/pid/#{Process.pid}
and exit worker

3 Layers of Memory Consumption Control

External (software)
Heroku, Monit, God, etc.

3 Layers of Memory Consumption Control

External (OS kernel)
Process.setrlimit(Process::RLIMIT_AS, )

What about Background Jobs?

Fork et Impera:

# setup background job
fork do # do something heavyend

Strategy 3Control GC Manually

GC Between Requests in Unicorn

OobGC for Ruby < 2.1require 'unicorn/oob_gc'use(Unicorn::OobGC, 1)

gctools for Ruby >= 2.1https://github.com/tmm1/gctoolsrequire 'gctools/oobgc'use(GC::OOB::UnicornMiddleware)

GC Between Requests in Unicorn

Things to have in mind:- make sure you have enough workers- make sure CPU utilization < 50%- this improves only perceived performance- overall performance might be worse- only effective for memory-intensive applications

Strategy 4Write Less Ruby

Example: Group Rank

SELECT * FROM empsalary;

depname | empno | salary-----------+-------+------- develop | 6 | 6000 develop | 7 | 4500 develop | 5 | 4200 personnel | 2 | 3900 personnel | 4 | 3500 sales | 1 | 5000 sales | 3 | 4800

PostgreSQL Window Functions

SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary DESC) FROM empsalary;

depname | empno | salary | rank -----------+-------+--------+------ develop | 6 | 6000 | 1 develop | 7 | 4500 | 2 develop | 5 | 4200 | 3 personnel | 2 | 3900 | 1 personnel | 4 | 3500 | 2 sales | 1 | 5000 | 1 sales | 3 | 4800 | 2

There has been a sentiment inside Rails community that sql is somehow bad, that you should avoid it at all costs. People invent more and more things to stay out of sql. Just to mention AREL.

Guys, I wholeheartedly disagree with this. Web frameworks come and go. Sql stays. We had sql for 40 years. It's not going away.

Finally Learn SQL

Strategy 5Avoid Memory Hogs

Operations That Copy Data

String::gsub! instead of String::gsub and similar

String::{:count=>11,:minor_gc_count=>8,:major_gc_count=>3,

:heap_used=>126, :heap_length=>130,

:malloc_increase=>7848, :malloc_limit=>16777216,

:oldmalloc_increase=>8296, :oldmalloc_limit=>16777216}

objspace.so

> ObjectSpace.count_objects=> {:TOTAL=>51359, :FREE=>16314, :T_OBJECT=>1356 ...

> require 'objspace'> ObjectSpace.memsize_of(Class)=> 1096
> ObjectSpace.reachable_objects_from(Class)=> [#, Class...

> ObjectSpace.trace_object_allocations_start> str = "x" * 1024 * 1024 * 10> ObjectSpace.allocation_generation(str)=> 11

objspace.sohttp://tmm1.net/ruby21-objspace/ http://stackoverflow.com/q/20956401

GC.stathttp://samsaffron.com/archive/2013/11/22/demystifying-the-ruby-gc

RubyProf Memory Profiling

require 'ruby-prof'RubyProf.measure_mode = RubyProf::MEMORYRubyProf.start

str = 'x'*1024*1024*10

result = RubyProf.stop printer = RubyProf::FlatPrinter.new(result)printer.print(STDOUT)

This requires patched Ruby, will work only for 1.8 and 1.9https://github.com/ruby-prof/ruby-prof/issues/86

Valgrind Memory Profiling

> valgrind --tool=massif `rbenv which irb`==9395== Massif, a heap profilerirb(main):001:0> x = "x"*1024*1024*10; nil=> nil==9395==

> ms_print massif.out.9395

> massif-visualizer massif.out.9395

http://valgrind.orghttps://projects.kde.org/projects/extragear/sdk/massif-visualizer

http://www.slideshare.net/adymo/alexander-dymo-railsconf-2014-improve-performance

Sign up for my upcoming book updates:ruby-performance-book.com

Ask me:[email protected]@alexander_dymo

AirPair with me:airpair.me/adymo

So, our time is out. If you'd like to learn more about ruby performance optimization, please sign up for my book mailing list updates. If you need help, just email me or airpair with me. And thank you for listening.

Requests (millions)Column B

120104.9

220117

320128

4201311.3

5201420.1

Column B

Ruby 1.9 & 2.023

Ruby 2.116

Column B

Ruby 1.9 & 2.013.5

Ruby 2.113.1