Upload
simeon-simeonov
View
602
Download
3
Embed Size (px)
DESCRIPTION
Boston Ruby Meetup presentation by Joe Ferris, CTO of thoughtbot, and Simeon Simeonov, CTO of Swoop, on ways to optimize the memory footprint of data intensive Ruby on Rails applications.
Citation preview
Memory Issuesin Rails Applications
I am @simeons
recruit amazing people
solve hard problems !
ship !
make users happy !
repeat
Problems of Success (good problems)
Too many users Too much traffic Too much data
Memory Issuesin Rails Applications
Common Problem of Success
Display AdvertisingMakes the Web Suck
User-focused optimization Tens of millions of users
1000+% better than average 200+% better than Google
Swoop Fixes That
Mobile SDKs iOS & Android
Web SDK RequireJS & jQuery
Components AngularJS
NLP, etc. Python
Targe<ng High-‐Perf Java
Analy<cs Ruby 2.0
Internal Apps Ruby 2.0 / Rails 3
Pub Portal Ruby 2.0 / Rails 3
Ad Portal Ruby 2.0 / Rails 4
Before 1hr @ 4Gb
Before 1hr @ 4Gb
When problems grow faster than the rate at which you can throw HW at them, you actually have to solve them
Before 1hr @ 4Gb
After 5min @ 230Mb
Resolving Memory Issuesin Rails ApplicationsUsing Streams
CSV
0
125
250
375
500
0 25,000 50,000 75,000 100,000
Rows
Mem
ory
(Mb)
0
125
250
375
500
0 25,000 50,000 75,000 100,000
Rows
Mem
ory
(Mb)
You are here
0
125
250
375
500
0 25,000 50,000 75,000 100,000
Rows
Mem
ory
(Mb)
You are here
This sucks
0
125
250
375
500
0 25,000 50,000 75,000 100,000
Rows
Mem
ory
(Mb)
You are here
This sucks
Start thinking here
Memory Leaks
class AddDomainsStep def call(hashes) hashes.map do |hash| transform_and_return(hash) end end end
1 class AddDomainsStep 2 def initialize 3 @domain_config = DomainConfig.instance 4 end 5 6 def call(hashes) 7 hashes.each do |hash| 8 hash['domain'] = 9 @domain_config. 10 domain_for(hash['domain_id']) 11 end 12 end 13 end
1 class DomainConfig 2 include Singleton 3 4 def initialize 5 @domains = {} 6 end 7 8 def domain_for(id) 9 @domains[id] ||= Domain.name_for(id) || '' 10 end 11 end
@domains[id] ||= Domain.name_for(id) || ''
Memory Leak
•Memory that will never be released by the garbage collector.
•Memory usage grows the longer the process runs.
Avoid Global State
•Global variables
•Class variables
•Singletons
•Per-process instance state
Memory Churn
hashes.map do |hash| hash['domain'].downcase.strip end
hashes.each do |hash| hash['domain'].downcase! hash['domain'].strip! end
vs
Memory Churn
•Allocating and deallocating tons of objects slows down processing
•Mutation limits allocations, but makes it easier to introduce bugs
1 hashes.each do |hash| 2 hash['domain'].downcase! 3 hash['domain'].strip! 4 end
Spot the Bug!
# In shared state: @domains[id] ||= Domain.name_for(id) || '' !
# Much later: hash['domain'].downcase! hash['domain'].strip!
Good News!•Allocating and freeing objects is
fairly fast in Ruby •Keeping your stack frame light
will limit the effects of memory churn
Memory Bloat
def to_csv csv = [CSV.generate_line(headers)] !
rows.each do |row| values = headers.map do |header| row[header] || defaults[header] end !
csv << CSV.generate_line(values) end !
csv.join('') end
def to_csv csv = [CSV.generate_line(headers)] !
rows.each do |row| values = headers.map do |header| row[header] || defaults[header] end !
csv << CSV.generate_line(values) end !
csv.join('') end
def to_csv csv = [CSV.generate_line(headers)] !
rows.each do |row| values = headers.map do |header| row[header] || defaults[header] end !
csv << CSV.generate_line(values) end !
csv.join('') end
Memory Bloat
•Memory usage grows with data set
•Loading too much data at once
Laziness
rename_report_fields( squash( add_domains( add_properties( unwind_variations( rows ) ) ) ) )
def duplicate(number, count) if count > 0 [number] + repeat(number, count - 1) else [] end end !
def sum(list) list.inject(0) do |result, number| result + number end end
sum(repeat(5,10)) # => 50
duplicate :: Int -> Int -> [Int] duplicate number count | count <= 0 = [] | otherwise = number:duplicate number (count - 1) !sum :: [Int] -> Int sum [x] = x sum (x:remaining) = x + sum remaining
> sum $ duplicate 5 10 50
Be ProactiveAbout Being Lazy
Enumerable
class AddDomainsStep def initialize(source) @source = source end !
def each @source.each do |hash| hash['domain'] = DomainConfig. instance. domain_for(hash['domain_id']) yield hash end end end
RenameReportFieldsStep.new( SquashStep.new( AddDomainsStep.new( AddPropertiesStep.new( UnwindVariationsStep.new( rows ) ) ) ) )
Buffering