Upload
juozas-kaziukenas
View
2.619
Download
2
Tags:
Embed Size (px)
Citation preview
You Can't Optimise What You Can't
MeasureJuozas Kaziukėnas // juokaz.com // @juokaz
Juozas Kaziukėnas, Lithuanian
You can call me Joe
More info http://juokaz.com
Why?
Data
Looking for lies
Looking for lies
Looking for lies
Data
• Metrics
• Removes subjective decisions
• Can be aggregated and related
• If it’s 0, it is 0
• Tesla was probably right
Debugging production
Debugging production
• Behavioral patterns
• When something changes - something is not right
• You better notice it
• Facebook deployment process
• What caused it?
What happened?
What happened?
• Previous and current state
• Events
• Correlated events
• State information
What happened here?
What happened here?
Nothing happened here
Something happened here
Logs suck
Logs suck
• Someone needs to be checking them
• Need to be aggregated
• Need to be vizualized
• High I/O to write
• Distributed logs?
I want to sleep
I want to sleep
• Call me when things go wrong
• Otherwise everything is working
• Things don’t break silently anymore
Business problems
Business problems
• Detecting when business tools stop working
• No PHP errors, no database errors
• Failures of APIs, empty responses, invalid data
• Things stop working silently
Counting and timing
Counting and timing
• Record when something happens
• Record how long it takes for something to happen
• Use this to know how many things are happening
The solution
Statsd
Statsd
• Counters and timing
• No need to initialize or set up counters
• Non-blocking writes
• Originally written by Etsy.com
• Just works
• https://github.com/etsy/statsd/
StatsD::increment("phpuk.visitors");
$start = microtime(true);
attend_conference();
$spent = (microtime(true) - $start) * 1000;
StatsD::timing("phpuk.timespent", $spent);
How it works
Graphite
Graphite
• Real-time charts
• Data collection
• Data aggregation
• Specialized database
• http://graphite.wikidot.com/
How it works
Lobster
Logster
Logster
• Parse log files
• Send data to graphite
• Integration with existing applications easier
• Also from Etsy.com
• https://github.com/etsy/logster
DataDogHQ.com
DataDogHQ.com
DataDogHQ.com
• Hosted solution
• Collect data from statsd
• Store and aggregate from multiple servers
• Chart combining any data
• Real time charts
• Alerts
Amazon outage
Amazon outage
Amazon outage
Amazon outage
How I use this
Web spiders
Web spiders
• ~250 nodes
• Couple thousand requests per second
• Increasing throughput - main goal
• Increasing reliability - secondary goal
• Metrics for: request time, error rate, error types, proxy failures, unknown responses, etc.
Web spiders
• Performance increased 1000% in 3 months
• Reliability increased to being 24/7 stable
• I can sleep
Wrapping up
Wrapping up
• Measure things
• Use statsd to collect data
• Graph it
• Sleep
THANKS!Juozas Kaziukėnas
@juokaz