Upload
clay-smith
View
875
Download
0
Embed Size (px)
DESCRIPTION
It’s inevitable that code running in browsers will fail in strange and unexpected ways. While we’ve made extraordinary progress in building highly-polished Javascript web apps on multiple platforms, we’re still at the very beginning of creating monitoring tools that tell us when things start to go wrong. Inspired by how operations teams deal with failure in complex systems, this talk covers some useful open-source tools and techniques that help developers create more resilient apps.
Citation preview
7/25/14
@smithclay
CLAY SMITHFORWARDJS 2014JULY 25, 2014
Embracing failure on the front-end
7/25/14
What this talk covers
NOT COVERED: MY RECIPE FOR TEXAS-STYLE BEEF CHILI. FIND ME AFTER TO TALK ABOUT IT.
The inevitability that Javascript apps will break.
Borrowing good ideas about failure from operations teams.
A bit about the theory of complex systems failure.
Open-source tools and services that help make apps more resilient.
Why talking about failure in the front-end is important.
7/25/14GOOGLE TRENDS ALL THE THINGS
One trend is twice as popular as the other trend on average.
7/25/14DR. COOK IS MY HERO
RICHARD I. COOK, MD. HOW COMPLEX SYSTEMS FAIL.
“Complex systems are intrinsically
hazardous systems.”
SOME THEORY, PART 1
7/25/14
“Exception” tracking with window.onerror
MAY YOU NEVER HAVE TO SEE THIS DIALOG AGAIN
DANGER: THIS GETS PRETTY UGLY.
7/25/14
So you want to use a 3rd party service…
SERIOUSLY, PAUL IRISH APPEARS IN ALL MY TALKS.
THERE ARE LOTS: HTTPS://PLUS.GOOGLE.COM/+PAULIRISH/POSTS/12BVL5EXFJN
7/25/14NS_TOO_MUCH_NOISE. NOT REALLY SURE WHY I REDACTED THE URLS.
FURTHER READING: HTTP://BLOG.MELDIUM.COM/HOME/2013/9/30/SO-YOURE-THINKING-OF-TRACKING-YOUR-JS-ERRORS
Example window.onerror output
7/25/14DOES THIS SOUND LIKE COMMON SENSE YET?
"Change introduces new forms of failure."RICHARD I. COOK, MD. HOW COMPLEX SYSTEMS FAIL.
SOME THEORY, PART 2
7/25/14
Monitor change with phantomas
CREEPY PICTURE, NO? I BET HE WRITES ERLANG. I ALSO DON'T KNOW HOW TO SAY PHANTOMAS.
HTTPS://GITHUB.COM/MACBRE/PHANTOMAS
JEAN MARAIS AS FANTÔMAS IN THE 1964 FILM.
Phantomas is “PhantomJS-based web performance metrics collector and monitoring tool”.
phantomas --cookie '_session=<redacted>'
--reporter=statsd
--statsd-host 127.0.0.1 --statsd-prefix stg
--runs 5
http://staging-web.com
7/25/14
How to get super-detailed site metrics…if you’re lazy and cheap.
5 HABITS OF HIGHLY LAZY FRONT-END PERFORMANCE ENGINEERS
Cloud server/your laptop with phantomas installed
Cron job that runs phantomas with statsd output
DataDog Lite Account + Install DataDog Agent on Server
Configure Alerting (I recommend PagerDuty)
Get woken up at 3am
7/25/14
Make the metrics understandable and actionable
THIS LOOKS IMPRESSIVE WHILE YOU READ HACKER NEWS ON YOUR OTHER MONITOR
TESTING DASHBOARD FOR STAGING ENVIRONMENT IN DATADOG.EVEN FANCIER: INTEGRATE IT INTO YOUR WEB APP: HTTPS://GITHUB.COM/BLOG/1252-HOW-WE-KEEP-GITHUB-FAST
7/25/14
Get alerted as things happen
YOU'LL BE ANGRY AT ME WHEN THIS WAKES YOU UP AT 3AM
CREATING A NEW METRIC ALERT IN DATADOG
Choose a phantomasmetric
Define conditions
7/25/14SAY THIS THE NEXT TIME YOU BLOW SOMETHING UP.
“Failure free operations require experience
with failure.”RICHARD I. COOK, MD. HOW COMPLEX SYSTEMS FAIL.
See also: https://blog.pagerduty.com/2013/11/failure-friday-at-pagerduty/
SOME THEORY, PART 3
7/25/14
Inject chaos into your front-end
ORIGINAL GRAPHIC SLIGHTLY REDACTED
HTTPS://GITHUB.COM/TRAVIS-HILTERBRAND/CHAOS-MONKEY-BROWSER
HTTPS://GITHUB.COM/MIKL/NODE-CHAOS-MONKEYWARE
7/25/14EMBRACING FAILURE ON THE FRONT-END
var props = {
probability:0.5,
allowedMethods:['GET'],
mischiefTypes:[
ChaosMonkey.MischiefTypes.delay,
ChaosMonkey.MischiefTypes.http403
]
};
ChaosMonkey(props);
CONFIGURING CHAOS-MONKEY-BROWSER (*JQUERY REQUIRED)
With a 50% probability, this configuration will
cause jQuery ajax GET requests to slowly
fail with a 403 response.
CDN Failure
API Failure
Connection Failure
Bad SSL certificates
And more!
Prepares for:
7/25/14
Other possible strategies
HOW TO ANNOY PEOPLE DURING CODE REVIEW
1. DISABLE/SLOW DOWN NETWORK CONNECTION (IN CHROME CANARY DEVTOOLS):
2. WHAT HAPPENS WHEN YOU DISABLE JS? (USING PLUGIN RECOMMENDED):
AMAZON.COM ISN’T HAPPY WITHOUT JAVASCRIPT
7/25/14
Lessons learned in failure
SERIOUSLY, REMEMBER ONE OF THESE THINGS
Measure errors and key performance metrics over time
Bad performance = failure
Annoy yourself to fix the broken things with alerting
Find remediation steps to make sure it doesn’t happen again
Get experience with failure before 7pm on a Friday
7/25/14
Thanks!
Additional resources (more reading):
• https://info.aiaa.org/tac/SMG/SOSTC/Shared%20Documents/How%20Complex%20Systems%20Fail.pdf
• http://blog.meldium.com/home/2013/9/30/so-youre-thinking-of-tracking-your-js-errors
• https://blog.pagerduty.com/2013/11/failure-friday-at-pagerduty/