The anatomy of an A/B Test - JSConf Colombia Workshop

A/B testing workshop “In God we trust, all others must bring data”

JSConf Colombia Workshop 2015

@shiota github.com/eshiota

slideshare.net/eshiota eshiota.com

A/B tests measure how a new idea (version B/variant/test) performs agains an existing implementation (version A/base/control).

Buy now Buy nowversus

coin flip

Buy now

When the user sees or is affected by the idea, they are tracked and become part of the test.

Buy now

track(my_experiment)

Data about the website is generated as users browse through pages and do their tasks.

product added to cart

number of products added

purchase finished

average price per purchase

number of products seen

user has logged in

used guest checkout

customer service calls

When there’s enough information to make a decision, you can either stop the test (keeping version A) or choose version B, directing all traffic to it.

Buy now Buy now

Duration: 14 days Visitors: 45.140 (22.570 per variant)

339 (1.5%) 407 (1.8%)

20% up

144.500 COP 147.390 COP

Number of purchases:

Average price:

coin flip

Buy now

"But my design is obviously more beautiful and intuitive than what we have now! Why should I run an A/B test?” — the majority of designers

Quiz time!(prizes included)

A: Raise your left hand B: Raise your right hand

Neutral: Don’t raise your hands

Which performed better?

Reduced bounce rate in 1.7%

Increased CTR in 203%

43.4% more purchases

Both were statistically equivalent

Intuition vs

Historical Analysis vs.

Experimentation

We have a 2/3 chance of being wrong when trusting our intuition.

People behave differently each season/month/day of the week.

Different cultures lead to different patterns of usage.

Data analysis alone provides correlation but not causation.

Running your A/B test(in 5 simple steps)

Step 1: Hypothesis

Analyse all possible inputs to come up with an hypothesis to work on.

• Usability research • Benchmarking • Surveys • Data mining • Previous experiments

Hypothesis:

“If users from South America countries relate more to the website, they will book more.”

Step 2: Idea

“If we add the country’s flag next to the website’s logo, users will relate more to the brand.”

Step 3: Setup

• Who will participate? • What is the primary metric? • Any secondary impacts? • How will it be implemented?

• Users from Argentina, Bolivia, Brazil, Chile, Colombia, Ecuador, Guyana, Paraguay, Peru, Suriname, Uruguay and Venezuela, on all platforms

• Conversion (net bookings) uplift is expected • We expect more returning customers

Step 4: Monitoring

Keep checking the metrics to see if anything’s terribly wrong.

Avoid checking too often, let your test get enough users and enough runtime.

Step 5: Data, decisions, and next steps

When you reach the expected runtime, number of visitors or effect, look at the data and take a decision.

product added to cart

number of products added

purchase finished

average price per purchase

number of products seen

user has logged in

used guest checkout

customer service calls

Optimizely dashboard

• How were the primary and secondary metrics impacted?

• What were the results isolated by each country?

• What were the results isolated by each language?

• Did any particular platform (desktop, mobile devices, tablets) perform better?

• Was the impact on returning customers any higher than first time visitors?

Based on the gathered data, plan for next steps.

• Should we add a copy to the flag? • Should we add a tooltip to the flag? • Should we increase/decrease the flag size? • Should we restrict it just for desktop users? • Should we try this for a single country, or

other countries?

What can you test?

(almost) Everything.

You can test a small design change.

versus

You can test large design changes.

versus

You can test different copies.

versus

Submit

Book now

You can test technical improvements and measure page load time, repaints/reflows, and conversion impact.

versus

jQuery 1.11.3

jQuery 2.1.3

You can even test back-end optimisations and measure page load time, rendering time, CPU and memory usage etc.

if track_experiment(:my_optimized_query) @users = my_optimized_query else @users = do_the_normal_thing end

Live coding(I hope that works.)

Find the code at:

https://github.com/eshiota/ab_workshop

Additional links:

https://www.optimizely.com/ https://github.com/splitrb/split/

http://whichtestwon.com http://unbounce.com/

http://blog.booking.com/hamburger-menu.html http://blog.booking.com/concept-dne-execution.html

Gracias!

The anatomy of an A/B Test - JSConf Colombia Workshop

Technology

To push or not to push - JSConf › pdf › deck-16372c4610518210.pdfTo push, or not to push?! A journey of resource loading in the browser JSConf EU, June 2018 Patrick Hamann @patrickhamann

JSConf US 2014: Building Isomorphic Apps

JSConf US 2010

Anatomy – Structure Physiology - Function. Gross Anatomy Regional Anatomy Systemic Anatomy Surface Anatomy Developmental Anatomy Microscopic Anatomy –Cytology

Unleash 3D games with Babylon.js - JSConf 2014 talk

Hacking Selenium @ JSConf

Be MEAN JSConf Uruguay - Suissa

JsConf 2014 - Google BigQuery API Node.js實作記錄

Cross-Platform Desktop Apps with Electron (JSConf UY)

JSConf 2013 Builders vs Breakers

Bridging the gap between mobile platforms - jsconf asia

cardiac anatomy ( heart anatomy)

Javascript, the GNOME way (JSConf EU 2011)

JSConf: All You Can Leet

2015 05 27 JSConf - concurrency and parallelism final

JSConf Budapest - Impostor Syndrome, am I suffering enough to talk about it?

JavaScript as a Server side language (NodeJS): JSConf 2011, Dhaka

Leveraging jQuery's Special Events API (JSConf 2012)

Abusing phones to make the internet of things - JSConf EU 2014

Stranger in These Parts. A Hired Gun in the JS Corral (JSConf US 2012)