Myths and Illusions of Cross Device Testing - Elite Camp June 2015

The Myths, Lies and Illusions of Cross Device Testing

Craig Sullivan, Optimiser of Everything, @OptimiseOrDie

@OptimiseOrDie• Split Testing, Analytics, UX, Agile, Lean, Growth• 50M+ visitors tested, 19 languages, over 200 sites• 70+ mistakes I’ve made personally during testing• Like riding a bike, really really badly…

• Optimise or Kickstart your programme? • Get in touch!

#fail

@OptimiseOrDie

23.5M

@OptimiseOrDie

24.2M

@OptimiseOrDie

Oppan Gangnam Style!

6.9M @OptimiseOrDie

You naughty boy…

@OptimiseOrDie

Cross Device Testing Myths

1. Responsive solves everything

2. All our customers are on iPhones, right?

3. The customer journey is in your head

4. You don’t integrate with analytics

5. You think you’re tracking people

6. You only imagine the context

7. We think we have a hypothesis thingy

8. You think best practice is other tests

9. You just start testing, right?

10.What you see is what you get

11.95% confidence is enough for me!

12.You always get the promised lift

13.Segmentation is too hard

14.Who cares if it’s a phone?

15.Testing makes you a data scientist

@OptimiseOrDie

1: Responsive solves everything

@OptimiseOrDie

@OptimiseOrDie

Vs.

@OptimiseOrDie

@OptimiseOrDie

@OptimiseOrDie

@OptimiseOrDie

1. Motorola Hardware Menu Button2. MS Word Bullet Button3. Android Holo Composition Icon4. Android Context Action Bar Overflow (top right on Android

devices)

@OptimiseOrDie

Mystery Meats of Mobile

BURGER SHISH DONER

Increase in revenue of > $200,000 per annum!

bit.ly/hamburgertest

@OptimiseOrDie

http://bit.ly/hamburgertest

2: All our Customers use iPhones, right?

• Most common answer is “iPhones and iPads”

• Do you really know your mix?• Most people undercount

Android

• Use Google Analytics to find out

• Replace guesswork with truth!

• 2-3 hours work only• Get your top testing mix right

on:Desktop, Tablet, Mobile

• I’m writing an article – why?@OptimiseOrDi

e

2: Browser reports

@OptimiseOrDie

2: Browser reports

@OptimiseOrDie

2: Browser reports

@OptimiseOrDie

IE11 is our top desktop browser?

Chrome is DOUBLE this user volume

2: Browser reports

@OptimiseOrDie

• Be very careful when doing numbers on desktop browsers, tablet or mobile devices

• Chrome and Safari are on auto upgrades• All browser ‘upgrade’ at different speeds• Chrome and Firefox are fragmented as a

reason• Same for mobile – there are thousands of

Android handsets makers but the models are similar

• When looking at apple, you need to split by model (you can use the resolution to figure this out).

• iPads you can’t distinguish in GA*• Is the analysis you did really true?• Cluster your devices or browsers where

needed!• Article is coming!

3: The Customer Journey is in your Head

@OptimiseOrDie


85%!


40% loss!


• The routes people take are not what we expect.

• Analytics data and Usability research are big pointers!

• Most common problem is the team not owning, experiencing and being immersed in the problems with your key journeys

• One charity wasted nearly 0.5M on a poor pathway

• Can you imagine someone from Mcdonalds never visiting their stores?

“So that’s how it looks on mobile!”

@OptimiseOrDie

3: Customer Journey - Solutions

"Great user experiences happen beyond the screen and in the gaps.“ Paul Boag• Test ALL key campaigns• Use Real Devices• Get your own emails• Order your own products• Call the phone numbers• Send an email• Send stuff back• Be difficult• Break things• Experience the end-end• Team are ALL mystery shoppers• Wear the magical slippers of the actual customer experience!• Be careful about dogfood though!

@OptimiseOrDie

• Investigating problems with tests• Tests that fail, flip or move around• Tests that don’t make sense• Broken test setups• Segmenting by Mobile, Tablet,

Desktop• Other customer segments• What drives the averages you see?

4: Our AB testing tool tells us all we need…

5: You think you’re tracking people

Basket

Shipping

Details

Pay

Product

Basket

ShippingDetail

sPay

Mobile Desktop

@OptimiseOrDie

5: You think you’re tracking people

• Keep people logged in• Use Social Logins• Identify unique customers• Feed this data to Universal Analytics• Follow users, not just device experiences

• It’s an attribution problem!

@OptimiseOrDie

5: Get a User ID View with Google Analytics

@OptimiseOrDie

5: Get a User ID View with Google Analytics

@OptimiseOrDie

6: You only imagine the context

• Tasks• Goals• Device• Location• Data rate• Viewport• Urgency• Motivation• Data costs• Call costs• Weather!

@OptimiseOrDie

6: You only imagine the context - solutions

bit.ly/multichannels@OptimiseOrDi

e

http://bit.ly/multichannels

7: Other people’s tests are Best Practice

“STOP copying your competitors

They may not know what the f*** they are doing either” Peep Laja, ConversionXL

@OptimiseOrDie

Best Practice Testing?• Your customers are not the same• Your site is not the same• Your advertising and traffic is not the same• Your UX is not the same• Your X-Device Mix is not the same

• Use them to inform or suggest approaches• They’re like the picture on meal packets• Serving Suggestion Only• There are obvious BEST PRACTICES but

these are usually in the category of ‘bugs’ or ‘UX problems’

Just fix that now!

We have no clue

@OptimiseOrDie

Best Practice Testing?

“The Endless Suck of Best Practice and Optimisation Experts”bit.ly/socalledexperts

@OptimiseOrDie

http://bit.ly/socalledexperts

Insight - Inputs

#FAIL

Competitor copying

GuessingDice rolling

An article the CEO

read

Competitor change

Panic

Ego

OpinionCherished

notions Marketing whims Cosmic rays

Not ‘on brand’ enough

IT inflexibility

Internal company

needs

Some dumbass

consultant

Shiny feature

blindnessKnee jerk reactons

8: You think you have a Hypothesis!

@OptimiseOrDie

Insight - Inputs

Insight

Segmentation

SurveysSales and

Call Centre

Session Replay

Social analytics

Customer contact

Eye tracking

Usability testing

Forms analytics Search

analytics Voice of Customer

Market research

A/B and MVT testing

Big & unstructured

data

Web analytics

Competitor evalsCustomer

services

8: These are inputs you need…

@OptimiseOrDie

Because we observed data [A] and feedback [B],

We believe that doing [C] for People [D] will make outcome [E] happen.

We’ll know this when we observe data [F] and obtain feedback [G].

(Reverse this)

@OptimiseOrDie

Because our CEO had an idea, that nobody else agreed with:

We believe that putting Orange Buttons on our Homepage will make people feel ‘Funkier’

We’ll know this when…

@OptimiseOrDie

Baseline Checks

Analytics health check

Developer onboarding

Goals & Metrics

Tool Setup

Analytics & Modelling

Discover ideas

Prioritise

Test cycles

9: You just start testing, right?

@OptimiseOrDie

DISCOVERY CYCLE

Discover Ideas

Prioritise

Build

Test or LiveMeasure

Data

Learn

@OptimiseOrDie

TEST CYCLE

Hypothesis Design

Sketch

WireFrame

Mockup /

Prototype

Signoff Process

Build

QA

Soft and Hard

Launch

Analyse &

Publish

Learn

@OptimiseOrDie

Desktop www.crossbrowsertesting.com

www.browserstack.comwww.spoon.netwww.saucelabs.com

www.multibrowserviewer.com

Mobile & Tabletwww.appthwack.com

www.deviceanywhere.comwww.opendevicelab.com

Read this articlebit.ly/devicetesting

10: What you see is what you get…

@OptimiseOrDie

http://www.crossbrowsertesting.com/

http://www.browserstack.com/

http://www.spoon.net/

http://www.saucelabs.com/

http://www.multibrowserviewer.com/

http://www.appthwack.com/

http://www.deviceanywhere.com/

http://www.opendevicelab.com/

http://bit.ly/devicetesting

10: What you see is what you get…

@OptimiseOrDie

11: You just stop at 95% confidence, right?

@OptimiseOrDie

The 95% Stopping Problem

• Many people use 95, 99% ‘confidence’ to stop

• This value is unreliable and moves around• Nearly all my tests reach significance

before they are actually ready• You can hit 95% early in a test (18

minutes!)• If you stop, it could be a false result• Read this Nature article : bit.ly/1dwk0if• Optimizely have changed their stats

engine• This 95% thingy – is the cherry on the

cake!• Let me explain

@OptimiseOrDie

http://bit.ly/1dwk0if


Scenario 1 Scenario 2 Scenario 3 Scenario 4

After 200 observations Insignificant Insignificant Significant! Significant!

After 500 observations Insignificant Significant! Insignificant Significant!

End of experiment Insignificant Significant! Insignificant Significant!

“You should know that stopping a test once it’s significant is deadly sin number 1 in A/B testing land. 77% of A/A tests (testing the same thing as A and B) will reach significance at a certain point.”Ton Wesseling, Online Dialogue

@OptimiseOrDie


“Statistical Significance does not equal Validity”http://bit.ly/1wMfmY2

“Why every Internet Marketer should be a Statistician”http://bit.ly/1wMfs1G

“Understanding the Cycles in your site”http://mklnd.com/1pGSOUP

@OptimiseOrDie

http://bit.ly/1wMfmY2

http://bit.ly/1wMfs1G

http://mklnd.com/1pGSOUP

Business & Purchase Cycles

• Customers change• Your traffic mix changes• Markets, competitors• Be aware of all the

waves• Always test whole cycles• Don’t exclude slower

buyers• When you stop, let test

subjects still complete!

Start Test Finish Avg Cycle

@OptimiseOrDie

• TWO BUSINESS CYCLES minimum (week/mo)• 1 PURCHASE CYCLE minimum• 250 CONVERSIONS minimum per

creative • 350 & MORE! it depends on

response• FULL WEEKS/CYCLES never part of

one

• KNOW what marketing, competitors and cycles are doing

• RUN a test length calculator - bit.ly/XqCxuu• SET your test run time , RUN IT, STOP IT,

ANALYSE IT• ONLY RUN LONGER if you need more data• DON’T RUN LONGER just because the test isn’t

giving the result you want!

How Long to Run My Test and When to Stop

@OptimiseOrDie

http://bit.ly/XqCxuu

12: The test result gives the promised lift

@OptimiseOrDie

The result is a range

• Version A is 3% conversion• Version B is 4% conversion• Yay! That’s a 25% lift• Let’s tell everyone

• When it goes live, you do NOT get 25%

• That’s because it was A RANGE

• 3% +/- 0.5 (could be 2.5-3.5)• 4% +/- 0.4 ( could be 3.6-4.4)

• Actual result was 3.5% for A• Actual result was 3.7% for B@OptimiseOrDi

e

13: Segmentation will solve everything!

@OptimiseOrDie

Always Segment Experiences

• If you segment by devices, the sample gets smaller.

• A = 350 conversions• B = 300 conversions

• Desktop A 200, Tablet A 100, Mobile A 50• Desktop B 180, Tablet B 80, Mobile B 40

• It’s vital to segment by device class • You may also segment by breakpoint,

viewport or model• Make sure you know the proportion of

devices!• If you want to analyse, plan ahead!

@OptimiseOrDie

13: Wow – the browser is a phone too!

• Add call tracking• Buy a solution or• Make your own!• Measure calls• ROI on phone mix• Vital for PPC• Explain the costs or• Make it free

• Responsive solves everything No, it’s just an attribute

• All our customers are on iPhones, right? Make sure you know what they use!

• The customer journey is in your head Customer Insight, Research, Data

• You don’t integrate with analytics Use analytics, not the AB test data

• You think you’re tracking people Have an authentication strategy

• You only imagine the context Customer Insight, Diary Studies

• We think we have a hypothesis thingy Challenge all work with my outline

• You think best practice is other tests Leverage your customers, not theirs

14: SUMMARY

@OptimiseOrDie

• You just start testing, right? Preparation, Methodology, Prioritisation

• What you see is what you get QA testing with your Customer Shizzle

• 95% confidence is enough for me! Don’t stop tests when they hit 95%

• You always get the promised lift Quote ranges, not predictions

• Segmentation is too hard Segmentation – watch sample sizes

• Who cares if it’s a phone? Add call tracking or add ‘Tap 2 Call’

• Testing makes you a data scientist No, it doesn’t – it makes you humble

14: SUMMARY

@OptimiseOrDie

Testing makes you all Data Scientists

@OptimiseOrDie

@OptimiseOrDie

We’re all winging it!

Guessaholics Anonymous

2004 Headspace

What I thought I knew in 2004

Reality

What I KNOW I know

Me, on a good day

2015 Headspace

Rumsfeldian Space

• What if we changed our prices?• What if we gave away less for free?• What if we took this away?• What about 3 packages, not 5?• What are these potential futures I can

take?• How can I know before I spend money?

• UPS left hand turning -10 Million Gallons saved

• http://compass.ups.com/UPS-driver-avoid-left-turns/

• McDonalds Hipster Test Store• bit.ly/1TiURi7

http://compass.ups.com/UPS-driver-avoid-left-turns/



http://bit.ly/1TiURi7

#1 Culture & Team#2 Toolkit & Analytics investment#3 UX, CX, Service Design, Insight#4 Persuasive Copywriting#5 Experimentation (testing) tools

The 5 Legged Optimisation Barstool

Marketing?

Variation

Heritability of good ideas Selectio

n based on death

Thank You!

Mail : [email protected]

Deck : dddddd

Linkedin :linkd.in/pvrg14

mailto:[email protected]

http://linkd.in/pvrg14