Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
TheEvolutionofContinuousExperimentationinSoftware
ProductDevelopmentFromDatatoaData-drivenOrganizationatScale
AleksanderFabijan
PavelDmitriev
HelenaHolmström Olsson
JanBosch
From Datatoa Data-drivenOrganization atScale
Howtoevolvecontrolledexperimentationtobecomedata-drivenatscale?
3
?
KeyLearnings
1. Thejourneyfromacompanywithdatatoadata-drivencompanyatscaleisanevolution.
2. Theexperimentationdoesnotneedtobecomplex(atfirst)!
3. Trustworthinessenablesscalability,andnot theotherwayaround!
4
Agenda
• Background&Motivation• ResearchMethod• TheExperimentationEvolutionModel
• Crawlstage• Walkstage• Runstage• Flystage
• ConclusionsandQ&A
5
Background&Motivation
• Softwarecompaniesareincreasinglyaimingtobecomedata-driven.
• Gettingdataiseasy. Gettingdatathatyoucantrust?Not thatmuch...
• Whatcustomerssaytheywant/dodiffersfromwhattheyactuallywant/do.
• Onlineconnectivityofproductsopensnewopportunities tocollectandusedatathatrevealswhatthecustomersactuallywant/do,
• OnlineControlledExperiments(e.g.A/Btests)can enableSoftwareCompaniestomoreaccuratelyidentifywhatdeliversvaluetotheircustomers.
6
Twist…
• Problem:Experimentationinlargesoftwarecompaniesischallenging.• RunningafewA/Btestsissimple.ScalingExperimentationnotsomuch!• Challengesincludeinstrumentation,dataloss,datapipelines,assumptionviolationsofclassicalstatisticalmethods,findingtherightmetrics,etc.
7
Problem/Solution
• Problem:Experimentationinlargesoftwarecompaniesischallenging.• RunningafewA/Btestsissimple.ScalingExperimentationnotsomuch!• Challengesincludeinstrumentation,dataloss,datapipelines,assumptionviolationsofclassicalstatisticalmethods,findingtherightmetrics,etc.
• Solution:Weprovidestep-by-stepguidance onhowtodevelop andevolve theexperimentationpractices(technical,organizationalandbusiness).
8
ResearchMethod
• Inductive case study conducted in collaboration with the Analysis andExperimentation team.
• Data Collection:• The study is based on historical data points (past experiments), and• Complemented with a series of semi-structured interviews, observations,
andmeeting participations.
• Data Analysis:• We grouped the collected data in four buckets, and performed iterative
category development to emerge with the three levels of evolution.
9
The Experimentation Evolution Model
• Ourmodelprovidesguidanceonhowtobecomedata-drivenatscalethroughtheevolution ofonlinecontrolledexperimentation.
• Weidentifyfourstagesofexperimentationevolution:
• WeidentifythemostimportantR&Dactivitiestofocusonineachofthestagesinordertoadvance.
10
The Experimentation Evolution Model
11
The Experimentation Evolution Model
12
Crawl stage(1xexperimentsyearly)
• Technicalfocus:• Logging ofsignals(clicks,dwelltimes,swipes,etc.)shouldbeimplemented,• Trustworthiness ofcollecteddatashouldbeconsidered(dataquality),• Analysisoftheexperimentresultscanbedonemanually.
• TeamFocus:• Productteamsgainmanagementsupportwiththefirstexperiments.
• Businessfocus:• TheOverallEvaluationCriteriaconsistsofafewkeysignals.
13
Contextual Bar Experiment
• ExperimentGoal:• Identifywhetherthecontextualcommandbarimproveseditingefficiency.
• ValueHypotheses:• (1)increasedcommonalityandfrequencyofedits,
• (2)increased2-weekretention.• Outcome:
• Theinitialexperimentwasunsuccessfulduetologgingmisconfiguration.
14
Walkstage(10xexperimentsyearly)
• Technicalfocus:• Startingtodevelop/integrateanexperimentationplatform.• Definingsuccessmetrics,debugmetrics,guardrailmetrics,anddata qualitymetrics,
• Teamfocus:• Productteamdesignsandexecutesexperimentsrelatedtotheirfeatures.
• Businessfocus:• TheOverallEvaluationCriteriainthisstageevolvesfromsignals toastructuredsetofmetrics (guardrail,success,anddata-qualitymetrics)
15
The “Xbox deals” experiment
16
• ExperimentGoal:• Identifytheimpactofshowingthediscountintheweeklydealsstripe.
• ValueHypotheses:• (1)increasedengagementwiththestripe
• (2)nodecreaseinpurchases.• Outcome:
• TreatmentCincreased bothengagement withthestripeandpurchasesmade.
Runstage(100xexperimentsyearly)
• Technicalfocus:• Features:alerting,controlofcarryovereffects,experimentiteration,etc.• Learningexperiments:Createcomprehensivemetrics.
• TeamFocus:• Experimentationexpandstootherfeatureteams(andotherproducts),• Teamscreate,executeandmonitorexperiments.
• Businessfocus:• TheOverallEvaluationCriteriaistailoredusingthelearningexperiments.
17
The “MSN.com personalization” experiment
18
• ExperimentGoal:• IdentifytheimpactofMLsortingofarticlesincomparisontoeditorcuratedarticles.
• ValueHypotheses:• (1)MLcuratedarticlesincreaseengagement
• Outcome:• Atfirst,MLarticlesperformedworsethaneditorcurating.Afterafewiterations thingschanged!
Flystage(1000+experimentsyearly)
• Technicalfocus:• Features:interactionbetweenexperiments,autonomousshutdown,andanaccumulationofinstitutionalmemory.
• Experimentresultsshouldbecomeintuitive(e.g.greenforgo/redforno-go)• TeamFocus:
• ProductTeamsexperimentwitheveryminorchangeintheportfolio,eventhesmallestbugfixes.
• Businessfocus:• TheOverallEvaluationCriteriaisstable.• OECcanbecomeusedtosettheperformancegoalsforteamsandameasureoftheirsuccess.
19
Bing Bot Detection Experiment
20
• ExperimentGoal:• Evaluatetheimproveddetectionofbots.
• ValueHypotheses:• (1)Nochangetorealuserexperience,• (2)Fewerresourcesusedtocomputesearchresults.
• Outcome:• ~10%savingoninfrastructureresources.
21
Conclusions
1. Thejourney fromacompanywithdatatoadata-drivencompanyatscaleisanevolution:• (1)culture,(2)business,andthe(3)technical capabilities.
2. Theexperimentationstarts‘easy’andbecomeschallenging!• Theneedforamoredetailedtrainingariseswhenmanyexperimentsarebeingexecutedbymanyproductteams• Theneedforamoresophisticatedplatformariseswhenmanyteamsexperimentandinterferewitheachother
1. Trustworthinessenablesscalability,andnot theotherwayaround!
22
23
Thankyou!