36
Using machine learning to determine drivers of bounce and conversion Velocity 2016 Santa Clara

Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

  • Upload
    soasta

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Using machine learning to determine drivers

of bounce and conversionVelocity 2016 Santa Clara

Page 2: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Pat Meenan@patmeenan

Tammy Everts@tameverts

Page 3: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

What we did (and why we did it)

Page 4: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Get the codehttps://github.com/WPO-

Foundation/beacon-ml

Page 5: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Deep learning

weights

Page 6: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Random forestLots of random decision trees

Page 7: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Vectorizing the data• Everything needs to be numeric• Strings converted to several inputs as

yes/no (1/0)• i.e. Device manufacturer• “Apple” would be a discrete input

• Watch out for input explosion (UA String)

Page 8: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Balancing the data• 3% conversion rate• 97% accurate by always guessing

no• Subsample the data for 50/50 mix

Page 9: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Validation data• Train on 80% of the data• Validate on 20% to prevent

overfitting

Page 10: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Smoothing the dataML works best on normally

distributed data

scaler = StandardScaler()x_train = scaler.fit_transform(x_train)x_val = scaler.transform(x_val)

Page 11: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Input/output relationships

• SSL highly correlated with conversions• Long sessions highly correlated with

not bouncing• Remove correlated features from

training

Page 12: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Training deep learning

model = Sequential()model.add(...)model.compile(optimizer='adagrad', loss='binary_crossentropy', metrics=["accuracy"])model.fit(x_train, y_train, nb_epoch=EPOCH_COUNT, batch_size=32, validation_data=(x_val, y_val), verbose=2, shuffle=True)

Page 13: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Training random forest

clf = RandomForestClassifier(n_estimators=FOREST_SIZE, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, bootstrap=True, oob_score=False, n_jobs=12, random_state=None, verbose=2, warm_start=False, class_weight=None)clf.fit(x_train, y_train)

Page 14: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Feature importancesclf.feature_importances_

Page 15: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

What we learned

Page 16: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

What’s in our beacon?• Top-level – domain, timestamp, SSL

• Session – start time, length (in pages), total load time• User agent – browser, OS, mobile ISP• Geo – country, city, organization, ISP, network speed• Bandwidth• Timers – base, custom, user-defined• Custom metrics• HTTP headers• Etc.

Page 17: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Conversion rate

Page 18: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Conversion rate

Page 19: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Bounce rate

Page 20: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Bounce rate

Page 21: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Finding 1Number of scripts was a predictor…

but not in the way we expected

Page 22: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Number of scripts per page (median)

Page 23: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Finding 2When entire sessions were more

complex, they converted less

Page 24: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Finding 3Sessions that converted had 38% fewer images than sessions that didn’t

Page 25: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Number of images per page (median)

Page 26: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Finding 4DOM ready was the greatest

indicator of bounce rate

Page 27: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

DOM ready (median)

Page 28: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Finding 5Full load time was the second

greatest indicator of bounce rate

Page 29: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

timers_loaded (median)

Page 30: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Finding 6Mobile-related measurements weren’t meaningful predictors of conversions

Page 31: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Conversions

Page 32: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Finding 7Some conventional metrics

were (almost) meaningless, too

Page 33: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Feature Importance (out of 93)

DNS lookup 79Start render 69

Page 34: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Takeaways

Page 35: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

1. YMMV2. Do this with your own data3. Gather your RUM data4. Run the machine learning

against it

Page 36: Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Thanks!