31
Computational advertising Kira Radinsky Slides based on material from the paper “Bandits for Taxonomies: A Model-based Approach” by Sandeep Pandey, Deepak Agarwal, Deepayan Chakrabarti, Vanja Josifovski, in SDM 2007

Tutorial 11 (computational advertising)

  • Upload
    kira

  • View
    1.134

  • Download
    4

Embed Size (px)

DESCRIPTION

Part of the Search Engine course given in the Technion (2011)

Citation preview

Page 1: Tutorial 11 (computational advertising)

Computational advertising

Kira Radinsky

Slides based on material from the paper

“Bandits for Taxonomies: A Model-based Approach” by

Sandeep Pandey, Deepak Agarwal, Deepayan Chakrabarti,

Vanja Josifovski, in SDM 2007

Page 2: Tutorial 11 (computational advertising)

The Content Match Problem

Advert

isers

Ads

DB

Ads

Ad impression: Showing an ad to a user

(click)

Page 3: Tutorial 11 (computational advertising)

The Content Match Problem

Advert

isers

Ads

Ad click: user click leads to revenue for ad server and content provider

Ads

DB

(click)

Page 4: Tutorial 11 (computational advertising)

The Content Match Problem

Advert

isers

Ads

DB

Ads

The Content Match Problem:

Match ads to pages to maximize clicks

Page 5: Tutorial 11 (computational advertising)

The Content Match Problem

Advert

isers

Ads

DB

Ads

Maximizing the number of clicks means: For each webpage, find the ad with the best

Click-Through Rate (CTR) but without wasting too many impressions in

learning this.

Page 6: Tutorial 11 (computational advertising)

Outline

Problem

Background: Multi-armed bandits

• Proposed Multi-level Policy

• Experiments

• Conclusions

Page 7: Tutorial 11 (computational advertising)

Background: Bandits

Bandit “arms”

p1 p2 p3(unknown payoff

probabilities)

Pull arms sequentially so as to maximize the total

expected reward

• Estimate payoff probabilities pi

• Bias the estimation process towards better arms

Page 8: Tutorial 11 (computational advertising)

Background: Bandits Solutions

• Try 1: Greedy Solution:

• Compute the sample mean of an arm A by dividing the total reward received from the arm by the number of times the arm has been pulled. At each time step choose the arm with

highest sample mean.

• Try 2: Naïve solution:

• Pull each arm an equal number of times.

• Epsilon-greedy strategy:

• The best bandit is selected for a proportion 1 − ε of the trials,

and another bandit is randomly selected (with uniform

probability) for a proportion ε.

• Many more strategies

Page 11: Tutorial 11 (computational advertising)

Background: Bandits

Bandit Policy

1.Assign priority to

each arm

2. “Pull” arm with

max priority, and

observe reward

3.Update priorities

Priority 1 Priority 2 Priority 3

Allocation

Estimation

Page 12: Tutorial 11 (computational advertising)

Background: Bandits

Why not simply apply a bandit policy

directly to the problem?

• Convergence is too slow

~109 instances of the MAB

problem(bandits), with ~106 arms per

instance (bandit)

• Additional structure is available, that

can help Taxonomies

Page 13: Tutorial 11 (computational advertising)

Outline

Problem

Background: Multi-armed bandits

Proposed Multi-level Policy

• Experiments

• Conclusions

Page 14: Tutorial 11 (computational advertising)

Multi-level Policy

Ads

Webpages

… …

……

……

classes

classes

Consider only two levels

Page 15: Tutorial 11 (computational advertising)

Multi-level Policy

ApparelCompu-

ters Travel

… …

……

……

Consider only two levels

Tra

ve

lC

om

pu

-

ters

Ap

pa

rel

Ad parent

classes

Ad child classes

Block

One MAB problem

instance (bandit)

Page 16: Tutorial 11 (computational advertising)

Multi-level Policy

ApparelCompu-

ters Travel

… …

……

……

Key idea: CTRs in a block are homogeneous

Ad parent

classes

Block

One MAB problem

instance (bandit)

Tra

ve

lC

om

pu

-

ters

Ap

pa

rel

Ad child classes

Page 17: Tutorial 11 (computational advertising)

Multi-level Policy

• CTRs in a block are homogeneous

– Used in allocation (picking ad for each new page)

– Used in estimation (updating priorities after each observation)

Page 18: Tutorial 11 (computational advertising)

Multi-level Policy

• CTRs in a block are homogeneous

Used in allocation (picking ad for each new page)

– Used in estimation (updating priorities after each observation)

Page 19: Tutorial 11 (computational advertising)

C

A C T

AT

Multi-level Policy (Allocation)

?

Page

classifier

• Classify webpage page class, parent page class

• Run bandit on ad parent classes pick one ad parent class

Page 20: Tutorial 11 (computational advertising)

C

A C T

AT

Multi-level Policy (Allocation)

• Classify webpage page class, parent page class

• Run bandit on ad parent classes pick one ad parent class

• Run bandit among cells pick one ad class

• In general, continue from root to leaf final ad

?

Page

classifier

ad

Page 21: Tutorial 11 (computational advertising)

C

A C T

AT

ad

Multi-level Policy (Allocation)

Bandits at higher levels

• use aggregated information

• have fewer bandit arms

Quickly figure out the best ad parent class

Page

classifier

Page 22: Tutorial 11 (computational advertising)

Multi-level Policy

• CTRs in a block are homogeneous

Used in allocation (picking ad for each new page)

Used in estimation (updating priorities after each observation)

Page 23: Tutorial 11 (computational advertising)

Multi-level Policy (Estimation)

• CTRs in a block are homogeneous

– Observations from one cell also give information about others in the block

– How can we model this dependence?

Page 24: Tutorial 11 (computational advertising)

Multi-level Policy (Estimation)

• Shrinkage Model

Scell | CTRcell ~ Bin (Ncell, CTRcell)

CTRcell ~ Beta (Paramsblock)

# clicks in

cell

# impressions in cell

All cells in a block come from the same distribution

Page 25: Tutorial 11 (computational advertising)

Multi-level Policy (Estimation)

• Intuitively, this leads to shrinkageof cell CTRs towards block CTRs

E[CTR] = α.Priorblock + (1-α).Scell/Ncell

Estimated

CTR

Beta prior (“block

CTR”)

Observed

CTR

Page 26: Tutorial 11 (computational advertising)

Outline

Problem

Background: Multi-armed bandits

Proposed Multi-level Policy

Experiments

• Conclusions

Page 27: Tutorial 11 (computational advertising)

Experiments [S. Panday et al. 2007]

Root

20 nodes

221 nodes

~7000 leaves

Taxonomy structure

use these 2

levels

Depth 0

Depth

7

Depth 1

Depth 2

Page 28: Tutorial 11 (computational advertising)

Experiments

• Data collected over a 1 day period

• Collected from only one server, under some other ad-matching rules (not our bandit)

• ~229M impressions

• CTR values have been linearly transformed for purposes of confidentiality

Page 29: Tutorial 11 (computational advertising)

Experiments (Multi-level Policy)

Multi-level gives much higher #clicks

Number of pulls

Clic

ks

Page 30: Tutorial 11 (computational advertising)

Experiments (Multi-level Policy)

Multi-level gives much better Mean-Squared Error it has learnt

more from its explorations

Mean-S

qu

are

d E

rror

Number of pulls

Page 31: Tutorial 11 (computational advertising)

Conclusions

• When having a CTR guided system, exploration is a key component

• Short term penalty for the exploration needs to be limited (exploration budget)

• Most exploration mechanisms use a weighted combination of the predicted CTR rate (average) and the CTR uncertainty (variance)

• Exploration in a reduced dimensional space: class hierarchy

• Top down traversal of the hierarchy to determine the class of the ad to show