59
A Technical Introduction to Open-Source RTB for Everyone

A Technical Introduction to RTBkit

Embed Size (px)

DESCRIPTION

Datacratic is the leader in real-time machine learning and decisioning and the creator of the RTBkit Open-Source Project. Mark Weiss, head of client solutions at Datacratic shares some of the challenges companies and developers face today as they move into Real Time Bidding. In this presentation he does a developer deep dive into design and implementation choices, technologies, plugins and provide some real world RTB customer use cases. You will also learn how you can join the RTBkit community get support for your upcoming RTBkit initiatives.

Citation preview

Page 1: A Technical Introduction to RTBkit

A Technical Introduction to

Open-Source RTB for Everyone

Page 2: A Technical Introduction to RTBkit

Overview● The Project● RTB Competitive

Landscape● The Problems With RTB

○ System○ Selection○ Value

● How RTBkit Addresses the Problems with RTB

● Demo

Page 3: A Technical Introduction to RTBkit

A Little Bit About

Page 4: A Technical Introduction to RTBkit

A Little History

● Created by machine-learning and digital marketing company Datacratic

● Code base evolved from running RTB in production from 2011-2013

● Open sourced in Feb. 2013, with ongoing support from Datacratic

● Apache-style governance started Jan. 2014

Page 5: A Technical Introduction to RTBkit

Participation and Governance

● Apache-style governance○ BDNFL - Benevolent

Dictator Not for Life○ Councillors○ Committers

● Outside contributions welcome● Github pull request workflow --

committers review and merge● Contributor guidelines● Users can become

Contributors● Contributors can become

Committers -- currently two outside Committers

Page 6: A Technical Introduction to RTBkit

Support, Community, Adoption

● Free support from the community and Datacratic

● Community support from 100s of users in 25+ countries

● Datacratic provides engineering support for development, code review, governance and evolution

● Participation and contributions from Rubicon Project

● 230 active developers● 35 committers, 11 outside of

Datacratic● 10 installations in prod: N.

America, Germany, France, Russia, Argentina, China

Page 7: A Technical Introduction to RTBkit

Development Support

● Getting Started Guide● Working test system:

○ mock Exchange configurable to run any bid requests

○ mock Ad Server○ fixed-price Bidding Agent

● Example Code● Documentation● Packaging script and weekly tagged

packages for download● Ubuntu AMI (ami-31acd858)● Google Group support● Pull request review and support

Page 8: A Technical Introduction to RTBkit

User Profiles - Reason for Adopting

Data from ongoing survey, 50 responses

Page 9: A Technical Introduction to RTBkit

User Profiles - Expected Spend

Data from ongoing survey, 50 responses

Page 10: A Technical Introduction to RTBkit

User Profiles - Type of Inventory

Data from ongoing survey, 50 responses

Page 11: A Technical Introduction to RTBkit

User Profiles - Geographic Targets

Data from ongoing survey, 50 responses

Page 12: A Technical Introduction to RTBkit

The Problems With RTB

Page 13: A Technical Introduction to RTBkit

The Problems With RTB

SYSTEM VALUESELECTION

Provided by RTBkit Customized by User

General/Technical Specific/Business

Page 14: A Technical Introduction to RTBkit

Solves the RTB System Problem

SYSTEM SELECTION VALUE

ScaleSpeed

Distribution

Reliability

General/Technical Specific/Business

Provided by RTBkit Customized by User

Page 15: A Technical Introduction to RTBkit

Addresses the RTB Selection Problem

SYSTEM SELECTION VALUE

ScaleSpeed Show user an ad?

What ad?

Distribution

Reliability

Provided by RTBkit Customized by User

General/Technical Specific/Business

Page 16: A Technical Introduction to RTBkit

Addresses the RTB Value Problem

SYSTEM SELECTION VALUE

ScaleSpeed

Distribution

Show user an ad?

What ad?

What is it worth?

What should I pay?Reliability

General/Technical Specific/Business

Provided by RTBkit Customized by User

Page 17: A Technical Introduction to RTBkit

RTB Competitive LandscapeSystem Pros Cons Degree of

Difficulty

Exchange / DSP UI

Easy to get started ● Manual, hard to scale● Lack of control over bidding

strategy and data

Low

Intermediate Hosted Bidding

● More control over bidding strategy and use of data

● Don't have to do Ops

Strategy and use of data mediated by vendor and product features

Medium

Roll Your Own Bidder

Full control of all aspects of the system

Solely responsible for everything Hardest

● Benefit from core problems being solved

● Benefit from● community● Flexible customization

● Full control (optionally) but requires digging in

● Responsible for ops

Hard

Page 18: A Technical Introduction to RTBkit

Addresses the System ProblemHow

Page 19: A Technical Introduction to RTBkit

Architectural Overview

RTBkit Core● Router● Banker● Post Auction Service● Service Monitor● Agent Configuration Service

Plugins● Exchange Connectors● AdServer Connector● Bidding Agent● Augmenter● Logger

Page 20: A Technical Introduction to RTBkit

Bidder Core Responsibilities

● Core working bidder system● High-performance real-time components● Multiple data center support● Reliable global banker updated once per

second with guarantees against overspend● Strongly typed currency support● Guaranteed response time to exchanges● Automatic load shedding● Flexible high-performance filtering of bid

requests● High-performance parsing, routing, filtering,

logging and monitoring

Page 21: A Technical Introduction to RTBkit

Inventory Integration

● Ships with 9 Exchange Connectors:○ Rubicon Project○ AdX○ FBX○ AppNexus○ Nexage○ MoPub○ GumGum○ BidSwitch

● OpenRTB 2.1 support

Page 22: A Technical Introduction to RTBkit

Router Responsibilities

● Gets bid requests from Exchange Connector

● Uses Filters to filter eligible campaigns● Passes bid requests through Augmenter● Passes bid requests to Bidding Agents to

generate bid responses● Communicates with Banker to guarantee no

overspend● Guarantees timely response● Only runs if system components are

available

Page 23: A Technical Introduction to RTBkit

Router Components● Gets bid requests from

Exchange Connector● Uses Filters to filter eligible

campaigns● Passes bid requests

through Augmenter● Passes bid requests to

Bidding Agents to generate bid responses

● Communicates with Banker to guarantee no overspend

● Guarantees timely response

● Only runs if system components are available

RTBkitRouterExchange

Exchange Exchange Connector

Static Filters

Augmentation Loop

Dynamic Filters

Auction Loop

Slave Banker

Master Banker

Bidding Agents

Augmenter

Post Auction Service

Agent Config

Page 24: A Technical Introduction to RTBkit

Router Data Flows● Controls the amount of

data flowing through● Dynamically directs the

Exchange Connector to shed load to guarantee timely response

RTBkitRouterExchange

Exchange Exchange Connector

Static Filters

Augmentation Loop

Dynamic Filters

Auction Loop

Slave Banker

Master Banker

Bidding Agents

Augmenter

Post Auction Service

Agent Config

Page 25: A Technical Introduction to RTBkit

Bid Request Lifecycle

RTBkitRouter

Exchange Exchange Connector

Static Filters

Dynamic FiltersAugmenter

Post Auction Service

Ad Server

Conversion Source

Bidding Agents

Ad Server Connector

Page 26: A Technical Introduction to RTBkit

RTBkit Data Flows● Five asynchronous data

flows flow through the Router:

○ Bid request processing

○ Banking updates○ Event Matching○ Notifying Bidding

Agents of Events○ Filtering and

Bidding Agent configuration

RTBkitRouterExchange

Exchange Exchange Connector

Static Filters

Augmentation Loop

Dynamic Filters

Auction Loop

Slave Banker

Master Banker

Bidding Agents

Augmenter

Post Auction Service

Agent Config

Page 27: A Technical Introduction to RTBkit

Ad Server and Conversion Integration

● Standard HTTP JSON connector for receiving Wins, Clicks and Conversions

● Event matching of Wins to bid response● Event matching of Clicks and Conversions

to Wins● Logging of all campaign events and

matched campaign events

Page 28: A Technical Introduction to RTBkit

Post Auction Service

● Clearinghouse for matching all bids to Wins, Clicks and Conversions

● Router sends Bid Request messages● Ad Server Connector sends Wins, Clicks

and Conversions● Matched Clicks and Conversions similarly

generate Matched messages● 15-minute window or bid is Inferred Loss● Match events sent to Logger and to Bidding

Agents● Shadow account spend bookkeeping● Current bottleneck, can process events in

the hundreds / sec● Recently improved: sharded hash tables,

one thread per core● More improvements on the near roadmap

Post Auction Service

Router Bids

Ad ServerConnector

Events

Shadow Account

Bidding Agents

LoggerMatched Events

Matched Events

Wins andInferred Losses

Page 29: A Technical Introduction to RTBkit

Banker Responsibilities

● Single source of truth for budget available for each Campaign and for each Account

● Authorizes spending of Campaign budget by Bidding Agents for a Campaign

● Enforces that each Budget has one Account owner

● Caps per-Campaign and per-Account spending

● Guarantees won't overspend if wins are cheaper than bids

● Insulates banker state from "shadow account" bookkeeping in Router and Post Auction Service

Page 30: A Technical Introduction to RTBkit

Banker Design

● Totals always go up -- you can always reason about the relative timing of entries

● Double-entry bookkeeping● Multiple increasing Currency Pools● Atomic, idempotent persistence● Designed for high-latency, low-bandwidth

unreliable connections● Updates global state once per minute

Page 31: A Technical Introduction to RTBkit

Banker Account Types

Budget Account● All budget for Account tree set in Master

Banker Budget Account● Cannot bid from this account● Cannot track spend directly● Can transfer budget into child Spend

Accounts● Can have child Spend Accounts● Only exists in the Master Banker

Spend Account● Must have a parent Budget Account● Can bid from this account● Can track spend directly● Cannot have children● Can be shadowed into a separate process

Budget Account

Spend Account

Master Banker

Spend Account

Page 32: A Technical Introduction to RTBkit

Account Hierarchies and Spend Tracking

Banker - Account Hierarchies● Spend accumulates from Children to Parent

Budget Account in Master Banker● Temporary bookkeeping for bids happens in

shadow accounts in separate processes● Shadows sync once per second

○ Router shadow - tracks budget committed to pending bids

○ PAS shadow - tracks budget to debit on Wins and to credit on Losses

● Natural partitioning will allow for shardingSpend Concepts

● Budget: amount allowed to spend● Spent: amount actually spent● inFlight: amount in live bids● Allocated: amount allocated to sub-accounts● Adjustments: sum of adjustments

BudgetA

SpendA:B

Master Banker

SpendA:C

Shadow Spend

A:B

Router Slave Banker

Shadow Spend

A:C

Shadow Spend

A:C

PAS Slave Banker

Shadow Spend

A:B

Page 33: A Technical Introduction to RTBkit

Banker Currency Pools

● Currency Pools store entries as 64-bit integers

● Multiple Currency Pools per Account● Each Account hierarchy mutated by a single

process● Strongly typed Currency, won't allow cross-

currency conversions● Automatic scaling conversions (e.g. CPM to

micro-dollars)● Debit and Credit Pools● Credit operations -> increase to a credit

Currency Pool in a hierarchy● Debit operations -> increase to a debit

Currency Pool in a hierarchy

Account Currency Pools

Credit Debit

budgetIncreases budgetDecreases

allocatedIn allocatedOut

recycledIn recycledOut

commitmentsRetired commitmentsMade

adjustmentsIn adjustmentsOut

spent

Page 34: A Technical Introduction to RTBkit

Banker Account Tree Currency Pools

Name Formula Description

tree.budget budgetIncreases - budgetDecreases Tree max spendable amount

tree.inFlight sum(commitmentsMade) - sum(commitmentsRetired)

Total outstanding bids

tree.spent sum(spent) Total spent

tree.adjustments sum(adjusmentsIn) - sum(adjustmentsOut) Total adjustments to spent

tree.effectiveBudget sum(budgetIncreases) - sum(budgetDecreases) + sum(recycledIn) - sum(recycledOut) + sum(allocatedIn) - sum(allocatedOut)

Max spendable amount according to current internal state

tree.adjustedSpent tree.spent - tree.adjustments Total spent after adjustments

tree.available tree.effectiveBudget - tree.adjustedSpent - tree.inFlight

Tree remaining spendable amount

Page 35: A Technical Introduction to RTBkit

Banker Parent-Child Currency Operations

Name Action Description

child.setBalance increase child.recycledOut and decrease parent.recycledIn

Set child account balance lower

child.setBalance set balance higher

increase child.recycledIn and decrease parent.recycledOut

Set child account balance higher

child.recuperateTo increase child.recycledOut and parent.recycledIn until child.balance == 0

Page 36: A Technical Introduction to RTBkit

Banker Debit and Credit Data Flow

Page 37: A Technical Introduction to RTBkit

Banker APIs

● REST API suitable for human reader and outside tool integration

● Also used by Router, Post Auction Service and Bidding Agents

● API presents a simple wrapper over the Account Type, Account Hierarchy and Currency Pool concepts

○ All Accounts in tree or subtree○ Accounts in (sub)tree by name○ Shadow Accounts in (sub)tree by

name○ Account children by name of parent○ Account balance of sub(tree) by name○ Account budget of tree by name

Page 38: A Technical Introduction to RTBkit

Banker Persistence

● Banker state stored in Redis● Banker dumps its state each second● Read-Modify-Write so only delta transmitted● Can detect out of date or corrupt data

○ If a value goes down○ If sum(credit) - sum(debit) !=

available● On a banker crash and restart, it reads and

reconciles state from shadow accounts and persistent store

● Maximum of one second of data lost● If routers/post auction loops (with shadow

accounts) stay up, no data lost

{

"md" : {"objectType": "Account"; "version": 1},

"type": account type ("budget" or "spent")

"budgetIncreases": amount (in USD/1M),

"budgetDecreases": amount (in USD/1M),

"spent": amount (in USD/1M),

"recycledIn": amount (in USD/1M),

"recycledOut": amount (in USD/1M),

"allocatedIn": amount (in USD/1M),

"allocatedOut": amount (in USD/1M),

"commitmentsMade": amount (in USD/1M),

"commitmentsRetired": amount (in USD/1M),

"adjustmentsIn": amount (in USD/1M),

"adjustmentsOut": amount (in USD/1M),

"lineItems": additional keyed amounts,

"adjustmentLineItems": additional keyed amounts

}

Page 39: A Technical Introduction to RTBkit

Logger

● Logging occurs in a separate process that each component uses

● Automatically handles compression and log rotation

● Pub/sub model using the RTBkit service discovery mechanism (Zookeeper)

● Supports target multiple outputs (file system, S3) and route messages to one or more outputs

● Supports combining multiple messages● Supports callbacks● Can be extended as needed

Page 40: A Technical Introduction to RTBkit

Monitoring and Operations Tools

● Extensive code instrumentation that logs to Carbon

● Lock-free, high-performance carbon logging library, with tunable sampling rate, one-second granularity and various useful functions

○ labelled occurrence○ counters○ levels (min, max, mean)○ values (min, max, mean)

● Can use library to add any custom metrics you desire

● Operational dashboard● All standard and custom metrics

charted in graphite● Launcher and real-time tmux shell

Page 41: A Technical Introduction to RTBkit

Monitoring and Operations Tools

● All standard and custom metrics charted in graphite

Page 42: A Technical Introduction to RTBkit

Monitoring and Operations Tools

● Launcher and real-time tmux shell

Page 43: A Technical Introduction to RTBkit

Addresses the Selection ProblemHow

Page 44: A Technical Introduction to RTBkit

Filter Design and Features

● Router passes bid requests through Filtering pipeline

● Bids must pass all filters to reach Agents● Thread safe● Useful primitives driven by configuration

and available as building blocks for custom filters

● Predefined Agent and Creative filters● Designed to guarantee performance first,

be flexible and powerful second● Regex support

○ Example: Location filter supports regexes at Agent and Creative level to support dynamic filter by geo

Page 45: A Technical Introduction to RTBkit

Generic Filter Primitives

● Building blocks for included Predefined Filters and for user Custom Filters

● Encapsulate generic comparison logic● IncludeExcludeFilter

○ True If any included And none excluded

● ListFilter○ True If any match in List

● RegexFilter○ True If any match regex

● IntervalFilter○ True If any within interval

● DomainFilter○ True If bid.domain in

DomainList

Page 46: A Technical Introduction to RTBkit

Filter Levels

Filter Levels● Agent Filters

○ Control whether Agent bids on bid request

● Creative Filters○ Control whether Creative eligible to be

the one returned in bid response

Agent

FormatLocation

ExchangeLanguage

Creative

Exchange

Location

Language

Host

URL

Segments

Hour of Week

Fold Position

User Partition

Page 47: A Technical Introduction to RTBkit

Filter Types

● Static Predefined Filters○ Creative filters match bid and Agent

creative sizes○ Config filters match bid request

attributes to filter attributes ● Static Segment Filters

○ Filter based on attributes set by Exchange Conn. bid request parse

● Static Custom Filters○ Creative or config filters○ Simple wrapper class API

● Dynamic Predefined Filters○ Based on system state○ notEnoughTime, tooManyInFlight

● Augmenter Filters○ Custom logic and data

Predefined

Segment

Static

Predefined

Dynamic

Augmenter

Custom

Bid RequestBid

RequestBid Request

Bid Request

Page 48: A Technical Introduction to RTBkit

Filter Priorities and Performance

● Prioritized execution order optimized for performance, not business logic

● Selective, inexpensive filters run earlier● Expensive filters run later (or not at all!)● Only fast filters run on the Exchange

Connector thread, which must guarantee a response within response SLA time

● Static filters build bitfield lookup table from configs, batch process filters per bid request in 64-bit blocks

● Filter matching tests match and retrieves eligible creatives in one pass

FormatLocation

ExchangeLanguage

Creative

Agent

Exchange

Location

Language

Host

URL

Segments

Hour of Week

Fold Position

User Partition

Agent

Bid Request

Bid RequestBid

RequestBid Request

Page 49: A Technical Introduction to RTBkit

Custom Filter Development

● (Creative)IterativeFilter<MyFilter>○ Simple wrapper interface○ Set priority, return bool per request○ Less scale, no batch processing

● (Creative)FilterBaseT<MyFilter>○ ConfigSet of filter configs○ CreativeMatrix maps each creative to

its filters○ FilterState stores state of processing

filters for current bid request○ Filter batch process by intersecting

ConfigSet and CreativeMatrix○ Filter code uses ConfigSet bit

operator-style interface and also sometimes raw bit operators

struct HourOfWeekFilter : public FilterBaseT<HourOfWeekFilter> {

HourOfWeekFilter() { data.fill(ConfigSet()); }

static constexpr const char* name = "HourOfWeek";

unsigned priority() const { return Priority::HourOfWeek; }

void setConfig(unsigned configIndex,const AgentConfig& config,bool value) {

const auto& bitmap = config.hourOfWeekFilter.hourBitmap;

for (size_t i = 0; i < bitmap.size(); ++i) {

if (!bitmap[i]) continue;

data[i].set(configIndex, value);

}

}

void filter(FilterState& state) const {

state.narrowConfigs(data[state.request.timestamp.hourOfWeek()]);

}

private:

std::array<ConfigSet, 24 * 7> data;

};

Page 50: A Technical Introduction to RTBkit

Augmenters: Your Logic and Data

● Bid Requests pass through Augmenter after Filtering, before Bidding

● Allows for custom filtering based on combinations of bid request fields, your data and business logic you code

● Filter based on user agent, device, geo, user data, etc.

Page 51: A Technical Introduction to RTBkit

Augmenter Implementation

● Provides thread pool of background threads to run augmenter calls

● Enforces 5ms timeout on router thread● Sync and async versions. Use async with

callback for calls to outside DBs.● RTBkit ships with Redis Augmenter. Other

stores such as Aerospike are in the wild.● Separate config for each Bidding Agent● Augmenter data is arbitrary JSON● Can subscribe to other RTBkit data streams

to write data○ e.g. - frequency cap Augmenter

subscribes to PAS MATCHEDWINs

Router

Bid Request

Fast DB TM

Thread Pool

Augmenter Impl.

Augmenter

Post Auction Service

Data Sink Callback

Augmented Bid

Request

Page 52: A Technical Introduction to RTBkit

Addresses the Value ProblemHow

Page 53: A Technical Introduction to RTBkit

Bidding Agent Configuration

● Bidding Agents configure the Core● Agents register Agent Config with the Agent

Configuration Service● Router, PAS and Augmenter periodically

pull updated Agent configs from the ACS● Router registers

○ creatives per campaign○ dynamic filters○ augmenters and augmenter filters

● Router passes ad markup from config to Exchange Connector for bid response

● Router forwards bid requests passing filtering to eligible Agents

● PAS forwards Matched Events to Agents● Augmenter adds augmented fields to Bid

Request based on Agent Configuration

Page 54: A Technical Introduction to RTBkit

Bidding Agent Configuration (con't)

● account -- which Accounts in an Account tree the Agent bids for

○ Implement different bidding strategies within an account or "account group" by mapping Agents to named accounts in an account tree

● maxInFlight -- outstanding bids● bidProbability per Agent can be used for

pacing and bidding strategy● creatives, languageFilter and

segmentFilter supported Static Filters● augmentations here configures an

Augmenter filter● providerConfig -- ad markup, not shown

{ "account": ["parent", "child"], "bidProbability": 0.1, "creatives": [{"id": 1,"width":300,"height": 250,

"providerConfig": {"supplySourceX": { "markup": "markup goes here", "attributes": ["alcohol"]} }, ...], "languageFilter": {"include":["en"], "exclude":[]}, "segmentFilter": { "sample1": { "include" : [], "exclude": ["bad"] }, "colors": { "include" : ["blue", "red"]}}, "augmentations": { "freq-rec": {

"required": true, "config": {"maxPerDay": 10}, "filter": { "include": [], "exclude": ["too-many"] }}

}, "maxInFlight": 10}

Page 55: A Technical Introduction to RTBkit

Custom Bidding Agent Implementation

● C++ or JavaScript● Programmatic configuration● Custom bidding logic based on

○ bid attributes○ a custom Win Cost Model to adjust for

desired margin and data costs● Currency support supports bidding at

different price granularities● Pacing support

○ Custom pacing logic○ Guaranteed communication between

Bidding Agent and Banker● Bid callback called on every bid request● Router sends back bid status messages● Post Auction Service sends back event

status messages

Custom Bidding Agent

ConfigurationBidding Logic

Win Cost ModelPacing

Currency Helpers

Router

Bid Request1Bid Response

2

onWinonLossonNoBudgetonTooLateonDroppedBidonInvalidBid

3

Post Auction Service

onImpressiononClickonVisit 4

Page 56: A Technical Introduction to RTBkit

Augmenters: Your Logic and Data

● Allows you to augment the bid request, adding any fields you want, based on combinations of bid request fields, your data and business logic you code

● Supports custom logic per agent● Augmented fields are then available to the

Bidding Agents● So, you can influence your bidding logic by

adding to the bid request

Page 57: A Technical Introduction to RTBkit

Future Directions

Near Term● Scalability Improvements

○ PAS○ Number of Agents

● Improved Packaging● Decoupled Bidding Agent API

This Year● Performance benchmarking tools● Protocol versioning of messages● Open plugin platform supporting

3p marketplace

Page 59: A Technical Introduction to RTBkit

About UsMark Weiss

● Head of Customer Solutions at Datacratic@marksweiss

Datacratic● Machine-learning software provider● Platform supports real-time decisioning● Current products target digital marketing

○ Hosted RTB Optimization○ Self-Serve and DMP Lookalike

Modelingwww.datacratic.com@datacraticWe're hiring! http://datacratic.com/site/careers