Upload
ceya
View
1.397
Download
0
Tags:
Embed Size (px)
Citation preview
1
Just-in-Time Contextual Advertising
Aris Anagnostopoulos, Andrei Z. Broder, Evgeniy Gabrilovich, Vanja Josifovski, Lance Riedel, CIKM’07.
Advisor: Chia-Hui ChangPresenter: Teng-Kai Fan
Date: 2008-08-20
2
Outline
Introduction Web Advertising Basic Methodology Empirical Evaluation Conclusion
3
Introduction
The Internet advertising spending is estimated over 17 billion dollars in 2006.
Two main types of textual Web advertising: Sponsored search which serves ads in response
to search queries. Content match which places ads on third-party
pages.
4
Introduction cont.
Web advertising for two types of Web page: Static page (Offline): the matching of ads can be
based on prior analysis of their entire content. Dynamic page (Online): ads need to be matched
to the page while it is being served to the end-user. Thus, limiting the amount of time allotted for its content analysis.
5
Introduction cont.
In this paper, the challenge is to find relevant ads while maintaining low latency and communication costs: Using the text summarization techniques to
extract short excerpt that are representative of the entire page content.
Using the classification technique to classify the page summaries with respect to a large taxonomy of advertising categories.
They perform page-ad matching based on both bag-of-words and classification features.
6
Contextual Advertising Basic
Four interactive entities: The publisher is the owner of Web pages on which advertising is
displayed.
The advertiser provides the supply of ads.
The ad network is a mediator between the advertiser and the publisher, who selects the ads that are put on the pages.
End-users visit the Web pages of the publisher and interact with the ads.
7
Overview of Ad display
WebPageAd Agency
System
Web Page+
Ads
register
(Publisher)
(End-User)
match
browse
WebPageAd Agency
System
Web Page+
Ads
(Adviser)
WebPageAd Agency
System
Web Page+
Ads
register
(Publisher)
(End-User)
match
browse
WebPageAd Agency
System
Web Page+
Ads
(Adviser)
8
Advertising Basic cont.
Four pricing models: CPM (Cost Per Impression) is where advertisers pay for exposure of their
message to a specific audience.
CPV (Cost Per Visitor) is where advertisers pay for the delivery of a Targeted Visitor to the advertisers website.
CPC (Cost Per Click) is also known as Pay per click (PPC). Advertisers pay every time a user clicks on their listing and is redirected to their website. They do not actually pay for the listing, but only when the listing is clicked on.
CPA (Cost Per Action) is based on each time an order is transacted.
9
Overview of the Proposed Solution Using text summarization techniques paired with
external knowledge to craft short page summaries in real-time.
Balance of two conflicts: analyzing as much page content as possible for better ad match vs. analyzing as little as possible to save transmission and analysis time.
External knowledge: URL often contain meaningful words. Reference URL might contain relevant words that to some
extent capture the user intent. Page Classification.
10
Text Summarization
Text summarization techniques are divided into extractive and non-extractive approaches.
Considering the following components in constructing summaries: Title (T) Meta knowledge and description (M) Headings (H): the contents of <h1> and <h2> HTML tags. Tokenized URL of the page (U) Tokenized referrer URL (R) First N bytes of the page text. (P<N>). Anchor text of all outgoing link on the page (A) Full of the page (F).
11
Text Classification
Using a summary of the page in place of its entire content can ostensibly eliminate some information.
To alleviate harmful effect of summarization, they study the effects of using text classification. They classify both page excerpts and ads with
respect to a taxonomy and use classification-based features to augment the original bag of words.
12
Choice of Taxonomy
Taxonomy: they employ a large taxonomy of approximately 6,000 nodes, arranged in a hierarchy with median depth 5 and maximum depth 9.
Human editors populated the taxonomy with labeled bid phrase of ad (approx. 150 phrases per node)
13
Classification Method
For each taxonomy node, they concatenated all the phrases associated with this node into a single meta-document.
Then, they computed a centroid for each node by summing up the TFIDF values of individual terms, and normalizing by the number of phrases in the class:
where, is the centroid for class Cj and p iterates over the phrase
s in class.
14
Classification Method cont.
The classification is based on the cosine of the angle between the document and the centroid meta-document:
where, F is the set of features ci and di represent the weight of the ith feature in the class a
nd the document.
15
Using Classification Features &Ad Retrieval Function Each page and as were represented as a bag
of words (BOW) and as additional vector of classification feature.
The ad retrieval function was formulated as a linear combination of similarity scores based on both BOW and classification features:
16
Dataset
From 12,000 human judgments (page-ad pairs): Dataset 1 consists of 105 Web pages that are
accessible through a major search engine. 2680 ads and 2946 page-ad score (some ads have been
scored for more than one page) The classification precision was 70% for the pages and
86% for the ads. Dataset 2: consists of 827 pages from publishers
that are not found in the search engine index. 5056 unique ads.
17
Evaluation Metrics
Precision MAP (Mean Average Precision) bpref-10 (Buckley et al., SIGIR’04)
Its idea is to measure the effectiveness of a system on the basis of judged documents only.
Since the scores for MAP and P@(N) are completely determined by the ranks of the relevant documents in the result set, these measures make no distinction in pooled collections between documents that are explicitly judged as nonrelevant and documents that are assumed to be nonrelevant because they are unjudged.
18
bpref-10
The preference measure is a function of the number of times judged non-relevant documents are retrieved before relevant document.
Formulation: Naïve: Simple counts of the number of judged nonrelevant documents retriev
ed before some relevant document are poor because the score is dependent on the absolute numbers of relevant judged nonrelevant documents.
For a topic with R relevant documents where r is a relevant document and n is a member of the first R judged nonrelevant documents
bprep:
bprep-10:
19
The effect of Focused Page Analysis
FullText(F), AnchorText(A), First 500 bytpes(P500), MetaData(M), Headings(H), Title(T), PageURL(U), ReferrerURL(R)
20
The contribution of individual fragments
FullText(F), AnchorText(A), First 500 bytpes(P500), MetaData(M), Headings(H), Title(T), PageURL(U), ReferrerURL(R)
21
Precision-Recall tradeoff
22
Incremental Addition of Information
23
The Effect of Classification
24
Conclusion
They presented a new methodology for contextual Web advertising in real time. They focused on the contributions of the different
fragments of the pages.