45
December 2010 Public Document MyMedia ICT-2008-215006 Page | 1 ICT MyMedia Project 2008-215006 Deliverable 5.3 Future IPTV Services Field Trial Report Public Document

MyMedia Project Deliverable 5.3 Future IPTV Services Field Trial Report

  • Upload
    ljmu

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 1

ICT MyMedia Project

2008-215006

Deliverable 5.3

Future IPTV Services Field Trial Report

Public Document

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 2

Contents 1 Executive Summary ......................................................................................................................... 6

2 Introduction .................................................................................................................................... 8

3 Glossary ......................................................................................................................................... 11

4 Target Environment ....................................................................................................................... 13

5 Motivation for the Field Trial ......................................................................................................... 14

6 Research Strategy and Design ........................................................................................................ 16

6.1 Research Questions ................................................................................................................ 16

6.2 User Sampling ........................................................................................................................ 16

6.3 Data Collection Methods ........................................................................................................ 17

6.4 Data Analysis Techniques ....................................................................................................... 17

7 Field Trial Development ................................................................................................................. 19

7.1 Content Catalogue ................................................................................................................. 19

7.2 User Feedback ....................................................................................................................... 21

7.3 User Interface ........................................................................................................................ 21

7.4 System Architecture ............................................................................................................... 24

7.5 Recommender Algorithms ...................................................................................................... 27

7.6 Filters ..................................................................................................................................... 27

7.7 Metadata Enrichment ............................................................................................................ 28

8 Field Trial Execution ....................................................................................................................... 29

8.1 Evaluation System .................................................................................................................. 29

8.2 Field Trial Execution ............................................................................................................... 30

8.3 Field Trial Results ................................................................................................................... 30

9 Conclusions ................................................................................................................................... 41

10 References ..................................................................................................................................... 44

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 3

11 Acknowledgements ....................................................................................................................... 44

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 4

Project acronym: MyMedia

Project full title: Dynamic Personalization of Multimedia

Work Package: 5

Document title: Deliverable D5.3 Future IPTV Services Field Trial

Report

Version: 1.1

Official delivery date: 31 December 2010

Actual publication date: 10 December 2010

Type of document: Report

Nature: Public

Authors: Paul Marrow, BT

Tim Stevens, BT

Ian Kegel, BT

Joshan Meenowa, BT

Craig McCahill, BT

Approved by:

Hakan Soncu, EMIC

Lydia Meesters, TU/e

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 5

Version Date Sections Affected

0.1

1.0

6 December 2010

10 December 2010

All: initial draft for review.

All: revision following review.

1.1 18 March 2011 Minor corrections following review report.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 6

1 Executive Summary BT Vision is BT’s TV On Demand service in the UK. It offers consumers video content for rental (Video-

on-Demand, VoD), which is then delivered to their TV via BT’s Broadband network and a Set-Top-Box

(STB). Prior to the start of the MyMedia project BT had researched the use of recommender systems

with BT Vision for stimulating customer uptake, and thus increase in revenue of Video-on-Demand.

Positive results from previous trials led to BT’s involvement in the MyMedia collaborative project.

The BT field trial within the MyMedia project has tested the MyMedia recommender system on BT

Vision. It is thus distinct from the other field trials carried out in the MyMedia project in delivering

recommendations to TV rather than to web browsers on a PC. In comparison with one of the other sets

of trials, executed by Microgenesis in Spain, the BT field trial has been carried out on a commercial

service, but on a much larger scale than Microgenesis’ study, involving up to 100000 customers.

This deliverable reports on the design of the field trial, which compares two groups of customers, 50000

receiving MyMedia recommendations and 50000 receiving recommendations provided by the editorial

(marketing) team of BT Vision. The BT development and implementation of the MyMedia system

required extension of the MyMedia recommendation framework in order to integrate it into the live BT

Vision service. Because of this a number of changes were required before the field trial could begin in

August 2010.

Data about user purchasing behaviour was recorded automatically by the BT Vision management

system, for billing and royalties payment purposes, and it involved no recording of personal details since

triallists were distinguished by STB ID only. Based on this information it was possible to compare the

VoD purchasing activity in the two groups over two months of the trial period.

Overall summaries of the two groups activity did not show much difference: the mean number of items

purchased was similar, and the distribution of purchasing activity was very negatively skewed, with

many triallists purchasing no items in one month, and a few purchasing very many. In fact it was

discovered that only a small number of triallists had viewed the BT Vision recommendation page during

the trial (<1% in each group.) The visibility of the recommendation page was beyond our control.

However it was discovered that of those triallists that did view the recommendation page, irrespective

of trial group, and did click-through to particular items, about 30% followed up with purchasing and

viewing activity. So the MyMedia recommender algorithm, drawing upon only very sparse implicit

feedback, could compete effectively with the professionals in BT Vision who had a much wider view of

the market and customer response upon which to base their recommendations.

Further analysis looked at the potential for the MyMedia recommender algorithm to predict user

preferences even when triallists had not seen those recommendations, in comparison to the editorial

group. A hit-rate based evaluation metric was used. The results suggested either that there was little

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 7

difference between the two types of recommendation or that in some cases editorial recommendations

performed better. This suggested that editorial recommendations do have some advantage in a

situation where most users do not see the output of a recommender algorithm, because the

recommendations suggested by the editorial team are more likely to integrate well with other forms of

marketing.

This trial demonstrated that it was possible to extend a general-purpose recommender framework to

support a very large scale trial on a commercial IPTV service. The trial did not show a clear benefit to the

service through increasing VoD sales by recommendation, but the reasons for this were made clear in

the discussion of the user interface: only implicit feedback was possible, and that only sparse. Despite

this for those triallists who did view recommendations and did click through to view items, the MyMedia

recommender compared well the editorial recommendations.

Overall, the involvement of BT in this trial and the project demonstrated the importance of

recommendation in a commercial IPTV context, providing it is delivered and managed correctly in

relation to other forms of marketing. These insights were taken on board by the BT Vision team for

future services.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 8

2 Introduction The BT MyMedia field trial was distinctive from the others in the MyMedia project in that it was

targeted towards customers of the BT Vision IPTV service, delivered by broadband and STB to TV in the

UK. It was also distinctive in taking place in the context of a large-scale commercial service, which had

been running (subject to changes) for some time prior to the MyMedia project started, and will continue

running after the project finishes.

Because the BT MyMedia field trial took place in a commercial context, its motivation was centered on

understanding the commercial benefits of recommender systems in stimulating purchasing activity of

VoD (Video-on-Demand) by BT Vision customers. To enable an efficient comparison a representative

group of BT Vision customers receiving recommendations from the MyMedia system was compared

with a group receiving recommendations from the editorial (marketing) team within BT Vision.

The main research question was whether recommendations generated by algorithms (in this case one

from the MyMedia recommender algorithm library) could stimulate greater purchasing activity of BT

Vision VoD than recommendations generated by the editorial team. Initially it was also proposed to look

at the user experience of customers receiving the MyMedia recommendation service, but this turned

out to be not possible because of the difficulty of maintaining a group of volunteers for a fixed period of

time when changes to the business requirements for the integration and deployment of the trial meant

that its start had to be delayed. Two later research questions arose after the trial had been planned:

how did MyMedia recommendation of individual VoD items compare with subscription (sVoD) packages

as a marketing incentive, and how well did MyMedia recommendations compare with editorial

recommendations as predictors of user preferences even when viewers had not seen the recommender

pages.

The trials were closed, with users selected by pseudo-random sampling from among the BT Vision

customer base. No personal identification of users was required, because they could be identified via

STB ID. Because BT does not produce its own digital content, but purchases rights to distribute it from

content providers, data about BT Vision customers purchasing of VoD is automatically logged for billing

and royalties payment purposes. These logs could be used in collecting data for analysis. It was originally

planned to use a variety of statistical techniques to compare the groups in the field trials, but not all the

originally planned techniques were used, due to the nature of the data gathered and of the comparisons

that were feasible to make. In particular, an evaluation metric based on hit-rate was introduced late in

the field trial because it had previously proved useful with similar datasets in evaluating recommender

performance.

The BT MyMedia field trial was developed in the context of a live and changing IPTV service. Customers

can search among a large catalogue of content items and choose items of VoD to watch. The nature of

the BT Vision user interface meant that no explicit feedback was possible by users during the MyMedia

field trial, but implicit feedback could be detected in that triallists had to pay for their choices and this

suggested positive implicit feedback. The user interface presented images representing five

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 9

recommended items. This did not represent the only user interface on BT Vision, which offers a variety

of different means of interacting with the service.

The BT MyMedia field trial required a system architecture that integrated the MyMedia software

framework with the existing BT Vision system. This was done via a series of data inputs, that provided

information about the BT Vision content catalogue and changes to it, to identify what could be

recommended, and initially also inputs in terms of customer identity (as identified by STB ID) to define

where the recommendations would go.

The application of a recommender algorithm selected from the MyMedia recommender algorithm

library, or the choice of a number of items to recommend from the editorial team, enabled a set of

recommendations associated with triallists to be exported and thus distributed to the recommender

user interface on the triallist’s BT Vision service. Pre-filters were applied to the content to be

recommended to ensure that only appropriate content was recommended.

Prior to the BT MyMedia field trial a sample of BT data was supplied to Novay in order to evaluate

whether Novay’s keyword extraction (KWE) metadata enrichment modules could improve the

performance of recommender algorithms when applied in BT’s field trial. The conclusion of this

evaluation was that the type of metadata supplied was not appropriate for the keyword extraction

techniques developed, in particular the textual synopses were too short and not sufficiently diverse, and

so metadata enrichment was not used.

For the purposes of data analysis, a further data import and processing task was required: data had to

be imported from an Oracle server that recorded customers’ viewing behaviour, and then reformatted

to allow data analysis with the software tools available.

In the context of the MyMedia recommender system evaluation framework (chapter 3 of MyMedia D1.5

[1] and Knijnenburg et al. (2010) [2]) the BT MyMedia field trial focused mainly on value choices made

about recommended items, due to its commercial context, but also investigates objective features of

recommender algorithms in evaluating the accuracy of recommendation via a hit-rate metric.

Although the BT MyMedia field trial was originally planned to start in January 2010, its start was delayed

until August 2010 due to the need to respond to business requirements and changes. Despite this, due

to its very large scale, a large amount of data was gathered upon which analysis could be carried out and

results obtained.

This deliverable goes on to report on the results derived from the field trial, and carry out analysis on

them where feasible. Although overall analysis showed little difference between the behaviour of the

different trial groups, when the behaviour of those triallists who had viewed the recommendation page

were considered, about 30% in each group followed this up by purchasing and viewing an item. This was

a very important finding for the field trial and the MyMedia project. Using only sparse implicit feedback

the MyMedia recommender algorithm could compete with professional recommendations, suggesting

that it could perform much better under improved conditions, with more frequent and ideally explicit

feedback.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 10

The BT MyMedia field trial showed that it was possible to deploy the MyMedia recommendation

framework on a live commercial IPTV service, different from the other field trials, and on a very large

scale, without either disrupting the existing service or intruding on customers’ privacy. Although it did

not show that the MyMedia recommendation algorithm used stimulated greater customer activity, this

was clearly because of the way in which the recommendations were accessible in the BT Vision service.

It did show that the MyMedia recommendation algorithm was comparable with professional

recommendations in customer uptake, and in the context of sparse implicit feedback this suggested that

the MyMedia system could perform better under alternative circumstances. Insights from the project

have been used to inform future versions of the service. The past and present potential of

recommendation as a means of anticipating user preferences and thus generating customer activity has

stimulated business decisions within the company that will see recommendation further used in the

future.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 11

3 Glossary Term or Phrase Definition

BPRMF Bayesian Personalised Ranking Matrix Factorisation: recommender algorithm

used in the BT MyMedia field trial, selected from the MyMedia recommender

algorithm library.

BT Broadband BT’s brandname for its consumer ISP service, which links BT’s UK IP network

to consumers.

BT Vision BT’s IPTV service in the UK upon which BT’s MyMedia field trial is based.

Catalogue The set of items on which the recommender makes predictions for the user.

Editorial Recommendations made by the Editorial (marketing) team at BT Vision.

Engine The recommender system component responsible for making predictions for

the user. An Engine implements the algorithms to consume user preference

information and produce predictions.

Feedback A specific user behaviour associated with a measurement of interest, such as

a rating, to a specific catalogue item.

Implicit Data Data retrieved by monitoring the user behaviour only.

IP Internet Prococol: standard for digital networks

IPTV Internet Protocol Television: TV services provided by connection to an IP

network.

ISP Internet Service Provider: company providing services via an IP network.

Metadata Enrichment The process of augmenting existing metadata in order to make it more useful

or meaningful.

Prediction The assignment of value to a content item, predicting the expected utility of

the item for a user.

Preference Model An abstract model of the user preference information.

Profile A grouping of information on a user or set of users.

pVoD Preview VoD: BT-specific acronym: Video-on-Demand content previewing an

item for purchase. Short, and free-of-charge.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 12

Term or Phrase Definition

Rating 1. A measurement of user interest, typically applied to a catalogue item.

2. In relation to content in BT field trial: legal definition of the age at

which an asset can be viewed (British Board of Film Censors

categorisation.)

Recommender A system for finding personalized content for a user.

STB Set Top Box (or Set-Top-Box): device providing hardware to process digitally

streamed content and deliver to TV. Provided with BT Vision service.

sVoD Subscription VoD: BT-specific acronym: Video-on-Demand content available

as part of a subscription package

tVoD Transaction VoD: BT-specific acronym: Video-on-Demand content available as

a result of an individual purchasing decision from the BT Vision catalogue

User Behaviour User interaction with the system, which is relevant for consideration in finding

personalized content.

VoD Video-on-Demand: video content items that can be delivered (usually by

streaming over an IP network) to a viewer when required

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 13

4 Target Environment The target environment for the trial is the recommendation of items delivered by IP over broadband to

consumers’ TV sets via a set-top box (STB). The BT Vision service requires the customer’s broadband

line to support a certain minimum bandwidth. The content-streaming part of the service reserves part

of the line’s available bandwidth when streaming is in progress. Content is streamed from the network

through the BT Home Hub to the BT Vision box. The BT Vision platform is based on Microsoft’s

Mediaroom product.

Customers interact with BT Vision via a conventional multi-function STB remote control, and menus are

displayed on the television. The BT MyMedia trial was displayed in a sub-section devoted to marketing.

The trial involved adding an additional sub-menu, and populating it with five recommendations, these

recommendations being tailored specifically for each BT Vision customer.

Because the trial involves BT’s BT Vision service, it only includes existing customers of that service,

currently only available in the UK. In order to have BT Vision they must also have a BT phone service and

have subscribed to BT Broadband. The trial was closed in the sense that participants were selected by

the BT MyMedia research team trial organisers, and participants were not able to add themselves to the

trial.

Research into recommendation through the BT MyMedia field trial offered significant potential benefit

to BT’s BT Vision customers both within and outside the field trial. The BT Vision service offers a wide

catalogue of VoD content, but searching it using a STB remote control is demanding. Recommendation

can ease the path of the customer to content that they prefer and reduce their search effort, improving

the user experience.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 14

5 Motivation for the Field Trial BT Retail Consumer operates the BT Vision IPTV service for BT. They are interested in learning about

how recommender services can improve the take up of VoD, in particular in the movie or film category

(which is the most active, see in Field Trial Results section 8.3 below), and thus increase revenue in a

commercial context.

They are also interested in how the user experience of recommendation can affect customer loyalty and

reduce churn (erosion of customer base). Churn is important because the BT Vision service is an

important differentiator to help retain BT’s broadband ISP customers.

Since the original field trial plans were developed, a further motivation has been expressed by BT Retail

Consumer in understanding how recommender algorithms perform in recommending individual VoD

items, in comparison to recommendation of regular subscription items for VoD.

At this point it is worth clarifying the three types of VoD viewing activity that are recorded on the BT

Vision management system:

- pVoD refers to the viewing of previews of content. Previews are brief, and free to view, and may

give advance viewing of either of the other two types of VoD. Because the viewer does not have

to pay any charge for viewing pVoD they are not the topic of the BT MyMedia recommender

trial, although some of the Field Trial Results reported in section 8.3 report on them.

- sVoD refers to the viewing of content items that are part of packages. Access to packages is

acquired by paying a monthly subscription, after which any item that is in that package at the

time the viewer wishes to view can be watched at no extra charge. Packages cover a range of

themes designed to cover broad ranges of the BT Vision audience. Because there is no

additional revenue generated when a customer views an individual sVoD item, sVoD was not

originally planned to be studied in the BT MyMedia field trial. After the field trial was planned,

an additional query from BT Retail Consumer about the relation between a film-themed

package, FilmClub and recommendations of individual film items, led to an additional research

question described below. Some aspects of sVoD viewing activity are reported in section 8.3 as a

result.

- tVoD refers to the viewing of content items that are purchased individually. These are identified

by searching the overall BT Vision catalogue, and making the decision to purchase. After

purchase the viewer has 24 hours in which to watch the item. In the original plan for the BT

MyMedia field trial, it was intended to focus only on tVoD, and on the recommendation and

viewing of tVoD films/movies, this being the most active area of tVoD purchase (see section 8.3

for more information.)

So the motivation of the BT MyMedia field trial was originally to compare how recommender algorithms

generated by the project could stimulate BT Vision film tVoD purchase, and then a later motivation

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 15

arose to compare recommendation effect on film tVoD purchasing activity with film sVoD viewing

activity.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 16

6 Research Strategy and Design

6.1 Research Questions In a business context, the marketing of content is always important. Conventional marketing techniques

have not attempted to anticipate user preferences in a systematic way using algorithms. The BT field

trial within the MyMedia project focuses on investigating the use of recommender algorithms as a

marketing tool for transaction Video on Demand items (tVoD), versus more conventional Editorial

marketing, where the content to be recommended is chosen by the editorial team based on their

knowledge of the content, the customer base, and competing services in the market.

The first research question then is: do recommender services offered through the MyMedia system

stimulate more purchasing activity of tVoD by BT Vision customers than editorial recommendations?

Two further questions arose during the development and operation of the field trial, after the field trial

was originally planned. The BT MyMedia team was asked to include in the field trial an examination of

the effect of recommendation on tVoD purchasing versus the effect of the film package, Film Club,

(sVoD). The films included in the package (sVoD) were not as wide a catalogue as those included in the

general (tVoD) catalogue, but purchasing the package enabled the viewing of multiple films with only

one purchasing action.

The final question related to the performance of the recommender algorithm in predicting user

behaviour. How closely did the recommendations delivered to viewers match viewers’ decisions to

purchase tVoD, irrespective of whether they had looked at the recommendations and taken the decision

to view the item from there?

6.1.1 Qualitative study

A more subtle research question planned for the BT MyMedia field trial was to do with the user

experience of the presentation of recommendation. Do users enjoy the experience of receiving

recommendations, and does this lead to greater customer loyalty and reduced churn?

This required a study involving detailed interaction with a small group of volunteer customers and their

families. An agreement was made with such a group and a timescale identified for this study to be

carried out, however the dependency of the BT MyMedia field trial on the BT Vision service meant that

the trial could not take place during the timescale agreed and the group of volunteers could not be

maintained beyond this time, so this question was not addressed.

6.2 User Sampling BT participated along with the other project partners in the preparation of personas and use cases as

reported in deliverable D1.1. Since BT’s field trial took place in the UK, the UK-based personas and use

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 17

cases are most relevant. However the nature of BT’s field trials, which were closed, involving sampling of

users by the field trial organisers from among a large customer base, means that it is difficult to match

users in trials to specific personas and use cases. This does not make the development of the personas

and use cases irrelevant – it is likely that among the large number of users included in the BT field trial

that there are some who fit closely to the personas and use cases proposed. The nature of the user

sampling and data collection methods used will make this clearer.

Customers were selected for the trial by pseudo-random sampling from the total customer base. Initially

two equally –sized groups of 50000 customers were selected, one to receive recommendations from the

MyMedia system, and one to receive recommendations chosen by the BT Vision Editorial (marketing)

team. Because BT Vision is a live commercial service, customers were allocated to the trial without their

knowledge and the service that they were offered did not vary as a result, except in the case of the

customers in the MyMedia recommender group receiving recommendations generated by an algorithm

from the MyMedia recommender algorithm library, if they accessed the page on the BT Vision service

where recommendations were displayed. All other customers whether in the Editorial group or outside

it, would receive the Editorial recommendations if they accessed this page. The inclusion of the

MyMedia recommendations was not distinguishable by customers in the design of the user interface,

and did not affect the terms of their contract with BT, nor require them to pay any additional charges to

receive the recommendations.

At no point was personal information about the users taken, nor was it needed for analysis. Since

customers of the BT Vision service could only be matched to a STB, customers in the trial could not be

linked to personas, and their association with particular use cases could not be confirmed.

6.3 Data Collection Methods Data was collected for the field trial in the form of logs of user viewing behaviour, which recorded the

date and time of viewing events, the STB ID, the BT Vision customer ID, the nature of the item viewed,

and the cost of the item, as well as other various business-specific items which were recorded for the

purposes of billing customers and supporting royalty payment to suppliers, but are not relevant to the

trial. This data was collected for the purposes of management of the BT Vision system irrespective of the

BT MyMedia trial. This data could be used to study both the original research question of the BT

MyMedia quantitative trial comparing the MyMedia recommender system with editorial

recommendations, and the later questions comparing MyMedia tVoD recommendations with an sVoD

package, and investigating the accuracy of MyMedia recommendations in predicting items viewed by

users.

6.4 Data Analysis Techniques The basis of the quantitative trial was to build up a lot of information about BT Vision customers’

response to recommendation by drawing upon a large, closed, sample of the BT Vision customer base. It

was possible to extract information about the customers involved in the trial by using the viewing log

files which were generated automatically on behalf of the BT Vision service, so that customers could be

billed for VoD that they had purchased, and royalties could be paid to the owners of the content viewed

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 18

where appropriate. These files contained no personal information about the customers receiving the BT

Vision service or their families, but it was possible to distinguish different households involved in the

trial through the STB ID. From the STB ID it was not possible to map back to the precise location or name

of the customers involved in the files used, nor was that required for the analysis.

The accumulation of a large amount of numerical data about the number of VoD items viewed and the

value of the purchases made in order to view them allowed statistical techniques to be used to

investigate for statistical significance between the groups. Descriptive statistics were generated to

describe overall behaviour. Non-parametric statistical tests [3] were used to perform comparisons

between groups, avoiding the assumptions of distributions of the data needed for many other statistical

tests.

As the field trial was being developed a request was made for a study of the effect of the MyMedia

recommender system in comparison with the Film Club package as a marketing tool. The Film Club

package is one of a range of packages available to BT Vision users. It focuses specifically on films.

Subscribers pay by monthly direct debit and can watch any of the films in the Film Club catalogue for

free as many times as they want, during the time that those films are available in the Film Club

catalogue. The Film Club content does not overlap with the wider VoD content, although a particular

film may move from one to another at some stage.

A small group of users (250) was moved from the main BT field trial, and divided into five groups, each

of which received five recommended films when they accessed the recommendation page (the same

page which any user could access). The difference in this case was that the users in the five groups

received different combinations of recommendations generated by the recommender algorithm used on

the MyMedia trial, and by pseudo-random sampling from the Film Club catalogue. So there were

different levels of MyMedia recommendation and Film Club recommendation. It was intended to use

Analysis of Variance [4] to test for differences between the different levels of recommendation of

MyMedia versus Film Club recommendation, but since analysis of viewing of the recommendation

pages showed that none of the triallists that were in this part of the trial had viewed the pages, this was

not followed up.

Near the conclusion of the field trial additional data was provided logging BT MyMedia triallists’ access

to the recommendation page on the BT Vision service. It was discovered that few of the triallists had

accessed the page (see section 8.3 below). However the viewing results for those that had accessed the

page were of interest.

Finally, it was thought worthwhile at the conclusion of the trial to investigate the performance of the

MyMedia recommender algorithm in predicting users’ purchases, whether or not they had made the

decision based on viewing the recommendation page. A hit-rate metric [5] was used to do this.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 19

7 Field Trial Development The BT MyMedia field trial needed to handle tens of millions of viewing records, one hundred thousand

customers, and thousands of items of content. Content items and viewing records were updated

throughout the trial. Input data was supplied from parts of BT external to the BT MyMedia team, and

the output recommendations supplied to third-party companies who are responsible for transferring the

items to the BT Vision platform. Iterative development was used, since the input format of the data was

changed for reasons outside the team’s control, and additionally performance and integrity issues

arising from the very large data sets had to be overcome. The final version of the system performed

satisfactorily for the duration of the trial, and is envisaged to remain operating after the project is

complete. Further detailed information on the system architecture may be found in Section 7.4.

7.1 Content Catalogue BT Vision customers may access content items as part of different catalogues, depending upon whether

they are subscribers to an sVoD package, whether they are viewing a free preview (pVoD), or whether

they have purchased an individual tVoD item. However all types of content have a similar format of

metadata associated with them, although since the content is provided by a variety of suppliers, BT not

being a content provider itself, not all fields of metadata associated with content items may be

complete or in a consistent format with other items.

The following table shows the most important fields and examples of metadata associated with content

items (referred to by BT Vision as assets). Many fields should be self-explanatory. Window Start and

Window End are worth explaining, because they are critically important for the BT Vision service and for

successful recommendations. Because BT acquires the rights to distribute digital content from content

owners, these rights are licensed for fixed periods. The Window Start and Window End fields are derived

from these licence terms, indicating when a particular asset can be purchased by a user under the terms

of the licence. In order to implement the BT MyMedia field trial it was essential to check that every

recommendation was delivered to a triallist during the time window when that recommended item

could be purchased.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 20

Field Example

Asset ID Ah_AGR400_1150811224590_926368967

Title Planet Earth-s1, ep7:Planet Earth

Summary With a budget if over 13 million UK Pounds and over 200 locations, Planet Earth is

a truly epic nature documentary. The treeless grasslands... .

Short Summary The BBC’s Bafta nominated documentary

Continues with the grasslands of the world’s ...

APPROX 35 WORDS

Series Planet Earth, Series 1

(Stored relationally by Series and Programme)

Episode Title Great Plains

Episode Number 7

Language en

Subtitle Language en

Rating U

Screen Format 16x9

Runtime 50

Release Year 2006

Price 0.99

Currency GBP

Window Start 2007-12-04T18:10:36

Window End 2008-11-01T23:59:59

Categorization TV/Documentaries

(Stored relationally. Genre under one of TV, Film, Kids, Music, Sport)

Table 1 Content data structure

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 21

7.2 User Feedback

Two forms of user feedback have been used in the MyMedia project. The first is explicit feedback,

where the user explicitly expresses or records a choice. A typical example is a user rating an item on a

recommendation page. The second kind of feedback is implicit feedback. In this case, the viewer does

not actively record an opinion or action, and a record of their interaction with the system is the only

means of establishing their behaviour. For example, if a viewer selected a film, then that could be

considered implicit positive feedback.

The nature of the BT Vision user interface as deployed in the BT MyMedia field trial means that

collecting explicit feedback is not possible. Data is recorded about the viewing activity that BT Vision

customers make, in order that they can be billed for VoD purchase, and in order that royalties can be

paid to content providers. These records are particularly important for analysing the effects of

recommendation. Table 2 above shows the format that this takes. This is a form of implicit feedback,

where the user does not take any direct action for the information to be recorded.

7.3 User Interface

Field Description

Event Datetime Date/Time of viewing event

STB ID Set Top Box ID

Customer ID BT Vision Customer ID

Asset ID Asset ID

Various Subscription

Flags

Whether the view was made as part of a subscription

Sub or Charge View Subscription of charged view

Customer Bitmask Subscription bitmask for customer at time of view

Table 2 Viewing data structure.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 22

Viewers interact with the BT Vision UI, and hence with recommendations provided through it, via a

dedicated remote control provided with the BT Vision set top box. The left/right/up/down buttons allow

movement over a screen, including to a “back” button which will move to a previous screen. The

“select” button selects a choice. The User Interface used in the field trial can be summarised as a user

journey shown in the following set of images. The BT MyMedia field trial is presented through the Try

This interface menu of BT Vision, an area devoted to marketing on-demand content. The Figures below

show the user journey towards selecting a particular item. It is worth mentioning that there are a

number of routes possible through the Try This user journey: all sub-pages of the Entertainment page

will lead to the green button allowing access to the recommendation page.

Figure 1 “Try This” user journey, part 1. Try This menu

Selected.

Figure 2 “Try This” user journey, part 2. UI process starts.

Figure 3 “Try This” user journey, part 3. Entertainment

screen opens by default.

Figure 4 “Try This” user journey, part 4. Family Ties option

selected from Entertainment screen.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 23

Figure 5 “Try This” user journey, part 5. User selects Try This!

Recommends, using green button as highlighted on bottom

of previous screen.

Figure 6 “Try This” user journey, part 6. User selects

“Awaydays” from recommendations. Information is

provided about movie and cost of purchase.

Figure 7 “Try This” user journey, part 7. User decides to

purchase “Awaydays”. A final choice is available before

ordering.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 24

7.4 System Architecture

A simplified schematic is shown in Figure 8. Components inside the dashed line are directly part of the

MyMedia system, and comprise the code framework with its database, additional tables within the

same database, plus the developed software. These components all run on a single 8-core computer

running 32 bit Windows 7, with SQL Server Developer Edition 2008 R2. Most of the core components

are written in C-sharp within .NET3.5 (the core MyMedia framework, the application logic and the

algorithm code). The code to handle the daily email feed is written in C++, and called directly from the

C-sharp application.

Figure 8 System Architecture.

Components outside the dashed line are not part of the core system, but are required in order for the

trial to operate. These components and their data formats are not under the direct control of the BT

MyMedia software team. They are described in the following subsections.

7.4.1 BT Vision Data Input (external to MyMedia)

Viewing records are supplied to MyMedia by the Vision operational team within BT. During the course

of the software development for the trial, the external database was changed from SQL Server, to

Oracle; these changes being made for operational reasons outside the scope of the MyMedia trial.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 25

7.4.2 Input Processing (within MyMedia)

Software was written to support all the database formats described in section 7.4.1, to import the

records directly into the MyMedia core database for the trial. This involved using standard C-sharp and

other .NET3.5 software components, which allowed the remote databases to be opened and queried as

though they were local, and for their contents to be imported into the MyMedia system. However this

approach proved problematical owing to the speed of import (SQL queries were made over a network,

and the connector speed was often slow. Additionally, cross-checking had to be performed in order to

eliminate minor inconsistencies with the data). With imports of historical data taking several weeks in

certain cases, the system proved sensitive to short-term network problems. The impact of these issues

was reduced by software redesign. Consequently, the technique adopted for the trial is to obtain a

delimited text dump from the live Oracle database of customer viewing records, manually transferred

once each month via FTP to the MyMedia computer by the Vision team. This file is approximately 4GB

in size each month, and contains millions of (sometimes >10 million) viewing events.

The MyMedia software operates on this large text file, with as much of the processing as possible being

performed on memory-resident data structures, rather than by making database queries: based upon

experience during development, this approach gave the best speed of import. The process was as

follows:

• Process text file, selecting only views that actually do fit within the current month time window.

• Read those views, selecting only the fields of interest

• Retrieve the trial customers and list of video assets from the MyMedia database

• For each view, verify that the user and asset can be located. If so, create a new viewing record

in the MyMedia database and associate it with the user and asset.

This monthly import and filtering takes several hours on a fast, multi-core computer, and has proved

reliable.

An example line from the import file is shown below (this example refers to a music video, and the

customer ID and asset IDs have been changed). The fields are delimited by the | symbol. The example

has been artificially broken over multiple lines for clarity:

20100505|2010-05|2010-05-07|2010-05-05|10:00 - 10:5 9|00000012342012|Unknown| ah_AGR603_1472482815221_240987196|Music Videos| Spring 2010: Vol 5 – Sleeping With A Broken Heart - Alicia Keys|Music|

Sony BMG|1|0|0|0|> 500|Unknown|Music|Music Video|20 08-05|None

These fields variously contain date information, textual description and flags relating to the type of

content. The field beginning 0000 contains the customer ID, and the ah_AGR… field uniquely identifies

the video on the system.

The field trial software processes the input data file to:

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 26

• Identify viewed assets (movies, programmes etc.) that are currently in the MyMedia system.

Those identified assets are subsequently filtered to exclude those which are not of interest (such

as trailers and free content). Viewing events that relate to items not known to the MyMedia

system are discarded.

• Import viewing events for users who are currently known to the MyMedia system. In order to

keep customers’ personal information private, various identifiers are used to represent users

within BT Vision, and data checking is performed as part of the trial in order to verify that an

(anonymised) billing ID corresponds to an appropriate (anonymised) Vision Set Top Box.

In addition to the monthly viewing records, the system also receives an automatically-generated email

daily, with changes to the asset content on the Vision platform. This email contains a compressed

attachment, with each asset’s metadata as an XML file, plus optionally images which are used for

publicity (analogous to the pictures that are printed on DVD cases).

These emails are automatically processed: The system verifies that the asset is indeed new, and if so,

adds it to the database, and saves the images if present. The images are not specifically required for the

recommendation generation, but are used later in the workflow, when the recommendations are sent

for distribution.

7.4.3 Recommender Output

The recommender produces a daily feed which is ultimately displayed on the BT Vision customer’s TV as

part of the ‘Try This’ page (see Figure 8 above). This recommendation list is generated as a delimited

text file, with one line of 5 recommendations per customer. The customer ID is a numeric value, which

cannot be used in isolation to identify an individual person or address, and the five recommendations

are alpha-numeric identifiers.

An example of the output is shown below. There is one of these lines per customer. The example below

has been artificially split across multiple lines for clarity.

cffd4268-8a20-4ae3-84bc-dbbf5ccc34d7|CC50209E-E8D1- 421C-95F2-D63D4EC8E873| 3F9C3014-3909-475A-A152-DABCB6A92148|8216EDA4-79F4- 4857-A4C6-2464117BFFFC| 598991D1-47E9-47E5-B29C-79053F5230FB|00ABEB3B-6D53- 4C54-A3E4-33C7020672E1

The first identifier (begins with cff above) is the STB ID. Note that this is a different format to the

customer ID in the (input) viewing records, thus further increasing anonymity. The remaining five values

are the identifiers of the items to be recommended for that customer.

The daily recommendations file is sent automatically from the MyMedia software via FTP to an external

company, who manage the Try This marketing pages for BT Vision.

The daily nature of recommendations required that the BT MyMedia recommender system generated

new sets of recommendations within 24 hours. It also necessitated checking the new sets of

recommendations against the the daily email input from BT Vision providing information about changes

in the asset list as well as additional images where required.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 27

7.4.4 System Architecture Conclusions

The BT field trial required additional database tables to store data and relationships that could not be

expressed in the core MyMedia framework. Several iterations of the schema were required, since the

storage format of the BT data was changed by operational parts of BT (outside the control of the BT

MyMedia researchers). Additionally, data validation was required, since (with a large data set, evolving

over two years), certain data inconsistencies were discovered in the records. This cross-checking was

performed during data input, verifying that users, assets and viewing records are all referring to the

same set of data. This required several re-imports of large historical data, since some problems only

came to light part way through the workflow.

The schema and queries also required optimisation as experience was gained. For example, the BT

Vision customer ID is numeric, although too large to be stored in a 32-bit integer. Initial designs

therefore stored the ID as a string; however the effect of this was to slow the queries such that data

import could take an unacceptably long time (importing 12 months of viewing data with more than 10

million views per month would have taken two months). Additionally, a small proportion of the

‘numeric’ IDs were actually found to contain non-numeric characters. Discovering these issues and

resolving them contributed to delays in starting the trial.

The current trial system performs satisfactorily, although parts of the workflow can still take several

hours to complete, and this makes stopping and debugging the software more difficult. There are still

occasions when “out of memory” exceptions are thrown due to the extremely large datasets handled by

this very large-scale trial. However, a daily manual check and restart of the software if necessary is

sufficient for the trial.

7.5 Recommender Algorithms The BT MyMedia field trial was based on use of the MyMedia software framework, and the

recommender algorithm library developed for use with this. In order to ensure that the most

appropriate algorithm was selected for use in the BT field trial, a sample set of BT Vision data,

incorporating viewing records and content catalogue data was supplied to UHI. UHI, leading the

development of recommender algorithms for the project, were able to evaluate which recommender

algorithm in the MyMedia recommender algorithm library performed best for data of the form that

would be used in BT’s field trial. The conclusion of this evaluation by UHI was that the BPRMF (Bayesian

Personalised Ranking Matrix Factorisation) algorithm, reported in MyMedia D4.1.2 [7] performed best.

This algorithm was therefore selected, although the design of the MyMedia software framework as

produced in WP2 could have allowed others to be substituted.

7.6 Filters In order to limit the amount of data processing required, filters were applied to the data. The core

MyMedia framework supports filters as software components, which can be loaded into the system to

filter out items not required for the recommendation process. The framework supports two types of

filters: pre- and post-filters. Four types of pre-filters were used in the BT MyMedia field trial:

• A pre-filter restricting the content category to tVoD film

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 28

• A pre-filter removing any items that were out of the time window which had been agreed with

the content provider within which they could be viewed

• A pre-filter removing items that had been previously viewed by viewers using the same STB

during the trial

• A pre-filter removing any items that although consistent with the previous pre-filtering actions

were not retained through changes introduced by the daily live report (reporting changes in the

content catalogue).

As a result of these pre-filters, it turned out that the use of post-filters was not required, although the

capability to introduce them was retained.

7.7 Metadata Enrichment Metadata enrichment is an important area of research in recommender systems, because the

availability of substantial metadata associated with content is often important for the effective

performance of recommender algorithms. Another area of the MyMedia project, WP3 focuses on this

area. A deliverable produced by WP3 simultaneously with this one, deliverable D3.1.3 [7], provides

information about the achievements of WP3 in more detail.

The content catalogue used in the BT MyMedia field trial contains a considerable amount of metadata,

as described in section 7.1 above. However since the content is provided by a variety of content

providers, who may have different standards in populating the metadata fields, and since some of the

metadata may be amended by BT Vision schedulers following import to the BT Vision system, it may not

always be consistent. Accordingly it was thought worthwhile to investigate whether some of the

metadata enrichment modules developed in WP3 could improve the existing metadata in BT’s content

catalogue in order to use them in conjunction with the recommender algorithms developed in WP4.

Of the various techniques available, the Keyword Extraction (KWE) approach researched by Novay was

studied due to being most advanced. In order to test whether KWE could improve the usefulness of the

recommendations in this specific case, a sample set of 70,000 viewing records and programme data was

extracted from the full BT Vision dataset. All personal data was anonymised, with each user being

reduced to a unique number, so that whilst it was possible to determine the items that each user

watched, it was impossible to identify the user by name or personal details. This sample was supplied to

Novay. Novay’s Statistical KWE modules were investigated as potential means of metadata enrichment

for the BT Vision content dataset. Results suggested that the form of textual synopses provided with the

content did not lead to any advantage when Novay’s techniques were applied. Therefore on the advice

of Novay, the technique was not adopted for the BT field trial.

Other techniques for metadata enrichment were researched as part of WP3, but were either not in a

sufficiently advanced state when the BT field trial was being developed, or depended on explicit user

feedback, something, as described above, not possible in the context of the BT MyMedia field trial.

Accordingly it was not possible to use Metadata Enrichment in the context of the BT MyMedia field trial.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 29

8 Field Trial Execution

8.1 Evaluation System Work reported in deliverable D1.5 of the MyMedia project [1], and summarised in a paper by

Knijnenburg et al. (2010) [2], led to the development of a conceptual framework for evaluating

recommender systems intended to be sufficiently general as to be able to cover all the field trials in the

MyMedia project, as well as other recommender systems as well. For the sake of rapid comparison, the

a simplified version of the framework as shown in Figure 10 of deliverable D1.5 is reproduced in Figure 9

below.

Figure 9. MyMedia evaluation framework

From this diagram we can see that most of the focus of the BT MyMedia field trial is on a small

subsection of the issues described here, on the Purchase/view sub-category within the Interaction

category within the Behaviour box. The use of only implicit feedback from users of the system within the

BT MyMedia field trial, and the deliberate lack of direct contact with users, means that most of the

aspects of recommender system usage described here cannot be studied. However this does not reduce

the importance of the study, which focuses on the use of the MyMedia system in the context of a live

commercial service, where purchasing behaviour must necessarily be of most importance.

During the field trial information was gathered not just about viewers’ viewing and thus purchasing

behaviour, but information also became available about which households (designated by STBs) had

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 30

viewed the BT MyMedia recommendation page described in section 7.3 above. This information did not

expose personal information about viewers, but it did allow the evaluation of objective aspects of

performance of the recommender algorithm used in relation to prediction of the preferences of viewers

proceeding to choose to purchase and view items, thus focusing on a more widely explored aspect of

recommender system evaluation, falling within the Objective system aspects box in Figure 9 above.

8.2 Field Trial Execution BT’s quantitative field trial involved no direct interaction with customers. A large number of customers

were involved, 50000 receiving MyMedia recommendations compared with 50000 receiving Editorial

recommendations (recommendations selected by the Editorial [marketing] team with the BT Vision

management). This large quantitative trial was thus the largest in terms of participants in the MyMedia

project, as well as differing in delivering the recommendations which ultimately originated from the

same MyMedia software via a Microsoft Mediaroom-based IPTV interface rather than a computer-based

browser.

The BT field trials were originally planned to commence in January 2010. This was dependent on

decisions made within the BT Retail Consumer business, the operators of BT Vision and host of the field

trial. One delay was caused by a requirement for more detailed preparation for customer relationship

management in the context of the new recommendation pages. Once this was dealt with, another delay

was generated by a change to the the format of data recording of customer viewing history. This

necessitated rewriting much of the software designed to import and parse customer viewing records so

that the data could be ready for analysis and for training the recommender algorithm.

Accordingly, while the BT MyMedia recommender service was ready to run over the BT Vision service

somewhat earlier, it was not possible to collect data for the field trial until August 2010. Data was

recorded on an Oracle database managed in another part of BT, and then imported to the BT MyMedia

research team’s machines through a number of security barriers posed by different firewalls. Software

was written to convert the Oracle format data into formats that could be used by a variety of data

analysis programs. For the purposes of this report data from August and September 2010 was used,

although the field trial continued running beyond the completion of this report.

8.3 Field Trial Results

8.3.1 General user behaviour

To investigate the first research question: do recommender services offer through the MyMedia system

stimulate more purchasing activity of tVoD by BT Vision customers, we first investigated how customers

in the two field trial groups (MyMedia recommender and Editorial) differ in their purchasing of tVoD film

content. Summary statistics are shown in Table 3 below.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 31

Month and

Group

Number of

customers

viewing

items

Mean

number of

items

viewed

Median

number of

items

viewed

Modal

number of

items

viewed

Mean

cumulative

purchase

value (GBP)

Median

cumulative

purchase

value (GBP)

Modal

cumulative

purchase

value (GBP)

August

Editorial

27927 1.5500 1 0 3.4388 0.418 0.00

August

MyMedia

27699 1.5284 1 0 3.4259 0.664 0.00

September

Editorial

26136 1.2070 0 0 2.7763 0.00 0.00

September

MyMedia

25985 1.2131 0 0 2.7552 0.00 0.00

Table 3. Summary statistics for the two groups in the BT MyMedia quantitative trial for the months of

August and September 2010.

This table refers to triallists in the two groups of the main BT MyMedia field trial, one receiving Editorial

recommendations from the BT Vision Editorial team, and one receiving recommendations generated by

the MyMedia recommender system. After a group of triallists were removed to support the FilmClub

trial, there were 49750 customers in each group.

A large proportion of customers (between 43.9% and 47.8%), irrespective of group and month, did not

view any VoD. This may reflect the availability of many free-to-air digital TV channels which can be

viewed and recorded via the BT Vision STB.

But the focus of the trial is on the recommendation of tVoD film items, and the mean, median and

modal number of items viewed and purchase value refers to tVoD items. The modal number of items

viewed being 0 irrespective of group or month is explained by the distribution of tVoD viewing being

extremely skewed, with a long tail of many viewers not viewing any tVoD items (but likely to viewing

pVoD or sVoD), and a few viewers being much more active in their purchasing of tVoD items. The

skewed nature of this distribution is confirmed by the failure of the mean, median and mode statistics to

match.

The numbers of triallists that did view items were very similar between the groups, and although they

did differ between months, this was also not significantly different. The mean number of items viewed

was also very similar between groups, although differing between months. In neither case was there a

significant difference.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 32

The cumulative purchase value refers to the purchase value of all tVoD items purchased during that time

period. The mean, median and modal cumulative purchase values are thus dependent on the mean,

median and modal numbers of tVoD items viewed.

Although there are differences between the two groups in terms of mean cumulative purchase value of

tVoD film items during both months, they are not significantly different. There are more differences

between the two months, but this is also not significantly different.

Although the conclusions of this overall summary would seem to show no effect of the MyMedia

recommender system on user behaviour, what it necessitates is further investigation of which users

(only identified by STB identifier) actually viewed the MyMedia and Editorial recommendations through

the Try This page, and the consequences that this has on viewer click-through to particular items and

viewing behaviour.

8.3.2 Recommendations delivered and viewing events recorded

During August 2010 there were 1301788 daily sets of five recommendations delivered to the Editorial

group and 1346822 to the MyMedia recommendation group. This is consistent with the

recommendations shown changing on a daily basis, and on the four complete weeks that made up the

most of the month being used as the sampling period. On a few occasions technical faults meant that

recommendations could not be delivered to customers, which is why these figures do not match exactly

the number of sets of recommendations that might expected in each month.

During the same period in August 2010 there were 167304 viewing events recorded among the Editorial

group and 168705 viewing events recorded among the MyMedia recommender group. These figures are

smaller than the number of daily recommendation events, which is not surprising when one considers

that not every recommendation is likely to lead to a viewing. However they suggest a much higher mean

rate of viewing activity (approximately 3.72 for the Editorial group, and approximately 3.75 for the

MyMedia recommender group, adjusting for the length of the month) than given in Table 3 above. Why

is this? This occurs because the viewing log files record all types of viewing events, pVoD and sVoD as

well as all categories of tVoD. When the total viewing events are reduced to tVoD only, 84965 are

recorded for August and 59511 for September. Of these, film is the most important category, as shown

in the tables and charts below. This is because it is the only category which cannot be completely viewed

through subscription.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 33

Category Number of tVoD items viewed Percentage of Total

Film 55332 65.12

Kids 10054 11.83

TV 8447 9.94

Music 5653 6.65

Replay 1899 2.24

Other 3580 4.21

Total 84965 100.00

Table 4. tVoD viewing in August 2010, across both trial groups. Other refers to several categories

which had small numbers of viewings.

Figure 10. tVod viewing in August 2010, in chart form.

viewings

Film

Kids

TV

Music

Replay

Other

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 34

Category Number of tVoD items viewed Percentage

Film 34566 58.08

TV 7576 12.73

Kids 7078 11.89

Music 4477 7.52

Replay 3189 5.36

Others 2625 4.41

Total 59511 100.00

Table 5. tVoD viewing in September 2010, across both trial groups. Others refers to several categories

which had small numbers of viewings.

Figure 11. tVoD viewing in September 2010, in chart form.

The fact that many viewing events are of items other than films explains in part why many of the

customers in both trial groups are not recorded as viewing any tVoD items in Table 3 above: they may

have viewed non-film tVoD items, or based on the total viewing figures, viewed pVoD or sVoD items.

Alternatively they may not have used the BT Vision service other than to view free-to-air channels.

viewings

Film

TV

Kids

Music

Replay

Others

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 35

However this does not prevent further exploration of the response of the BT MyMedia triallists to

recommendation since we also have information about which triallists (as defined by STB rather than by

personal information) viewed the recommendation page, and whether this led to a click-through to a

particular content page. This can be linked with viewing records to identify whether the

recommendation page was effective in stimulating user purchasing and viewing of tVoD films.

8.3.3 Viewing the BT MyMedia recommendation page and its consequences for tVoD

viewing

The BT MyMedia research team were able to obtain late in the field trial, information about which

customers had accessed the Try This page where the MyMedia and Editorial recommendations were

displayed.

Group Number of

viewers

Number of

viewing events

Viewing of

recommendation

page without click-

through

Click-through to

asset

Editorial 50 106 70 18

MyMedia

recommendation

74 199 109 45

Table 6. Viewing of the MyMedia Try This recommendation pages, during the BT MyMedia field trial,

August-September 2010.

This represents a very small proportion of the trial population (<1% in each case) which makes basing

the analysis of the whole trial on the behaviour of the whole population inappropriate. Instead it is

necessary to focus more on the triallists that viewed the recommendation pages and their behaviour in

viewing items. Although the triallists in the Editorial group viewed the recommender page less than

expected, and clicked through to assets less than expected, there was no significant difference in

viewing events or click-through activity (Χ2

test, 1 d.f., in both cases) it is possible to make further

observations based on these smaller number of viewers in the trial.

For each of the triallists who were recorded as having accessed the Try This page at least once during

the evaluation period (August and September 2010), a complete history of Video on Demand viewing

was extracted and compared chronologically with the Try This page access logs.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 36

Customer: 062B023D-D1FF-4EE1-AC6B-E452E6F944BE

Date Time Event Type Asset ID Asset Title

03/08/2010 16:00 Preview ah_AGR603_1279211753364_682168968 Trailer: Ben 10: Kids Take Over

13/08/2010 12:00 tVoD view ah_AGR603_1195657549425_653658463 Scooby Doo 2: Monsters Unleashed

13/08/2010 12:53 Viewed Recs

13/08/2010 12:54 Click Through ah_AGR603_1195657549425_653658463

09/09/2010 23:00 Preview ah_AGR603_1273589912469_816189450 Trailer: Grouchy Young Men -s1

12/09/2010 16:00 Preview ah_AGR603_1274733959690_241005256 Trailer: Kick Buttowski

13/09/2010 18:00 tVoD view ah_AGR603_1213279450125_205990313 Chronicles of Narnia: The Lion, the Witch & the Wardrobe

25/09/2010 14:00 Preview ah_AGR603_1274733959690_241005256 Trailer: Kick Buttowski

26/09/2010 11:00 Preview ah_AGR603_1274733959690_241005256 Trailer: Kick Buttowski

Table 7: Example of a combined customer viewing history compared with access to the MyMedia

recommendations.

Table 7 shows an example of a chronological history for a typical BT Vision customer during August and

September. This particular customer was also a BT Vision package subscriber and made extensive use of

the Kids’ TV package during this period. For clarity, Subscription viewing events are omitted.

On 13th

August the table shows (highlighted in green) how an interaction with the Try This

recommendation page led to the customer purchasing a film on demand (tVoD). It should be noted that

VoD viewing information is recorded in hourly ‘buckets’, and so the time at which the film was

purchased (event type ‘tVoD view’) is marked as 1200, but actually fell between 1200 and 1300 on 13th

August. It can be clearly seen that at the same time the customer viewed the recommendations page

(12:53) and then clicked through to an asset (12:54) which had the same ID as the film they

subsequently purchased (‘Scooby Doo 2: Monsters Unleashed’).

In order to observe the effectiveness of recommendations on the Try This pages for the small number of

triallists who viewed them, the chronological history for each customer was examined and events similar

to the one described above were counted. Using objective data such as this, it is impossible to be

certain when a particular pattern of events could be attributed to a recommendation, and so some

flexibility was necessary when reading the event sequences. In some cases, a click-through event would

lead to the customer watching a preview, or paying for something else instead.

Group Total sVoD events

Mean sVoD events per customer

Total tVoD events

Mean tVoD events per customer

Click-through events leading to tVoD viewing

Editorial

3473 69.46 148 2.96 28%

MyMedia Recommendation

11672 157.73 255 3.45 29%

Table 8: Viewing behaviour related to click-through events for triallists who viewed the Try This

recommendations.

Table 8 shows comparative statistics for each trial group with respect to sVoD events, tVoD events and

click-through events resulting from viewing the Try This recommendations page. It is notable that for

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 37

both groups just under one third of click-through events led to a viewing action. In spite of the very

small numbers involved, this is an encouraging rate of response. In this case, the MyMedia

recommendation system equals the performance of professionally-compiled editorial

recommendations, and given the context that the recommender is being driven by infrequent, implicit

feedback it could be argued that a higher level of performance would be difficult to achieve.

Examination of the combined event data for customers who accessed the Try This recommendations

page also yields some ‘weak signals’ which cannot reasonably be attributed directly to recommendation,

but in whose explanation recommendation may be a contributing factor:

• It can be seen from Table 8 that the group receiving MyMedia recommendations showed a

greater rate of tVoD viewing and a much greater rate of sVoD viewing. A simple statistical

summary of customer viewing frequency in each group shows that the standard deviation for

the MyMedia group is very high (250) compared with the Editorial group (78), suggesting that a

few high-volume users may be skewing the data for the MyMedia group. However, most

subscription customers in the MyMedia group did view recommendations prior to their

subscription viewing, and could have been influenced by this.

• In addition to click-through events influencing customer behaviour, there is evidence of tVoD

viewing events taking place immediately after customers have viewed the recommendations

page, suggesting that the recommendations have prompted them to choose a film using a

different part of the BT Vision system.

8.3.4 Comparing MyMedia recommendation with the FilmClub package

Because the evidence in terms of access of the MyMedia and Editorial recommendations through the

Try This page (User Interface described in section 7.3 above) showed that none of the triallists in the five

groups receiving both MyMedia and FilmClub recommendations had actually accessed the page during

the period of the trial described here, it was not possible to test the intended research question.

8.3.5 Predictability of MyMedia recommendation in the BT Vision trial

Since logging of the users viewing the BT MyMedia recommendation page had shown that only a very

small proportion of the triallists actually viewed the pages (see section 8.3.3 above), the analysis of the

performance of the MyMedia recommender system used here was carried out to investigate the ability

for the MyMedia recommender system or the Editorial recommendation choices to predict user

preferences, irrespective of whether users of the BT Vision service in the BT MyMedia trial had viewed

the Try This page or not. This is different from the usual use of recommender evaluation metrics (see

e.g. [1, 6]) where the metric is used to compare the performance of the recommender with observed

behaviour of users in responding to it.

The metric used was based on hit-rate. We used the Top-5-Hit-Rate-Normalised metric. It is referred to

as Top-5-Hit-Rate-Normalised because the top five recommendations as generated by the recommender

algorithm for a particular customer are the only ones that they have the possibility of receiving. It was

calculated in the following way:

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 38

• For each day, during the field trial, calculate the total number of unique assets watched by a

user. (So omit multiple viewings of an asset if viewed during the same day. For a film, which is

what we are considering here, this is likely to be rare.)

• On each day when a user in the field trial has watched one or more assets, check how many of

the five recommendations corresponded to what the user has watched, this is the daily hit rate

for that user.

• Generate the total unique assets watched for each user over the trial period by summing the

total unique assets for that user each day.

• Generate the total hit rate for each user over the trial period by summing the daily hit hit for

that user for every day.

• Calculate the Top-5-Hit-Rate-Normalised as

�����_���_��

�����_��� �_�����_����ℎ�

for each user, and display the results as means according to the number of items viewed in the trial

period. The higher the hit-rate, the more effective the recommendation is, since the hit-rate

measures the accuracy of the recommendation algorithm in matching what the user has watched.

For the BT MyMedia field trial the following results were obtained in the months of August and

September:

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 39

Month Group

Aug Editorial Assets1 >500 >200-

500

>100-

200

>50-

100

>20-

50

>10-

20

>5-10 1-5 All2

Hit-rate3 - - - - 0.0424 0.0223 0.0244 0.0290 0.0283

Viewers4 0 0 0 0 10 157 1386 10538 12091

MyMedia

BPRMF

Assets >500 >200-

500

>100-

200

>50-

100

>20-

50

>10-

20

>5-10 1-5 All

Hit-rate - - - - 0.0074 0.0098 0.0093 0.0080 0.0081

Viewers 0 0 0 0 14 152 1393 10470 12029

Sept Editorial Assets >500 >200-

500

>100-

200

>50-

100

>20-

50

>10-

20

>5-10 1-5 All

Hit-rate - - - 0 0.0161 0.0136 0.0062 0.0083 0.0082

Viewers 0 0 0 2 5 79 787 9245 10118

MyMedia

BPRMF

Assets >500 >200-

500

>100-

200

>50-

100

>20-

50

>10-

20

>5-10 1-5 All

Hit-rate - - - - 0.0267 0.0163 0.0089 0.0085 0.0086

Viewers 0 0 0 0 5 74 773 9453 10305

Table 9. Predictive capability of the Editorial recommendations versus the MyMedia BPRMF

recommendations in the BT field trial, evaluated by hit-rate.

1 Upper and lower boundary on the total number of unique assets viewed by a user (identified by STB ID) during

that month.

2 Total number of viewers that viewed at least 1 tVoD film item during that month. Thus smaller than the total

number of viewers who viewed all tVoD, and smaller than the number of users who viewed all VoD. Does not

include all triallists because not all triallists viewed at least 1 tVoD film during that month.

3 Hit-rate as calculated by the method described above.

4 Number of viewiers in a group viewing a total number of unique assets within the boundaries defined by Assets

during that month.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 40

What conclusions can we draw from this? It confirms evidence from earlier analysis that a lot of BT

Vision customers, irrespective of their trial group status, do not watch very many items, but a few watch

somewhat more.

Overall, the hit-rate results are not high for any group, during any month, at any level of activity in terms

of number of unique items viewed. The highest hit-rate, 0.0424, for a few participants in the Editorial

group who viewed over 20 unique items during August 2010, does not show a very good level of

prediction of user preferences, nor is it representative of any group in the trial, since there were only 10

viewers (identified by STB IDs) in that category.

It is better to look at the categories in which most users are found, those with lower numbers of

viewings per month. The results for the lower categories converge to the results for the overall group.

Looking at the groups overall, it is clear that the Editorial group produces higher hit-rates in August,

while the MyMedia BPRMF group produces slightly higher hit-rates in September, although in fact the

rates are very similar across the categories of lower numbers of viewings and more users, and converge

towards values for the MyMedia BPRMF group for August.

The MyMedia BPRMF recommender algorithm does not in itself appear to be a better means of

predicting user preferences in comparison to the expertise of the editorial (marketing) team, when the

recommendations are not displayed to the majority of the viewers (the majority of viewers do not take

up the opportunity to view recommendations, although they do have it.)

In this context the presentation of editorial recommendations does have an advantage, which may

explain the substantial difference between the groups in August, where the hit-rate recorded for the

Editorial group is higher. August is a month for holidays, and as Table 4 and Figure 10 above show, is a

time for higher levels of tVoD viewing on the BT Vision service. Accordingly it is a time for considerable

marketing effort, and the recommendations provided by the editorial team for the BT MyMedia field

trial Editorial recommendations are consistent with those marketed by other means. So even if

members of the Editorial trial group have not viewed the recommendations page, and the evidence

from section 8.3.3 suggests that that is the case for most members of the group, they may have seen

similar items marketed by other means. By contrast the MyMedia recommender algorithm does not

drive any additional marketing activities.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 41

9 Conclusions

The BT MyMedia field trial was different from other field trials in the project in that:

• it was offered via an IPTV service

• it was offered via a live commercial service (BT Vision) and thus

• in order not to disrupt the planned marketing of this which predated the trial it had to be

delivered in a manner consistent with business requirements

• triallists were not made aware of their participation in the trial

• there was no form of explicit feedback available due to the user interaction available with the

trial system

• the overall trial involved many more participants than any of the trials.

However, despite these differences it was similar in respect of

• testing users’ response to items recommended by the MyMedia software framework and

recommender algorithms developed within the project added on to it

• testing users’ response to recommendation in an interactive commercial service, as with the

Microgenesis trial, although in a very different commercial context and in a different language in

a different European country.

• Providing means of comparing trial results with the conclusions of the other work packages of

the MyMedia project in the context of recommender systems.

The BT MyMedia field trial was originally planned to take place in two components, a large quantitative

field trial involving many thousands of participants, and a small qualitative trial involving much smaller

numbers of volunteer participants, describing their user experience of the MyMedia recommender

system in much more detail. Because of the challenges involved in developing the field trial for

execution on the BT Vision system, which involved a deployment later than originally planned, it was not

possible to retain the volunteer group for the qualitative trial, so that did not take place.

However, the quantitative trial, which involved closed pseudo-random sampling of triallists from among

the whole BT Vision customer base, without any direct contact with the customers, did take place, and

was carried out on a very large scale with 50000 customers receiving recommendations from the

MyMedia recommender system, and a control group of 50000 customers receiving recommendations

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 42

from the BT Vision editorial (marketing) team. The focus of the trial was on the purchasing and viewing

of film content (tVoD) items.

This was the basis for the analysis reported on in this deliverable. Overall descriptive statistics showed

little difference between the two trial groups: many of the customers did not purchase any film VoD

items during each month of the trial, and the distribution of purchasing activity was very skewed, with a

few customers being very active. There was no significant difference between the groups when

considered at this overall level.

However this did not demonstrate a failure of the MyMedia recommender algorithm to successfully

recommend items. When the number of instances of viewing the page on the BT Vision service where

the MyMedia or editorial recommendations were displayed was investigated, it was found that only a

very small number of triallists, less than 1% of either group, had actually viewed the page during the

period of the trial.

This clearly demonstrated the importance of designing the user interface of recommendation system so

that it was very prominent and access to users accessing the system for the first time, confirming the

results of studies reported on in deliverable D1.4 [8] of the MyMedia project. As can be seen in section

7.3 above, the user of the BT Vision system has to pass through a number of pages before reaching the

page on which recommendations are displayed.

However further analysis of the actions of those minority of triallists that viewed the trial

recommendation pages and clicked through to particular items, showed that a substantial proportion

went on to purchase and view (if tVoD) or view (if sVoD), items (see Table 8). Although there was little

difference between the group receiving the MyMedia BPRMF algorithm-generated recommendations

and the group receiving the Editorial recommendations for tVoD viewing, it showed that a

recommender algorithm using only very sparse implicit feedback could compete with the expertise of

the professional Editorial marketing team. Interestingly the triallists viewing the pages from the

MyMedia recommendation group had a higher rate of sVoD viewing activity, suggesting that the

algorithm-generated recommendations could generate greater engagement with other aspects of the

BT Vision service, even if they were not recommending those aspects directly.

Additonal tests of the ability of the MyMedia algorithm and the editorial recommendations to predict

user preferences for purchasing and viewing items were carried out, to see whether they could still do

this well when viewers had not accessed the pages. An evaluation metric based on hit-rate was used.

The results confirmed the skewed level of purchasing and viewing activity identified earlier, with most

users only viewing a few items. All levels of hit-rate were low (not higher than 0.05), although it was

interesting that in one month (September) hit-rate showed no difference between the editorial and

recommendation groups, while in August the editorial group produced higher hit-rates. This appeared to

be a result of the context in which the editorial recommendations were chosen, where the editorial

team also marketed the same items by other means, allowing triallists to learn about them even if they

had not viewed the recommendation pages.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 43

The BT MyMedia field trial demonstrated the importance of a clear and accessible user interface for a

recommender system, something now being focused upon in the BT Vision system. It also showed the

importance of recommender systems as marketing tools being integrated with other forms of marketing

in a business context (in this case the editorial recommendations), and it demonstrated the need for

understanding the user experience in more detail at an individual level in addition to doing so at a high-

level quantitative study. Because of the very large scale of the BT MyMedia field trial and its need to

align with BT Vision business decisions it was not possible to carry out data analysis in a form consistent

with the other field trials carried out in the MyMedia project, but the information that was obtained

supported the findings of the other field trials of the importance of recommendation systems in

appropriate contexts, user interfaces and with means of interaction and feedback.

The BT MyMedia field trial, BT’s involvement in the MyMedia project, and the various “pre-trials” that

have been carried out in BT prior to this trial, have indicated the importance of recommendation as a

tool towards personalisation of consumer services to BT’s business, and have established worthwhile

links between a necessarily application-focused organisation and other partners in the recommendation

area who have research expertise and a relevant longer-term view. The overall product of this

collaboration has not just been the outcome of several field trials but also the release of the MyMedia

software framework, which offers the potential for other researchers to carry out other experiments

and field trials in the future.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 44

10 References

[1] Meesters, L., Marrow, P., Knijnenburg, B., et al. (2009) MyMedia project deliverable D1.5 End-user

recommendation evaluation metrics. Downloadable from

http://www.mymediaproject.org/Deliverables.aspx

[2] Knijnenburg, B., Meesters, L., Marrow, P. & Bouwhuis, D. (2010) User-centric evaluation framework

for multimedia recommender systems. User Centric Media (Akan, O., Bellavista, P., Cao, J., Dressler, F.,

Ferrari, D., Gerla, M., Kobayashi, H., Palazzo, S., Sahni, S., Shen, X., Stan, M., Xiaohua, J., Zomaya, A.,

Coulson, G. , Daras, P. & Ibarra, O.M., eds.) Lecture Notes of the Institute for Computer Sciences, Social

Informatics and Telecommunications Engineering 40, 366-369. Springer: Berlin Heidelberg.

[3] Siegel, S. & Castellan, N.J. (1988) Nonparametric statistics for the Behavioural Sciences. McGraw-Hill:

New York.

[4] Maindonald, J. & Braun, J. (2003) Data Analysis and Graphics Using R – an Example-based Approach.

Cambridge University Press: Cambridge.

[5]Deshpande, M. & Karypis, G. (2004) Item-based Top-N recommendation algorithms. ACM Trans. Inf.

Sys. 22(1), 143-177.

[6] Rendle, S., Tso-Sutter, K., Huijsen, W., et al. (2009) MyMedia project deliverable D4.1.2 State-of-the-

Art Recommender Algorithms. Downloadable from http://www.mymediaproject.org/Deliverables.aspx

[7] Wartena, C., Gazendam, L., Brusee, R., et al. (2010) MyMedia project deliverable D3.1.3 Metadata

Enrichment Modules Documentation and Software. Downloadable from

http://www.mymediaproject.org/Deliverables.aspx

[8] Meesters, L., Marrow, P., Matthews, I., et al. (2009) MyMedia project deliverable D1.4 User Control

Design Specification. User Control Test Results Report. Downloadable from

http://www.mymediaproject.org/Deliverables.aspx

11 Acknowledgements

We thank Tamas Jambor, a BT-funded PhD student at University College London, for his assistance with

the hit-rate metric described above, developed in the context of his PhD research.

December 2010 Public Document

MyMedia ICT-2008-215006 Page | 45