View
1.474
Download
4
Category
Preview:
DESCRIPTION
Citation preview
Retention of visitors coming to Palco 3.0 via the news in
SAPO: Decoy optimization
Jindřich TandlerLuboš Popelínský
Jan Kocourek
Knowledge Discovery Lab FI, Masaryk University BrnoCzech Republic
Readers of SAPO (www.sapo.pt) are presented with Palco news.
News appear occasionally that redirect the user to PalcoPrincipal (www.palcoprincipal.pt).
Problem: people who come to Palco in this way usually don't stay there long enough.
Question: How can we help Palco to retain them?
Introduction
Main task
Task: improve current level of bouncing of users (users coming to www.palcoprincipal.pt through news)
Hypothesis: Shared interests are more common among the users who clicked on a specific link (are viewing the same content) than among users who didn't (are not).
The link(s) clicked is probably the only information we can possibly know about users coming from www.sapo.pt
What else could be done?
Improving fidelity of already registered users by exploiting the social network
Hypothesis: socially related users share interests
e.g. we can identify users of Palco who showed interest in particular items and use their activity to make suggestions to socially related users
Current work Data has been collected into the database:
Current work Social network structure has been extracted from the
database to Pajek format
Using R software and igraph library we can compute assortative mixing coefficient
This can be used for verification of the second hypothesis (i.e. socially related users share interests)
R in combination with igraph will be used also for exploring the social network properties to get better understanding of its specific features
Current work – network properties
Exploring social network properties to understand its specific features
Small world phenomenom, degree distribution, degree correlations, community structure, ...
A new student has been assigned to help with this task (Pavel Kocourek)
Possible future work
Clustering
division of users into groups with similarcharacteristics can be used for contentrecommendation
Association rule mining
for discovering direct or indirect relationships betweenweb pages in users' browsing behavior
Data needed
Actual rate of bouncing, time spent on the pages, etc. (Google Analytics?)
Clickstream data – browsing history of particular users (important for the main task)
Clickstream data assigned to the registered Palco users could be useful for mining as well
(privacy issues?)
Some other questions
To what extent is it possible to change the layout and content of Palco pages?
What does it take to change something? (time, people involved, …)
What algorithm is currently being used for content recommendation?
Is it possible to perform AB testing of the site with the old and a new layout/content?
Data overview
Data – statistics *Query No of results
users 68314
listeners 50663 (74%)
artists 17660 (26%)
friendship ties 168342 (avg 2.5)
mainstream bands 114062
fans 42228
comments 817370
tags 55018* February 2010
Data – users
User CreatedAt, LastLogin, IsActive
User_profile genre_id, Name, Slug, Category, Type, Culture,
ModerationStatus, About, CreatedAt, UpdatedAt
Artist MusicUploads, GooglePageRank,
GooglePageRankUpdatedAt
Data – users
Friend InviterUserId, InvitedUserId, StatusId, CreatedAt,
UpdatedAt
Listener_data GenderId, BirthDate
Data - contact
Contact country_id, city_id, Place, Address, PhoneNumber,
Homepage
Country
City
State
Data – activities
Owner_activities ActionId, OwnerId, TargetId, CreatorId, CreatedAt,
Slug
Activity_stream_action ActionTag (new event added, band music added, ...),
PrivacyTagTitle, PrivacySubscriberTagTitle, ActionTitle
Data - music
Playlist listener_id, photo_id, Name, Description, Hash, Slug,
Position, CreatedAt, UpdatedAt
Playlist_music playlist_id, track_id, Position, CreatedAt
Track album_id, Name, CreatedAt, UpdatedAt, Complete,
Downloadable, Position, IdS3, FilenameS3, LinkS3, UploadedS3, Slug, Processed
Genre
Data – tags, comments
Tag Culture, Name, Hash, Slug, CleanSlug, Length,
CreatedAt
Comments AuthorId, OwnerId, CommentableModel,
CommentableId, Aproved, Text, CreatedAt
Data – bands
mainstreamBand Summary, About, Name, CreatedAt, UpdatedAt,
LastFmLink, Hash, Slug
fans mainstreamBand_id, user_id
Albums artist_id, photo_id, Name, Description, ReleaseDate,
CreatedAt, UpdatedAt, Hash, Slug
Recommended