17
Center for E-Business Technology Seoul National University Seoul, Korea Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search Kay-Uwe Schmidt*, Tobias Sarnow*, Ljiljana Stojanovic** *SAP Research, Vincenz-Prießnitz-Straße 1, 76131 Karlsruhe, Germany **Forschungszentrum Informatik, Haid-und-Neu-Straße 10-14, 76131 Karlsruhe, Germany Symposium on Applied Computing (2009) 2009. 08. 13. Summarized & presented by Babar Tareen, IDS Lab., Seoul National University

Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

  • Upload
    eamon

  • View
    31

  • Download
    0

Embed Size (px)

DESCRIPTION

Kay- Uwe Schmidt*, Tobias Sarnow *, Ljiljana Stojanovic ** *SAP Research, Vincenz-Prießnitz-Straße 1, 76131 Karlsruhe, Germany **Forschungszentrum Informatik, Haid-und-Neu-Straße 10-14, 76131 Karlsruhe, Germany Symposium on Applied Computing (2009) 2009. 08. 13. - PowerPoint PPT Presentation

Citation preview

Page 1: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Center for E-Business TechnologySeoul National University

Seoul, Korea

Socially Filtered Web Search:An approach using social bookmarking tags to personalize web search

Kay-Uwe Schmidt*, Tobias Sarnow*, Ljiljana Stojanovic**

*SAP Research, Vincenz-Prießnitz-Straße 1, 76131 Karlsruhe, Germany

**Forschungszentrum Informatik, Haid-und-Neu-Straße 10-14, 76131 Karlsruhe, Germany

Symposium on Applied Computing (2009)

2009. 08. 13.

Summarized & presented by Babar Tareen, IDS Lab., Seoul National University

Page 2: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Copyright 2008 by CEBT

Introduction

Search engines do not consider current work context

Static results for all users

Server side personalization has limited use

Client side search engines rely on additional terms extracted from documents, thus not scalable

Social Bookmarking based search result personalization addresses these issues

2

Page 3: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Copyright 2008 by CEBT

Related Work

Google History

goZone.com

Mahalo.com

UCAIR

3

Page 4: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Copyright 2008 by CEBT

Motivation

4

A developer is looking for guide lines for testing DB code

Visits

www.ibm.com/db2

www.hsqldb.org

Googles

“Test”

Original Results

Web based certification

Personality test

Bandwidth test

Personalized Results

DB2 training

DB2 programming test

Page 5: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Copyright 2008 by CEBT

Personalizing Search Results

Tracking browsing behavior

Create user model

Url’s

Tags fetched from Delicious

Issue original query

Enhance search query by adding tags

Issue new query

Display both results

Tags given by a community of users provide a good summary of web page content

5

Url Tags (Metadata)

www.youtube.com

video, youtube,entertainment, web2.0

www.amazon.com

shopping, books, ama-zon, music

www.snu.ac.kr university, snu, korea, 서울대

www.hsqldb.org database, java, sql, opensource

www.ibm.com/db2

ibm, db2, database, unix

Page 6: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Copyright 2008 by CEBT

Architecture [1]

6

Search Module

Carries out original query

Inserts space (<DIV>) for personalized results

Metric Module

Includes a metric that delivers a tag for personalized search

Search Enhancer Module

Combines search string with metric module tags

Metadata Module

Extracts metadata for a visited website from delicious

Page 7: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Copyright 2008 by CEBT

Architecture [2]

Built as add-on on top of

Firefox

Internet Explorer

7

Page 8: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Copyright 2008 by CEBT

Metric [1]

Two datasets

Collection of visited websites

Tags for each website

Query last 20 disjunct websites from user model

Format (url, count)

Sorted by weight ‘γ’

8

Page 9: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Copyright 2008 by CEBT

Metric [2]

Tags assigned to website

Format (tag, no of users)

t → tags assigned to a website

T → tags for all websites

9

Page 10: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Copyright 2008 by CEBT

Algorithm

10

Page 11: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Copyright 2008 by CEBT

Result

11

Page 12: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Copyright 2008 by CEBT

Evaluation

How effective can this be ?

12

Page 13: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Center for E-Business TechnologySeoul National University

Seoul, Korea13

Can Social Bookmarking Improve Web Search?

Pauly Heymann, Georgia Koutrika, Hector Garcia-Molina

Dept. of Computer Science, Stanford University

USA

Web Search and Data Mining 2008

Page 14: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Copyright 2008 by CEBT

Positive Factors [1]

URLs

Pages posted on delicious are often recently modified

– Delicious users post interesting pages that are actively updated or have been recently created

Approximately 25% of URLs posted by users are new, unindexed pages

– Delicious can server as a small data source for new web pages and to help crawl ordering

Roughly 9% of results for search queries are URLs present in delicious

– Delicious URLs are disproportionately common in search results compared to their coverage

While some users are more prolific than others, the top 10% of users only account for 56% of the posts

– Delicious is not highly reliant on a relatively small group of users

14

Page 15: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Copyright 2008 by CEBT

Positive Factors [2]

URLs

30-40% of URLs and approximately one in eight domains posted were not previously in delicious.

– Delicious has relatively little redundancy in page information

Tags

Popular query terms and tags overlap significantly

– Delicious may be able to help with queries where tags overlap with query terms

In this study, most tags were deemed relevant and objective by users

– Tags are on the whole accurate

15

Page 16: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Copyright 2008 by CEBT

Negative Factors

URLs

Approximately 120,000 URLs are posted to delicious each day

– The number of posts per day is relatively small; for instance, it represents 1/10 of the number of blog posts per day

There are roughly 115 million public posts, coinciding with about 30-50 million unique URLs

– The number of total posts is relatively small for instance, this is a small portion of the web as whole (perhaps 1/1000)

Tags

Tags are present in the pagetext of 50% of the pages they annotate

– A substantial proportion of tags are obvious in context, and many tagged pages would be discovered by a search engine

Domains are often highly correlated with particular tags and vice versa

– It may be more efficient to train librarians to label domains than to ask users to tag pages

16

Page 17: Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Copyright 2008 by CEBT

Discussion

Query expansion model based on Social tagging

What is the probability of finding tags for random URL in delicious.com?

Generalization vs. Specialization

17