23
Page Ranking Techniques In Search Engines

Page Ranking Techniques In Search Engines

  • Upload
    rianna

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

Page Ranking Techniques In Search Engines. Introduction. Need Increasing need of Search engine. Search results should be ordered by Relevancy. Importance. What is Page Ranking. Algorithms. HITS (Hyperlink Induced Topic Search) e.g.Alta Vista PageRank - PowerPoint PPT Presentation

Citation preview

Page 1: Page Ranking Techniques In Search Engines

Page Ranking Techniques In Search Engines

Page 2: Page Ranking Techniques In Search Engines

Introduction

Need Increasing need of Search engine.

Search results should be ordered byRelevancy.Importance.

What is Page Ranking

Page 3: Page Ranking Techniques In Search Engines

Algorithms

HITS (Hyperlink Induced Topic Search)

e.g.Alta Vista

PageRank

e.g. Google.

Page 4: Page Ranking Techniques In Search Engines

Definition – PageRank.We assume page A has pages T1...Tn which point to it (i.e., are citations). The parameter

d is a damping factor, which can be set between 0 and 1. We usually set d to 0.85 .……. C(A) is defined as the number of links

going out of page A. The PageRank of a page A is given as follows:

 

PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))

Ref: Sergey Brin and Lawrence Page ”The Anatomy of a Large-Scale Hypertextual Web Search Engine”

http://www-db.stanford.edu/~backrub/google.html

Page 5: Page Ranking Techniques In Search Engines

How to use formula.

e.g. 2 pages A and B, pointing to each other.

A B

Page 6: Page Ranking Techniques In Search Engines

Start with PR(A) = PR(B) =1

PR(A) = (1-d) + d * (PR(B)/C(B))

= (1-0.85) + 0.85 * (1/1) = 1

PR(B) = (1-d) + d * (PR(A)/C(A)) = (1-0.85) + 0.85 * (1/1) = 1

Page 7: Page Ranking Techniques In Search Engines

Lets start with PR(A) = PR(B) = 10

After 1st iteration:

PR(A) = (1-d) + d*(PR(B)/C(B))

= 0.15 + 0.85 * (10/1)

= 8.65

PR(B) = (1-d) + d*(PR(A)/C(A))

= 0.15 + 0.85 * (8.65/1)

= 7.50

Page 8: Page Ranking Techniques In Search Engines

After 2nd iteration:

PR(A) = (1-d) + d*(PR(B)/C(B))

= 0.15 + 0.85 * (7.50/1)

= 6.527

PR(B) = (1-d) + d*(PR(A)/C(A))

= 0.15 + 0.85 * (6.527/1)

= 5.698

And so on….. till?

Page 9: Page Ranking Techniques In Search Engines

Ans: Iterations should be repeated till PR values converges……..

In this example ……..tillPR(A) = PR(B) =1.

Thus we can start with any values of PR, and should repeat iterations till PR values converges i.e. don’t change too much.

Page 10: Page Ranking Techniques In Search Engines

Difference…

Result of PR calculation.

Google toolbar values

Page 11: Page Ranking Techniques In Search Engines

Examples

Assumption: We’ll take initial PR value of each page as 1.0

Page 12: Page Ranking Techniques In Search Engines

Example 1

A B PR(A) = (1-d) + d ( 0)

= 0.15 PR(B) = (1-d) + d (0)

= 0.15

 

For practicing examples on PageRank use calculator:www.webworkshop.net/pagerank_calculator.php?

lnks=2,10,15&iblprs=0.15,0.15,0.15,0.15&pgnms=&pgs=2&initpr=1&its=100&type=simple

Page 13: Page Ranking Techniques In Search Engines

Example 2

PR (A) = (1-d) + d (PR(B) / C(B))

= 0.15 + 0.85 (1/1) = 1PR (B) = (1-d) + d (0)

= 0.15

Dangling links are links that go to pages that don't have any outbound links.

Orphan pages are those, which don’t have any inbound link.

A B

Page 14: Page Ranking Techniques In Search Engines

Example 3

From here onwards I’ll represent final PR values after sufficient no. of

iterations inside page.

A 1.0

B 1.0

C 1.0

A 1.0

B 1.0

C 1.0

Page 15: Page Ranking Techniques In Search Engines

Example 4

Observation: We can channel large proportion of PR of site to a particular page.

A 1.85

B0.575

C0.575

Page 16: Page Ranking Techniques In Search Engines

Example 5

Observation: We can reduce PR leak by increasing internal link structure.

C1.255

A 2.6

B1.255

External Site 1 1.0

External Site 21.215

External Site1 1.0 A

1.0

B0.575

C0.575

External Site 20.638

Page 17: Page Ranking Techniques In Search Engines

Example 5 Cont..

External Site 1 1.0

A 2.146

B1.549

C1.720

External Site 21.215

Page 18: Page Ranking Techniques In Search Engines

How to increase PR?

By adding spam pages.

Join forum.

Submit to search engine directories.

Reciprocating links.

Contents.

Page 19: Page Ranking Techniques In Search Engines

Adding spam pages.

A 331.0

B281.6

Spam 1

0.39

Spam 2

0.39

Spam 1000

0.39

Page 20: Page Ranking Techniques In Search Engines

Conclusion.

Even though formula for calculating PageRank seems to be difficult, it is easy to understand. But when a simple calculation is applied hundreds of times, the results can seem complicated. And we can not predict the result of these iterations. Surely, more practice can yield more observations.

PageRank is important factor considered in Google ranking, but it is only one of the important factors considered. e.g. now a days Google is paying a lot of attention to the link’s anchor text while deciding relevancy of target page.

But as Page Rank is also one of the important factor, one should be well aware of PageRank while designing the website.

Page 21: Page Ranking Techniques In Search Engines

References.

http://www.webworkshop.net/pagerank.html

 http://www.iprcom.com/papers/pagerank/

http://www-db.stanford.edu/~backrub/google.html

http://www.google.com/intl/en/technology/

http://www.google-watch.org/pagerank.html

Page 22: Page Ranking Techniques In Search Engines

?

Page 23: Page Ranking Techniques In Search Engines

Thanks