Title: Improved Link-Based Algorithms for Ranking Web Pages

(NYU-CS-TR846)

Authors: Ziyang Wang 


Abstract:

Several link-based algorithms, such as PageRank [19], HITS
[15] and SALSA [16], have been developed to evaluate the
popularity of web pages. These algorithms can be interpreted as computing
the steady-state distribution of various
Markov processes over web pages. The PageRank and HITS
algorithms tend to over-rank tightly interlinked collections
of pages, such as well-organized message boards. We show
that this effect can be alleviated using a number of modications to the
underlying Markov process. Specically, rather
than weight all outlinks from a given page equally, greater
weight is given to links between pages that are, in other respects,
further off in the web, and less weight is given to
links between pages that are nearby. We have experimented
with a number of variants of this idea, using a number of
different measures of ``distance'' in the Web, and a number
different weighting schemes. We show that these revised
algorithms often do avoid the over-ranking problem and give
better overall rankings.