PageRank
From TrustLet, a free, collaborative project for collecting and analyzing information about trust metrics.
PageRank is one of the algorithm powering the search engine Google. It was proposed by Larry Page and Sergey Brin in [1]
It consider a link from web page A to web page B as a vote and hence as a trust statement.
It works on an unweighted graph (since HTML links don't have weights) and computes a global reputation value for every page, called the pagerank of the page, representing its authority.
The basic assumption is that a web page which receives many incoming links is more authoritative that a web page which receives few links. However also the authority of the linking page is important since it is better few links from "popular" pages than many links from "unpopular" pages.
There are proposals for personalized Pagerank.
A good description of PageRank with simple examples is in [2].
[edit] References
- ↑ Brin, S.; Page, L. (1998). "The anatomy of a large-scale hypertextual Web search engine (link)". Computer Networks and ISDN Systems 30 (1-7): 107-117. Retrieved on 2007-08-02.
- ↑ Austin, D. (2006). "How Google Finds Your Needle in the Web’s Haystack (link)". American Mathematical Society Feature Column. Retrieved on 2007-08-05.
[edit] Code
- suggested addition to networkx based on this ticket. There is the iterative version (slow) and 3 versions using the power method to find the largest eigenvector: Pure python, using Numpy with full matrix, using the SciPy sparse matrix package.
- The Google PageRank Algorithm in 126 Lines of Python
- This article is a stub. You can help by expanding it.

