What is PageRank? 

Take an in-depth look at Google PageRank. We explain what PageRank is, how pages get it and share it (or don’t), how Google really uses it, and explore some tantalising hints for what else Google might be doing with it.

Named after Google co-founder Larry Page, PageRank is Google’s way of scoring the importance of a web page. The long definition from Google is:

“PageRank reflects our view of the importance of web pages by considering more than 500 million variables and 2 billion terms. Pages that we believe are important pages receive a higher PageRank and are more likely to appear at the top of the search results.

PageRank also considers the importance of each page that casts a vote, as votes from some pages are considered to have greater value, thus giving the linked page greater value. We have always taken a pragmatic approach to help improve search quality and create useful products, and our technology uses the collective intelligence of the web to determine a page’s importance.”

The simple definition, as mentioned by Google employee Matt Cutts in a presentation to a WordPress users conference is "the number and importance of links pointing to you". In other words, Google takes the links to your web page as votes as to its quality.

Getting PageRank is not automatic. There are many sites on the web with no PageRank at all, in many cases due to the quality of the site. Google is also able to manually adjust PageRank if a site "breaks the rules". A well-publicised example of this was when Google reduced the PageRank of the Google Japan website, which had been using a paid blogging campaign in an effort to boost its market share against Yahoo, from PR9 to PR5.

The type of page that you link to and, therefore, the pages that you pass PageRank to are also important. Having links out to too many low quality pages can mark your page as low quality. To avoid this, Google (and other search engines) state that webmasters should block such links using some form of robots exclusion, such as the rel=nofollow attribute. This is a signal to Google that you do not vouch for the quality of the page that you are linking to and that you don’t want to pass any PageRank on to that page. This was reiterated in a recent post on Matt Cutts’ blog about PageRank sculpting. Note that the "noindex" directive (either in the meta robots tag or in the X-Robots-Tag HTTP header) does not prevent a page from passing PageRank, although a page with the attribute will not appear in Google’s index.

The problem with talking about PageRank is that there are different types of PageRank. Firstly, there is Google’s Toolbar PageRank. This appears as a little green bar graphic on the Google Toolbar. Secondly, there is the Google Directory PageRank. Google Directory is essentially results from the DMOZ (ODP) Directory with a representation of Google’s Toolbar PageRank displayed alongside the page listing. Google has also discussed using different types of PageRank in the past.

However, these are all subsets of Google’s internal PageRank. Toolbar PageRank, for example, is a 0 to 10 non-linear scale that represents internal PageRank. The relationship is probably logarithmic, although that is by no means certain. In answering a question about how PageRank is stored internally, Matt Cutts said:

“It’s more accurate to think of it as a floating-point number. Certainly our internal PageRank computations have many more degrees of resolution than the 0-10 values shown in the toolbar.”

Internal PageRank is not published and remains a closely guarded secret. Athough the original maths behind it has been well publicised, PageRank is calculated differently nowadays, as Matt Cutts mentioned in this blog post:

“Even when I joined the company in 2000, Google was doing more sophisticated link computation than you would observe from the classic PageRank papers. If you believe that Google stopped innovating in link analysis, that’s a flawed assumption. Although we still refer to it as PageRank, Google’s ability to compute reputation based on links has advanced considerably over the years. I’ll do the rest of my blog post in the framework of ‘classic PageRank’ but bear in mind that it’s not a perfect analogy”

Having a high PageRank is a good thing. A high PageRank means that Google will crawl a site more often, and will crawl a site deeper and earlier than pages with a lower PageRank. It also used to be an indicator as to whether or not a page was in Google’s supplemental index. Google now no longer labels any results as supplemental.

The problem is that we can never be certain what the actual PageRank of a page is. We can look at Toolbar PageRank but it is, at most, a useful barometer of the way in which Google views any given page. Actual PageRank is only one of more than 200 (and counting) factors that Google uses to score a page, and Toolbar PageRank does not directly reflect actual PageRank. Google’s internal PageRank is calculated continually, as Matt Cutts explained in this video:

“Some data refreshes happen all the time. For example, we compute PageRank continually and continuously, so there’s always a bank of machines refining PageRank based on incoming data, and PageRank goes out all the time, any time there’s an update in our index, which happens pretty much every day.”

Toolbar PageRank however, was at one time only updated every three to four months and so lagged behind the internal version. The updates are now more random and Google has not announced one in some time. There is, for example, an update to Toolbar happening as this post is being written, with as yet no official word from Google. Additionally, PageRank doesn’t always directly correlate with rankings, although there is a tendency for higher PageRank pages to rank more highly in the results.

So obsessing about PageRank is not really time well spent. Interestingly, Google stopped displaying PageRank in Google Webmaster Tools in the middle of October. Perhaps this was to send the message that PageRank is really not that important?

See all posts tagged "pagerank"

Tags: ,

0 comments Add This

Leave a comment

Please note that job applications should be sent to careers@lbi.com