Strong Correlation between Facebook Likes and PageRank
First, let me say that everyone should take this study with a huge grain of salt. While I believe the data is intriguing, it does not implicate anything specifically. So, here goes.
I have long guffawed at the social graph and, in particular, it’s relationship to search engine optimization. I am quick to argue about anything that would imply that Google search results are meaningfully influenced by social activities. One of my most common points is that in the majority of open social websites, the social graph is closely patterned by the link graph.
Take Digg for example. If you submit a story on Digg, it gets a link from your profile. If someone votes on that story, it receives a link from their vote history page. As more and more votes are tallied, more and more links arrive. As the story moves up the upcoming section, it gets more powerful links from pages that are closer to the homepage. If it hits the homepage, it gets that coveted high PR link. On a site like digg, there is a clear correlation between social events and link creation.
The same tends to be true on closed social networks like Facebook. Liking a Facebook Fan page may trigger a wall post or a link from a list of those organizations you like. However, because these events are behind a login, the link graph is severed. Ostensibly, all that Googlebot would be able to access would be the Facebook login page and the common endpoints of Fan Pages, People Pages, and the site directory. While the site directory would allow link juice to flow through to Fan pages, because it lists every page, you would expect it not to flow juice in a pattern that is influenced by the social activity on the site.
I was interested, however, in whether or not there was a correlation between Facebook Likes and mozRank. Potentially, well-liked pages would attract external links or links from the “recently liked” section of People pages. I spidered 1000 Facebook Fan pages and compared the number of “Likes” to the mozRank, external mozRank, internal mozRank and the unique inbound linking domains. Unsurprisingly, there was little to no correlation between Likes and any of the others. But then my tin-foil hat took over. (One point worth noting, when comparing to mozRank, external mozRank and internal mozRank, I took the Log of the Likes number. I also removed PR0 elements from the analysis. It is impossible to distinguish a PR0 that is so because too few links from a PR0 that just hasn’t been updated by Google yet. Since all pages start as PR0 in the toolbar, there is too much noise at the PR0 level to comfortably analyze)
Google has long been suspected of using alternative tactics to getting access to data it wants without using Googlebot. Speculation constantly surrounds questions of whether Google Chrome, the Google Toolbar, Google Analytics or other Google tools are used to build out the link graph without necessarily indexing or displaying content. So I decided to determine if PageRank correlated with the log of Facebook Fan Likes.
A simple linear regression in Excel reveals what appears to be a direct, positive correlation between the log of Facebook Fan Likes and the PageRank of those fan pages, while SEOMoz mozRank, in comparison, shows a scattershot of mozRanks (what we would expect). Of course, linear regression is not the appropriate model, and the R-squared measurement wouldn’t give us a reasonable statistic as the stair-step PageRank model is distinct rather than continuous data.
After consulting some individuals in the statistics community who know far more about this stuff than I do, I was pointed towards the Spearman’s Correlation Coefficient. The Correlation Coefficient for mozRank vs the Log of Facebook Fan Likes is -.103, both wrong in the direction and decidedly low. With PageRank the Correlation Coefficient was .53 with one and two tailed P at less than .00001!
So What Does this Mean?
There are a couple of potentials reasons for this strong correlation.
- Random Happenstance. The limited spiderably link graph on facebook.com sufficiently explains the correlation with PageRank. But why then does mozRank stray so drastically? How does a fan page like Peak Fan Club get a PR6 with only a handful of internal links and a mozRank of .75? And, more importantly, how does this happen over and over again?
- Google Gets Special Crawling Privileges: Rand Fishkin from SEOMoz pointed out this likely scenario. Perhaps Facebook allows Google to spider-but-not-index behind the login.
- Google Gets Link Data from Google Toolbar, Chrome, or ISP data: One could expect the link graph generated by Facebook Likes to line up quite nicely with PageRank behind-the-scenes.
- Google Uses Like Data as a PR Corollary: I think this is unlikely, but it could be a cheap short-cut.
Takeaways
It is a tough pill for me to swallow, but ladies and gentlemen, I am going to say it here and in writing for the first time. It may, OH GOD IT PAINS ME TO SAY IT, be beneficial, OH THE MINDLESS SUFFERING, for SEO purposes, MUST.KEEP.CONTROL, to create a Facebook fan page, put a link on it, and attract Facebook likes. There, I said it.
9 Comments
Trackbacks/Pingbacks
- Want More Facebook Likes? | Traffic Blog Cafe - [...] confirm that this holds true. I did find a couple articles that showed some insight, but blogs like Russ…
One more thought – it’s quite possible that Facebook allows Google to crawl the Facebook site much more deeply than it does Linkscape’s crawlers. Hence, link juice is passing around in Facebook that Linkscape/mozRank can’t detect…
Awesome analysis, though – pretty clear that Google’s being aggressive about social signals and social data.
After the recent announcement that they created a ranking factor that could not be gamed I thought this maybe the way Google was heading, but if Facebook Likes are a ranking factory then sadly not…
While you make a good case that Facebook likes matter, your support for there being causation rather than just correlation by providing links to four Facebook pages seems a bit thin. I am not disagreeing with a conclusion that pained you so much to arrive at, just suggesting that the case would be even stronger with more examples. Still, an action step for me is that I should make another attempt at getting the Facebook Like button to appear on my blog for IE users.
Very interesting article. However, I wasn’t clear if you are referring tot the pagerank of a website’s facebook fan page or the pagerank of the website’s home page?
I nofollow external links from my website to the websites facebook fan page because I don’t want it to outrank my websites home page. Also, facebook has too much page rank as it is and is ranking for lots and lots of long tail queries. They certainly aren’t getting any of my link juice 🙂
Author Response: Sorry about that, I am referring to the PR of the Facebook Fan Page. ie: the more likes your fan page has, the higher PR it has.
Maybe it is the content giving the stronger rank, and the content that people are liking… Not necessarily the like causing rank.
Very interesting experiment. Have you any plans to use a larger sample, and how did you determine the FB pages to include in your current sample?
At a business level, would it seem odd for FB to grant Google any special access rights, esp. given their strong rivalry? It’s always a possibility, but it would seem like a risky metric for Google to use if their access could be removed at some time in the future at FB’s behest (notwithstanding contractual obligations obviously).
What about profile pages with pagerank…My Facebook profile page has pr2,at first I thought it was a mistake but it is not mistake because I checked it with several websites that evaluate the pagerank.I checked almost 400 profile pages of my friends,none of them has any pagerank except one with pr1 .How is that possible ?
An excellent article. Thank you..
It would be in the interests of both parties if Google has access to data behind the login.
I would take that as a given.
I have been running a linking test for several months which has generated some (surprising?) results.
Basically, I did a page on a topic and did not do a full SEO job on it as I wanted to see if I could improve it’s SERPS with inbound anchor text links.
The page indexed at #74 and that was too high.
Due to the natural shuffle too many sites were dropping and appearing, making the test too hard to monitor.
I did some onpage work and got it to #27
Over the next few months stats were closely monitored and I started to place anchor text links on RELEVANT sites.
I placed 48 and no movement was seen.
About then there was a SEO article on Google using FB and Twitter signals in their indexing.
My next test was to Tweet the article and do a FB entry.
Still sitting at #27 …
My next test was to tweet my followers (26) asking htem to retweet and ask their followers to do so. (My 26 followers have a total of 58,903 followers).
The next day my page stood at #1.
It stayed there for 9 days then dropped down to it’s usual position.
Conclusion:
It is not the links that affect position but the recommendations from those passing it on. Retweets count for more.
@Jordy post.
“Author Response: PageRank (not the rank of the page in Google, but the PageRank measurement) is explicitly a measurement of Link Popularity (the quantity and quality of links pointing to a page on the web). While good content can help attract good links, content alone does not affect the PageRank measurement.”
I am pretty certain this has changed.
I think Mayday not only changed long tail, but the same relevance algo was brought into line for PageRank.
I started a website about SEO just before Mayday.
About 4 months into it’s development the PR went from PR0 to PR4 in one step.
At the time of the jump I had exactly 115 links.
1 on a PR5 page. 1 on a PR3 page.
113 on PR0 pages like this one.
Extremely relevant pages. No visible PR.
The mathematical formula of .85 x page PR / # of links on page could not have applied. I think the amount of PR was assigned to the linked page by the relevance between linking and linked.
@ miki.
Your post prompted me to check my FB PR and it is 10!!
It was a screen shot moment :)>
best to all
Reg
nbs-seo.com
Hi All
Thanks for the great study, I will be running a similar analysis of hundreds of factors over at my new blog (link to the left).
@miki
If you get links from a number of pages with a lower PageRank than the page they are linking to it is a cumulative result. So multiple PR1 links might add up to PR 2 or maybe even 3.
Thanks
Mark