Digg Poaching: The Sordid Underbelly of Web 2.0 Promotion

One of the most difficult problems for all webmasters is spam. Whether you are a blogger fighting comments, a search engine fighting doorway pages, or Digg fighting vote manipulation, spam is a serious concern. Unfortunately, the treatments for spam are often very painful and present numerous false positives – a CAPTCHA too difficult to solve, an IP filter that tags large office buildings, or a spam trap that filters good comments. When these treatments become too strong, especially on “democratic” Web 2.0 sites like Digg and Reddit, they provide spammers with easy targets.

Virante nor I condone this kind of tactic, nor do we condone this kind of experimentation on an unwilling subject. That being said, the compelling and important nature of this information demands that it be brought to light.

We have known for a while about the oft-discussed down-voting bots on Reddit. Because the homepage of Reddit sees far fewer stories than Digg, it is less susceptible to some of these types of attacks (competiting for far fewer front page positions over the long haul makes negative-signal attacks less viable). Yesterday, however, it was brought to my attention that Digg’s front page algorithm makes poaching easy, affordable, and effective.

Quick Backstory

As a search marketer, my company has been involved in a large number of social media marketing campaigns. We have acquired over 100 Digg front pages on behalf of our clients using good content and good social promotion – not vote buying. Because of this, we are regularly pitched by third-party submitters and outlets with their latest and greatest offerings (“100 Diggs for $50”, “Guaranteed Front Page”, etc.) Normally these solicitations get shuffled directly out of my inbox and into the trash. This one caught my attention.

Digg’s Algorithm

Getting to the front page of Digg is a competitive endeavor. Success means tens of thousands of visitors in a matter of an hour or so and tons of exposure around the web. Perhaps most importantly, a front page on Digg means that nearly every other social media network and voting site will pick up your story: links and visitors from a diverse set of sources.

There are a handful of meaningful factors in Digg’s algorithm for determining whether a site goes popular. There are both positive and negative signals. These include the diversity of the set of diggers, the way in which diggs were received (did the user view the story, dugg from button or digg page, etc.), the speed at which diggs were received, the number of buries, etc.

As the best stories begin to accumulate positive signals, they aggregate on this page, the must dugg upcoming. One interesting thing to pay attention to is the difference between the most diggs and the Hot in All Topics list on the right side. The right side is a better indicator because it measures all of the signals, not just total diggs over time. If you see your site in either of these lists, you are a strong contender for a front page.

There is strong evidence that as stories get close to promotion, they under go stricter and stricter filters. This is understandable given the huge computing undertaking of dealing with so many stories, votes, comments, etc. Thus, as your story nears promotion, the ability to manipulate your story positively drops greatly given the additional oversight. However, this gives the unscrupulous a new opportunity.

Digg Poaching

Digg poaching is the use of negative promotion factors at the correct time to remove competing stories from Digg, increasing the likelihood of promotion of your own stories. Imagine if the Tour de France tested every bike rider one more time before the last leg. You could guarantee victory by slipping into the bedrooms of the other top competitors the night before and giving them steroids injections. The same holds true here.

A blackhat social promoter offered this strategy to me and I was, admittedly, curious about the effectiveness of such an approach. Could this type of technique really be used to topple stories before they go popular? Especially stories with great promise? So, I followed up asking for a case study or evidence. He said, watch.

Below is an accumulation of the evidence he provided, complete with screenshots.

Step 1: Identify the target

In order to prove the point, the attacker chose to target “a domain that has popped over 1500 times and a submitter with over 1000 pops“. (Pops = made popular) This was an important part of my question. Toppling a first-time front-pager would be easy, as I would imagine Digg’s promotion algorithm has some level of trust or karma built in. The chosen target had already accumulated over 197 diggs in under 16 hours, making it the top in the group and very likely to go popular. Given the high pedigree of both the domain and the submitter, I would have been shocked for it not to have succeeded.

Step 2: Send in Rapid Automated Votes

While I cannot be sure of the attackers source of these votes, the pace with which they were acquired leads me to believe they were automated at some level. Over 30 votes were acquired in under 15 minutes (the attacker said approximately 40 votes were sent in). Compared with the pace of the other sites in the top, including itself, which had acquired at a pace of 12 to 13 votes an hour, this was quite suspicious. I can also not determine the IPs or the user accounts of these votes.

Step 3: Watch it Burn

Sure enough, the story was buried, despite earning nearly 100 more votes over the next 8 hours. You can confirm a story is buried by searching for the story using the +b command in Digg’s search engine, as you will see in the screenshot below.

Implications

The implications of this evidence is startling. It offers a real solution for unethical marketers to secure front pages on Digg without manipulating their own stories. Considering that Diggs can be purchased for a dollar or so a piece, for under $200 one could remove 4 or 5 stories from the queue, paving the way for success. It also brings into question the possibility of far worse consequences. Could a marketer use a similar tactic to get a domain banned altogether from Digg?

The only real solution for Digg is to filter votes rather than penalize stories, much like Google’s link algorithms. Because Digg cannot know the source of purchased votes, penalizing stories based on the acquisition of those votes, rather than merely devaluing those votes, is too dangerous a remedy.

1 Comment

  1. Dave Dugdale
    Dec 17, 2009

    Wow, interesting concept. Digg frustrates me to no end, I have had a few articles that had some momentum but then failed.

Submit a Comment

Your email address will not be published. Required fields are marked *