Blekko Counts Spam Pages to Highlight the Problem of Link Relevancy for Search Engines

All internet users today cannot help having to deal with spam pages, that is, the web pages filled with ads and designed only to drive traffic. Even though experienced internet users recognise the telltale signs of junk web pages right away, they still get tricked every once in a while, at least for a few seconds. And things are far worse for most people, who are not internet-savvy. So if people can’t avoid getting fooled, how is a computer going to know any better? Blekko, the start-up search engine we’ve reported on earlier, thinks it can’t, unless users are willing to help.

The innovative search engine has launched the Spam Clock website to emphasise the problem of spam pages littering the Web at a frightening pace. According to Blekko, “every hour one million new spam pages are created”! The “clock” is basically a counter that displays the number of spam pages created since January 1, and you can actually see the number rise in the real time. And believe it or not but as of today it estimates there have been over 290 million spam pages pushed online! However, don’t confuse the figure with the total amount of spam pages out there, since it must be in the billions.

OK, but is spam a big problem? Of course, and it certainly makes it a lot harder for any search engine to present the best results. Today, it’s obvious that spammers get better and more sophisticated, trying to mislead Google’s ranking algorithm. But so is Google, its algorithms are always getting better at filtering out the irrelevant results. And still, even Blekko’s web-slashing concept with human-curated tags can’t entirely screen off spammy pages, that may sometimes rank very high.

Well, we think this may indicate the importance of link sharing through Facebook, Twitter and other social networking sites. Imagine searching the index of your friends’ favourite sites, rather than a spam-littered search engine index. Our social graphs might not be big enough to cover the entire Web so far, but in the future…

