When you have a popular website and a lot of good content, life is good. Readers love what you do, the search engines love you and new services spring up to use your content and spread the love around.
There is just one fly in the ointment – programmers.
You can’t trust programmers – and I should know, I’m a programmer and I’ve managed programmers and seen the things we all do, good and bad. Website software, applications and utilities can make our lives easier – or at least interesting. Then there was Y2K, spambots, scraper site software and Vista!
On Wednesday this week, the number of visitors to NewsBlaze from Google was slashed by 80%. I was out all day, videoing a business expo, so I didn’t even know until late that day.
Thrashing around in the dark
When a programmer sees a problem, its a challenge to their authority, a challenge that has to be taken on, and beaten into submission.
So I did some research, asked a few friends for insight and tried a few things. Using Google’s Webmaster Tools was one of the first steps and that told me everything looks good, except the number of indexed pages was much smaller than before.
That realisation led me to the idea of creating an XML Sitemap, to help Google find all the pages. So I did that, deciding not to put the most recent stories in the Sitemap, thinking that Gogglebot will find those new stories itself. I hope that is true. If not, I’ll extend the Sitemap.
I also realised that I’d contributed to some confusion because NewsBlaze has a lot of categories and many stories fit into more than one. There are several “top level categories” that generally don’t have overlapping stories, so the next step was to stop the search engines traveling down those “duplicate” trees looking for new content. This has additional benefits – reducing our bandwidth and helping search engines by reducing the time they take to fetch and process our stories.
By Saturday morning, it was looking grim and investigation on another track, which I should have tried before, led me to find the problem.
Inform, Google and NewsBlaze, The Destroyers
Inform, an interesting company that creates “Connected Content” picks up NewsBlaze stories. Inform programmers probably realised that you can truncate NewsBlaze URLs and still reach the story you want. I made NewsBlaze handle that a long time ago, when I saw that URLs can break when sent in an email. All that really counts is that the reader gets the story they were expecting. That’s what programming is all about.
None of this seems like a problem so far. Enter Google. For some reason, Googlebot must have found the Connected Content links and decided they were yummy – more interesting than the same links at NewsBlaze, so the existing NewsBlaze entries in the the Google index were replaced by the new, sexy, truncated links used by Inform.
The problem, of course is that Google’s ranking algorithm takes into account the votes of confidence coming from links to stories. The truncated links obviously don’t have any other websites pointing to them permanently, as the longer URLs did, so the “vote count” on most of our pages went to zero and that brought down the value of the front page and all of the secondary pages.
I just wrote to Inform and phoned, but of course, it is Saturday, so probably no response until Monday. So now I’m going to see if I can talk to someone at Google.
If you have a story about the loss – and hopefully redemption – of your listings, email email@example.com