Last month I had a chance to talk to Aaron D’Souza from Google at the Search Engine Room conference. Aaron is a Software Engineer in the Search Quality group run by Matt Cutts. I asked Aaron about the new Google webmaster guideline:
- Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don’t add much value for users coming from search engines.
In particular I asked how Google determines whether pages “add much value for users”. Aaron’s response was that if the pages are ranking in Google then they are deemed to add value for users. I was pleased to hear this.
I described how we were concerned about a customer of ours that had over a hundred thousand search and navigation pages indexed by google even though they had less than 5000 products. We were concerned that Google may penalise them. Aaron reassured me that this was fine and suggested we use the sitemaps feature to indicate to Google which of those pages we considered the most important.
Martin Kelly just sent me the photo which prompted me to write this. He also sent this one of me giving a presentation.