Yes, I’ve missed something really simple. For a while I’ve been wondering why this site doesn’t show up on Google. I’d assumed it was because the previous incarnation of the site was just a load of links and that this might breach their terms and conditions, so I submitted the site for reconsideration (via Webmaster Tools) to see if this was the problem. In the meantime, I tried fetching the site as the Googlebot (another feature of Webmaster Tools), only to find that this was ‘Denied by robots.txt’. That’s strange, I thought, I don’t recall creating a robots.txt for this site. I checked the root directory on the webserver and couldn’t see any sign of robots.txt.
I thought I must be going mad, so I viewed the source of the index page, only to see there was a meta tag saying ‘noindex’ and ‘nofollow’. So clearly this was something generated by WordPress, as I hadn’t explicitly put them there. On a whim I decided to try to see if I could view robots.txt via the browser, and what do you know, it displayed a robots.txt file disallowing everything!
I dug around in the admin settings, until finally I found the Privacy page – and lo and behold, it looked like this:
Aaaargh! So I changed to the other setting, and now robots.txt shows correctly, and the meta tag has gone from the source. So simple when you know how! I feel like such a fool. The Googlebot still won’t fetch the site, but I guess maybe they’re using a cache or something, so will check again in a few days…