: Google is indexing unintended content that is either unpublished or a secret part of site Google is indexing parts of my site that I haven't linked to, it's a mobile version of website that
Google is indexing parts of my site that I haven't linked to, it's a mobile version of website that I'm working on - domain.com/m/ <-- note the M.
How is that even possible? Only thing I can think of is Google getting the URL's from Google Analytics.
Same problem with my development site, dev.domain.com - This I have fixed by re-enable htpasswd. Had it once, but disabled it for some testing purposes.
I know I could use robots.txt to skip the indexing, but have always been told - "Don't put super secret stuff in this, as it is public domain".
Will Google follow the rules of a <meta name="robots" content="noindex,nofollow"> ?
Thinking of putting this on the mobile version.
People accessing domain.com from a smartphone is automatically redirected to the mobile version.
More posts by @Goswami781
3 Comments
Sorted by latest first Latest Oldest Best
If you have installed google analytics on your site, then this is almost definitely how it knows about them. I don't know the code off the top of my head but if I get a chance later I'll search it out and edit this post but there is code you can put on your pages to prevent crawling on pages you want to be private.
It is in Google's interests to index all of the public web, including content that contains no hyperlinks from existing indexed content.
The use of m.example.com or example.com/m/ for the URL of a mobile version of a site is common.
It is therefore reasonable to assume that Google's crawler will see if m.example.com or example.com/m/ exists and attempt to index such content if found.
It is also reasonable to assume that Google Mobile will attempt to determine if a suitable-looking m.example.com or example.com/m/ URL exists for known content and present this as a link to a user in search results. There is no reason to think such divination would not feed back to the crawler.
There are various theories as to how Google knows what to crawl. It could be that someone linked to your mobile version. It could be that Google tried random urls and came across the /m version of your site. I'm not aware that they say they won't use URLs from their analytics data.
Yes they do follow those rules: googlewebmastercentral.blogspot.com/2007/03/using-robots-meta-tag.html.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.