: What is the empirical way to tell that Google has removed a URL from the index? After merging two sites, one with ~40 URLs and one with ~700 URLs (due to a forum) the result was that overall
After merging two sites, one with ~40 URLs and one with ~700 URLs (due to a forum) the result was that overall traffic decreased by ~50%. I want to get the traffic back, as it was organic and a helpful resource to the community.
After finding this answer I decided to try removing the ~700 forum pages from the index by using a robots.txt disallow, which didn't work. As this FAQ and this answer point out, the robots.txt must allow the pages and meta noindex must be used.
After applying noindex tags to all forum pages, removing the disallows from robots.txt and waiting a week, still there are 700+ URLs in the index according to Google webmaster tools. However, if I view the advanced index status and check all the boxes, it shows 795 URLs indexed and 200+ blocked by robots. The blocked by robots graph line is increasing steadily (~30 URLs per week). Note that average crawl rate is ~125 pages per day.
My question is this:
How can I tell that the URLs have successfully been removed? Looking at Google webmaster tools' graph of index status is what I thought was a good indicator, but I am wondering what should actually happen in the results there. Should the blue graph line of total indexed drop back down to a low URL count (as I am expecting), or will the total indexed remain high and the blocked by robots and/or removed URLs increase?
This answer from Google seems to indicate that the total indexed (blue line) should drop. Why then has it not dropped at all after applying the noindex tags and waiting a week?
More posts by @Gloria169
2 Comments
Sorted by latest first Latest Oldest Best
Google Webmaster Tools index count will fluctuate constantly especially with dynamic sites or those using common platforms such as forum software.
The best way to tell how many of these URL's Google are dropping from their index is to use the site operator in Google's web search, for example:-
site: example.com/forum
This will show you all indexed URL's at /forum/* - if you perform this search every few days and see how many URL's are indexed, this will give you an idea if they are being dropped from the index as you intend.
As these URL's start getting dropped from the index, you should expect to see the total indexed count in Google Webmaster Tools also decrease accordingly, although as I mentioned above, there will also probably be plenty of new URL's Google are indexing (and dropping) from your site each day.
Unfortunately, the fastest way to remove pages from the Google index, you abandoned. There is nothing wrong with using the robots.txt file to remove pages from the Google index.
Having switched to noindex, it will take some time for the spider to fetch all of the pages and update the index. The speed will depend upon the freshness of your site in the past.
Part of what may slow this process down is changing the game. I strongly suggest staying the course. The more you change things around, the more you confuse the machine and the slower it will be to get what you want.
Be patient. It will take some weeks before the page are removed. Search engines are notoriously slow.
Unfortunately, there is no really good way to check your progress except for that the graph you already mentioned. Sometimes the numbers may not make sense. Google seems to use Microsoft for calculations sometimes. When the graph has leveled off for a period, it is likely that all the pages are de-listed. One thing you can do is to take a sample of the page titles and do a site: search using the unique titles in quotes. Taking a sample can give you an idea of what is going on.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.