: How to correctly remove parameters from the Google index? Within the GWT console under URL Parameters there are two settings that I'm not clear on. What is the difference between setting a URL
Within the GWT console under URL Parameters there are two settings that I'm not clear on. What is the difference between setting a URL parameter to:
No: Doesn't affect page content
and
Yes:Changes, reorders... and then picking No Urls under which URLs google bot should crawl?
I am trying to remove all URLs with certain parameters from the index and not sure which setting to choose. I've already submitted a sitemap with the new URLs and I will set up a 301 redirect to the new URLs also, but I think some of our old parameters are causing issues with crawling efficiency and duplicate content so I want to try and clean that up.
More posts by @Michele947
2 Comments
Sorted by latest first Latest Oldest Best
to remove all URLs with certain parameters from the index and not sure which setting to choose.
Search console doesn't allow to remove parametrized URLs from index. It gives only some possibilities to degrade their importance on crawling.
Setting urls with certain parameters to noindex could be done by rewrite rule and setting x-robots-tag, like:
<IfModule mod_rewrite.c>
RewriteCond %{QUERY_STRING} ^id=([0-9]*)$
RewriteRule .* - [E=NOINDEX_HEADER:1]
</IfModule>
<IfModule mod_headers.c>
Header set X-Robots-Tag "noindex, follow" env=NOINDEX_HEADER
</IfModule>
In this example you set to noindex all urls with parameters like ?id=0 to ?id=n, where n is integer.
Update: According to John Mueller's advice, there is an example rule to set canonical to certain parametrized urls:
<IfModule mod_rewrite.c>
RewriteCond %{QUERY_STRING} ^id=([0-9]*)$
RewriteRule .* - [E=CANONICAL_HEADER:1]
</IfModule>
<IfModule mod_headers.c>
Header set Link '%{HTTP_HOST}%{REQUEST_URI}e; rel="canonical"' env=CANONICAL_HEADER
</IfModule>
No: Doesn't affect page content
Googlebot is likely to crawl 1 URL that contains this parameter. From the help message that displays (highlighting my own):
Select this option if this parameter can be set to any value without changing the page content. For example, select this option if the parameter is a session ID. If many URLs differ only in this parameter, Googlebot will crawl one representative URL.
Yes:Changes, reorders... and then picking No Urls under which URLs google bot should crawl?
Google won't crawl any URLs that contain this parameter. From the tooltip:
No URLs: Googlebot won't crawl any URLs containing this parameter. This is useful if your site uses many parameters to filter content.
I am trying to remove all URLs with certain parameters from the index and not sure which setting to choose. I've already submitted a sitemap with the new URLs and I will set up a 301 redirect to the new URLs also
The above URL parameter options block crawling. However, if you have an alternative URL to redirect to (as you suggest) then you are probably better off implementing the 301 redirect instead - this will result in the URLs being "updated" in the SERPs. If you block crawling then the URLs will probably simply drop out of the index over time.
However, it depends on the URL parameter and how they have been indexed. If you simply have a lot of duplicate content as a result of this URL parameter being indexed and these pages aren't being linked to then blocking crawling might be the better option.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.