: Preventing Google from crawling URLs with URL parameters when a friendly URL exists for the same content In e-commerce sites it is common to have multiple parameters to filter, narrow, sort data.

Posted in: #Googlebot #GoogleSearchConsole #UrlParameters

In e-commerce sites it is common to have multiple parameters to filter, narrow, sort data. Hence Google provides the URL parameters section in Webmasters.

In our sample site we have the following 2 URLs generated which link to the same content:

/dresses/women/prada-size32-kneelength.html

and link with URL parameters

/dresses/women.html?ajaxcatalog=true&size=32&manufacturer=prada&length=kneelength

We have left the parameters options as "Let Google Decide" - however, it is noticed in the logs that Google is crawling both of the above links.

Why is Google crawling 2 similiar links? Is it because it finds it and hence crawls (seems logical)? But then what is the use of the "Let Google Decide"? Crawling 2 similiar links results in a waste of crawl budget and system resources.

To avoid the above we have 2 options:

Include a Disallow the size, manufacturer, length in the robots.txt OR
set each of the URL parameters to no crawl in the Google Webmasters.

Would there be any downside to each any of then options above? Is it a general practice for e-commerce stores to block all parameter related data (carefully ofcourse) since most of it is in general duplicate data.

10.03% popularity Vote Up Vote Down

:

@Rambettina238

0 Comments

: How to host a subdomain with a different web host and add to Cloudflare? I have a website example.extension hosted with a US web hosting company. The domain is registered with GoDaddy, and

@Rambettina238

Posted in: #Cloudflare #Cpanel #Subdomain

1 Comments

: GSC Structured Data error for "Missing: homeLocation" Although we didn't do any changes, this new error appeared in the GSC Structured Data section: I don't really get what it means. There

@Rambettina238

Posted in: #GoogleSearchConsole #RichSnippets #SchemaOrg #StructuredData

1 Comments

: SEO and www to non-www (and vice versa) AFTER having settled for some time in one of them I have been using non-www redirect in .htaccess for a couple of years now for my Wordpress site.

@Rambettina238

Posted in: #NoWww #Redirects #Seo #Wordpress

1 Comments

Login to post a comment!

3 Comments

Sorted by latest first Latest Oldest Best

@Gail5422790

You can solve it with canonical tag in head of your pages:
for example you set this canonical tag:

<link rel="canonical" href="www.example.com/dresses/women/prada-size32-kneelength.html" />

for two Urls above:
example.com/dresses/women/prada-size32-kneelength.html www.example.com/dresses/women.html?ajaxcatalog=true&size=32&manufacturer=prada&length=kneelength

10% popularity Vote Up Vote Down

@Speyer207

I had that happen to me. So Google will try and crawl everything on your site, and I've even had Google bug out on me and ignore my robots.txt once. It took a month for Google to correct itself again!

Also, I've had Google moan at me under HTML Improvements about duplicate content where it has crawled random pages with URL Parameters. Once I had gone over each one of my URL parameters and manual configured each entry, the duplicate content warnings stopped appearing over a few weeks. The only downside here is if you pick the wrong URL parameter to be ignored.

10% popularity Vote Up Vote Down

@Cooney921

The Google Bot tries to craw everything mentioned or linked on your site / the whole I set up a test case and the bot even crawled urls like this:

<script>
// Even a url in a JS comment is crawled by google: stackoverflow.com
console.log("test..");
</script>

And i think it's more about "let google decide what url they serve the user" and not "let google decide what url they will crawl" in the WMT.

In case of a faceted navigation you have to be careful what you want to be indexed. In general it's best practice to set all options to "noindex, follow". "Follow" cause you want the Google Bot to crawl your detail pages.
samplesite.com/dresses/women.html = Index, Follow samplesite.com/dresses/women.html?size=10 = NoIndex, Follow samplesite.com/dresses/women.html?color=red = NoIndex, Follow samplesite.com/dresses/women.html?page=2 = NoIndex, Follow

If you have 5 categories and 50 products but 5k sites in the google index your site most likely will not perform well.
On the other hand, if you think your site is strong enough, you can try to open one option to get some long tail keywords like "red women dresses" to rank.

10% popularity Vote Up Vote Down

Feed

: Preventing Google from crawling URLs with URL parameters when a friendly URL exists for the same content In e-commerce sites it is common to have multiple parameters to filter, narrow, sort data.

More posts by @Rambettina238

:

: How to host a subdomain with a different web host and add to Cloudflare? I have a website example.extension hosted with a US web hosting company. The domain is registered with GoDaddy, and

: GSC Structured Data error for "Missing: homeLocation" Although we didn't do any changes, this new error appeared in the GSC Structured Data section: I don't really get what it means. There

: SEO and www to non-www (and vice versa) AFTER having settled for some time in one of them I have been using non-www redirect in .htaccess for a couple of years now for my Wordpress site.

Login to post a comment!

3 Comments

Back to top | Use Dark Theme