Mobile app version of vmapp.org
Login or Join
Jennifer507

: Why is a query string appearing on my URLs in the Google search results? When I enter a URL from my site into Google search, I get back that URL but with an added query string in the results.

@Jennifer507

Posted in: #GoogleSearch #GoogleSearchConsole

When I enter a URL from my site into Google search, I get back that URL but with an added query string in the results. For example when I search for example.com/blog/blog/2013/02, the search results show it with parameters as example.com/blog/blog/2013/02?limit=200.
I have disallowed the parameters in the robots.txt file as Disallow: /*?. Now the Google search result shows the message as


A description for this result is not available because of this site's robots.txt – learn more.


How can I avoid having this added query string on the URL?

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Jennifer507

3 Comments

Sorted by latest first Latest Oldest Best

 

@Sherry384

Okay. First get rid of the Disallow: /*? in the robots.txt file. This is causing the message from Google. Google is saying that it has no access to your site at all.

In your .htaccess file, try this:

RewriteCond %{REQUEST_URI} ^(*.)?limit=d+$ [NC]
RewriteRule .* example.com/%1 [R=301,L]


I have not tested this, but I am sure the regex (regular expression) is correct. I at least tested that under a different scenario here. Try this and make several requests to your site using a variety of ?limit=200 added to the end of the request and see if there is a redirect to a URL without it.

I still say that the parameter should have no real effect and should cause no harm. It should be okay to just leave it.

10% popularity Vote Up Vote Down


 

@Ravi8258870

Robots.txt will only prevent bots from crawling the Disallowed URLs, not from indexing them. If the Disallowed URLs are linked to externally, or internally from a page that isn't Disallowed, they'll appear in the index with the snippet text you've quoted.

If you want to exclude them from the index entirely, the best option is probably the canonical link element:

<head>
<link rel="canonical" href="http://www.example.com">
</head>

In the example you give, the page example.com/blog/blog/2013/02?limit=200 would contain the following:

<head>
<link rel="canonical" href="https://example.com/blog/blog/2013/02">
</head>

That's assuming HTTPS is your preferred protocol. If it isn't, you should normalise that via 301 redirect.

The advantage of this approach is that you don't have to configure search engine Webmaster Tools.

Using Webmaster Tools

An alternative is to use URL Parameter Filters in Google and Bing Webmaster Tools. In Google, you'll find it under Crawl > URL Parameter Filters.

Typically, that page will already be populated with parameters the crawler has discovered, though you can specify them manually too.

Assuming ?limit=200 is controlling how many items are shown on a page, you'd configure it as follows in Google WMT:

Select "Yes: Changes, reorders or narrows page content"

Select "Narrows"

Select "No URLs"

10% popularity Vote Up Vote Down


 

@Jamie184

Not sure where the query param is coming from, but there is a way to strip it off in Google Analytics. See support.google.com/analytics/answer/1010249?hl=en, topic 'Exclude URL Query Parameter'

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme