?b=9 But when I test

I have the URL
www.example.com/shopping/books/?b=9

and the following robots.txt file:

User-agent: *
Disallow: /?b=9

But when I test this in Google Webmaster Tool's robots.txt tester it is showing allowed when it should be disallowed.

Whilst /?b=9 is fixed, /shopping/books will change with different categories and I need to block them all.

Please tell me what's wrong with my robots.txt.

10.04% popularity Vote Up Vote Down

: How to solve error: Missing return tags hreflang in multilanguage website I have a multilanguage website, which has the following properties: It does not have a .com/ page, it always usex suffixes

@Bryan171

Posted in: #Browsers #Domains #Hreflang #Language

1 Comments

: What are the SEO implications of switching from a longer 10 year old ".com" domain name to a shorter ".supply"? We are reorganising our company's focus and product line from working with only

@Bryan171

Posted in: #MultipleDomains #Seo

4 Comments

: Bing gives NET::ERR_CERT_COMMON_NAME_INVALID on my website I have added my site https://greymeter.com on Bing Webmaster. But when I search greymeter on bing.com, it gives https://www.greymeter.com

@Bryan171

Posted in: #Bing #Https #SecurityCertificate

1 Comments

: SEO and the use of ellipsis CSS property I have a Q&A service and in every question details page, there is a section named "related questions". Every question in there has the title and

@Bryan171

Posted in: #Css #GoogleSearch #HiddenText #Html #Seo

1 Comments

Login to post a comment!

4 Comments

Sorted by latest first Latest Oldest Best

@Sarah324

Doesn't a self altering text configuration file suggest an issue with your directories and actual ability to reach/edit that file? Not to cause panic, but... the input you entered changed....I don't think it's a text file issue.

10% popularity Vote Up Vote Down

@Harper822

I don't think there's such a way to do it in robots.txt and also whatever is advertised in robots.txt is also what can be advertised to hackers because robots.txt is a file accessible to all.

What I would suggest is to use your scripting language to detect for the query string you don't want people to access and if the query string matches, create a redirect to a relevant page people are allowed to access or take them to a page with a 410 HTTP code.

For example, in PHP, you can use either of these to block the b=9 parameter from being accessible:

<?php
if ($_GET['b']=="9"){
header("HTTP/1.1 410 Gone",true);
echo "This page is gone.";
exit();
}
?>

<?php
if ($_GET['b']=="9"){
header("HTTP/1.1 301 Redirect",true);
header("Location: example.com/newpage ,true);
echo "This page moved <a href="http://example.com/newpage">here</a>";
exit();
}
?>

If you are looking to specifically block just robots and not real users, then you could make the parameters accessible via POST only. Here's the HTML and PHP you can use:

Html:

<form action="phpscript.php" method="POST">
<input type="hidden" name="b" value="9">
<input type="submit" value="special page">
</form>

Php file named phpscript.php:

<?php
if ($_GET['b']=="9" && strtoupper($_SERVER['REQUEST_METHOD']) != "POST"){
header("HTTP/1.1 410 Gone",true);
echo "This page is gone";
exit();
}
?>

Only problem with the post method is that making post requests are generally non-cacheable based requests since they're primarily meant for user data submission.

10% popularity Vote Up Vote Down

@Miguel251

The answer is on the link i posted :

Disallow: /shopping/*/*?b=9

* is a joker which mean "all"

10% popularity Vote Up Vote Down

@Alves908

robots.txt is prefix matching, so a rule like Disallow: /?b=9 will block all URLs that start /?b=9. Your URLs start /shopp... so they are not blocked.

However, you can use a * (wildcard - 0 or more instances of any character) to represent the first part of the URL. This is an addition to the "standard", but the main search engine bots ("Google, Bing, Yahoo, and Ask") support it:

Disallow /*/?b=9

The above should block /shopping/books/?b=9 and /<anything>/?b=9.

Reference: developers.google.com/webmasters/control-crawl-index/docs/robots_txt?hl=en#url-matching-based-on-path-values

10% popularity Vote Up Vote Down

Feed

: How do I disallow a specific query string in robots.txt? I have the URL http://www.example.com/shopping/books/?b=9 and the following robots.txt file: User-agent: * Disallow: /?b=9 But when I test

More posts by @Bryan171

: How to solve error: Missing return tags hreflang in multilanguage website I have a multilanguage website, which has the following properties: It does not have a .com/ page, it always usex suffixes

: What are the SEO implications of switching from a longer 10 year old ".com" domain name to a shorter ".supply"? We are reorganising our company's focus and product line from working with only

: Bing gives NET::ERR_CERT_COMMON_NAME_INVALID on my website I have added my site https://greymeter.com on Bing Webmaster. But when I search greymeter on bing.com, it gives https://www.greymeter.com

: SEO and the use of ellipsis CSS property I have a Q&A service and in every question details page, there is a section named "related questions". Every question in there has the title and

Login to post a comment!

4 Comments

Back to top | Use Dark Theme