Mobile app version of vmapp.org
Login or Join
Jamie184

: See you answered your own question, but as you point out, "Disallow: /*?" is the source of your problems. The "*" is a regex, or regular express pattern, which basically means any STRING

@Jamie184

See you answered your own question, but as you point out, "Disallow: /*?" is the source of your problems. The "*" is a regex, or regular express pattern, which basically means any STRING of text of ANY length, "*?" means the same thing, but limits the pattern to the SHORTEST possible answer, and in the case of robots.txt, I can't think of a way that the "*?" expression would have any meaning.

Reading ROBOTS.TXT

The Disallow line lists the pages you want to block.

The User-Agent line lists the crawlers you want to block.

ERRORS_IN_YOUR_ROBOTS.TXT

(1) All the use of * in "/*/" may or may not correctly used, all the "Disallow: /INSERT_XYZ/*" are wrong, all you need is "Disallow: /INSERT_XYZ/"

(2) "Disallow: /*?" should be "Disallow: /" since the reference is to directories, not agents; with "User-Agent: *" that's correct, and "User-Agent: /" would be wrong. But since you want your site crawled in part, remove it.

(3) All the "Disallow: /INSERT_XYZ" should likely be "Disallow: /INSERT_XYZ/" if they're a reference to a directory.

Google's webpage for webmasters for Robots.txt is here.

NOTE: You should also Google these meta-tags: noindex, nofollow, noarchive, nocache

10% popularity Vote Up Vote Down


Login to follow query

More posts by @Jamie184

0 Comments

Sorted by latest first Latest Oldest Best

Back to top | Use Dark Theme