: Robots.txt Disallow command How do I use robots.txt to disallow folders which are being crawled due to a bad URL structure? They're currently causing a duplicate page error. The URL which been
How do I use robots.txt to disallow folders which are being crawled due to a bad URL structure? They're currently causing a duplicate page error.
The URL which been crawled incorrectly is:
abc.com/forum/index.php?option=com_forum
However, The correct page is:
abc.com/index.php?option=com_forum
Is robots.txt a correct way of excluding this? I'm thinking about using:
Disallow: /forum/
Will that not block legitimate content in the /forum/ folder of my site?
More posts by @Kaufman445
1 Comments
Sorted by latest first Latest Oldest Best
If abc.com/forum/index.php?option=com_forum is the only link that's been mis-read then you can just add Disallow: /forum/index.php to your robots.txt file. Adding Disallow: /forum/ will tell robots to ignore everything in that directory, which doesn't sound like what you want.
You could also add a 301 redirect from abc.com/forum/index.php?option=com_forum to abc.com/index.php?option=com_forum to tell robots what the URL should be. This will also help any users who accidentally land on the wrong URL.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.