: How to Delete URL Parameter from Web master tools via .htaccess file? My WordPress site, not that much huge content. Recent faced problem of High CPU bandwidth usage. Within seconds it gets
My WordPress site, not that much huge content. Recent faced problem of High CPU bandwidth usage. Within seconds it gets 100% and the server become down. After huge analysis find that in Web master tools there indexed status about 2,64,023. And the URL Parameters where url's monitored over 24,981,662 for individual parameter. That is insane. Used filtering option. After find out the problem noindex those option from Yoast plugin, Edit the parameters. But no change in index status. It's increasing day by day. So i want to No-index those parameters parmanently from web mastertool and also want to delete it. How can i do that through .htaccess file? That will surely decrease the Total indexed number from WMT.
Here is The Indexed URL parameters from WMT
More posts by @Harper822
1 Comments
Sorted by latest first Latest Oldest Best
It looks like you should probably be blocking these URLs (with URL parameters) in your robots.txt file, to prevent search engine bots (ie. Googlebot) from crawling these URLs in the first place. For example, to block all URLs with query strings:
User-agent: *
Disallow: /*?
Within Google Search Console (formerly Webmaster Tools) you can also explicitly tell Google how to handle each URL parameter. Under Crawl > URL Parameters. For example, your filter_display parameter might be defined as:
Does this parameter change page content seen by the user?
"Yes: Changes, reorders or narrows page content"
How does this parameter affect page content?
"Narrows"
Which URLs with this parameter should Googlebot crawl?
"No URLs" (or perhaps "Let Googlebot decide" if you trust Google, given the previous options)
How can i do that through .htaccess file?
You mentioned in comments that these URL parameters are "not important". However, they do look like they provide some user features (eg. filtering, sorting, ...)? In which case, you probably don't want to use .htaccess. Using .htaccess you could canonicalise the URL and redirect URLs with these URL parameters. This would completely remove these URL parameters from your site - which could even break your site functionality?
UPDATE: Your robots.txt file (copied from comments):
User-agent: *
Disallow: /*?
User-agent: *
Disallow: /
User-agent: Googlebot
Disallow:
User-agent: *
Allow: /wp-content/uploads/
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/
Disallow: /images/
Disallow: /wp-content/
Disallow: /index.php
Disallow: /wp-login.php
This would not work as intended. You have conflicting groups. ie. Three groups that all match User-agent: *. Bots only process one block of rules. The block that matches is the one that matches the "most specific" User-agent. The User-agent: * block matches any bots that didn't match any other block. From these rules Googlebot will simply crawl everything (unrestricted), including all your URL parameters - if this is causing problems for your server (as you suggest) then this is not what you want. And from these rules I would "guess" that all other bots will match the first User-agent: *
(But, even if you adopted different reasoning and assumed multiple blocks could be processed, this wouldn't make sense...?)
Depending on your requirements, this should be written something like:
User-agent: *
Disallow: /
User-agent: Googlebot
Allow: /wp-content/uploads/
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/
Disallow: /images/
Disallow: /wp-content/
Disallow: /index.php
Disallow: /wp-login.php
Disallow: /*?
I assume that, if this is a WordPress site, then you don't want even Googlebot to crawl everywhere?
From these rules, all other (good) bots are prevented from crawling your site.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.