Mobile app version of vmapp.org
Login or Join
Berryessa370

: Need help with mybb forum robots.txt file I have installed the mybb forum in subdomain of my site. The link to my site is here. but when I search Google even the pages that are disallowed

@Berryessa370

Posted in: #Forum #RobotsTxt #Subdomain #WebCrawlers

I have installed the mybb forum in subdomain of my site.
The link to my site is here.

but when I search Google even the pages that are disallowed are also index. What's wrong with the robots.txt file?

User-Agent: *
Disallow: /captcha.php
Disallow: /editpost.php
Disallow: /member.php
Disallow: /misc.php
Disallow: /modcp.php
Disallow: /moderation.php
Disallow: /newreply.php
Disallow: /newthread.php
Disallow: /online.php
Disallow: /printthread.php
Disallow: /private.php
Disallow: /ratethread.php
Disallow: /report.php
Disallow: /reputation.php
Disallow: /search.php
Disallow: /sendthread.php
Disallow: /task.php
Disallow: /usercp.php
Disallow: /usercp2.php
Disallow: /calendar.php
Disallow: /*action=emailuser*
Disallow: /*action=nextnewest*
Disallow: /*action=nextoldest*
Disallow: /*year=*
Disallow: /*action=weekview*
Disallow: /*action=nextnewest*
Disallow: /*action=nextoldest*
Disallow: /*sort=*
Disallow: /*order=*
Disallow: /*mode=*
Disallow: /*datecut=*

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Berryessa370

1 Comments

Sorted by latest first Latest Oldest Best

 

@Megan663

The best way to diagnose problems with your robots.txt file is to use Google Webmaster Tools. They have a tool there that is specifically meant to figure out this kind of issue. Once you are in GWT, navigate to:

Crawl → Blocked URLs

Once there, it will import your robots.txt file in the upper box. You can test changes by editing there (but it won't save them back to your site, of course.) The lower box is for a list of URLs that you want to test. Choose some URLs that you want to block but which are indexed and test them.



It is possible that nothing is wrong with your robots.txt file.

If you recently added rules to your robots.txt file it may take a month or more until the pages get removed from Google's index. Google doesn't instantly de-index pages once you block those pages. Google won't de-index pages until next time it tries to crawl them. If you would like them removed sooner, you can request pages be removed from the index. In GWT, navigate to:

Google Index → Remove URLs

According to their removal requirements the pages must be blocked (for example by robots.txt) before they can be removed.



robots.txt may not prevent Google from including a page in the index altogether. In some cases, when a page has enough external links, Google may include the page in the index and use the anchor text of the inbound links to know what it is about. In such cases, you can request that the page not be indexed by allowing Googlebot to crawl the page but putting a meta robots noindex tag into the head.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme