Mobile app version of vmapp.org
Login or Join
BetL925

: How do you configure robots.txt to allow crawling of the site except for a few directories? What is the best initial or general setup for the robots.txt to allow search engines to go through

@BetL925

Posted in: #RobotsTxt #Seo

What is the best initial or general setup for the robots.txt to allow search engines to go through the site, but maybe restrict a few folders?

Is there a general setup that should always be used?

10.04% popularity Vote Up Vote Down


Login to follow query

More posts by @BetL925

4 Comments

Sorted by latest first Latest Oldest Best

 

@Yeniel560

The best configuration, if you don't have any special requirements, is nothing at all. (Although you may at least want to add a blank file to avoid 404s filling up your error logs.)

To block a directory on the site, use the 'Disallow' clause:

User-agent: *
Disallow: /example/


There is also an 'Allow' clause which overrides previous 'Disallow' clauses. So if you've disallowed the 'example' folder you may wish to allow a folder like 'example/foobar'.

Remember that robots.txt doesn't prevent anyone visiting those pages if they want to, so if some pages should remain secret you should hide them behind some kind of authentication (i.e. a username/password).

The other directive that is likely to be in many robots.txt files is 'Sitemap', which specifies the location of your XML sitemap if you have one. Put it on a line on its own:

Sitemap: /sitemap.xml


The official robots.txt site has lots more information on the various options. But in general, the vast majority of sites will need very little config.

10% popularity Vote Up Vote Down


 

@Alves908

You can use google webmaster tool to do this. Google webmaster tool is very helpful to create robot.txt

10% popularity Vote Up Vote Down


 

@Si4351233

Here's everything you need to know about the robots.txt file

10% popularity Vote Up Vote Down


 

@Kristi941

Google Webmaster tools has a Section called "Crawler access"

This section allows you very easily to create your robots.txt

For example to allow everything except blog a folder called test your robot.txt would look something like

User-agent: *
Disallow: /Test
Allow: /

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme