: Robots.txt should be in the root-directory or can be in sub-directory? I have a sub-directory that I would like not to be visible to the search engine Web crawlers. One way to do that is

I have a sub-directory that I would like not to be visible to the search engine Web crawlers.

One way to do that is to use a robots.txt in the root directory of the server but is something that I want to avoid. The reason is that anyone knowing the website URL, could access the robots.txt contents and can explore the disallowed directories, which is something that I want to avoid.

I though a way to avoid this.
Let X be the name of the sub-directory that I want not to be indexed. One way to stop Web Crawlers indexing the X directory and at the same time to make harder for someone to identify X directory from root's robots.txt, is to add the robots.txt in the X directory instead of the root directory.

If I follow this solution I have the following questions:

Will the Web Crawlers "read" the robots.txt if is in a sub-directory? (given that, a robots.txt already exist and in the root directory)
If robots.txt is in the X sub-directory, then what shall I use:

User-agent: *
Disallow: /X/

or this

User-agent: *
Disallow: /

10.02% popularity Vote Up Vote Down

: Subdomain not working when nameserver set to different host I have my domain and website both hosted on Godaddy. There was example.com and blog.example.com operational. But now I shifted my hosting

@Jennifer507

Posted in: #Dns #Domains #Nameserver

0 Comments

: Find website reviews not on website? Basically I want to see the reviews a website has but I don't want to look at the reviews they put on their own website. I Googled "Websitename review

@Jennifer507

Posted in: #GoogleSearch

1 Comments

: Mobile google SERPs return a desktop url instead subdomain mobile url We have desktop and mobile site respectively. www.example.com for desktop and m.example.com for mobile. In Google.com mobile

@Jennifer507

Posted in: #Google #Mobile #Serps

0 Comments

: Fetch as Google not rendering single-page application (AngularJS) - how to show console logs output I have an SPA application, and it is not being rendered correctly by Google. I tried going

@Jennifer507

Posted in: #Google #GoogleSearchConsole #Seo #SinglePageApplication #WebCrawlers

0 Comments

Login to post a comment!

2 Comments

Sorted by latest first Latest Oldest Best

@Carla537

No, web crawlers will not read or obey a robots.txt file in a subdirectory. As described on the quasi-official robotstxt.org site:

Where to put it

The short answer: in the top-level directory of your web server.

or on Google's help pages (emphasis mine):

A robots.txt file is a file at the root of your site that indicates those parts of your site you don’t want accessed by search engine crawlers.

In any case, using robots.txt to hide sensitive pages from search results is a bad idea anyway, since search engines can index pages disallowed in robots.txt if other pages link to them. Or, as described on the Google help page linked above:

You should not use robots.txt as a means to hide your web pages from Google Search results. This is because other pages might point to your page, and your page could get indexed that way, avoiding the robots.txt file.

So what should you do instead?

You can let search engines crawl the pages (if they find them), but include a robots meta tag with the content noindex,nofollow. This will tell search engines not to index those pages even if they do find links to them, and not to follow any further links from those pages. (Of course, this will only work for HTML web pages.)
For non-HTML resources, you can configure your web server (e.g. using an .htaccess file) to send the X-Robots-Tag HTTP header with the same content.
You can set up password authentication to protect the sensitive pages. Besides protecting the pages from unauthorized human visitors, it will also effectively keep web crawlers away.

10% popularity Vote Up Vote Down

@Lee4591628

Your robots.txt should be in the root directory and should not have any other name. According to the standard specification:

This file must be accessible via HTTP on the local URL "/robots.txt".

10% popularity Vote Up Vote Down

Feed

: Robots.txt should be in the root-directory or can be in sub-directory? I have a sub-directory that I would like not to be visible to the search engine Web crawlers. One way to do that is

More posts by @Jennifer507

: Subdomain not working when nameserver set to different host I have my domain and website both hosted on Godaddy. There was example.com and blog.example.com operational. But now I shifted my hosting

: Find website reviews not on website? Basically I want to see the reviews a website has but I don't want to look at the reviews they put on their own website. I Googled "Websitename review

: Mobile google SERPs return a desktop url instead subdomain mobile url We have desktop and mobile site respectively. www.example.com for desktop and m.example.com for mobile. In Google.com mobile

: Fetch as Google not rendering single-page application (AngularJS) - how to show console logs output I have an SPA application, and it is not being rendered correctly by Google. I tried going

Login to post a comment!

2 Comments

Back to top | Use Dark Theme