: How did Google manage to crawl my 403 pages? I had a couple private files in a directory on my school folder. You could see that the files existed by going to myschool.edu/myusername/myfolder,
I had a couple private files in a directory on my school folder. You could see that the files existed by going to myschool.edu/myusername/myfolder, but trying to access the files themselves via myschool.edu/myusername/myfolder/myfile.html returns a 403 error.
And yet Google somehow managed to grab the contents of those private files and store them in its cache! How is this possible? [I've since removed those files, so I'm just curious how Google managed to do this.]
More posts by @Yeniel560
1 Comments
Sorted by latest first Latest Oldest Best
The most probable reason is that the pages won't return a 403 header.
You can check that using the Web Developer Toolbar in Firefox or Chrome. The tool is located under "Information" -> "View Response Headers".
Also, the way I create my error pages is:
I create some dummy error page. Let's say 403.php.
I create an actual error page. For example error403.php.
On the dummy error page, I put the following code: <?php header("Location: /error403.php",TRUE,301); ?>
In my .htaccess, I put the following:
Options -Indexes
ErrorDocument 403 /403.php
This adds all the redirects in a proper way and makes me sure I'm getting some juice from my error pages.
This can actually be extended in an extremely cool way if your website has a search engine which uses GET requests.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.