: How is crawler seeing unlinked directories / files? I'm running a crawler on my website to test for broken links and such. It starts by using a URL like www.domain.com One curious thing is
I'm running a crawler on my website to test for broken links and such.
It starts by using a URL like domain.com
One curious thing is that it is showing directories with no internal links. For example, directory /example_dir/ is showing up in the crawl tree, but I can't find any internal link to that directory within the pages.
How could this be happening and is there a way to prevent it?
More posts by @Sarah324
2 Comments
Sorted by latest first Latest Oldest Best
My guess is that Jon is right, you must have a link somewhere. It might not show on the page but the spider is finding it.
Don't forget that code like this can happpen <a href="/my_dir/"></a>. Although it's blank to the user, it will be followed by the spider.
What tool are you using to crawl your site?
Crawlers typically find new pages by following links so the odds are you have a link pointing to those directories. It may not be intentional, such as a a dynamic link that is pulling up bad data but not throwing out an error. If you aren't using Xenu's Link Sleuth I recommend using it as it will tell you what pages had links that lead it to crawl those directories.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.