: How is crawler seeing unlinked directories / files? I'm running a crawler on my website to test for broken links and such. It starts by using a URL like www.domain.com One curious thing is

I'm running a crawler on my website to test for broken links and such.

It starts by using a URL like domain.com
One curious thing is that it is showing directories with no internal links. For example, directory /example_dir/ is showing up in the crawl tree, but I can't find any internal link to that directory within the pages.

How could this be happening and is there a way to prevent it?

10.02% popularity Vote Up Vote Down

: CMS that allow pdf to be view online (not download into the computer) Is there any CMS that allow user to view the pdf file on the CMS instead of downloading them into the computer and open

@Sarah324

Posted in: #Cms #Pdf

1 Comments

: HTML Editor for Windows Mobile Smartphone Are there any HTML editors for windows mobile smartphones with which I can create HTML on my smartphone?

@Sarah324

Posted in: #Html

1 Comments

: Is it possible for CloudFlare to cause an increase in bandwidth usage? We started using CloudFlare at the end of May 2011 and I was just looking at the numbers and saw our usage of bandwidth

@Sarah324

Posted in: #Bandwidth

1 Comments

: Just submitting pages to Google in an XML sitemap does not guarantee inclusion. It merely tells them where the pages are and then Google decides when and if it will crawl and index them.

@Sarah324

0 Comments

Login to post a comment!

2 Comments

Sorted by latest first Latest Oldest Best

@Bryan171

My guess is that Jon is right, you must have a link somewhere. It might not show on the page but the spider is finding it.

Don't forget that code like this can happpen <a href="/my_dir/"></a>. Although it's blank to the user, it will be followed by the spider.

10% popularity Vote Up Vote Down

@Sarah324

What tool are you using to crawl your site?

Crawlers typically find new pages by following links so the odds are you have a link pointing to those directories. It may not be intentional, such as a a dynamic link that is pulling up bad data but not throwing out an error. If you aren't using Xenu's Link Sleuth I recommend using it as it will tell you what pages had links that lead it to crawl those directories.

10% popularity Vote Up Vote Down

Feed

: How is crawler seeing unlinked directories / files? I'm running a crawler on my website to test for broken links and such. It starts by using a URL like www.domain.com One curious thing is

More posts by @Sarah324

: CMS that allow pdf to be view online (not download into the computer) Is there any CMS that allow user to view the pdf file on the CMS instead of downloading them into the computer and open

: HTML Editor for Windows Mobile Smartphone Are there any HTML editors for windows mobile smartphones with which I can create HTML on my smartphone?

: Is it possible for CloudFlare to cause an increase in bandwidth usage? We started using CloudFlare at the end of May 2011 and I was just looking at the numbers and saw our usage of bandwidth

: Just submitting pages to Google in an XML sitemap does not guarantee inclusion. It merely tells them where the pages are and then Google decides when and if it will crawl and index them.

Login to post a comment!

2 Comments

Back to top | Use Dark Theme