Mobile app version of vmapp.org
Login or Join
Becky754

: Good tool to crawl my site and help me find dead link and unlinked files I have a pretty big legacy site with literally thousands of PDFs that are sometimes accounting for in a database,

@Becky754

Posted in: #DeadLinks #SiteMaintenance #WebCrawlers

I have a pretty big legacy site with literally thousands of PDFs that are sometimes accounting for in a database, but often are just links on the page, and are stored in most every directory on the site.

I have written a php crawler to follow all the links on my site, and then I am comparing that against a dump of the directory structure, but is there something easier?

10.06% popularity Vote Up Vote Down


Login to follow query

More posts by @Becky754

6 Comments

Sorted by latest first Latest Oldest Best

 

@Vandalay111

Link Examiner is a really good freeware too for your need.

10% popularity Vote Up Vote Down


 

@Nimeshi995

There are several products from Microsys, especially their A1 Sitemap Generator and A1 Website Analyzer that will crawl your website and report everything you can possibly imagine about it.

That includes broken links, but also a table view of all your pages so you can compare things like identical <title> and meta description tags, nofollow links, meta noindex on webpages, and a whole lot of diseases that just need a sharp eye and a quick hand to fix.

10% popularity Vote Up Vote Down


 

@Cugini213

Try W3C's open source tool Link Checker. You can use it online or install it locally

10% popularity Vote Up Vote Down


 

@Mendez628

If you are using windows 7 the best tool is IIS7's SEO Toolkit 1.0. It is free and you can download it for free.

The tool will scan any site and tell you where all of the dead links are, what pages take to long to load, what pages have missing titles, duplicate titles, same for keywords and descriptions, and what pages have broken HTML.

10% popularity Vote Up Vote Down


 

@Dunderdale272

I'm a big fan of linklint for linkchecking large static sites, if you have a unix command line around (I've used on linux, MacOS, and FreeBSD). See their site for installation instructions. Once installed, I create a file called check.ll and do:

linklint @check .ll


Here's what my check.ll file looks like

# linklint
-doc .
-delay 0
-http
-htmlonly
-limit 4000
-net
-host example.com -timeout 10


That does a crawl of example.com and generates HTML files with cross-referenced reports for what is broken, missing, etc.

10% popularity Vote Up Vote Down


 

@Murray432

I've used Xenu's Link Sleuth. It works pretty well, just be sure not to DOS yourself!

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme