: Meaning of Crawl errors My question is about definition of Crawl errors in Google Webmaster Tools. Crawl errors is devided into few sections. Let's first consider HTTP section. I assume that
My question is about definition of Crawl errors in Google Webmaster Tools. Crawl errors is devided into few sections.
Let's first consider HTTP section.
I assume that all broken links in this section was somehow found by crawler, this is not the links from sitemap. If all this links was found by scanning pages from sitemap for links, why it doesn't mention what was the source page, like in sitemap section with column Linked From. Please correct me if I am wrong.
Sitemap section.
Looks like all those links came from my sitemap. But there is Linked From column, I already know, that all those broken links is from sitemap, so in order to fix the error, I should revise my sitemap. Am I wrong?
Not followed section.
I don't know what does it mean. Looks like it accumulates all links that caused redirect, but for some reason Google considers all those redirect as wrong redirect. Do you know if there are any set of rules how to determine wrong redirect. Actually I found were was my mistake, I tried to normalize URL and redirect it to the right URL, but I did normalization in a wrong way.
Not found section.
This section like HTTP section but with 404 errors. This section has Linked From column. But very often Linked From has unavailable. What does it mean, Google can not say me how it found this non existing page. How this section related to sitemap section. Does this section contains all 404 links from sitemap too. But there is too many 404 links, much more than in sitemap. I tried to take a look what we have in Linked From, and I saw that this link came from sitemap two month ago. But why Google keeps it indexed, the link is already dead, new sitemap doesn't have it. If there is any expire date for old links?
Unreachable section.
Looks like this section for 500 errors. This section doesn't contain Linked From column. There are too many completely meaningless links, I really don't know where this stuff came from, and without Linked From I am not able to figure out how to deal with it.
Sorry for such a big topic, but I just want to make it clear, what every section stands for, because it's extremely crucial in order to deal with all those problems. Hopefully it will be useful not just for me.
Thanks!
More posts by @Angie530
1 Comments
Sorted by latest first Latest Oldest Best
I think you are mostly correct with your assumptions.
The first HTTP section shows all 4xx errors apart from 404 errors (which are far more common so get their own page). I get 400 (bad request) errors from CodeIgniter disallowing certain characters in URLs. 403 (forbidden) are here too.
The sitemap section just tells you any URL in a sitemap that can't be found. The "linked from" section is useful so that if a page no longer exists you can remove links to it.
I've never seen the not followed section but it sounds like it lists any URL that's linked to with rel=ofollow.
Not found is every 404 error. It will probably list pages in your sitemap because obviously those should be linked to on your site (in other words, a page only in the sitemap and not linked anywhere would not appear here).
Unreachable is 5xx errors like you said, which are server errors. "Linked from" is not shown here because that doesn't matter, no page should ever return a 5xx error.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.