: A top directory in a URL path returns a 404 error, will its subpages still be crawled? I found a site in which one of the directories in the URL path returns 404 error, but the subpages
I found a site in which one of the directories in the URL path returns 404 error, but the subpages are still live.
Example URL: shop.edurite.com/blogs/news/summer-activities
In this URL
shop.edurite.com/blogs --> this will lead to a 404 error
But
shop.edurite.com/blogs/news/summer-activities --> this is a live page
My Questions:
1) Will this URL be crawled by Google, without any glitch?
2) Would this be considered bad SEO? If yes then why?
More posts by @Turnbaugh106
3 Comments
Sorted by latest first Latest Oldest Best
Yes the link will be crawled by Google. And as i have checked, this is the main category URL shop.edurite.com/blogs/news/ and working fine too
This was always of concern to us when making a website, and while I don't think we faced too many issues with pages not being indexed properly, we did feel that perhaps since the structure of the website was not perfect in such cases it would affect:
Navigation of the crawler
Domain credibility
Making a perfect breadcrumb
Because the above were assumptions/feelings and not proven, I can't say for sure.
One thing it definitely does affect though, is user navigation. Users expect URLs to work a certain way and without the mid level pages, it becomes difficult to understand navigation.
Google don't follow directories, it is not like Google type ls command and it show all the directory in your site. Google just follow links everywhere.
Internal/external links - For example, If I link to stackexchange.com/blog/some-page/, then Google will crawl only that webpage(/some-page/), it will not goint to crawl (/blog/) page, because it is not linked anywhere. If it is link somewhere then Google will crawl that URL as well. So Google just follow links to crawl all webpage on internet.
Sitemap - Sitemap also contain links. If Google simply rely on internal/external links then may be more than 10% webpages on internet will not going to index by Google. So Google suggest webmaster to create sitemap where you can list of all of your pages, so they will check your sitemap in regular time, and will index new URL which listed on sitemap.
Google submit URL - There are also tools on Google search console, where you can submit your links, and Google will crawl that.
So the whole internet world is crawled by links. They don't check sub/child pages to crawl. Even we can't know how many directories are there on specific website, it's may be given different permission to view it.
Will this URL be crawled by Google, without any glitch?
If it is linked then it will be crawl by Googlebot. Crawling does not depends on child/sub-child pages. But in case, if /blog/ directory is blocked on robots.txt then all of it's child pages will not be crawl, otherwise everything will be crawl.
Would this be considered bad SEO? If yes then why?
No, it's not bad for whole website. If that page is not crawled by Googlebot, it means you can't see that URL in search result, and so on you won't get any search traffic on that page only. Other index pages can still get traffic.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.