: How does Googlebot crawl a webpage? Today I had a silly fight with my colleague. My friend told me, "Googlebot crawls a web page only through the links present in webpage or through the number
Today I had a silly fight with my colleague. My friend told me, "Googlebot crawls a web page only through the links present in webpage or through the number of page views (by text ads)." But I argued, "Googlebot can crawl the webpage even if we don't submit a sitemap.xml in Google Webmaster Tools or don't have any links in the page. That is, Googlebot can crawl the page without any external force, it would automatically visit the page and crawl it."
What's the exact answer? Please inform me of any other factors involved.
More posts by @Bryan171
3 Comments
Sorted by latest first Latest Oldest Best
Sitemap.xml helps indexing a website by specifying to Googlebot to crawl your website but if a website doesn't have a sitemap.xml, of course Googlebot can crawl it.
Googlebot follow links on internet and if it finds a dofollow link to one of your webpage, it follows it and crawl your webpage (and maybe your website).
Sitemap Confusion
Many people believe that a sitemap is required for the site to be discovered this is not true, a sitemap's purpose to help Google ensure that the crawling of data is done correctly. Google has many discovery methods that expends further than just a sitemap.
DNS Collective
When a new domain is registered these records are obtainable though DNS servers, Google even owns its own DNS servers via 8.8.8.8 and 4.4.4.4 which is a public DNS service, new domains will surely be added to these automatically and while I can't find official response on this I'm dead certain that they are sure to have something in place.
Unstoppable Text References and Backlinks
Often newly registered domains will appear on lists all over the world, even some sites publish whois data whenever anyone does a whois on the domain.. It saves it to a page for example and then crawls for more data.
Deep Pages
These can't be detected unless there is some reference to them, this includes text mentions, link mentions.
Google doesn't guarantee to crawl your website. So it is a dice roll with waiting on Googlebot to crawl, let alone index your website. You should setup a sitemap.xml file in Google Webmaster Tools -- your website will be indexed much faster as you are making it known to Google that your website exists.
Without links from other webpages to your website, Googlebot may never find your page and crawl it.
Stick with a sitemap.xml file and tell Google about your site.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.