Mobile app version of vmapp.org
Login or Join
Bryan171

: How does Googlebot crawl a webpage? Today I had a silly fight with my colleague. My friend told me, "Googlebot crawls a web page only through the links present in webpage or through the number

@Bryan171

Posted in: #Googlebot #WebCrawlers

Today I had a silly fight with my colleague. My friend told me, "Googlebot crawls a web page only through the links present in webpage or through the number of page views (by text ads)." But I argued, "Googlebot can crawl the webpage even if we don't submit a sitemap.xml in Google Webmaster Tools or don't have any links in the page. That is, Googlebot can crawl the page without any external force, it would automatically visit the page and crawl it."

What's the exact answer? Please inform me of any other factors involved.

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Bryan171

3 Comments

Sorted by latest first Latest Oldest Best

 

@BetL925

Sitemap.xml helps indexing a website by specifying to Googlebot to crawl your website but if a website doesn't have a sitemap.xml, of course Googlebot can crawl it.

Googlebot follow links on internet and if it finds a dofollow link to one of your webpage, it follows it and crawl your webpage (and maybe your website).

10% popularity Vote Up Vote Down


 

@Nimeshi995

Sitemap Confusion

Many people believe that a sitemap is required for the site to be discovered this is not true, a sitemap's purpose to help Google ensure that the crawling of data is done correctly. Google has many discovery methods that expends further than just a sitemap.

DNS Collective

When a new domain is registered these records are obtainable though DNS servers, Google even owns its own DNS servers via 8.8.8.8 and 4.4.4.4 which is a public DNS service, new domains will surely be added to these automatically and while I can't find official response on this I'm dead certain that they are sure to have something in place.

Unstoppable Text References and Backlinks

Often newly registered domains will appear on lists all over the world, even some sites publish whois data whenever anyone does a whois on the domain.. It saves it to a page for example and then crawls for more data.

Deep Pages

These can't be detected unless there is some reference to them, this includes text mentions, link mentions.

10% popularity Vote Up Vote Down


 

@Looi9037786

Google doesn't guarantee to crawl your website. So it is a dice roll with waiting on Googlebot to crawl, let alone index your website. You should setup a sitemap.xml file in Google Webmaster Tools -- your website will be indexed much faster as you are making it known to Google that your website exists.

Without links from other webpages to your website, Googlebot may never find your page and crawl it.

Stick with a sitemap.xml file and tell Google about your site.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme