Mobile app version of vmapp.org
Login or Join
Hamaas447

: How to Crawl a website requires cookies for audit? Situation: My Client's website requires cookies to access it. Users should choose (Language and country) to access the website. The problem is:

@Hamaas447

Posted in: #Googlebot #Seo #WebCrawlers

Situation: My Client's website requires cookies to access it.
Users should choose (Language and country) to access the website.

The problem is: Whenever I try to crawl the website using any software (DeepCrawl or Screaming Frog), the crawler keeps getting the same page that asks for Language and country for each URL.

Question: How to allow the crawler to bypass, or even select language and country to access the website?

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Hamaas447

2 Comments

Sorted by latest first Latest Oldest Best

 

@Carla537

You need to use a crawler with a cookie jar. Here's one I wrote some time ago that can log in and keep cookies for a site. You didn't mention a language. This is PHO and Mysql or Oracle.
github.com/Pamblam/Crawler

10% popularity Vote Up Vote Down


 

@Turnbaugh106

Search Engines Bots such as Googlebot do not use cookies and if your content is only visible with cookies then your content is not crawlable. You need to ensure that the website is crawlable without cookies.

The simple solution is to do a cookie check before serving 'choose language option', if the cookie tests negative then you have your website serve the most popular version of your website, then for other languages you use rel="alternate" hreflang="en-XXX" in the head so Google indexes all versions.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme