Mobile app version of vmapp.org
Login or Join
Lengel546

: Disallow JS scripts in robots.txt for Googlebot To my knowledge, Googlebot is currently fully capable of rendering complex SPA applications, and this is recommended as a rule of thumb. The website

@Lengel546

Posted in: #Cloaking #Googlebot #RobotsTxt #SinglePageApplication

To my knowledge, Googlebot is currently fully capable of rendering complex SPA applications, and this is recommended as a rule of thumb.

The website is currently routed from server side and served as static pages. It has no problems with SEO and presists in Google index.

Upcoming new website has been converted to quasi-SPA powered by Angular where all SEO significant parts are still served as static pages. Client-side routing does PJAX, so all pages from SPA routes correspond to the ones served by server-side router.

The idea is that the website will gracefully degrade when Angular application is not working (isn't loaded or browser scripts are switched off). All informative content is still there, but interactive widgets (cart, real-time graphs, etc) that don't need to be searchable are being cut off.

Our intention is to not disturb Google's status quo in any way. For instance, I've encountered problems with <title> in SPAs indexed by Google (the page itself appeared ok in user's browser).

Another problem is that data-heavy widgets may take several seconds to initialize, and I would like to give Googlebot no chances to stop with time-out or mark the website as 'slow' and lower its rank.

So I consider that the website is gracefully degraded, and I would like Google to see it like some of our visitors - a simplified yet informative website with same text content. In my opinion, it is certainly not being cloaked.

TL;DR: we want to augment existing static website with dynamic features like widgets and PJAX, but also want to disallow most JS scripts in robots.txt to keep same appearance in the eyes of Googlebot in order to not disturb SEO.

Are disallowed JS scripts in robots.txt acceptable to achieve the goal? What are possible implications from Google's side? Can it consider that as cloaking and/or award a penalty? Can Googlebot ignore robots directives and peep at fully functional website somehow to get an idea on what's going on?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Lengel546

1 Comments

Sorted by latest first Latest Oldest Best

 

@Lee4591628

I think that Google has made it relatively clear that it wants to be able to access all of a page's content. And because javascript can lead to very vunerable scripts such as malware, viruses and browser hijacking, if you disable Googlebot's access to your .js file I imagine that it would be very leery of sending traffic to your page.

Imagine this scenario. A website has a .js script that redirects the user to a malware site, and/or causes a virus.exe application to download on the user's client side. The website then blocks Googlebot from accessing this .js file. Google then sends traffic to the page and their visitors are infected with malware. This seems like a very large security flaw that Google has almost certainly been made aware of.

As a result, my best guess is that disabling Googlebot's access to your .js files is a very risky position to put yourself in, and could potentially derank your pages altogether.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme