: How to crawl a webPage with dynamic content added by javascript I guess there is a news that Google bots have the capability to understand our javascript code. It means this is possible to

I guess there is a news that Google bots have the capability to understand our javascript code. It means this is possible to fully crawl a webpage which has lazy loading feature enabled. I am using Apache Nutch to crawl websites but I don't think it has the capability to fetch the URLs being injected in HTML page by javascript when the page is scrolled down. I see a lot of websites doing lazy loading for performance issue. So Can somebody please explain me how can i crawl the data which comes in HTML page on lazy load. (On scrolling the page down).

10.02% popularity Vote Up Vote Down

: How to find out if my hosting's speed is good enough? There are lots of different online performance tests: Google PageSpeed Insights iWebTool Speed Test AlertFox Page Load Time WebPageTest

@Jessie594

Posted in: #Performance #Testing #WebHosting

3 Comments

: Parallels plesk 11 missing web presence builder After a recent upgrade to parallels plesk 11, we decided to start using their web presence builder tool. However, every video, (documentation and

@Jessie594

Posted in: #Plesk

1 Comments

: What did I do wrong when buying my domain through Google Apps? Several times in the past I've bought a domain through Google Apps, and every year Google automatically renews the domain for

@Jessie594

Posted in: #DomainRegistrar #GoogleApps

1 Comments

: Google Webmasters tools search queries position In my website account on Google Webmasters tools, some search queries show average position 1.0. This make me understand that it should be displayed

@Jessie594

Posted in: #Google #GoogleSearchConsole #Search

1 Comments

Login to post a comment!

2 Comments

Sorted by latest first Latest Oldest Best

@Ravi8258870

You can use some javascript parser at your server-side code(crawler) and parse the javascripts to fetch all the Ajax requests and then also crawl them.
One of them is google-caja.

Try it. May be it will solve your purpose.

10% popularity Vote Up Vote Down

@Phylliss660

Googlebot understands links "hidden" with JS. But the LazyLoad just makes the browser render the content after the initial page load. The HTML is still there. So your bot should have no issue scanning it, since JS is client-side.

If you are having trouble with heavily-JS'd links, check them with parsechecker to see how the filters handle them, and adjust them accordingly.

10% popularity Vote Up Vote Down

Feed

: How to crawl a webPage with dynamic content added by javascript I guess there is a news that Google bots have the capability to understand our javascript code. It means this is possible to

More posts by @Jessie594

: How to find out if my hosting's speed is good enough? There are lots of different online performance tests: Google PageSpeed Insights iWebTool Speed Test AlertFox Page Load Time WebPageTest

: Parallels plesk 11 missing web presence builder After a recent upgrade to parallels plesk 11, we decided to start using their web presence builder tool. However, every video, (documentation and

: What did I do wrong when buying my domain through Google Apps? Several times in the past I've bought a domain through Google Apps, and every year Google automatically renews the domain for

: Google Webmasters tools search queries position In my website account on Google Webmasters tools, some search queries show average position 1.0. This make me understand that it should be displayed

Login to post a comment!

2 Comments

Back to top | Use Dark Theme