Mobile app version of vmapp.org
Login or Join
Sarah324

: Is feeding Googlebot and Bingbot crawlers with "special" data common thing in rich internet applications? So I faced next problem: my server and my rich internet application (RIA) client are using/generating

@Sarah324

Posted in: #Googlebot #WebCrawlers

So I faced next problem: my server and my rich internet application (RIA) client are using/generating so much JSON data that is rendered on fly even on just page creation complete that standard dull no js crawler will see not much data on my site. I know about progressive enhancement and unobtrusive JavaScript but they (crawlers) tend sometimes not use JS at all. Even my mobile phone can do such thing!).

I can always open chrome or some adobe air app like Scout and get parsed HTML and all links it really has.

Demo (dull html):

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
<html>

<head>
<title>Cloud Server Services Selection Menu</title>
<meta charset="UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="Services List" />
<meta name="keywords" content="art, navigation, C++, services, web, web-services, exibition" />
<meta name="author" content="CF2011" />
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<!-- End Of Info-->

<!-- CF CSS -->
<!--[if gte IE 9]>
<style type="text/css">
.gradient {
filter: none;
}
</style>
<![endif]-->
<link rel="stylesheet" href="css/cf.css">
<link rel="stylesheet" href="css/cf.index.application.css">
<!-- End of CF CSS-->

<!-- Public JS-->
<script type="text/javascript" src="js/jquery-1.7.1.min.js">
</script>

<script type="text/javascript" src="js/tempo.min.js">
</script>
<!-- End Of Public JS-->

<!-- CF JS -->
<script type='text/javascript' src="js/cf.js"></script>

<!-- End Of CF JS-->
</head>
<body>
<div class="container">
<div>
<h1 class="head"><a href="#"><p>Cloud Server<span>Exhibition of Public Services &amp; Their Interaction</span></p></a></h1>
</div>
<div>
<ol id="marx-brothers3">
<div data-template>
<ul class="ca-menu">
<li>
<a href="{{url}}">
<!--<span class="ca-icon"><p>{{icon}}</p></span> -->
<div class="ca-content">
<h2 class="ca-main">{{name}}</h2>
<h3 class="ca-sub">{{description}}</h3>
</div>
</a>
</li>
</ul>
</div>
<li data-template-fallback>
Sorry, JavaScript required!
</li>
</ol>
</div>
<div>

</div>
</div>

<script type="text/javascript">
$(document).ready(function() {
var services = Tempo.prepare('marx-brothers3');
services.starting();

$.getJSON("server.json", function(data) {
services.render(data);
});
});
</script>
</body>
</html>


Parsed data with removed script tags:

<html lang="en"><head>
<title>Cloud Server Services Selection Menu</title>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="Services List">
<meta name="keywords" content="art, navigation, C++, services, web, web-services, exibition">
<meta name="author" content="CF2011">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<!-- End Of Info-->

<!-- CF CSS -->
<!--[if gte IE 9]>
<style type="text/css">
.gradient {
filter: none;
}
</style>
<![endif]-->
<link rel="stylesheet" href="css/cf.css">
<link rel="stylesheet" href="css/cf.index.application.css">
<!-- End of CF CSS-->
</head>
<body cz-shortcut-listen="true">
<div class="container">
<div>
<h1 class="head"><a href="#"><p>Cloud Server<span>Exhibition of Public Services &amp; Their Interaction</span></p></a></h1>
</div>
<div>
<ol id="marx-brothers3">

<li data-template-fallback="" style="display: none; ">
Sorry, JavaScript required!
</li>
<div data-template="">
<ul class="ca-menu">
<li>
<a href="images.html">
<!--<span class="ca-icon"><p>default_service_icon.png</p></span> -->
<div class="ca-content">
<h2 class="ca-main">Image Renderer Service</h2>
<h3 class="ca-sub">Service for images rendering, transcoding, mapping.</h3>
</div>
</a>
</li>
</ul>
</div><div data-template="">
<ul class="ca-menu">
<li>
<a href="observer.html">
<!--<span class="ca-icon"><p>default_service_icon.png</p></span> -->
<div class="ca-content">
<h2 class="ca-main">Observer Service</h2>
<h3 class="ca-sub">Demo service for public video conferencing, it does not require user account and its free!</h3>
</div>
</a>
</li>
</ul>
</div><div data-template="">
<ul class="ca-menu">
<li>
<a href="ufs.html">
<!--<span class="ca-icon"><p>default_service_icon.png</p></span> -->
<div class="ca-content">
<h2 class="ca-main">Users Files Service</h2>
<h3 class="ca-sub">Service for user personal and publicly avaliable files storing and managing.</h3>
</div>
</a>
</li>
</ul>
</div></ol>
</div>
<div>

</div>
</div>
<div id="cf-footer" style="position:fixed;min-height:20px;height:auto !important;height:20px;background-color:#3f3b8d;background-color:rgba(0,0,0,0.6);bottom:0; width:100%"><p id="cf-footer-paragraph" style="font-size: 8pt"> Copyright © 2012 <a id="rol" href="#cloudobserver" onclick="{ newwindow=window.open('http://code.google.com/p/cloudobserver/','CloudObserver','height=750,width=900'); if (window.focus) {newwindow.focus()}}">Cloud Forever</a>. Some rights reserved. </p><div></div></div></body></html>


If I save that parsed data and open it up in a browser I would see all that real user will see and all I want crawlers to see.

I already see me creating a chrome browser API based service that will work like normal file service for usual users and would be sending rendered HTML pages for crawlers. Do such services exist already? Does any one use them? What could be the crawlers' author's opinion on such services?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Sarah324

1 Comments

Sorted by latest first Latest Oldest Best

 

@Heady270

Google has an entire guide to making AJAX websites crawlable

The technique that you are referring to is "creating HTML snapshots" which is covered in the guide.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme