: Google indexed my escaped_fragment pages My site is a single page web-app. I am following the suggestions based on making AJAX applications crawl-able. My URL looks like this: http://domain.com/#!pages/contactUs
My site is a single page web-app. I am following the suggestions based on making AJAX applications crawl-able.
My URL looks like this:
domain.com/#!pages/contactUs
My understanding is:
domain.com/#!chair/12 goes to domain.com/?_escaped_fragment=chair/12
As I am not using any server-side scripting on this project, I have created HTML pages with the application states and put them in a folder like so:
domain.com/htmlFiles/1.html
In Apache I have forwarded requests that include _escaped_fragment_= to the right html page:
RewriteEngine on
RewriteCond %{QUERY_STRING} ^_escaped_fragment_=chair/([w]*)
RewriteRule ^(.*)$ htmlFiles/%1.html? [R=302,L]
The forwarding works correctly and the appropriate page shows up if the _escaped_fragment URL is used.
The sitemap I submitted to Google looks like this:
<url>
<loc>http://domain.com/#!pages/contactUs</loc>
<lastmod>2012-12-30</lastmod>
<changefreq>weekly</changefreq>
<priority>0.8</priority>
</url>
The problem now is this:
my whole htmlFiles folder (http://domain.com/htmlFiles/1.html) with the HTML files is indexed in Google. These pages are there in the first place just to show Google what content my actual pages contain.
My entire website works from
domain.com/
These pages should not be coming up in the search results. As they had said they will only index pretty URLs, but still, I am reluctant to have them remove these pages as I don't know if it's going to hamper something else.
Could it be that 302 is not the right redirect and 301 should be used instead?
Also is there something wrong with this redirect approach thing in the first place?
More posts by @Angela700
1 Comments
Sorted by latest first Latest Oldest Best
As precised in the Google specification, you could use 302 redirection (but not 301) when they call you with ?_escaped_fragment_= to provide them the content.
The problem I could see in your implementation is that, maybe, your HTML files contains some links that are relative to the redirected page or that point directly to others HTML files. For example if you have an href in your HTML file that points to something like yourdomain.com/htmlFiles/1.html (or 1.html in a relative way), this page would be indexed by Google.
In short what I say is that, starting from your sitemap, Google should correctly index your pages, but when Google use the links provided by your HTML files, Google probably index them directly without making the relation with the original #! urls.
Here you have different solutions to fix your problem :
Don't use redirection. In your case it seems unnecessary, you could directly send the HTML files to Google when they request them. Just remove the R flag in your Apache rewrite rule. Moreover, this solution will avoid unnecessary round trip between Google and your server due to the redirection mechanism. It will save you (and Google) bandwidth and CPU cycles.
Verify that your HTML files have the correct links in the different href. If they all point to your #! version, it should work.
Add on each of your HTML files a canonical link to your content in #! (http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139394). This could help Google to know what the correct URL to index.
Note: Each of these solution should work individually, but you could also combine part of them and even all of them.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.