Mobile app version of vmapp.org
Login or Join
Ravi8258870

: SEO failure: AmazonS3 + AngularJS + Many pages not being crawled or indexed I've looked over various threads here but nothing that seems to be the same issue. We're currently facing two problems

@Ravi8258870

Posted in: #AmazonS3 #AngularJs #CrawlableAjax #GoogleSearchConsole #Seo

I've looked over various threads here but nothing that seems to be the same issue. We're currently facing two problems at the moment. If I can address the first problem then it would definitely help diagnose the second issue.

Also, I've posted this to google's webmaster forum but no replies yet.

Our stack:

AngularJS,
HTML + SCSS,
AmazonS3 as our "web server" but as you may know is not really a web server. We have a redirect rule on the bucket to prefix any URL with a hash bang so the site functions properly.
CloudFront in front of our S3 bucket.

First problem:

The "Fetch as Google" tool is truncating any url starting with a #! (hash bang) making it difficult to know if any of these pages can be crawled by Google. If this is working for other sites then the problem might be that we're using AmazonS3 as our "web server." I've checked other threads here and it seems to be working for other people.

Second problem:

Google is only indexing two pages of your site offtherecord.com.
Feel to search "site:offtherecord.com" in google.


offtherecord.com
offtherecord.com/how-it-works


For the "how-it-works" page, Google is able to crawl this which requires a hash bang to get the content to render on the browser thereby requiring execution of JS and it works! However, it just doesn't seem to be able to crawl and/or index any of the other pages.

Putting offtherecord.com/how-it-works in the "Fetch as Google" tool causes a 301 redirect as expected to #!/how-it-works, however if I try to follow it in the tool then it truncates everything after #! url.

I've checked the Google crawler stats page on the webmaster tool and there are no crawler errors.

Similar threads:


Google not crawling AJAX content: productforums.google.com/forum/#!topic/webmasters/_pdC55wUvfI;context-place=topicsearchin/webmasters/hashbang AmazonS3 + AJAX content: [stack exchange only allows 2 links for me]


We have html5 mode enabled in our AngularJs app via
$locationProvider.html5Mode(true).hashPrefix('!');

Please advise on how we can address #1 and #2 . We're looking into actually having a real web server if this is hurting our accessibility for search engine crawlers.

Thank you for your time

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Ravi8258870

2 Comments

Sorted by latest first Latest Oldest Best

 

@Michele947

The answer to the two issues I was facing was addressed by moha297's answer here below. I didn't complete step #5 (setting up a web server) but pages are now being properly indexed.
stackoverflow.com/a/35354677/6929000
Thank you!

10% popularity Vote Up Vote Down


 

@LarsenBagley505

It seems you have a lot going on.

Your robots.txt needs to be looked at. I will leave it at that. Then...


Your text sitemap needs to be html for the site and xml for the
search engine, not plain text.
Submit that sitemap to search engines (you can create your sitemap without the #! and it will work.)
Create a Webmaster Tools account for Google > add site > verify site
submit Sitemap > Fetch as Google > Submit to Index.


You can also add the sitemap via htaccess

RewriteEngine On
RewriteRule ^sitemap.xml$ /path_to_sitemap [L]


Make sure mod_rewrite is enabled

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme