: How my robots.txt should look like for a single page app I can understand how to disallow bots to crawl some pages/folder in normal application. For example for google-bot it is nicely described

I can understand how to disallow bots to crawl some pages/folder in normal application. For example for google-bot it is nicely described here.

But what should I do if I have a single page application (the one that uses only ajax to upload new content and has routing and page generation on the client). How to make it crawlable is described here and here, but what if I do not a bot to follow some links (that are on my starting page)? By this I mean the following:

When SPA is loaded for the first time it loads some basic HTML. This html can have specific links like:

home (#!home/)
about (#!about/)
news (#!news/)

but I do now want a bot to crawl #!about link.

10.01% popularity Vote Up Vote Down

: How does my server validate my email address I am attempting to send emails from my website (localhost, test and live) and I am finding that only a fraction ever reach their destination.

@Lee4591628

Posted in: #DnsServers #Email #Localhost #Smtp

1 Comments

: Do people really use social network sharing buttons? I'm making a website that you can upload images to. Implementing social network sharing buttons for content is turning out to be a pain and

@Lee4591628

Posted in: #SocialMedia #SocialSharingButtons

7 Comments

: How can I manage multiple AdWords accounts using the same login? This is the situation. I have an AdWords account which I use with my own email address. My girlfriend has another which she

@Lee4591628

Posted in: #Google #GoogleAdsense #GoogleAdwords

1 Comments

: Favicon 404 in access log, but favicon is there Going through 404 error on my site, I have noticed a bunch of problems with favicons: /apple-touch-icon-120x120.png /apple-touch-icon-120x120-precomposed.png

@Lee4591628

Posted in: #Favicon #Logging

1 Comments

Login to post a comment!

1 Comments

Sorted by latest first Latest Oldest Best

@Chiappetta492

I have found a way to do exactly what I want. It is nicely documented by google:

When your site adopts the AJAX crawling scheme, the Google crawler
will crawl every hash fragment URL it encounters. If you have hash
fragment URLs that should not be crawled, we suggest that you add a
regular expression directive to your robots.txt file. For example, you
can use a convention in your hash fragments that should not be crawled
and then exclude all URLs that match it in your robots.txt file.
Suppose all your non-indexable states are of the form
' #DONOTCRAWLmyfragment . Then you could prevent Googlebot from crawling these pages by adding the following to your robots.txt:

Disallow: /*_escaped_fragment_=DONOTCRAWL

10% popularity Vote Up Vote Down

Feed

: How my robots.txt should look like for a single page app I can understand how to disallow bots to crawl some pages/folder in normal application. For example for google-bot it is nicely described

More posts by @Lee4591628

: How does my server validate my email address I am attempting to send emails from my website (localhost, test and live) and I am finding that only a fraction ever reach their destination.

: Do people really use social network sharing buttons? I'm making a website that you can upload images to. Implementing social network sharing buttons for content is turning out to be a pain and

: How can I manage multiple AdWords accounts using the same login? This is the situation. I have an AdWords account which I use with my own email address. My girlfriend has another which she

: Favicon 404 in access log, but favicon is there Going through 404 error on my site, I have noticed a bunch of problems with favicons: /apple-touch-icon-120x120.png /apple-touch-icon-120x120-precomposed.png

Login to post a comment!

1 Comments

Back to top | Use Dark Theme