: How to force search engines not to crawl my site, is there any special mark on them? I want to block search engines from crawling my websites. Atleast for Msn, Yahoo, Google, and Yandex.

I want to block search engines from crawling my websites. Atleast for Msn, Yahoo, Google, and Yandex.

i dont really trust robot.txt. Because they can simply ignore it and continue crawling.

I would use rails as my web framework.

How can i do it, atleast to decrease the posibility destructive action they do by simply crawling a specific page.

Is there any special mark or identifier on them while they are crawling a site ?

This question marked as duplicate with
Robots denied by domain is still listed in search results

Wich is i guess a mistake. Above questions is asking specifically why. I dont really care why google crawling my site. I just want to block it. Show 404 status.

10.01% popularity Vote Up Vote Down

: Google Analytics - Identifying Bots I've noticed over the last few days a massive increase in visitors, The thing is that it seems like a bot: Mozilla compatible user agent average 0 minutes

@Bethany197

Posted in: #GoogleAnalytics #Spam

1 Comments

: Pagerank updated and traffic change, what's first? As we all know, pagerank is not updated very often, this month it Did. Thing is in one of my sites google traffic increased about a 300%

@Bethany197

Posted in: #Pagerank #Seo

1 Comments

: How does licensing affect my website? I like to use prebuilt CSS/JS such as Foundation and PureCSS and I'm curious as to how I should go about displaying the licensing information, as well

@Bethany197

Posted in: #Css #Javascript #Licenses

1 Comments

: SSL not valid for domain I am in the process of setting up a webmail server and one of the requirements is to have a signed SSL which I obtained through StartCom. During the setup it asked

@Bethany197

Posted in: #Https

1 Comments

Login to post a comment!

1 Comments

Sorted by latest first Latest Oldest Best

@Si4351233

An elegant way to specify search engine crawling instructions for non-html files without having to use robots.txt. This comes first in listing. Some other crawling control methods listed below:

Use noindexpage meta tags:

<meta name="robots" content="noindex" />

Use index in page meta tags with searchengine name to crawl:

<meta name="googlebot" content="index" />

Nofollow: Tell search engines not to spider some or all links on a page

<meta name="robots" content="nofollow" />

To specify nofollow at the link level, add the attribute rel with the value nofollow to the link:

<a href="mypage.html" rel="nofollow" />

Use X-Robots-Tag in your http headers which is accepted only by yahoo.

10% popularity Vote Up Vote Down

Feed

: How to force search engines not to crawl my site, is there any special mark on them? I want to block search engines from crawling my websites. Atleast for Msn, Yahoo, Google, and Yandex.

More posts by @Bethany197

: Google Analytics - Identifying Bots I've noticed over the last few days a massive increase in visitors, The thing is that it seems like a bot: Mozilla compatible user agent average 0 minutes

: Pagerank updated and traffic change, what's first? As we all know, pagerank is not updated very often, this month it Did. Thing is in one of my sites google traffic increased about a 300%

: How does licensing affect my website? I like to use prebuilt CSS/JS such as Foundation and PureCSS and I'm curious as to how I should go about displaying the licensing information, as well

: SSL not valid for domain I am in the process of setting up a webmail server and one of the requirements is to have a signed SSL which I obtained through StartCom. During the setup it asked

Login to post a comment!

1 Comments

Back to top | Use Dark Theme