: How to tell search engines to not index entire image domain without making them waste server bandwidth or making google complain From what I learned, there's one way I could cause all URLs

Posted in: #Baidu #Download #Google #Images #Indexing

From what I learned, there's one way I could cause all URLs on a domain strictly serving images not to be indexed and that is with the x-robots-tag HTTP header. Now I check my logs and find out that google and even baidu are downloading the entire contents of the image URLs. I was hoping they'd stop downloading when they came across this line:

X-Robots-Tag: noindex, noimageindex

Either I formatted that line wrong (used wrong casing or wrong order of values or something), or search engines are just plain dumb and they just decide to download everything just to waste customer's money.

I looked into robots.txt and thought of using the noindex line but when I did, google complained about having no access to what they call an "important url" when it isn't important.

I don't want to block their IP's because I have text-based content on another domain running on the same server that I want them to index.

I'm tempted to offer search engines the equivalent of what users get if they requested the URL via the HEAD method (full headers but no actual content), but I might get penalized for content cloaking.

Is there something I can do to rectify this?

10.02% popularity Vote Up Vote Down

: Phpmyadmin error 403 - No matching DirectoryIndex (index.html) found, and server-generated directory index forbidden by Options directive I installed phpmyadmin over a year ago. I use MAC OS 10.10.4.

@Annie201

Posted in: #Apache2 #Phpmyadmin

1 Comments

: The EU cookie law is a peculiar law that applies to any website serving to visitors in the EU. Visitors first have to give consent to store cookies.

@Annie201

0 Comments

: Remove a website from your Bing Webmaster Tools account which is not in the overview? Here is some info about removing a website from Bing Webmaster. It is very easy, you just select the site

@Annie201

Posted in: #BingWebmasterTools

0 Comments

: What to do when web server hosting is down? Update: As of Oct 21, 8AM EST, it's still down. I'll call Rackspace again My web server is hosted by Rackspace Mycloud and today Oct 20, 2015

@Annie201

Posted in: #Downtime #SiteMaintenance #WebHosting

2 Comments

Login to post a comment!

2 Comments

Sorted by latest first Latest Oldest Best

@Heady270

Google supports Noindex: in robots.txt. See How does “Noindex:” in robots.txt work? It is a beta feature though and they may remove support for it. Because of that I would use the robots.txt file:

User-Agent: *
Disallow: /

User-Agent: Googlebot
Noindex: /

User-Agent: bingbot
Disallow:

User-agent: Yahoo! Slurp
Disallow:

User-agent: Yandex
Disallow:

Along with the heading you mention in your question:

X-Robots-Tag: noindex, noimageindex

In that case, only three spiders will crawl your content to find out they can't index it. Googlebot won't crawl or index. Non-search-engine bots won't even be allowed to crawl at all.

If Googlebot does stop supporting Noindex: it will start crawling and find out that it can't index.

10% popularity Vote Up Vote Down

@Si4351233

The most effective way to do this is to use a robots.txt file with Disallow: / as the only directive and place it in the web root for the images domain. When this is done search engines won't crawl the images. The reason why you got the error from Google was only because it was a computer based evaluation that deemed that the images may have been needed to be crawled but it is at your discretion. As you don't want the images to be indexed you can safely ignore this error from Google as it indicates that the images are not going to be crawled which is what you want.

10% popularity Vote Up Vote Down

Feed

: How to tell search engines to not index entire image domain without making them waste server bandwidth or making google complain From what I learned, there's one way I could cause all URLs

More posts by @Annie201

: Phpmyadmin error 403 - No matching DirectoryIndex (index.html) found, and server-generated directory index forbidden by Options directive I installed phpmyadmin over a year ago. I use MAC OS 10.10.4.

: The EU cookie law is a peculiar law that applies to any website serving to visitors in the EU. Visitors first have to give consent to store cookies.

: Remove a website from your Bing Webmaster Tools account which is not in the overview? Here is some info about removing a website from Bing Webmaster. It is very easy, you just select the site

: What to do when web server hosting is down? Update: As of Oct 21, 8AM EST, it's still down. I'll call Rackspace again My web server is hosted by Rackspace Mycloud and today Oct 20, 2015

Login to post a comment!

2 Comments

Back to top | Use Dark Theme