Mobile app version of vmapp.org
Login or Join
Margaret670

: How to disallow Google crawling out of my website through images? I'm taking care of a website which has multiple images coming from exterior links. The thing is, those links are outgoing ones,

@Margaret670

Posted in: #Google #RobotsTxt #WebCrawlers

I'm taking care of a website which has multiple images coming from exterior links.

The thing is, those links are outgoing ones, and I want to avoid that as much as possible.

I tough of one solution which was to add rel="nofollow"to my links, but the thing is W3C is explicit about it: this is not allowed.

So my first question is: how would Google react to this?

The second thought I asked myself: could I tell Google via robots.txt not to crawl further? If yes, how am I supposed to do it?

I've read that regex is not allowed in robots.txt and I can't use something like Disallow: /*.jpg$ since I need some of my images to be crawled.

Edit:

I just thought of this:

User-agent: *
Allow: /images/*.jpg$
Disallow: /*.jpg$


Since all of our personal images are located in a specific folder. Would it work?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Margaret670

1 Comments

Sorted by latest first Latest Oldest Best

 

@Rivera981

You are right, rel=nofollow is only for links. <img> tag can't have rel=nofollow attribute.

What do you mean by


disallow google to crawl out of my website through images?


Google bot doesn't crawl out when it encounters an external image. It just gives a positive (or sometimes negative) recognition to it and may index it - Google can't crawl an image. You website will be crawled as it is with or without external image links.

What rel=nofollow tells (in case of external link) google is that you are not positively recommending the website and merely providing a link. Even having external links without nofollow doesn't affect crawling of your website.

I am not sure if there is corresponding attribute similar to nofollow for images and it is rightly so because images can only be indexed and not crawled/followed. Whether to allow indexing of image or not is in the hands of the owner of the image (which is the external website) which can allow/disallow indexing of images through robots.txt but it isn't your decision.

User-agent: Googlebot-Image
Disallow: /

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme