: How do search engines handle protocol-relative links? With many sites supporting but not requiring HTTPS now, there's an increase in protocol-relative links. A protocol-relative link is one where

Posted in: #Http #Https #Links #RelativeUrls #SearchEngines

With many sites supporting but not requiring HTTPS now, there's an increase in protocol-relative links.

A protocol-relative link is one where the protocol is not specified, and the browser redirects to HTTPS if the page containing the link is viewed as HTTPS, and redirects to HTTP if the page containing the link is viewed as HTTP. For example, this link is protocol relative; if you hover over it you'll see the same protocol as however you're viewing this question.

How do search engines parse protocol-relative links? If the Googlebot is crawling a page via HTTP, will it remain in HTTP when following protocol-relative links, or will it know to look at both the HTTP and HTTPS versions of the target link?

10.01% popularity Vote Up Vote Down

: GoDaddy: RewriteCond in .htaccess not working with "beginning of the line" (^) I have .htaccess containing RewriteCond and RewriteRule. It used to work just fine (it's been a long time since

@Hamaas447

Posted in: #Godaddy #Htaccess

2 Comments

: Dreamhost Shared Hosting - Increase post_max_size I am using Dreamhost shared hosting running PHP 5.4 and I am trying to increase the file upload limit to 1 GB. I try to achieve that by using

@Hamaas447

Posted in: #Dreamhost #FileSize #Php #SharedHosting

1 Comments

: Don't forget to have some sort of javascript thingy that follows the cursor. Maybe a pair of eyes, some snow flakes, whatever. And was that the period when autoplay midi files happened?

@Hamaas447

0 Comments

: URLs posted with @ in the path are not autolinked properly, are they valid? I have a site where users post comments. Often, those comments have URLs to a video streaming site with URLs in

@Hamaas447

Posted in: #Url

2 Comments

Login to post a comment!

1 Comments

Sorted by latest first Latest Oldest Best

@Samaraweera270

How do search engines parse protocol-relative links?

Web crawlers follow the same conventions as browsers do to parse URIs, as described in RFC 3986, including performing URL normalization in order to avoid crawling the same resource more than once.

Some crawlers like the Googlebot even render webpages, so protocol-relative and relative URIs will appear the same way to them as they do to users with modern browsers, with the base path from the current location added to them.

Google also states here that using relative URLs ensures your links and resources always use HTTPS.

You can test this with the Googlebot by using the fetch and render mode in Fetch as Google, in which:

Googlebot gets all the resources referenced by your URL such as picture, CSS, and JavaScript files, running any code. to render or capture the visual layout of your page as an image. You can use the rendered image to detect differences between how Googlebot sees your page, and how your browser renders it.

By adding an image with a protocol-relative URI for the source, you'll be able to see if the Googlebot renders the page with that image or not.

Protocol-relative and relative URIs can however result in errors with some crawlers that are less sophisticated than the Googlebot since they often use parallel architectures in which the URLs are parsed from source code, databased, and then crawled in parallel. Unless the base path from the URL under which they were found was appended to the relative URI, the crawler won't be able to resolve it.

Another problematic area is when sitemaps are created automatically using sitemap tools, since they also just often parse the relative URI from the source code and list that in the sitemap, resulting in the same issue as above.

You can possibly circumvent these issues by setting a base element, which instructs browsers and bots on how to resolve relative URIs, including which protocol to use for relative URIs found on that page. It's highly recommended though to use absolute URLs whenever possible to avoid these issues altogether.

10% popularity Vote Up Vote Down

Feed

: How do search engines handle protocol-relative links? With many sites supporting but not requiring HTTPS now, there's an increase in protocol-relative links. A protocol-relative link is one where

More posts by @Hamaas447

: GoDaddy: RewriteCond in .htaccess not working with "beginning of the line" (^) I have .htaccess containing RewriteCond and RewriteRule. It used to work just fine (it's been a long time since

: Dreamhost Shared Hosting - Increase post_max_size I am using Dreamhost shared hosting running PHP 5.4 and I am trying to increase the file upload limit to 1 GB. I try to achieve that by using

: Don't forget to have some sort of javascript thingy that follows the cursor. Maybe a pair of eyes, some snow flakes, whatever. And was that the period when autoplay midi files happened?

: URLs posted with @ in the path are not autolinked properly, are they valid? I have a site where users post comments. Often, those comments have URLs to a video streaming site with URLs in

Login to post a comment!

1 Comments

Back to top | Use Dark Theme