Mobile app version of vmapp.org
Login or Join
Deb1703797

: Is "site:" returning pages that would not otherwise be returned? Today I attempted a search on one of my websites as I thought I had a page about this or that and used the following in my

@Deb1703797

Posted in: #CanonicalUrl #GoogleSearch #RelCanonical

Today I attempted a search on one of my websites as I thought I had a page about this or that and used the following in my search:

site:m2osw.com upload


I started looking through the results and could not find the very page I was looking for, but noticed that one of my website, we'll say "blah.m2osw.com", appeared in the results!

This was a big surprise to me because that site has had the correct canonical URL for a long time and it points to a different domain. Something like this:

<link rel="canonical"
type="text/html"
title="Home Page"
href="https://exdox.com"/>


I was thinking that whenever you had a canonical it would tell Google to go look over there, but now I have a doubt and am thinking that maybe these canonical do not work cross domain? I thought I read somewhere that was the best way to have a test site anywhere I wanted and point the test site to the real site so Google indexes the real site and not the test site...

Do you know whether that canonical looks correct from blah.m2osw.com or whether I would have to add something else (i.e. the robots = NOINDEX for example...)

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Deb1703797

2 Comments

Sorted by latest first Latest Oldest Best

 

@Alves908

<meta rel="canonical"
type="text/html"
title="Home Page"
href="https://exdox.com"/>



This is incorrect. This should be a link element, not a meta element. For example:

<link rel="canonical" href="https://exdox.com">


The type and title attributes in this context are irrelevant.


Is “site:” returning pages that would not otherwise be returned?


However, this is correct. The Google site: operator does indeed return URLs that would not ordinarily be returned in a normal Google search. The rel="canonical" link element doesn't necessarily prevent both pages from being indexed, but it advises Google that it should return the canonical page in the SERPs when possible. (The site: operator returns the pages that are indexed.) Whether Google follows this advice is, however, up to Google. If Google does not think (by its own analysis) that the declared "canonical" URL is not actually canonical (ie. sufficiently similar) then it could be ignored. From Google Webmaster Central Blog - Handling legitimate cross-domain content duplication


While the rel="canonical" link element is seen as a hint and not an absolute directive, we do try to follow it where possible.


UPDATE: If this is a "test site" then yes, it probably shouldn't be indexed at all and the rel="canonical" tag is mostly irrelevant. You should either noindex (robots meta tag or X-Robots-Tag HTTP response header) or block in robots.txt or restrict access in some other way (pwd, IP restriction, etc.).

10% popularity Vote Up Vote Down


 

@Looi9037786

Do you know whether that canonical looks correct from blah.m2osw.com

This an example of really funny question ;)

To the point:


canonical is just a recommendation. This means: Google will decide by its own, what to display as search result - canonicalized or canonical url
the SERP Google builds for a site-query displays urls from a given domain, which are relevant to the meaningful query part. There could appear both of canonicalized and canonical urls - indeed they are both relevant to a given query, canonical - more, canonicalized - less.
In the "normal", non-site query Google would abstain from displaying of canonicalized url because of less relevancy.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme