Mobile app version of vmapp.org
Login or Join
Shanna517

: Google indexed the same page under two URLs (despite rel-canonical) The Super User question "Playing mp3 in quodlibet displays “GStreamer output pipeline could not be initialized” error" is

@Shanna517

Posted in: #CanonicalUrl #DuplicateContent #GoogleSearch

The Super User question "Playing mp3 in quodlibet displays “GStreamer output pipeline could not be initialized” error" is indexed under two URLs in Google:

superuser.com/questions/651591/playing-mp3-in-quodlibet-displays-gstreamer-output-pipeline-could-not-be-initia https://superuser.com/questions/651591/playing-mp3-in-quodlibet-displays-gstreamer-output-pipeline-could-not-be-initia/652058


The first one is the canonical one; the corresponding rel-canonical is included in both pages:

<link rel="canonical" href="https://superuser.com/questions/651591/playing-mp3-in-quodlibet-displays-gstreamer-output-pipeline-could-not-be-initia" />


Google also indexed superuser.com/a/652058, which redirects to the answer:
superuser.com/questions/651591/playing-mp3-in-quodlibet-displays-gstreamer-output-pipeline-could-not-be-initia/652058#652058

Now, the second URL from above is the same as this one minus the fragment #652058 .

So Google seems to strip the fragment, which results in exactly the same page under another URL (= containing the answer ID /652058 as suffix), and indexes it, too -- despite rel-canonical and duplicate content.

Shouldn’t Google recognize this and only index the canonical variant?

What is going on here?



EDIT: Google even displays a slightly different title for the same page (screenshot of result #1 and #2 for the query "Super User QuodLibet"):

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Shanna517

1 Comments

Sorted by latest first Latest Oldest Best

 

@Michele947

Looking at Google's cached versions of the two variants, the non-canonical has one answer (cached 21 Oct) while the canonical has two answers (with the first marked as accepted), (cached 24 Oct). The question and both answers have been edited at various times, too, the second answer as recently as today.

My guess is these differences have been enough to prevent the canonical link element from being obeyed, and this will be corrected as the content becomes static (i.e., no more answers or edits) and is re-crawled.

From Google's page on rel="canonical


The rel="canonical" attribute should be used only to specify the preferred version of many pages with identical content (although minor differences, such as sort order, are okay).


Anything after # is always ignored, so that correctly equates to another link to the non-canonical URL.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme