: Google indexed the same page under two URLs (despite rel-canonical) The Super User question "Playing mp3 in quodlibet displays “GStreamer output pipeline could not be initialized” error" is
The Super User question "Playing mp3 in quodlibet displays “GStreamer output pipeline could not be initialized” error" is indexed under two URLs in Google:
superuser.com/questions/651591/playing-mp3-in-quodlibet-displays-gstreamer-output-pipeline-could-not-be-initia https://superuser.com/questions/651591/playing-mp3-in-quodlibet-displays-gstreamer-output-pipeline-could-not-be-initia/652058
The first one is the canonical one; the corresponding rel-canonical is included in both pages:
<link rel="canonical" href="https://superuser.com/questions/651591/playing-mp3-in-quodlibet-displays-gstreamer-output-pipeline-could-not-be-initia" />
Google also indexed superuser.com/a/652058, which redirects to the answer:
superuser.com/questions/651591/playing-mp3-in-quodlibet-displays-gstreamer-output-pipeline-could-not-be-initia/652058#652058
Now, the second URL from above is the same as this one minus the fragment #652058 .
So Google seems to strip the fragment, which results in exactly the same page under another URL (= containing the answer ID /652058 as suffix), and indexes it, too -- despite rel-canonical and duplicate content.
Shouldn’t Google recognize this and only index the canonical variant?
What is going on here?
EDIT: Google even displays a slightly different title for the same page (screenshot of result #1 and #2 for the query "Super User QuodLibet"):
More posts by @Shanna517
1 Comments
Sorted by latest first Latest Oldest Best
Looking at Google's cached versions of the two variants, the non-canonical has one answer (cached 21 Oct) while the canonical has two answers (with the first marked as accepted), (cached 24 Oct). The question and both answers have been edited at various times, too, the second answer as recently as today.
My guess is these differences have been enough to prevent the canonical link element from being obeyed, and this will be corrected as the content becomes static (i.e., no more answers or edits) and is re-crawled.
From Google's page on rel="canonical
The rel="canonical" attribute should be used only to specify the preferred version of many pages with identical content (although minor differences, such as sort order, are okay).
Anything after # is always ignored, so that correctly equates to another link to the non-canonical URL.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.