: Pagination, Duplicate Content, and SEO Please consider a list of items (forum comments, articles, shoes, doesn't matter) which are spread over multiple pages. Different sort orders are supported
Please consider a list of items (forum comments, articles, shoes, doesn't matter) which are spread over multiple pages. Different sort orders are supported (by date, by popularity, by price, etc).
So, an URL might look like this (I use the query style here to simplify things):
/items?id=1234&page=42&sort=popularity
/items?id=1234&page=5&sort=date
Now, in terms of SEO, I think I should be worried about duplicate content. After all, each item appears at least as many times as there are sort orders.
I've seen Matt Cutts talking about the rel=canonical link tag, but he also said that the canonical page should have very similar content. But this is not the case here because page #1 in a non-canonical sort order might have completely different items than page #1 in the canonical sort order. For a given non-canonical page, there is no clear canonical page listing all the same items, so I think rel=canonical won't help here.
Then I thought about using the noindex meta tag on all pages with non-canonical sort order, and not using it on all pages with canonical sort order.
However, if I use that method, what will happen with backlinks that are going to non-canonical pages -- will they still spread their page rank juice, even though the first page googlebot (or any other crawler) is going to encounter is marked as "noindex"?
Can you please comment on my problem and what you think is the best solution?
If you think you have a better solution, please consider that 1) I do not want to use Javascript for this, 2) I do not want all the items to be on one page.
Thank you.
More posts by @Speyer207
2 Comments
Sorted by latest first Latest Oldest Best
Use a rel="canonical" link tag. It exists for this purpose. It was suggested first by google, but whatwg and w3c are in ways to approve it as spec.
Per your comments on @danlefree answer, your concern maybe related to URL rewriting to user-friendly URLs, ie, you are used to www.domain.tld/apples/weight/, instead of using www.domain.tld/items.php?id=23&sort=weight.
Well, one way or another will reach the same script, set same application variables and output same content.
Just make sure you have the same (your preferred) URL on all related pages (for example, every same item, regardless sorting order, in my example could be www.domain.tld/apples/).
EDIT
Going further on discussion...
First, Matt Cutts is a great guy, I have nothing against his person, but never, NEVER take anything as the holy grail. It maybe a good proposal, but your judgement must be the final word - not any person, despite his/her position and qualification.
Let's get back to your example. Assumptions:
You have a products listing page
Per default it sorts by ID
Users can choose to sort by stock quantity, price or size
Each page shows 3 items
you currently have 12 different items to be listed.
prices and stock quantities can vary along the time
OK, now if user enter you page he will see the lists:
Default: 1 2 3 - 4 5 6 - 7 8 9 - 10 11 12
By Price: 6 11 1 - 3 4 7 - 2 9 12 - 10 8 3
By Stock: 3 2 1 - 6 4 5 - 8 9 7 - 11 12 10
By Size: ...
Are you sure you want different sorting pages to be crawled? I mean, unless you change other page contents, like title tag like:
<!-- canonical -->
<title>Joe's Store - Products list</title>
<!-- sorted by price -->
<title>Joe's Store - Products list - cheaper first!</title>
I agree, that sorting can generate a different content, but it should no be crawled and indexed if what changes is only the user sorting preferences. If you are going to provide another title, meta tags and so on, then I think it worth while keep everything as different, otherwise, canonical use is recommended.
Of course, no follow on sorting links also works and very well, but you should not rely just on them.
Let's go to my point:
First, think on users
Second, make sure your markup is as good as possible (this include, rel="nofollow" on the right bodycopy anchors, link rel="canonical", meta noindex on head)
Third, use the tools you have like robots.txt, sitemaps.xml and google webmaster tools
Why all this? Because you should worry first with your users. Then, if your page is well written, and last about crawlers. If your page is really well written, not only users and google will appreciate that, but other search engines and tools.
Use Google Webmaster Tools to ignore the GET parameter (i.e. "sort") for different sort orders.
I recently encountered this problem with an e-commerce shop in which the "similar items" feature was presented as a hard-coded link from every product page (which created many, many "similar items" pages for each product): telling Google to ignore the "similar items" GET parameter fixed the situation and took the number of "pages" down from 90+ million to a few thousand within several days with appreciable improvements in ranking (and, obviously, crawl rate).
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.