Mobile app version of vmapp.org
Login or Join
Speyer207

: Pagination, Duplicate Content, and SEO Please consider a list of items (forum comments, articles, shoes, doesn't matter) which are spread over multiple pages. Different sort orders are supported

@Speyer207

Posted in: #CanonicalUrl #Seo

Please consider a list of items (forum comments, articles, shoes, doesn't matter) which are spread over multiple pages. Different sort orders are supported (by date, by popularity, by price, etc).

So, an URL might look like this (I use the query style here to simplify things):

/items?id=1234&page=42&sort=popularity

/items?id=1234&page=5&sort=date

Now, in terms of SEO, I think I should be worried about duplicate content. After all, each item appears at least as many times as there are sort orders.

I've seen Matt Cutts talking about the rel=canonical link tag, but he also said that the canonical page should have very similar content. But this is not the case here because page #1 in a non-canonical sort order might have completely different items than page #1 in the canonical sort order. For a given non-canonical page, there is no clear canonical page listing all the same items, so I think rel=canonical won't help here.

Then I thought about using the noindex meta tag on all pages with non-canonical sort order, and not using it on all pages with canonical sort order.

However, if I use that method, what will happen with backlinks that are going to non-canonical pages -- will they still spread their page rank juice, even though the first page googlebot (or any other crawler) is going to encounter is marked as "noindex"?

Can you please comment on my problem and what you think is the best solution?

If you think you have a better solution, please consider that 1) I do not want to use Javascript for this, 2) I do not want all the items to be on one page.

Thank you.

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Speyer207

2 Comments

Sorted by latest first Latest Oldest Best

 

@Lee4591628

Use a rel="canonical" link tag. It exists for this purpose. It was suggested first by google, but whatwg and w3c are in ways to approve it as spec.

Per your comments on @danlefree answer, your concern maybe related to URL rewriting to user-friendly URLs, ie, you are used to www.domain.tld/apples/weight/, instead of using www.domain.tld/items.php?id=23&sort=weight.
Well, one way or another will reach the same script, set same application variables and output same content.

Just make sure you have the same (your preferred) URL on all related pages (for example, every same item, regardless sorting order, in my example could be www.domain.tld/apples/).
EDIT

Going further on discussion...

First, Matt Cutts is a great guy, I have nothing against his person, but never, NEVER take anything as the holy grail. It maybe a good proposal, but your judgement must be the final word - not any person, despite his/her position and qualification.

Let's get back to your example. Assumptions:


You have a products listing page
Per default it sorts by ID
Users can choose to sort by stock quantity, price or size
Each page shows 3 items
you currently have 12 different items to be listed.
prices and stock quantities can vary along the time


OK, now if user enter you page he will see the lists:


Default: 1 2 3 - 4 5 6 - 7 8 9 - 10 11 12
By Price: 6 11 1 - 3 4 7 - 2 9 12 - 10 8 3
By Stock: 3 2 1 - 6 4 5 - 8 9 7 - 11 12 10
By Size: ...


Are you sure you want different sorting pages to be crawled? I mean, unless you change other page contents, like title tag like:

<!-- canonical -->
<title>Joe's Store - Products list</title>
<!-- sorted by price -->
<title>Joe's Store - Products list - cheaper first!</title>


I agree, that sorting can generate a different content, but it should no be crawled and indexed if what changes is only the user sorting preferences. If you are going to provide another title, meta tags and so on, then I think it worth while keep everything as different, otherwise, canonical use is recommended.

Of course, no follow on sorting links also works and very well, but you should not rely just on them.

Let's go to my point:


First, think on users
Second, make sure your markup is as good as possible (this include, rel="nofollow" on the right bodycopy anchors, link rel="canonical", meta noindex on head)
Third, use the tools you have like robots.txt, sitemaps.xml and google webmaster tools


Why all this? Because you should worry first with your users. Then, if your page is well written, and last about crawlers. If your page is really well written, not only users and google will appreciate that, but other search engines and tools.

10% popularity Vote Up Vote Down


 

@Cody1181609

Use Google Webmaster Tools to ignore the GET parameter (i.e. "sort") for different sort orders.

I recently encountered this problem with an e-commerce shop in which the "similar items" feature was presented as a hard-coded link from every product page (which created many, many "similar items" pages for each product): telling Google to ignore the "similar items" GET parameter fixed the situation and took the number of "pages" down from 90+ million to a few thousand within several days with appreciable improvements in ranking (and, obviously, crawl rate).

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme