Mobile app version of vmapp.org
Login or Join
Correia994

: Index page content identical to page 1 of a gallery-type website I have a gallery type website, e.g. a site that lists blog posts or pictures in a paginated manner. However, I have 2 pages

@Correia994

Posted in: #DuplicateContent #Seo

I have a gallery type website, e.g. a site that lists blog posts or pictures in a paginated manner.

However, I have 2 pages that have identical content:


example.com/index.html
example.com/page/1


Page 2, 3 and so on have different content naturally.

However, for SEO purposes, what is the best way of telling Google that page 1 is identical to index.html?

Should I 302 redirect index.html to /page/1 so index.html is non-existent, so to say or should I put a canonical tag in /page/1 (but not on /page/2) that points to index.html?

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Correia994

3 Comments

Sorted by latest first Latest Oldest Best

 

@Hamaas447

If you have a straightforward pagination structure, such that page/1 always has the same context as index, then I agree that you should designate one of these duplicate URLs as the canonical one, and either use rel=canonical links to inform Google about this, or simply redirect the non-canonical URL(s) to the canonical one. In fact, the best option would be to adjust you pagination code so that there will be no links to the non-canonical URLs in the first place.

However, note that such a URL structure is quite suboptimal if new pages may be added to the list (anywhere but at the end), since inserting a new page will cause the numbers of all higher-numbered pages to shift. This will:


break any links to those pages,
require Google to re-crawl the whole list of pages, and
until that happens, potentially cause some pages to be missing or duplicated in Google's index.


Instead, a more user and search engine friendly URL structure involves giving all the pages permanent URLs involving e.g. timestamps or post IDs. (For an example, see the URL of this page.) Thus, for your example list of blog posts, you might have a set of URLs like this:


/index.html (always shows content of newest post)
/post/20140818 (currently identical to index.html)
/post/20140817
/post/20140816
/post/20140815
etc.


with navigation links pointing to the previous and next permanent URL, an a "latest" link pointing to the index URL.

Such a URL structure has a number of advantages:


Users can easily link to any of the posts, or to the index page showing the latest post, without the links breaking when new posts are added.
Both the index page and the individual post pages can also be easily bookmarked.
Search engines will only need to crawl each page once (expect for the index page, which will be detected as frequently changing, and thus recrawled often).


(For optimal crawling, you should also maintain an up-to-date XML sitemap of all the pages, including the index, and ping search engines whenever a new page is added. As a fallback, you should also ensure that the index page always has a link to the latest permanent URL somewhere on it, so that search engines recrawling the index page will discover any new permanent URLs.)

As for using redirects or rel=canonical links with such a permalink-based pagination structure, the established practice appears to be not to do it. While the index page and the latest permalink URL will indeed (temporarily) have duplicated content, they aren't really equivalent from an SEO viewpoint, and you do want both to be indexed separately. Fortunately, as this is a commonly used pagination scheme (and, I believe, actually recommended by Google), major search engines will generally handle it well, treating it as "acceptable" duplication and generally showing appropriate results (the index page for general searches, permalinks for page-specific keywords) without any explicit hinting.

10% popularity Vote Up Vote Down


 

@Jennifer507

To begin with, you should link to your pages using a consistent URL. If example.com/index.html and example.com/page/1 contain same content then you should remove one of these links from your website by editing HTML files or PHP code.

Next, if google or other search engines have picked up both URLs you can either:


Send a 301 Moved Permanently header and redirect from non-canonical to canonical URL
Send a 404 Not Found header when the non-canonical URL is requested. Do this only if you do not care about existing links.

10% popularity Vote Up Vote Down


 

@Goswami781

No don't use a 302, because it means temporary redirection. Set a canonical link in page 1 to index.html. That is the right way.

REM: both pages exist, but Google (and other search engines) will only pick one to display in search results.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme