Mobile app version of vmapp.org
Login or Join
Cody1181609

: Duplicate title tags and meta descriptions after removing .html extension from files Google Webmaster Tools/Search Console is giving me errors regarding duplicate title tags and meta descriptions.

@Cody1181609

Posted in: #DuplicateContent #GoogleSearchConsole #Htaccess #Seo

Google Webmaster Tools/Search Console is giving me errors regarding duplicate title tags and meta descriptions.

The website in question is a static HTML website. All documents do have a .html extension. In order to remove the .html from all documents I am using the code below in my .htaccess file:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^.]+)$ .html [NC,L]


So for example example.com/about.html becomes example.com/about Now Google thinks that there are two separate about pages -
even though it's only one. Can someone explain to me how to resolve this?

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Cody1181609

2 Comments

Sorted by latest first Latest Oldest Best

 

@Ogunnowo487

If your .html URLs were already indexed at the time you changed your URLs (and removed the .html extension) then the only way to preserve your SEO and avoid duplicate content from the get go is to implement 301 redirects from the .html URL to your desired URL.

(This assumes you have changed all the URLs in your application to your desired "extensionless" URLs.)

Something like the following at the top of your .htaccess file:

RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule (.+).html$ / [R=301,L]


The check against REDIRECT_STATUS is to avoid a redirect loop by ensuring the rewritten request (to .html) is not redirected (when the internal rewrite is triggered, REDIRECT_STATUS is set to 200).




In order to remove the .html from all documents I am using the code below in my .htaccess file


Aside: I guess this is probably just how you are describing it, but that isn't actually what that snippet of code does. You "remove the .html" from the URL by physically changing the URLs in your application (not with .htaccess). You then use .htaccess to internally rewrite the URL back to the actual filesystem path (with the .html extension) - and it's this that your snippet of code does. It re-appends the .html extension, it doesn't remove it.

10% popularity Vote Up Vote Down


 

@Sarah324

Let's assign example.com/about - Is your main URL and that URL you want to index in Google.

And example.com/about.html - Is your duplicate URL and that you don't want to index it on Google.

So There are two perfect solution. You can use any one or both.

1 ) Use 301 redirection from example.com/about.html to example.com/about
. So Google will index only the final or redirected version of URL.

2) Use Canonical link tag on head section.

Your pages are duplicate hence your canonical link tag will be same on all these pages.
example.com/about/ www.example.com/about example.com/about.html www.example.com/about/index.html


So when you place below canonical link tag then all above pages will inheirt same canonical link tag, just like the webpage title/description is same for all URL's

<link rel="canonical" href="https://www.example.com/about" />


So here Google will index only that canonical link tag, other pages will consider as duplicate and Google avoid to index it.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme