Mobile app version of vmapp.org
Login or Join
Sent6035632

: "pretty urls" with only JavaScript content changing I have a small webapp that uses a single HTML file, but accepts any URL in the directory. I'm using mod_rewrite to make all of these send

@Sent6035632

Posted in: #ModRewrite #Seo

I have a small webapp that uses a single HTML file, but accepts any URL in the directory. I'm using mod_rewrite to make all of these send index.html.

The index page may be accessed at, for example, either of these, but also any other number.

stacklint.aboutscript.com/ http://stacklint.aboutscript.com/18327507


Will I be penalized in search rankings if Google sees all of these URLs having the same HTML source?

Should I declare all of these pages as canonical to the root?

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Sent6035632

2 Comments

Sorted by latest first Latest Oldest Best

 

@Heady270

Google won't generally outright penalize a site for this. Most sites accept any URL parameters which don't change the page, but do change the URL:

example.com/page.html http://example.com/page.html?foo=bar


Usually when Google finds duplicate pages within your site, it just chooses one to index and ignores the rest.

The disadvantage to your setup is that:


Users may find various URLs for a page and link into different ones.
Googlebot may have to crawl many variations of each URL which may use up your crawl budget (then Googlebot won't find other important pages)
Google may choose to index a URL that you would not prefer.
Links into various URLs with the same content will spread your link juice (Pagerank) to various URLs and you will not rank as highly as if all the links were to the same page.
Some bots check to make sure your site serves 404 error pages at expected URLs such as /this-is-a-404-tecwnstst.html and your site will fail that check. For a long time you couldn't register your site in Google Webmaster Tools unless you served 404 pages at nonsense URLs.


As a fix you could:


Redirect all the pages back to the root
Change your setup so that you only serve that page at one URL and serve 404 pages elsewhere
Use the canonical meta tag to tell Googlebot what your canonical URL is.

10% popularity Vote Up Vote Down


 

@Sue5673885

Yes, declare index.html as canonical url at Meta tag in HEAD section (mentioned below):

<link rel="canonical" href='http://stacklint.aboutscript.com/index.html'/>

or

<link rel="canonical" href='http://stacklint.aboutscript.com/'/>

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme