: Why treat these as URLs with different path capitalization and trailing slash as different? These are all strictly different urls: http://www.example.com/page http://www.example.com/pAge http://www.example.com/page/

Posted in: #CanonicalUrl #CaseSensitive #TrailingSlash #Url #UrlRewriting

These are all strictly different urls:
www.example.com/page http://www.example.com/pAge www.example.com/page/ http://www.example.com/paGE/

I get that it conforms to the strict ISO rules, but why? How many websites are there out there that actually treat page and page/ as different url's you can visit? Or actually use capitalisation to differentiate content? If they did I would tell them they are probably doing it wrong.

Why do we have to waste our time conforming to these rules? Isn't it quite trivial for Google to work out that page and page/ are the same page and probably shouldn't be treated as duplicate content?

10.03% popularity Vote Up Vote Down

: Author rel tag works on one author but not another Here's one page I've managed to get authors set up on properly, no errors: http://www.google.com/webmasters/tools/richsnippets?url=http%3A%2F%2Fwww.scirra.com%2Fblog%2F42%2Fwelcome-

@Moriarity557

Posted in: #Author #Rel

1 Comments

: "Remember me": Best practices for expiration and refresh There are many questions with similar titles, but I couldn't find any asking the same thing. Most sites that support user login have a

@Moriarity557

Posted in: #Recommendations #Session #Standards

1 Comments

: I guess you should be able to set a custom page for 401 error, just as for other pages: ErrorDocument 401 /error.php

@Moriarity557

0 Comments

: Umbraco also supports multiple sites on a single install, and I use it as such, but at least in version 4.7 its not perfect - you can have completely different content/layout etc on multiple

@Moriarity557

0 Comments

Login to post a comment!

3 Comments

Sorted by latest first Latest Oldest Best

@Reiling115

No offense intended, but Case Sensitivity is VITAL to urls today - they are used millions of times a day:

bit.ly

bit.ly/ri2LhQ http://bit.ly/ri2LHq

Two vastly different sites - only possible because of case sensitivity

10% popularity Vote Up Vote Down

@Cody1181609

I get that it conforms to the strict ISO rules, but why?

There are different operating systems behind the various servers on the net, and for some of them a directory or file named page is not the same as one named Page. The result is that those really are two different locations and not even necessarily the same type of location(dir/page). The web server might be configured as case-insensitive, but you can't assume that. Therefore, the rules have to assume things do care about case and if they don't then whatever. Realistically, it's probably not a great idea to rely on case differences, but the situation does exist and so it has to be accounted for, sometimes with things like mod_speling.

How many websites are there out there that actually treat page and page/ as different url's you can visit?

They are different. It's just almost always hidden from you:

When you go to example.com/foo/ the web server is aware you're going to a directory, and so looks for a file in there matching whatever it's configured to recognize as a directory index. So eventually you end up at example.com/index.html for example.
If you go to example.com/foo the server does actually look for a file in the root directory named just foo. If it doesn't find one, then it checks if there's a directory named /foo and you can go up to #1 .

What you seem to be reading as "normal" behavior in #2 is actually a fallback to handle a likely case.
How many do use extension-less filenames is irrelevant. Again: real problem; needs to be accounted for.

If they did I would tell them they are probably doing it wrong.

That is an opinion.
You can back it up with various practical arguments about case-insensitivity and how to handle extension-less URLs that I don't necessarily disagree with, but factually you would be wrong to say this.

10% popularity Vote Up Vote Down

@Samaraweera270

This is not a Google policy, they are basics rules.

From a windows user point of view it is difficult to understand case-sensitive filenames. However, under unix/linux systems, pAge and page are not the same files nor directories, and so on webservers.

The trailing slash is a configuration issue (or choice).
Keep in mind that on most web servers, the server will issue a 30x redirect on /page two /page/, thus, requiring a second request to your server.

You can make your web server case insensitive and configure it in any way you want to comply to your own rules.

But again, it is not related to Google at all

10% popularity Vote Up Vote Down

Feed

: Why treat these as URLs with different path capitalization and trailing slash as different? These are all strictly different urls: http://www.example.com/page http://www.example.com/pAge http://www.example.com/page/

More posts by @Moriarity557

: Author rel tag works on one author but not another Here's one page I've managed to get authors set up on properly, no errors: http://www.google.com/webmasters/tools/richsnippets?url=http%3A%2F%2Fwww.scirra.com%2Fblog%2F42%2Fwelcome-

: "Remember me": Best practices for expiration and refresh There are many questions with similar titles, but I couldn't find any asking the same thing. Most sites that support user login have a

: I guess you should be able to set a custom page for 401 error, just as for other pages: ErrorDocument 401 /error.php

: Umbraco also supports multiple sites on a single install, and I use it as such, but at least in version 4.7 its not perfect - you can have completely different content/layout etc on multiple

Login to post a comment!

3 Comments

Back to top | Use Dark Theme