: How should websites handle hostname with trailing dot? I read this question How can URLs have a dot . at the end, e.g. www.bla.de.? and realise that FQDN should contain a trailing . for the
I read this question How can URLs have a dot . at the end, e.g. bla.de.? and realise that FQDN should contain a trailing . for the root label of the DNS tree:
example.com. instead of example.com
However, there are issues as pointed out in this blog article:
If you do not consider the fact that the user can accidentally enter
the domain name with a dot at the end, or follow a link received from
some "well-wisher" and get on your domain name with the dot at the
end, as the result it may lead to unexpected consequences:
1) If the website uses HTTPS, when navigating to the domain name with
the dot at the end, the browser will display the warning on untrusted
connection.
2) Authentication may be broken, as cookies are usually set for the
domain name without a dot at the end. User in this case will be quite
surprised why he can’t log in. It is noteworthy, that if you set a
cookie for a domain name with a dot at the end, this cookie will not
be passed to the domain name without the dot at the end and vice
versa.
3) JavaScript on the page may be broken.
4) There may be problems with the caching of website pages (for
example, www.cloudflare.com/ does not clear the pages cache if
domain name has a dot at the end considering it an invalid domain
name).
5) If in conditions in the web server configuration you rely on the
particular domain name ($http_host in Nginx, %{HTTP_HOST} in Apache)
without the dot at the end, you may face a variety of unexpected
situations: unexpected redirects, basic-authorization problems, etc.
6) If the web server is not configured to accept requests on the
domain name with the trailing dot, any user who accidentally typed a
domain name with the trailing dot will see something like Bad Request
- Invalid Hostname.
7) It is possible that search engines may find that your resource has
a duplicate content, if someone accidentally or intentionally post
links to your web pages with a dot at the end of the domain name.
I also realise that webmasters.stackexchange.com./ does a 400 Bad Request. But since the domain name proper should contain a . at the end, shouldn't we be issuing 400 error or 301 redirect for hostnames without a trailing dot? What is the proper way to deal with this issue in a coherent and consistent manner?
More posts by @Dunderdale272
3 Comments
Sorted by latest first Latest Oldest Best
my comment at core.trac.wordpress.org/ticket/35248#comment:9 :
my reply to the text by the first link ( web.archive.org/web/20160604095348/http://homepage.ntlworld.com/jonathan.deboynepollard/FGA/web-fully-qualified-domain-name.html ):
Originally, as defined in RFC 1738 (§ 3.1), the "host" portion of a (Common Internet Scheme) URL was always and unequivocally a fully qualified domain name and the conventional mechanism for distinguishing fully-qualified domain names from non-fully-qualified domain names did not apply. Whether it was example.com. or example.com, the host was intended to be the same.
-- i think he is not right, i think "example.com" was not allowed at all in urls according to rfc 1738, it is cited in the second text, and i cite it:
3.1. Common Internet Scheme Syntax
//<user>:<password>@<host>:<port>/<url-path>
host
The fully qualified domain name of a network host
and "example.com" could not be used in http headers at that time, because rfc 1738 is of 1994 and host field appeared only with http 1.1 in 1997 (you can check in wikipedia).
so, indeed, only fqdn was left allowed in urls. i think, this was a error in rfc 1738, because in such way it made (tried to make) the "relative domains" feature useless. if it did not disallow it, they theoretically could be used in "a" tag hrefs in local scripted sites or static html documentation inside big companies that used relative domains, if browsers and servers supported it. but even if rfc 1738 disallowed them, people did not obey it: they continued to use top level domains in relative form ie without trailing dot, so this disallowing by rfc 1738 was not a big practical problem anyway, and people had and used an alternative to relative domains: they just made local top-level domains like "localhost" (and used and use them also without trailing dot).
then he says:
Unfortunately, in practice web browsers have always violated that specification and passed the "host" portion through the name qualification procedures of their DNS Client libraries when mapping the host name to a set of IP addresses. (For example, those that used the BIND DNS Client library would leave the RES_DNSRCH option set and would not append the final trailing dot if it was missing.)
-- i think he meaned that hosts without trailing dot should be just thrown off as error, and only absolute domains (fqdn) should be passed to dns. i think probably browsers did pass all domains to dns because people used their custom local top level domains like "localhost". and anyway, later in rfc 2396 published in 1998, the usage of top level domains in urls without trailing dots was allowed.
then the author (Jonathan de Boyne Pollard) cites rfc 2396 and regrets about it changed according to the established human behaviour ie de facto standarts, says that better would be if browsers obeyed rfc 1738, and recommends to all people to use only fqdn, in all places, as it was commanded by rfc 1738.
-- but what would happen if people obeyed rfc 1738? urls like "http://example.com/test.html" and "http://localhost/test.html" all had to be rewritten as "http://example.com./test.html" and "http://localhost./test.html". browser would have to either mark hosts without dots as error, or redirect on clicking them to full/absolute form of them. all people who configured local top-level domains like "localhost" would have to configure their servers to accept only requests for domains like "localhost." , or accept and redirect [all urls inside] "localhost" to [corresponding urls in] "localhost.". text like "localhost" would stay useful only when typing it in browser address bar, but that would be only very useless usage, and the relative domain feature is not needed for that, because browsers search for domains on typing. usage of them in html source would become useless because it would lead to that such links would not work, or clicking all links with "localhost" would move user to "localhost." and it would be just extra redirect on every click (on such links). so, rfc 1738 would make the planned "relative domain" feature entirely useless. if some company used that feature, and used their relative domains in their local sites, and their urls with relative domains were not redirected to absolute form by browsers, so their sites worked normally, if they also obeyed rfc 1736, they would configure their servers to accept only fqdn, and they would have to either rewrite all their such urls with fqdn, or work with extra redirect on every click on such urls. if that companies liked having short domain like "team101" instead of "team101.microsoft.com." in their address bars and html sources, they would have to start to use their custom internal top-level domains like "team101." ie like "localhost." instead of subdomains like "team101.microsoft.com." (which could be used as just "team101" before they decided to obey rfc 1738).
--
and i have found out that the trailing dot, which was so strongly supported by rfc 1738, really appeared only after the standart without trailing dots! it appeared with rfc 1034 in 1987, it is cited in the second link, and i cite it:
Since a complete domain name ends with the root label, this leads to a
printed form which ends in a dot. We use this property to distinguish between:
- a character string which represents a complete domain name
(often called "absolute"). For example, "poneria.ISI.EDU."
- a character string that represents the starting labels of a
domain name which is incomplete, and should be completed by
local software using knowledge of the local domain (often
called "relative"). For example, "poneria" used in the
ISI.EDU domain.
rfc 1034 (of 1987) just declared all domains which were used, seems they all were without trailing dots, declared them all as becoming relative domains! but they still worked as before, so probably few people knew out about that, and continued to think that they are unambiguously requesting a unique real "example.com" site when they use "example.com" without trailing dot. so that has become an additional security breach in some cases: famous real example.com could be spoofed by a subdomain administrator even if he was not given rights to make any local domain like "localhost.". so, rfc 1034 also was not designed very well: seems its authors did not expect that maybe it will be {not widely known, so creating security breach}!
probably rfc 1738 (1994) tried finally to bring the idea of distinction between absolute and relative domains to wide audience and also fix that security breach after 6 years, {but by fixing the security breach by disallowing relative domains in urls it made relative domains useless, {but i think they probably were not used widely, probably only in some big companies}}. so, what would be [left] in result of rfc 1737, if it would be obeyed? - 1) relative domains declared in 1987 would become finally useless, so, trailing dot, designed to show absolute domain, also would become finally useless and redundant "legally" ie as defined by the rfcs! (but maybe they planned later re-allow relative domains in urls after many years, when wide audience (general public) start to know about the possibility of relative domains). 2) and rfc 1737, if it was obeyed, would also fix the security breach. - but even rfc 1034 would not create the security breach if it reached masses and it was widely understood that using relative domain is not safe! - so, main recipe to fix it was reaching the wide audience, and publishing one more rfc was just one of many ways to do it.
i think now that probably the relative domain feature has not become widely known after rfc 1034 (of 1987) because it was of too limited use: only in some big companies or providers' local networks, and it was a feature with no practical value, because local networks could already make any local domain, so that feature was just for itself, it was in fact just a useless text in rfc that anybody should know and use without having any additional benefit! but people created the little security breach by widely ignoring the rfc, while browsers started to obey it.
i checked the relative domains feature yesterday, it works. (it is ok, because rfc 2396 (of 1998) re-allowed it after rfc 1034 (of 1987) denied, and later rfc 3986 (of 2005) still allows them). i added dns suffix in windows 10 - control panel - ... - network device properties - ipv4 properties - additional - dns tab. when i added "google.com" then opened "http://mail/" in firefox, it opened google's server, but it was not configured to work with just "mail" in the http "host" header, so i got something like "404" page.
--
my reply to the text by the second link ( www.dns-sd.org/trailingdotsindomainnames.html ):
he also cites the rule in rfc 1738 and says:
Unfortunately, the people implementing web browser clients appeared not to understand what this meant. When you access a web site, the value most web browsers put in the "Host:" field is what the user typed, not what the computer actually ended up using, after applying the DNS user's searchlist to constuct a fully-qualified name from the partial name. For example, here are three different ways the user may refer to the host "www.example.com." ... When sending the "Host:" parameter to the web server, the web browser client puts in what the user typed ("www.example.com.", "www.example.com", or "www") instead of what the client ended up actually looking up in DNS ("www.example.com." in all three cases). ...
-- this is not very true(correct), because rfc 1738 was very strict in this regard, and it disallowed relative domains in all urls, even if it is in browser's address bar, and url itself is the [recommended] way of making any references to sites, even if people write it on paper, so it was not allowed to users to refer to that site in that 3 ways, by rfc 1738, if that users were going to think by it that they used URL!
and seems the author of this text (Stuart Cheshire) did not know about rfc 2396, so this text is outdated.
--
and what is the situation nowadays? rfc 3986 ( tools.ietf.org/html/rfc3986#page-21 ) allows referring to absolute domain without trailing dot: it says " The rightmost domain label of a fully qualified domain name in DNS may be followed by a single "." " and that it should be used if it is "necessary to distinguish between the complete domain name and some local domain". i think that due to de facto standarts it is almost never necessary, so wordpress can accept the de facto standart and redirect from address with trailing dot to the address without it.
I like to think of the trailing dot as the "real" root of the Internet, and that it lives in Virginia, USA. If you leave out the dot, then some root is always implied. For normal users, it's the same root, and that's the situation I will discuss today.
In my perverse way, I actually find the trailing dot quite handy. If I'm checking out someone else's website and I want to start fresh, with no caching, no cookies etc, and I'm too lazy to flush those out, I'll either use a different browser or I will add the dot. If the site doesn't redirect me, I've got all-fresh uncached URL's for all the site's pages and other resources.
As a webmaster, I want all people and robots viewing a page to be viewing it with the same URL, and therefore with the same hostname. If the hostname isn't the one I want them to use, I'll do an immediate 301 redirect so they will see the correct URL in their browser. For my PHP-based sites, I handle the problem in PHP and not in the .htaccess or web.config file, as it is more portable and is easier to test on development and staging servers. I handle my database connections at the same time, as they also also vary among development/staging/production servers.
Here is a simplified version of my typical code. Note the canonical redirects towards the end.
$Host = $_SERVER['HTTP_HOST'];
switch ( $Host ) {
case 'exampleweb.local': // my local dev machine
$MysqliParams = array(
'host' => 'localhost',
'username' => 'root',
'passwd' => 'snoopy',
'dbname' => 'exampledb');
break;
case 'www.exampleweb.com': // the "live" site
$MysqliParams = array(
'host' => 'superhost1.net',
'username' => 'examp302',
'passwd' => 'anything-but-snoopy',
'dbname' => 'examp302_db');
$GoogleAccount = 'UA-13243546-01; // only enable for live site
break;
case 'exampleweb.mystagingsite.net': // the client preview site
$MysqliParams = array(
'host' => 'superhost1.net',
'username' => 'examp302',
'passwd' => 'anything-but-snoopy',
'dbname' => 'examp302_staging');
break;
case 'exampleweb.com': // canonical redirects
case 'exampleweb.com.':
case 'www.exampleweb.com.':
header('HTTP/1.1 301 Moved Permanently');
header("Location: www.exampleweb.com );
exit;
default:
die("invalid hostname $Host");
}
To partially answer your question, you can add it to htaccess canonical forwarder rules. In a basic HTTP sense it looks for a period before the URI and works it into whatever anti-duplicate forwarding mechanism you use. Here is an example including a common "addon domain" sub util route:
RewriteCond %{HTTP_HOST} ^domain.hostdomain.com(|.)$ [OR]
RewriteCond %{HTTP_HOST} ^www.domain.hostdomain.com(|.)$ [OR]
RewriteCond %{HTTP_HOST} ^domain.com(|.)$ [OR]
RewriteCond %{HTTP_HOST} ^www.domain.com.$
RewriteRule ^(.*)$ "http://www.domain.com/" [R=301,L]
What this would do is forward all of the following to a canonical HTTP www domain:
domain.hostdomain.com
domain.hostdomain.com. domain.hostdomain.com www.domain.hostdomain.com.
domain.com
domain.com. domain.com.
All forward to:
www.domain.com
There is a caveat to this though - as stated in the original blog quote, SSL will not forward correctly and will fly a browser warning or 400 bad request error in most server instances (esp with HSTS). This is because it sees the "host" SSL in a post-TLD-period use case. I am not sure of a workaround to deal with the host SSL warning since it comes before htaccess and things.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.