Mobile app version of vmapp.org
Login or Join
Voss4911412

: Should plus be encoded in mailto: hyperlinks? When placing an email address with an address tag (aka sub-addressing) in a mailto hyperlink … <a href="mailto:username+foo@example.com">mail

@Voss4911412

Posted in: #Hyperlink #Mailto #UrlEncoding

When placing an email address with an address tag (aka sub-addressing) in a mailto hyperlink …

<a href="mailto:username+foo@example.com">mail us now!</a>

… should the plus in the email be URL encoded?

<a href="mailto:username%2Bfoo@example.com">mail us now!</a>

I can't figure this out, and the documentation is conflicting. Our real world tests have produced mixed results as well, making it even more confusing.

10.07% popularity Vote Up Vote Down


Login to follow query

More posts by @Voss4911412

7 Comments

Sorted by latest first Latest Oldest Best

 

@Goswami781

Per RFC 6068 as mentioned in answers, you MAY encode the plus sign as %2B.

The reason there's confusion is that converting a space into a plus isn't actually part of standard URL encoding, it's part of form parameter encoding (i.e. application/x-www-form-urlencoded)

It's like the difference between PHP's rawurlencode() and urlencode().

So what RFC 6068 is saying is that a mailto: URL should use "raw" standard URL encoding (per RFC 3986), and a plus sign that appears in the URL should always be treated as a literal plus sign, and not as a space which has been form encoded.

If the local client does convert the plus into a space it's broken.

10% popularity Vote Up Vote Down


 

@Pope3001725

You MAY encode +, but you don't have to.

First, we need to agree that mailto is an example of a generic URI, specified by RFC 2396. (This is what XHTML and HTML 4 use).

Now let us find out the list of reserved characters in RFC 2396.

reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
"$" | ","


URI splits into absolute and relative:

URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]


And because scheme mailto: is specified this is an absolute URI:

absoluteURI = scheme ":" ( hier_part | opaque_part )


And since both patterns for hier_part start with /, mailto is an opaque part.

opaque_part = uric_no_slash *uric

uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" |
"&" | "=" | "+" | "$" | ","

uric = reserved | unreserved | escaped


So the restriction is that you have to escape / if it comes to the first character, but after that you can put in reserved characters including + and @ .

Here's another RFC to support this. In the latest RFCs of mailto scheme published in 2010 called RFC 6068, it says:


Software creating 'mailto' URIs likewise has to be careful to encode
any reserved characters that are used. HTML forms are one kind of
software that creates 'mailto' URIs. Current implementations encode
a space as '+', but this creates problems because such a '+' standing
for a space cannot be distinguished from a real '+' in a 'mailto'
URI. When producing 'mailto' URIs, all spaces SHOULD be encoded as
%20, and '+' characters MAY be encoded as %2B. Please note that '+'
characters are frequently used as part of an email address to
indicate a subaddress, as for example in <bill+ietf@example.org>.

10% popularity Vote Up Vote Down


 

@Kevin317

Per new RFC tools.ietf.org/html/rfc6068#section-5
... '+' MAY BE encoded as %2B


So I guess the answer is don't, but maybe?

10% popularity Vote Up Vote Down


 

@Speyer207

The RFC1738


3.5. MAILTO

The mailto URL scheme is used to
designate the Internet mailing
address of an individual or service.
No additional information other
than an Internet mailing address is
present or implied.

A mailto URL takes the form:

mailto:<rfc822-addr-spec>


where is (the
encoding of an) addr-spec, as
specified in RFC 822. Within
mailto URLs, there are no reserved
characters.

Note that the percent sign ("%") is
commonly used within RFC 822
addresses and must be encoded.

Unlike many URLs, the mailto scheme
does not represent a data object to
be accessed directly; there is no
sense in which it designates an
object. It has a different use than
the message/external-body type in
MIME.


Since there are no reserved characters it should be encoded.

10% popularity Vote Up Vote Down


 

@Gretchen104

A strict reading of the relevant RFC says that the "+" should be encoded.

Section 2, top of page 2 on tools.ietf.org/html/rfc2368 says:


"Note that all URL reserved
characters in "to" must be encoded: in
particular, parentheses, commas,
and the percent sign ("%"), which
commonly occur in the "mailbox"
syntax."


The RFC for URIs (http://tools.ietf.org/html/rfc3986#section-2.2) lists "+" as a reserved character.

That said, what is "correct" is not necessarily what will work in all browsers. Some browsers will obviously always handle the correct things as if they were wrong and the incorrect as if they were right.

Edit: As for RFC6068 and its "MAY", I would read that as context dependent. If you are writing the URL for text reading then "+" would make more sense, however if you're writing it in HTML then the stricter interpretation of RFC3986 would be more inline with "valid HTML" ideas and so anything using the value should expect it to be encoded.

10% popularity Vote Up Vote Down


 

@Caterina187

I think that encoding it or not, won't make a real difference.
The problem are the mail clients. For examle, Yahoo Mail only uses hyphen for sub adressing whereas gMail uses the plus.

That's my 2 cents...

EDIT: The response below has a solid point.

10% popularity Vote Up Vote Down


 

@Megan663

The plus is used to encode spaces in URLs, not in HTML and not in SMTP (RFC2821). However, since mailto:address@server.com is a URI (it has a protocol, the protocol separator and the protocol address) then it should be treated as a URI and it should be percent encoded.

Therefore, it is up to the client to resolve accurately the encoded representation and to decode it as far as is appropriate. Here is Microsoft's official take on the matter.

You should apply URL encoding on mailto: URLs embedded in HTML if the characters in the email address are URI reserved. This ensures that you are doing the correct thing. It is up to the client to decode the URI appropriately from whence it is received. Yes, this+address@gmail.com is a very valid email; yes this%2Baddress@gmail.com is also valid. Yes those two are different, but whether they'll be treated differently is up to the client...

As you previously noted, not all clients render this correctly. I suggest finding the most likely client (gmail? browser based clients? Outlook?) that your users will use and doing what that client does. You said you tested on GMail? How did you test it? With a "browser based mailto: client (such as add-ons to firefox and gmail offer) the URI is most likely not being decoded (as it should be).

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme