Re: Special Characters

New Message Reply About this list Date view Thread view Subject view Author view

From: Charles Lindsey (chl@clw.cs.man.ac.uk)
Date: Wed Apr 19 2000 - 11:21:05 CDT


In <Pine.OSF.4.21.0004081822440.16964-100000@worf.netins.net> Jonathan Grobe <grobe@netins.net> writes:

>Well, I've just spent an hour or so reading through RFCs and all I can
>say is that I'm more confused now than when I started.

>I based my code on son-of-1036 and it's been rejececting those same
>message-ID (i.e, <blah_[1.2.3.4]@ns.sol.net> stuff.

>Looking at 822, 1036, son-of-1036, USEFOR, and DRUMS, I conclude that
>in all of those, the above ID is invalid because of the [ and ].

Well, we should assume that DRUMS will shortly supersede 822, and USEFOR
is supposed to the the same as DRUMS - but without the obsolete stuff -
(so we should fix it if we find any discrepancy). So let us interpret
these tricky cases with DRUMS (i.e. MESSFOR) and see where we get.

The DRUMS syntax is

msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS]
id-left = dot-atom-text / no-fold-quote / obs-id-left
id-right = dot-atom-text / no-fold-literal / obs-id-right
no-fold-quote = DQUOTE *(qtext / quoted-pair) DQUOTE
no-fold-literal = "[" *(dtext / quoted-pair) "]"

Following that,
        <blah_[1.2.3.4]@ns.sol.net>
is no good becuase blah_[1.2.3.4] is not an id-left (not even an obsolete
one). However, if you quote it:
        <"blah_[1.2.3.4]"@ns.sol.net>
then it is OK.

>But all the standards are also inconsistent. What a mess. And even the
>new USEFOR seems to have problems. It starts by droping one of the " in
>the double quote convention (a typo no doubt). Then makes reference
>to strict-qtext strict-quoted-pair which I can't find defined anywhere.

OK, I fixed the typo. You will find strict-qtext and strict-quoted-pair in
USEFOR 2.4. Essentially, they ensure that only 7-bit characters are
allowed (bringing it into strict accord with DRUMS - i.e. no UTF-8 stuff
in messgage-IDs).

>Basically, most the standards define a few special characters which
>can only be used in the ID if they are quoted. [ ] are characters that
>must be quoted in all the standards.

Certainly in an id-left, but not in an id-right (but then only if they are
the whole of the id-right), so you can have:
        <foo.bar@[1.2.3.4]>

>822 says that atoms must be quoted. So for example, you could do this:

> <"[special_stuff]".a.b@domain>

>that is, you don't have to quote the entire local part, but you can.

In MESSFOR, that is allowed under the obsolete syntax. I.e. MUST accept,
but MUST NOT generate.

>USEFOR seems to say that if there's a special character, you must quote
>the entire local part:

Indeed so, because USEFOR neither accepts not generates the obsolete
syntax.

> <"[special_stuff].a.b"@doman>

>USEFOR seems to allow spaces if they are quoted:

> <" whie space ok here"@domain>

Yes indeed. This has been spotted, and DRUMS have been asked to do
something about it (note to Pete Resnick - is anything happening there?).
If DRUMS do not fix it, then we may have to make a unilateral decision of
our own.

>Which is totaly bogus if it does.

>USEFOR seems to allow use of the \ to escape special characters,
>as in:

> <\[special\].a.b@domain>

No, I don't see that USEFOR allows that. You can only use quoted-pairs
inside a no-fold-quote.

>son-of-1036 doesn't allow quoting (of any type) to be used in Message-IDs.
>So you just can't use [ and ] period. It also throws in ! as a special
>character which can't be used in Message-IDs. With son-of-1036,
>even this: <asdfasdf@[12.12.12.12]> is an invalid Message-ID.

I think deviating from son-of-1036 was the price we paid for getting DRUMS
to drop folding in message-IDs.

>At this point, I'm quite lost when it comes to figuring out what my
>news server should allow and/or reject when it comes to message-IDs....

Just follow USEFOR (or non-obsolete DRUMS).

>I allow [IP-ADDR] as the domain part of the ID now, but only if it's
>the entire domain name and only if it strictly follows the N.N.N.N format
>where N is 0 to 255.

And here we find an oddity in DRUMS (which we inherit), namely that within
[...] you can have almost anything you want:
        <foo.bar@[any_ascii_except_\[,\]_and\ \\_(even_NO-WS-CTL)]>
(pretty, isn't it?)

Essentially, MESSFOR decided not to get bogged down in defining what a
domain literal was, leaving it to the transport mechanism to decide
whether it was meaningful (yes, you can use all that "pretty" stuff in
addr-specs too). There might even be some (non-internet) transport
mechanism out there that understands it.

So USEFOR follows MESSFOR here (though maybe a SHOULD NOT generate would
be in order).

Note that USEFOR no longer allows [1.2.3.4] in a path-identity (but it
does allow ...!1.2.3.4!...).

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Email:     chl@clw.cs.man.ac.uk  Web:   http://www.cs.man.ac.uk/~chl
Voice/Fax: +44 161 437 4506      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9     Fingerprint: 73 6D C2 51 93 A0 01 E7  65 E8 64 7E 14 A4 AB A5


New Message Reply About this list Date view Thread view Subject view Author view


This archive was generated by hypermail 2b29.