[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Header field name case in XML
At 09:27 PM 2/7/03 +0000, Chris Croome wrote:
Hi Graham
On Fri 07-Feb-2003 at 06:02:35PM +0000, Graham Klyne wrote:
>
> "But is it good practice to produce email headers that are lower case?"
>
> I don't think it's a *bad* practice.
OK, but it's so uncommon that it doesn't seem to be a sensible thing
to do...
> Maybe the question to ask is: "Are there any known problems caused
> by sending emails with lower case email headers?".
Good question. I don't know. However I do know that there are a
_lot_ of broken mail servers and mail clients out there and I can't
think of a good reason to risk upsetting other software when
producing software that sends out email. Sending things out in a
case sensitive way (using the norms) and accepting either seems the
best thing to do.
Since I posted that, I asked on IETF-822 [1] which has some folks to have
*real* experience with *real* email, and my take on the tone of the
responses was that there have been some rare examples of such problems, but
not enough that we should be concerned about.
[1] http://www.imc.org/ietf-822/mail-archive/msg03094.html
> This is an issue to which I've not previously given adequate
> thought. I see two choices:
>
> (a) adopt some case-normalized form (e.g. all lowercase) for all
> header names in XML, and copy them as-is into RFC2822 messages.
> This has the advantage of being pretty simple to implement in a
> completely generic fashion. But note the question above.
>
> (b) adopt a standard spelling w.r.t. upper/lower case for all
> headers. This may be more stylish, but I think it raises problems
> in the scenarios you mention, particularly when trying to map from
> RFC2822 messages to XML in a generic fashion: what to do about
> previously unknown header field names? This looks rather
> problematic to me.
I'm not sure I really understand these two options, however the way
I'll try and explain by example. I think we really need to understand each
other here:
If we opt for all-lower-case, (a) would have us use:
<foo:from>...</foo:from>
<foo:to>...</foo:to>
<foo:cc>...</foo:cc>
<foo:message-id>...</foo:message-id>
(Another option might be to capitalize the first letter, but the result is
still different from what follows...)
The "standard spelling" approach (b) would have the xml use, say:
<foo:From>...</foo:From>
<foo:To>...</foo:To>
<foo:cc>...</foo:cc>
<foo:Message-ID>...</foo:Message-ID>
etc. In this case, one cannot easily construct the correct element name
from a given case-inconsiderate [*] input, such as a message containing the
first form of input.
[*] by which I mean an RFC2822 message that uses, as it is permitted to
do, something other than the "conventional" case mixture for header field
names.
I see it is that when producing plain text mail to put over the net
it's probably best to stick to case conventions. When this email is
being produced from a XML templating system one can match the case
in the XML or do something with regular expressions (like this Perl module
does:
http://search.cpan.org/author/NWIGER/Text-Header-1.03/Header.pm
Currently, due to being lazy we are doing the former, but using one
extra Perl module isn't a big deal so I'm easy on this issue.
If it were just a case of one extra Perl module, I might agree. But it's
more than that that's needed: given any valid email message (which *may*
have header field names in all-lowercase, or all-uppercase, or anywhere
between), it would be necessary to know explicitly about each header field
used in order to construct the "conventional" spelling.
(Ironically, by following this path, I think we'd be in danger of causing
something like the very potential problem you indicate as reason to not
"just use lowercase", because XML-based software could be tripped up by
emails that don't use standard case conventions.)
...
OK, here's a suggestion that partially addresses your concerns:
(1) the XML format always uses lower case form of header field names in
element names. Therefore, when mapping from RFC2822->XML, all field names
are converted to lowercase.
(2) software that generates RFC2822 from the XML creates "conventional
case" output for those header fields for which it knows the conventional
casing.
(3) software that generates RFC2822 from the XML creates all-lower-case
output for any other header field names.
I note that items (2) and (3) are not within the scope of the XML message
format specification.
#g
-------------------
Graham Klyne
<GK@xxxxxxxxxxxxxx>