[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Header field name case in XML




At 09:27 PM 2/7/03 +0000, Chris Croome wrote:
Hi Graham

On Fri 07-Feb-2003 at 06:02:35PM +0000, Graham Klyne wrote:
>
> "But is it good practice to produce email headers that are lower case?"
>
> I don't think it's a *bad* practice.

OK, but it's so uncommon that it doesn't seem to be a sensible thing
to do...

> Maybe the question to ask is: "Are there any known problems caused
> by sending emails with lower case email headers?".

Good question. I don't know. However I do know that there are a
_lot_ of broken mail servers and mail clients out there and I can't
think of a good reason to risk upsetting other software when
producing software that sends out email. Sending things out in a
case sensitive way (using the norms) and accepting either seems the
best thing to do.

Since I posted that, I asked on IETF-822 [1] which has some folks to have *real* experience with *real* email, and my take on the tone of the responses was that there have been some rare examples of such problems, but not enough that we should be concerned about.


[1] http://www.imc.org/ietf-822/mail-archive/msg03094.html

> This is an issue to which I've not previously given adequate
> thought.  I see two choices:
>
> (a) adopt some case-normalized form (e.g. all lowercase) for all
> header names in XML, and copy them as-is into RFC2822 messages.
> This has the advantage of being pretty simple to implement in a
> completely generic fashion.  But note the question above.
>
> (b) adopt a standard spelling w.r.t. upper/lower case for all
> headers.  This may be more stylish, but I think it raises problems
> in the scenarios you mention, particularly when trying to map from
> RFC2822 messages to XML in a generic fashion:  what to do about
> previously unknown header field names?   This looks rather
> problematic to me.

I'm not sure I really understand these two options, however the way

I'll try and explain by example. I think we really need to understand each other here:


If we opt for all-lower-case, (a) would have us use:

    <foo:from>...</foo:from>
    <foo:to>...</foo:to>
    <foo:cc>...</foo:cc>
    <foo:message-id>...</foo:message-id>

(Another option might be to capitalize the first letter, but the result is still different from what follows...)

The "standard spelling" approach (b) would have the xml use, say:

    <foo:From>...</foo:From>
    <foo:To>...</foo:To>
    <foo:cc>...</foo:cc>
    <foo:Message-ID>...</foo:Message-ID>

etc. In this case, one cannot easily construct the correct element name from a given case-inconsiderate [*] input, such as a message containing the first form of input.

[*] by which I mean an RFC2822 message that uses, as it is permitted to do, something other than the "conventional" case mixture for header field names.

I see it is that when producing plain text mail to put over the net
it's probably best to stick to case conventions. When this email is
being produced from a XML templating system one can match the case
in the XML or do something with regular expressions (like this Perl module
does:

http://search.cpan.org/author/NWIGER/Text-Header-1.03/Header.pm

Currently, due to being lazy we are doing the former, but using one
extra Perl module isn't a big deal so I'm easy on this issue.

If it were just a case of one extra Perl module, I might agree. But it's more than that that's needed: given any valid email message (which *may* have header field names in all-lowercase, or all-uppercase, or anywhere between), it would be necessary to know explicitly about each header field used in order to construct the "conventional" spelling.


(Ironically, by following this path, I think we'd be in danger of causing something like the very potential problem you indicate as reason to not "just use lowercase", because XML-based software could be tripped up by emails that don't use standard case conventions.)

...

OK, here's a suggestion that partially addresses your concerns:

(1) the XML format always uses lower case form of header field names in element names. Therefore, when mapping from RFC2822->XML, all field names are converted to lowercase.

(2) software that generates RFC2822 from the XML creates "conventional case" output for those header fields for which it knows the conventional casing.

(3) software that generates RFC2822 from the XML creates all-lower-case output for any other header field names.

I note that items (2) and (3) are not within the scope of the XML message format specification.

#g



-------------------
Graham Klyne
<GK@xxxxxxxxxxxxxx>