From: Charles Lindsey (chl@clw.cs.man.ac.uk)
Date: Thu Mar 14 2002 - 11:28:54 CST
In <3C8BDE9E.13843FF5@erols.com> Bruce Lilly <blilly@erols.com> writes:
>I have once again revised the comments at
>http://users.erols.com/blilly/mailparse/draft-ietf-usefor-article-06.pdf
Arrrrgh! The first time Bruce posted about this I looked at his PDF file,
saw that it was just our draft in PDF, looked down the first few pages and
saw nothing else, and looked at the end of the file and saw no comments
there, and so decided to ignore it.
Now he has posted about it again, I looked again and saw that he had
indeed used some Acrobat magic to stick comments on the text.
PLEASE DO NOT DO THIS!
It took me hours of work to disentangle these comments, and obtain them in
a readable form, and I don't intend to go through that exercise again.
Comments on the drafts should be emailed, in plain text, to this list.
However, his comments make various point that we need to discuss, so I
reproduce the relevant ones below (many of his minor ones I have just dealt
with in the draft, and some of them we knew about already).
>Questions and comments regarding the comments should be
>addressed to me at blilly@erols.com.
If you want to be involved in these discussions, then please subscribe to
this list.
In Section 4, regarding parameter (both ours, and any we inherit from
MIME)
[Bruce Lilly
What about RFC 2231 chars used in attributes, esp. with continuation and
charsets: "*", "'", "%"]
Well does any Mail system actually implement RFC 2231? My feeling is that
we just don't want to know, so I have written an extra paragraph to follow
the existing paragraph about RFC 2047 in 4.4.1 (both paragraphs reproduced
below). I probably need to say something in gateways when I get there.
Where the use of non-ASCII characters, encoded in UTF-8, is permitted
as above, they MAY also be encoded using the MIME mechanism defined
in [RFC 2047], but this usage is deprecated within news articles
(even though it is required in email messages) since it is less
legible in older reading agents which support neither it nor UTF-8.
Nevertheless, reading agents SHOULD support this usage, but only in
those contexts explicitly mentioned in [RFC 2047].
Similar considerations apply to non-ASCII characters within the
values of parameters (which, according to the syntax, MUST be in the
form of quoted-strings in order for UTF8-xtra-chars to be
accomodated). There is NO requirement to support the extensions set
out in [RFC 2231] for specifying continuations, character sets or
languages in such values, though reading agents MAY support them.
Is that OK?
[RFC 2231] N. Freed and K. Moore, "MIME Parameter Value and Encoded
Word Extensions: Character Sets, Languages, and COntinuations",
RFC 2231, November 1997.
Section 5 - Path
[Bruce Lilly
":.-_" is a valid path-identity Perhaps path-identity should be
constrained to begin with ALPHA or DIGIT.
An upper bound (presumably no more than 73 characters) on the length of
path-identity and tail-entry should be specified.]
I agree with the first, and I now have
path-identity = ( ALPHA / DIGIT )
*( ALPHA / DIGIT / "-" / "." / ":" / "_" )
Any objections?
But I see no point in a limit (short of that 998). This header is for
machines to read. We do encourage it to be folded (because humans
sometimes like to peer at it, but that is a secondary use). Fixed numbers
are, generally speaking, a Bad Thing.
0.0.1. Adding a path-identity to the Path-header
When an injecting, relaying or serving agent receives an article, it
MUST prepend its own path-identity followed by a path-delimiter to
the beginning of the Path-content. In addition, it SHOULD then add
CRLF and WSP if it would otherwise result in a line longer than 79
characters.
[Bruce Lilly
This implies that an agent may only break a long Path line immediately
after its entries. Provision should be made for breaking long lines
after any delimiter (e.g. a line over 998 characters received from a
site running old software that does not break Path lines).]
Well if every relaying agent en route adopted that SHOULD, the Path header
would always come through nicely folded. But if some of them don't, it
might not. I suppose I could add, to that paragraph, ", alterantively it
MAY refold the entire Path-header in order to fit within that length."
However, that would be tinkering with stuff already received, as opposed
to tinkering the bit it was adding itself, which might be regarded as a
Bad Thing, so I shall not make that change unless I hear support.
In regard to <dotted-quad> and <ipv6-numeric>
[Bruce Lilly
RFC 820 cannot be correct. It was obsoleted by RFC 870 which was
obsoleted by RFC 900 which was obsoleted by .... RFC 1700 which was
obsoleted by RFC 3232. The correct RFC should appear in the
references.]
Well RFC 1700 obsoletes 820, 870 and 900, but RFC 3232 does not obsolete
1700 (it is only an experimental protocol). But RFC 1700 does not define
the "dotted-quad" notation (though it clearly uses it). In fact, it seems
to be one of those things which "everybody knows", because I cannot find
a formal definition anywhere :-( .
But help is at hand! RFC 2273, which officially defines the IPv6 notation,
actually includes decent syntax for IPv4address and IPv6address, so I now
refer just to RFC 2273, and have used that syntax at the proper
places.
[Bruce Lilly
RFC 1036 section 2.1.6 treats ':' , '_', and whitespace as delimiters.
Path-identifiers ("systems") consist solely of letters, digits, hyphens,
and periods. To avoid interoperability problems, a transition period
should be provided where:
1. use of ':' and '_' as delimiters is deprecated (via SHOULD NOT)
2. use of ':' or '_' in path-identifiers is discouraged (also via SHOULD
NOT)
3. use of whitespace as sole delimiter is deprecated (but not forbidden).
4. whitespace separating path- identifiers MAY be interpreted as a
delimiter (for backward compatibility)
5. ':' and '_' MAY be interpreted as delimiters (for backward
compatibility) Note should be made that future revisions may require the
specific delimiters mentioned
No! We discussed this a long way back. It is quite clear that most
relayers support arbitrary punctuation as delimiters, as required by RFC
1036, so our introduction of special meanings for '%', '/' etc does not
interfere with current practice. OTOH, no relayer that we know of uses
other than '!' for a delimiter, therefore no harm can arise if we assign
some of the former delimiters to other uses.
We HAVE to allow ':' within path-identities because IPv6addresses are
surely going to appear in them (though probably not in the very short
term). There is the slight possibility that an existing relayer will
interpret a ':' in an IPv6address as a delimiter, but the worst that can
lead to is that a site may sometimes be offered an article it has already
got. We can live with that.
Section 6.
References:
[Bruce Lilly
Any restrictions on where an agent may break or unfold a long References
line?:
when adding a reference
when trimming
when relaying (e.g. if there are very long lines)]
I see no problem. A followup agent may fold the References line anywhere it
likes (I see no requirement to use the exact folding in the References
line of the precursor). Of course, a relaying agent MUST NOT refold
anything.
Subject lines starting with "cmsg"
NOTE: The presence of a Subject-header starting with the string
"cmsg " and followed by a Control-content MUST NOT be construed,
in the absence of a proper Control-header, as a request to
perform that control action (as may have occurred in some legacy
software). See also section 5.4.
[Bruce Lilly
RFC 1036 section 2.2.6 recommends exactly what this draft forbids. An
overnight change is impractical; a transition period where the cmsg
interpretation is recommended against (i.e. SHOULD NOT rather than MUST
NOT) should be provided, with a warning to implemetors that a future
revision may change that to MUST NOT. And the requirement or
recommendation itself should be moved from the NOTE into the body of the
document.]
I think we were agreed that the practice in question was so abhorrent, and
so likely to be misused by trolls and other maldoers, that its use had to
be curtailed immediately. No proper current practice is affected because
no legitimate issuer of control messages in his right mind uses it.
Moreover, all decent newsreaders make provision for a user to cancel his
own articles with a proper cancel message, and anyone who might
legitimately want to issue a 3rd party cancel and doesn't know how
probably shouldn't be doing it anyway.
Supersedes:
header =/ Supersedes-header
Supersedes-header = "Supersedes" ":" SP Supersedes-content
*( ";" other-parameter )
Supersedes-content = msg-id
[Bruce Lilly
The original official Internet definition of Supersedes as it appears in
the defining document (RFC 2156) is: "Supersedes" ":" 1*msg-id That
uses RFC 822 syntax, so CFWS is permitted (but not required) between
msg-ids.
This document should be consistent with that definition by permitting
more than one msg-id.
Since Supersedes was not in RFC 1036, there is no reason to require CFWS
between msg-ids (as is claimed for References).]
I would not regard RFC 2156 as "the" definition of Supersedes for Netnews,
though it is for Email. I have added a NOTE:
NOTE: The Supersedes-header defined here has no connection with
the Supersedes-header that sometimes appears in Email messages
converted from X.400 according to [RFC 2156]; in particular, the
syntax here permits only one msg-id in contrast to the multiple
msg-ids in that Email version.
[RFC 2156] S. Kille, "MIXER (Mime Internet X.400 Enhanced Relay):
Mapping between X.400 and RFC 822/MIME", RFC 2156, January 1998.
The posting-date-parameter:
This parameter identifies the time at which the article was injected
(as distinct from the Date-header, which indicates when it was
written). It is in the form of the number of seconds elapsed since
January 1st 1970, optionally followed by a date-time which MUST
indicate the same time.
[Bruce Lilly
seconds elapsed since Jan 1 1970 IN WHICH TIME ZONE?]
I think that is taking pedantry too far (he will be wanting me to exclude
those leap seconds next :-) ). If somebody can give me a reference to the
proper POSIX document, I will mention it. Otherwise, it is a thing that
"everybody just knows".
MIME:
The following headers, as defined within [RFC 2045] and its
extensions, may be used within articles conforming to this standard.
MIME-Version:
Content-Type:
Content-Transfer-Encoding:
Content-ID:
Content-Description:
Content-Disposition:
Content-MD5:
[Bruce Lilly
What about Content-Language (used in one of the examples!)? Content-
Duration? Content-Location?]
I'll buy Content-Language (and I will give proper RFC numbers for them
all). I am open to opinions regarding the other two.
The Content-Type: "multipart/digest" is commended for any article
composed of multiple messages more conveniently viewed as separate
entities, thus enabling reading agents to move rapidly between them.
The "boundary" should be composed of 28 hyphens (US-ASCII 45) (which
makes each boundary delimiter 30 hyphens, or 32 for the final one) so
as to enable reading agents which currently support the digest usage
described in [RFC 1153] to continue to operate correctly.
[Actually, this conflicts with some present digest usage (such as the
news.answers rules), but should still be the right way to go. There
remains the possibility that future MIME-compliant readers could enable
one to proceed directly to some particular message by clicking on it in
a table of contents, but that feature is not yet supported by the
current MIME standards.]
[Bruce Lilly
See Content-Location.]
Yes, this was considered (and discussed on the FAQ-Maintainers list). The
problem is not with using Content-Location (or Content-ID) in each item in
the Digest, but in referring to that URL in the "Index" section of the
Digest, which is not in general written in HTML (indeed we strongly
discourage HTML within Netnews). Yes, I know that many newsreaders can
"guess" what is a URL when they see one, but no standard mechanism has
been defined for indicating them in text documents as yet, and I couldn't
find any system (even one that recognized those URLs) that would pick up
the Content-Location in another part. So I think that is future work.
But the mechanism suggested using multipart/digest with 30 hyphens seems
to work fine on many systems, as described.
Definition of some new Content-Types:
This standard defines (or redefines) several new Content-Types, which
require to be registered with IANA as provided for in [RFC 2048].
For "application/news-groupinfo" see 7.2.1.2, for "application/news-
checkgroups" see 7.2.4.1, and for "application/news-transmission" see
the following section.
[Bruce Lilly
Use "media type[s]" rather than "Content-Type[s]" to avoid confusion
with the header Content-Type.]
Bruce wants me to change that terminology in lots of places. I will grant
you that RFC 2045 uses the term "media type" to refer to the entities such
as "text/plain" of "application/news-transmission", but I remain
unconvinced. Surely, when we speak of the "Subject" on an article, we mean
the content of its Subject-header. When we speak of the "Date" of an
article, we mean the content of its Date-header. So what is wrong with
referring to whatever is written in its Content-Type-header as its
"Content-Type"?
So I would like to hear further opinions on that one.
Section 7
Newgroup:
The "newgroup" control message requests that the specified group be
created or changed. If the request is honoured, or if the group
already exists on the serving agent, and if the newgroup-flag
"moderated" is present, then the group MUST be marked as moderated,
and vice versa. "Moderated" is the only such flag defined by this
standard; other flags MAY be defined for use in cooperating subnets,
but newgroup messages containing them MUST NOT be acted on outside of
those subnets.
NOTE: Specifically, some alternative flags such as "y" and "m",
which are sent and recognised by some current software, are NOT
part of this standard. Moreover, some existing implementations
treat any flag other than "moderated" as indicating an
unmoderated newsgroup. Both of these usages are contrary to this
standard.
[Bruce Lilly
The penultimate sentence of the NOTE is rather curious. At most one
newgroup-flag is permitted. The group is moderated if and only if the
flag is "moderated". So if there is a flag other than "moderated", the
newsgroup is necessarily unmoderated. How can that be "contrary to this
standard"?
If treating a flag other than "moderated" as indicating an unmoderated
group is contrary to this document, the implication is that the presence
of ANY flag should be considered an indication of a moderated group.]
No, I think the MUST NOT in the normative text in the first paragraph
covers the situation. But to be quite sure, I have added to the NOTE
"and control messages with such non-standard flags should be ignored".
Section 8.
Duties of an Injecting Agent
[Bruce Lilly
Injecting agents are also responsible for ensuring that articles comply
with other standards, e.g. RFC 2298 (and the draft of its successor,
draft-vaudreuil-mdnbis-02.txt) strongly recommends that news articles not
include a Disposition-Notification-To header.]
Hmmm! I think if we start down that route we shall never end. There are
lots of headers defined in lots of other standards that may appear in
Netnews, where they may or may not have sensible usages. We have given a
blanket warning in Section 4 that all such headers MAY appear, but should
be ignored unless the agent thinks it knows what to do with them. Actually,
they are usually more useful to human readers who can sometimes garner
much information about how an article entered the system by perusing some
of those obscure headers.
Now the Disposition-Notification-To header is used in email to say "please
send me an automatic acknowledgement when this message arrives at its
recipient". Clearly, it is a nonsense in Netnews, and the authors of RFC
2298 have kindly made it a SHOULD NOT, which is fine.
But that is not going to stop it happening and, occasionally, in an
article gatewayed into news and then back into email it might even make
sense. Bruce suggests that injectors and gateways should be instructed to
remove it when they see it. I think I disagree, again on the grounds that
if we do it for one we have to do it for all.
Moreover, if a message is both posted and mailed (and a Posted-And-Mailed
header is used) then the two versions MUST be identical, according to our
current draft. I would not want to go to the trouble of declaring
Disposition-Notification-To to be a variant header.
But if anyone thinks further special (or general) action is needed, then
please speak up.
-- Charles H. Lindsey ---------At Home, doing my own thing------------------------ Tel: +44 161 436 6131 Fax: +44 161 436 6133 Web: http://www.cs.man.ac.uk/~chl Email: chl@clw.cs.man.ac.uk Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K. PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5