Re: USEAGE split for section_5

From: Bruce Lilly (blilly@erols.com)
Date: Mon May 19 2003 - 10:29:47 CDT


Claus Färber wrote:
> Shmuel (Seymour J.) Metz <Shmuel+gen@patriot.net> schrieb/wrote:
>
>>In <3EC501F7.5070907@Sonietta.blilly.com>, on 05/16/2003
>> at 11:21 AM, Bruce Lilly <blilly@erols.com> said:
>>
>>>How about changing the obscure "graphemes" to "octets" or
>>>"characters".
>>
>>They are not the same thing.

Under the current (RFC 1036) rules, and under the current consensus of
this WG (most recently following discussion of Russ' table of issues),
and under the current focus of this WG (internationalization of newsgroup
names deferred, on-the-wire format compatible with RFCs [2]822), they
are in fact the same thing.

> The correct word, of course, is "column". Not all graphemes have a width
> of one column but the column width is what matters.

If "column" is meant to relate to display issues, that's not necessarily
correct, as display might use a proportional-spacing font, in which case
"column" has no meaning.

Are the restrictions (on newsgroup name component and overall newsgroup
name length) in fact related to display issues, or to something else
(header field line length, look-ahead for processing, server limitations,
NNTP protocol limits, etc.)? There's no mention of any rationale in the
Lindsey draft, nor is there any explanation of where the magic numbers
30 and 71 came from. Neither the numbers nor restrictions come from RFC
1036.

My ISP lists a newsgroup named
alt.religion.kibology.the-not-funny-version-where-lee-can-xpost-not-funny-stupid-threads
which has a length of 88 characters (= octets) including the dots, which is
longer than the length 71 restriction in the Lindsey draft. That newsgroup
name also contains the longest component, length 66 octets (= characters),
which is more than double the 31 limit in the Lindsey draft.

I propose that we first establish what the functional limits are, since clearly
a component of length 66 and a newsgroup name of length 88 work fine.
Obviously, unless there's some provision for splitting and reassembling long names
(e.g by insertion/removal of FWS), the longest name cannot be longer than 997
octets (RFC 2822 field line length minus 1 (for SP or HTAB to indicate
continuation)). NNTP probably imposes a lower limit due to its limited
minimum command line length. IMAP folder naming conventions in conjunction with a
255-character path length limit imposes an upper bound of somewhat less than
249 characters (depending on where the implementation places folders). Some
UAs (e.g. those based on rn) will similarly impose a limit based on OS path
limits and KILL file paths. A practical limit is probably ca. 200 characters
for the total newsgroup name length, with no limit on component length.

Does anybody know of a functional limit below 200 characters for the newsgroup name,
or any limit on component length (other than 14 or 8 characters on some older
operating systems (where the term "operating" is used loosely...)?

Once we establish what the real limits are, we should document the rationale
for the limits along with the actual values. Incidentally, the definition of
POSIX_PATH_MAX (the minimum allowable value of PATH_MAX, viz. 255) in ISO
9945-1 (1990) is given in terms of bytes, so if that's what imposes the
bound, then "octets" is the appropriate unit.




This archive was generated by hypermail 2.1.7.