Re: Extended newsgroup tags; another approach

New Message Reply About this list Date view Thread view Subject view Author view

From: greg andruk (gja@meowing.net)
Date: Sat Oct 05 2002 - 01:19:08 CDT


[A bit of preface: this might all make a bit more sense if one considers
the possibility that the way newsgroups are named is fundamentally
broken, not just in the Usefor draft but in the real world too. Yes,
I'll explain below.]

Jean-Marc Desperrier wrote:
> greg andruk a écrit:
>
>> [...] if only as a reminder that there are ways to do things that
>> don't involve breaking backward compatibility with anything.
>
>
> Are you joking ?

No, of course not.

> This draft is full of problems and of interoperability nonsense.

The only nonsense is that interoperability has anything to do with it.
The whole point of the mechanism is, obviously, that no changes at all
are made to the way newsgroup headers are constructed.

The Usefor draft is chock full of interoperability nonsense, but the
draft says so which makes it all okay, so interoperability is of course
a non-issue.

>> - 100% backward compatible, existing clients and servers can
>> simply ignore it.

> The other side of it is that the support of the functionnality must be
> added from scratch,

This is a feature. All the proposed in-band encodings break something,
either software that needs to interoperate or user interfaces. A
layered approach avoids all that, _and_ it addresses the fundamentally
broken nature of the Newsgroups header in a way that none of the propsed
changes really can (more on that later).

Yes, of _course_ clients need to be updated to make good use of _any_
change to the way newsgroups are named. And, of course, the claims made
by some on this list that there are pre-USEFOR clients that will
automatically "just work" are false.

Heck, even your own claim that Windows is a pure Unicode system are
demonstrably false. The libraries allowing Unicode integration didn't
really come into being until 1999, and then only really on the NT
systems, and then only with software that has been written in such a way
that it can make use of it. Forrest's code page adventures much more
accurately depict what happens in real life. In my own little corner of
the world, where I actually try to make use of Unicode on windows where
possible, I distressingly often see accented Roman characters displayed
in what appear to be Greek and Cyrillic.

> and this, as I will show, in order to in fact
> duplicate an already existing functionnality.

OK...

> And on the client side, the way it's implemented causes major problems.

There are actually some very nice ways to implmement these kinds of
names, as seen in mail, and to a more primitive extent in news clients.

>> - Charset religious issues are left up to hierarchy maintainers
>> and server operators.
> Well, the true name of that is :
> - NO interoperability. Hierarchy maintainers and server operators are
> left free to choose whatever mecanism they want and *nothing* is done to
> transmit the choice. Also no client can be sure he has implemented
> enough encodings to be sure he will be able to communicate with the server.

Q- and B- encoding do, of course, contain charset information. And, of
course, anything that supports MIME has to worry about these issues
already. And, of course, the client is always free to show the display
names in any charset it deems appropriate, or to not bother displaying
them if it seems like too much trouble. And, of course, these names are
not transmitted back over the wire by the client, so local mangling to
obtain a representation that works on the local system is fine.

And, of course, the use of Unicode also solves none of those problems on
the minimalist systems always dragged in as an excuse to just send 8. A
VT101 isn't going to magically become able to display Hebrew and
Chinese, or even a semi-readable approximation.

>> - It's been working just fine on corporate intranets for years.
> And because of inexistant interoperability support, it's the only one
> place where it has any change of working.

It works just fine with the public Usenet as well. I use them, and
associates and clients have used them as well. Nothing broke, and
indeed the rest of the net would never have known that such names were
in use.

Actually, AOL, the world's largest ISP, has used that kind of naming for
years and years as well. Have we seen a flood of nessages with those
names being distributed thoughout the network? Of course not, their
user interface doesn't work that way.

> It has NO support for the user to type the prettyname name of a group.

Of course not, because NNTP is not a user interface. It's a network
protocol.

Have you ever used an MTA with an address book? Have you ever used a
mailer that can find the best matches in the directory from only a few
characters or a nickname typed into the To: field? These things are
*common* in the real world. (Notice that distinction between names and
addresses. It's important.)

> The client has NO way to know what encoding, what way the prettyname
> must be encoded before comparing to the list the server has given.

Of course it has that information, that's what RFC 1522 (now 2047)
referred to in that draft handles.

(Note that it _is_ only a draft, and of course the wording can be
tightened up to, for example, note that Q/B-encoding really has to be
used for non-ASCII characters. A draft is, of course, a *draft*.)

> The user can not use the prettyname of a group that is not carried by
> his server,

A user also can't read any of the articles in a group that is not
carried locally, or know if the not-carried group specification is
valid, even under the traditional system. Oh well.

> the user can not use a client that will not systematically
> retrieve the full list of groups of the server before use.

In an offline client, yes, information needs to be downloaded before it
can be used. Online clients, of course, can use the single-group form
of LIST commands if they wish.

How does this differ from the need of a client to fetch and periodically
refresh the contents of active in order to tell the user that a group is
moderated? Or to display the old ASCII descriptions? Or to know what
the valid newsgroups are? Or to know what the range of existing article
numbers is?

> Then there is NO description of how the prettyname will propagate and
> how to make sure everyone uses the same prettyname.

Right. As I already stated in my prior message, that has been waiting
for years and years on the Usefor WG to provide an extensible control
message mechanism.

> Then as there is no hierarchy, there is NO support for a way to insure
> that there will not be prettyname collision and who will be able to
> assign what kind of prettyname.

There is also NO way for you to control what names I assign in my
address book, or in directories not under your control. Names are not
addresses.

> Basically when you get to the end, you realise that this prettyname
> thing is a way to reinvent exactly the same functionality as the group
> description,

A description is a description. A name is a name. They are different
things. That is why the publishers of dictionaries, encyclopedias and
directories have something to print.

> with just the _suggestion_ to describe the group in one
> word (but if a server carries 30 000 groups, then the prettyname will
> have to become the same thing has a group description because no short
> name will be explicit enough and avoid collision ) and exactly the same
> i18n interoperability problem the group description has today (I regret
> that this problem is not taken care by USEFOR, but it's not directly a
> USEFOR issue).

The newsgroups file, and the LIST NEWSGROUPS NNTP command, were never
subject to any attempts at formal specification, even RFC 2980 doesn't
help there. Yes, that can be fixed, but again, descriptions and names
are not the same, no more than names and addresses are the same.

> There was no need to invent this protocol, the administrators just
> needed to fill a short group description for all their private groups,
> and to use clients that would display that instead of the group name.

Okay, time to explain what I'm getting at with names and addresses.

The foo.bar.baz thing that appears in the Newsgroups: header is ancient
legacy stuff, directly lifted from A News. The A article format was
massively deficient, using a fixed, minimalist set of headers in an
unfortunate if well-intentioned attempt to save a few bytes on old pdp11
timesharing systems using very slow modems.

Those old headers were rather cryptic, owing to their miserliness. The
path was pushed into double service as a routing history and indication
of the message sender (and might, assuming bidirectional feeds, even
serve as a mailing address for the sender, but maybe not). The author's
name, of course, did not appear, so sneder identities could be something
of a mystery. The adoption of mail-like headers within a couple years
helped with that particular failure, of course.

A similar unfortunate shortcut was taken with what became the
Newsgroups: header, but it is overloaded with not two but three
different meanings. First, it serves as an address of what amounts to a
public, shared mailbox. Second, it serves as a distribution flag (it
was later realized, albeit too late, that distributions should have been
separate entities). Third, it tries to serve as a name for the group
using the mailbox (any anyone who has read news.groups or its regional
counterparts will immediately recognize the contortions the overloading
forces, and why simply expanding the allowable range of glyphs won't
really fix those problems).

Mail has long had ways to avoid that kind of overloading. When my
friend sends me mail from work, the message is from
     Doe, Jane [XYZ] <zx345@xyz.example.com>

But when I send mail to her, it will appear as
     Jane <zx345@xyz.foo.com>
or even just
     zx345@xyz.fo.com

But the name I type into my mail program is just
     Trouble

...and I have the choice of seeing it displayed as the nickname or the
full version in my address book (or from the directory service, if at work).

What I type has nothing to do with the actual address, and has nothing
to do with how my mail is sent over the wire. In a world full of Jane
Does and Troubles, the message manages to find its way to the right
person, because names and addresses are not the same things.

Routing information and administrative control are, of course, contained
on each side of the "@" in the address, and if I started calling my
friend zx345, she'd look at me funny, because of course that's not her
name. It's the address of her mailbox.

News and mail and other media aren't just collections of protocols,
they're tools used by people. It's okay to provide hooks that enable
user interface things, the communication process is about a lot more
than transmitting bits between computers.

-- 
"I was looking at the man pages for Leafnode last night and at the
ones for Inn and my first thought was, 'What UBER GEEK wrote *this*
set up??! Sheesh!"' -- seen in alt.2600


New Message Reply About this list Date view Thread view Subject view Author view


This archive was generated by hypermail 2b29.