[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: #1047 Path field delimiters and syntax - status
On Thu August 18 2005 09:52, Charles Lindsey wrote:
>
> In <871x4sy5kd.fsf@xxxxxxxxxxxxxxxxxxxxx> Russ Allbery <rra@xxxxxxxxxxxx> writes:
>
> >Bruce Lilly <blilly@xxxxxxxxx> writes:
>
> >> No, we have not; there are no "sever interoperability" issues which are
> >> any different than are introduced by MISMATCH, POSTED, "!!", etc. --
> >> those are changes, and there will be some interoperability issues.
> >> Introducing ':' as other than a delimiter (its long-standing use) will
> >> introduce severe interoperability issues. There are no performance
> >> issues with introducing comments; that is a red herring.
>
> >There are no performance issues with comments provided that servers are
> >allowed to treat them as Path entries and the ()s as delimiters. There
> >are (mild) performance issues if servers are expected to ignore them.
> >There are (mostly mild but more annoying) performance issues if the full
> >RFC 2822 CFWS syntax is expected to be supported.
>
> I don't think it is as simple as that. If someone puts "(demon)" in the
> middle of the Path, then it is quite possible that demon customers will
> never get to see the article.
Iff it's not treated as a comment. Ditto for !POSTED! vs posted customers,
!MISMATCH! vs mismatch customers, and !dead:beef::cafe! vs dead, beef, and
cafe customers.
> So I think servers would have to ignore
> them, which means matching pairs of "(...)" inside them.
Yes; trivial.
> Bearing in mind that every relaying agent is supposed to locate the Path
> header within each article,
Invariant.
> scan it for valid entries (i.e. things between
> delimiters/WSP/folds,
Invariant.
> but not the <tail-entry>)
Invariant (and also rather pointless, as that entry serves no purpose).
> and check each entry found
> against the name(s) of the peer it is considering sending it to. Repeat
> for every peer.
Note that there are any number of ways to do so which do not require
parsing the Path field more than once, or other than left-to-right, at
any given server. It necessarily means comparing M non-comment, non-
diagnostic, non-keyword, non-bogus path entries to N peer names (although
clearly the tests can be short-circuited once a match for a specific
peer is found). For an article with M valid path entries and at a site
with N peers, none of which are listed in the path field, that is a
minimum complexity of O(N*M).
> any extra work put into
> that loop is going to impact performance
1. skipping from '(' to the matching unquoted ')' is outside of the
loop; it is one-time field parsing and none of the skipped content
is checked against any peer names. Nor does it affect the number
of non-comment entries to be checked. O(N*M) still applies.
2. skipping over comments is trivial (compared to e.g. parsing 2231
parameters or comparing keywords to a (possibly long) list of peer
names)
> All of which is why we removed <comment>s from the Path quite some time ago.
"we" didn't remove them. It was a one-person, non-consensus, unilateral
action based on flawed reasoning.