From: Bruce Lilly (blilly@erols.com)
Date: Mon Mar 22 2004 - 00:44:51 CST
Bruce Lilly wrote:
> Charles Lindsey wrote:
>
>>In <40584453.4090206@erols.com> Bruce Lilly <blilly@erols.com> writes:
>
>
>>I was more concerned to know whether that information was essential for
>>threading, which is the main application for the References header.
>
>
> Not essential (and threading itself is not essential; it is a
> UA feature), but if a UA threads, some order of presentation
> is provided. In the case of a followup to multiple articles,
> where does one place the followup? In order to answer that,
> one needs to know exactly *which* articles it is a followup
> to, and References provides no mechanism to identify the
> immediate predecessors (vs. any other ancestors, with the
> exception of the root -- if there is a single root). The
> alternative is to pick an arbitrary predecessor as *the*
> predecessor, or to use some auxiliary information, such
> a timestamp for ordering. In order to pick one of the
> immediate predecessors, or if auxiliary information is used,
> but within subthreads (as opposed to an overall ordering
> within the entire thread), then again it is necessary to be
> able to identify the immediate predecessors. So while
> identification of immediate predecessors cannot be said to
> be essential for purposes of protocol or network operations,
> in practice it is quite important for threading.
For more insight into what In-Reply-To adds, consider
http://users.erols.com/blilly/mparse/usefor/ref+irt.png
and
http://users.erols.com/blilly/mparse/usefor/ref-only.png
They are graphs corresponding to predecessors of a message
(indicated by the letter 'z') constructed from a) the
combination of References and In-Reply-To information from
all articles, and b) information from References only
where several messages are followups to multiple messages.
With both References and In-Reply-To, the thread can be
constructed accurately; with only References (and with
the assumption that the last ID in the References field
points to an immediate predecessor) 75% of the messages
do not show up as ancestors of message 'z' even though
they are listed in its References field. [Other
assumptions could be made to link them, but those
assumptions would lead to incorrect graphs.]
That of course presumes that all messages are available so
that each message's References and/or In-Reply-To fields
can be examined and used to build the dependency graph.
In practice, some messages are likely to be unavailable,
and that will lead to missing edges in the directed graph.
It is possible to provide sufficient information in a
simple text-based format that would enable the entire graph
to be constructed from information carried in one message
(indeed, the graphs were constructed from a text-based
description), but not without a new header field design
(to supplement or replace References and/or In-Reply-To).