From: Bill Davidsen (davidsen@prodigy.com)
Date: Mon Mar 06 2000 - 15:15:11 CST
chl@clw.cs.man.ac.uk (Charles Lindsey) suggested:
> In <yl8zzzydqd.fsf@windlord.stanford.edu> Russ Allbery <rra@stanford.edu> writes:
>
> >INN's dbz code doesn't look much, if anything, like C News's at this
> >point.
>
> If dbz is basically a device for mapping Message-IDs into an offset within
> the history file, then it shouldn't be hard to fix it to do the "right
> thing" for duplicates. Note that it _doesn't_ have to retain any
> capability to retrieve the old history entry. Nobody cares about that
> anymore (except the Expires program).
You miss the point, you can't have duplicate entries in a history file
because they're not allowed. The way a server prevents getting multiple
copies of an article from multiple feeds is to reject an article with
the same message-id (or hash thereof). In this case "the right thing" is
to prevent duplicates, so there's no logic to implement the forbidden.
That said, it may be possible to rewrite the token to point to another
article elsewhere in the spool. That would have less chance of messing
up than any change to a basic assumption of the whole dbz
implementation.
Oh, and makedbz cares very much, it's not just expire.
> >What INN's dbz code *will* allow in INN 2.4 or thereabouts (if I can ever
> >find time to work on anything other than fighting fires at work :/) is
> >in-place updates of the storage API token associated with a given hash.
> >So you can potentially solve the problem in INN that way. But it doesn't
> >support that currently either.
>
> OK, I think you need to explain to me exactly what this API token is. Do I
> gather than the history file essentially maps (a hash of) the Message-ID
> to an API token (plus an expiry date or whatever). So presumably this API
> token is mapped to actual storage in some way, whether via article numbers
> or not. I think I need to understand that bit of the process.
It's a pseudo-physical address, offset and cycbuff id.
> >Opening the article and parsing its headers on each Replaces is just
> >really ugly. Doable for a single news system, but really ugly. In a
> >clustered environment, I think you'd have to do this separately on each
> >slave since the article has likely already expired from the master, making
> >this even uglier.
>
> I don't see this as a problem. Replaces are going to be relatively rare
> beasts, so efficiency is not important. I agree that it seems that the
> slaves have to do the work in those systems, becuase Replaces are a form
> of Cancel.
I agree with both of you, ugly but not significant cost. Then again, I
have been dubious that replaces was a good idea all along, it just isn't
an issue I feel is worth fighting (unlike some other recent issues which
I feel are very important).
-- -bill davidsen (davidsen@prodigy.com) "The secret to procrastination is to put things off until the last possible moment - but no longer" -me