[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: OT: Re: Less is more



> > If that field occurred in an actual message generated by an actual
> > MUA I'd claim it was a programmer error even if it was syntactically
> > valid:)  Anyone who put that in a shipping product ought to be
> > sacked.
> 
> And what about the protocol designer?

with 24 years' hindsight, the baroqueness of the message format is hard
to justify.
 
> >> If a program can't parse Frode's field, it can't parse the RFC822
> >date> field syntax as specified.
> 
> > True, but it could quite possibly parse 99.999% of the dates that
> > occur in actual use, including dates that aren't valid - at which
> > point the inability to parse dates is insignificant in comparison to
> > failures that
> > are due to other problems.  If you're concerned about reliability
> > you care about how well it works in actual use, not whether it
> > handles really obscure corner cases.  (security concerns are an
> > exception - since crackers specifically look for corner cases.)
> 
> This sounds like a decent pragmatic approach to protocols which are so
> 
> complex or ambiguously defined that writing code that handles all 
> possible permutations is infeasible. However, when building a new 
> protocol it makes sense to create it such that the number of possible 
> permutations is small enough that it's possible to fully implement and
> 
> test them all.

well, a version of the current date field that didn't allow comments 
in the middle of a date, and that required numeric timezone offsets,
would come fairly close.

but I'm not really arguing (much) about how dates should be represented
- I'm arguing about how we should justify design decisions.

> An interesting question is whether it's better to implement a binary 
> date format as a timestamp or a concatnation of 
> year/month/day/hour/second fields. The advantage of timestamps is it 
> makes date comparisons easy and there is no ambiguity as every
> possible value is a valid date/time.

comparisons of dates are easy either way.  also it's certainly
possible to specify a "separate fields" format in such a way that
the potential for ambiguities is minimized.  you might still have
the possibility of feb 29 in a year that isn't a leap year or :60 
for a second that isn't a leap second, but those cases will be rare.
 
> Add to that that a binary format with explicit length values is much 
> faster to parse (especially on disk where you can seek over large 
> uninteresting parts)

binary with explicit length is not -inherently- faster to parse,
though it's certainly possible to design a format that is fast to
parse.  somehow I don't think that's an overriding consideration
though.  I'm more concerned with robustness - including
minimizing implementation errors.

--
Regime change 2004 - better late than never.