[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Poll: consensus to change the encoded-character extension



On Tue, 2007-04-10 at 19:12 +0000, Aaron Stone wrote:
> Something like this:
> 
>    encoded-character    = "${" encoded-char-scheme ":" encoded-char-seq
> "}"
>    encoded-char-scheme  = hex / unicode
>    encoded-char-seq     = *(LWSP WSP 1*HEXDIG) LWSP

if we allow ${hex:100} in the grammar, we need to say something in the
text about the valid range.  I would prefer to stick to separate
productions for encoded-arb-octets and encoded-unicode-char to keep the
text simple and to minimise the change to the text.

> Note that LWSP is optional by definition,

ouch, good catch!

> so we have to include SP or WSP
> to force some kind of separator between 1*HEXDIG's. Note that this is not
> valid according to the syntax above,
> 
> ${unicode:
> 123
> ABC
> }
> 
> ..because 123 and ABC do not have WSP between them. Use WSP / CR / LF? Is
> there some variant of LWSP that mandates at least one character of
> something be present?

LWSP requires WSP after CRLF, too, so it's simply not what we want, we
need to add another basic terminal, perhaps

   blank = WSP / CRLF

I suggest we stick to the poll question from Alexey, but with "1*blank"
replacing LWSP in his suggested new text.

> I think there are three options for values that are out of range:
> 
>  1. Throw an error and reject the script.
>  2. Ignore the offending value.
>  3. Insert some placeholder like ' ' or '?'.

I don't think we need to revisit this question.

> I concur that comments should not be allowed.

-- 
Kjetil T.