[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: IMAP test clients and servers
Patrik Fältström wrote:
>
> At 09.27 -0800 1999-01-28, John Myers wrote:
> > My suggestion:
> >
> > * If your search string fits in us-ascii, don't use CHARSET.
> > * First try UTF-8
> > * If you get a NO completion response, try your local charset.
>
> So, a client support a charset if:
>
> - The customer types in some glyphs
> - A search command is issued with the CHARSET option in that charset,
> having the glyphs in the correct code positions
Characters, not glyphs. (Using Unicode terminology)
A client must also be able to display text that is encoded in that
charset.
> A server support a charset if:
>
> - A client issues a SEARCH command, with the CHARSET option set
> to the specific character set
> - Matches is done in the email which is in the same
> charset as specified or UTF-8
> - Matches is done in _other_ email ONLY if the characters are
> mapped to the correct positions in the charset the email is
> composed in
>
> Is that what you suggest John?
I'm not sure I understand what you're saying above. A correctly written
server should perform searches based on characters, not encodings of
characters. It should behave as if all content transfer encodings and
character encoding schemes are removed, converting all text in supported
charsets into a single canonical form before performing the search.
> I.e. the question is if a server supports ISO-8859-1 IF it is the
> case that searches can be done in that character set, and one get no
> matches in ISO-8859-2 encoded email?
If a server does not support ISO-8859-2, then it is OK for one to get no
matches in iso-8859-2 encoded email. If a server supports both
ISO-8859-1 and ISO-8859-2, then two character sequences should match or
not regardless of which charset those sequences are encoded in.
In other words, supporting a charset means behaving as if it is able to
convert text, both in the SEARCH command and in MIME-labeled messages
being searched, encoded in that charset into canonical form before
search.