Re: Newsgroup names and Unicode, attempt 3

New Message Reply About this list Date view Thread view Subject view Author view

From: Clive D.W. Feather (clive@demon.net)
Date: Mon Jul 02 2001 - 09:00:13 CDT


Florian Weimer said:
> Unicode says that 'grapheme' really means 'what a user thinks of as a
> character'. In contrast, 'glyph' is the rendered version of a
> character or character combination, I think.

If I understand the terms correctly, and I might well not:

    In English, o-dieresis is two characters normally rendered as one glyph
    In French, c-cedilla is one character and one glyph
    In Arabic, "ibn" might be three characters that are rendered as three
      glyphs in some contexts, but as one in others

I'm not going to swear that there isn't a case where one character is two
glyphs (like the C operator ?: comes in two separate parts).

Also, "glyph" includes font changes, so that several different glyphs
represent the same character or grapheme.

I suspect we may want to steer well clear of this confusion.

-- 
Clive D.W. Feather  | Work:  <clive@demon.net>   | Tel:  +44 20 8371 1138
Internet Expert     | Home:  <clive@davros.org>  | Fax:  +44 20 8371 1037
Demon Internet      | WWW: http://www.davros.org | DFax: +44 20 8371 4037
Thus plc            |                            | Mobile: +44 7973 377646


New Message Reply About this list Date view Thread view Subject view Author view


This archive was generated by hypermail 2b29.