- Subject: Re: Jed and utf-8... a pre-pre-pre-plea :-)
- From: "John E. Davis" <davis>
- Date: Tue, 17 Jun 2003 12:56:51 -0400
Romano Giannetti <romano@xxxxxxxxxxxxxxxx> wrote:
>fly). The actual contents of the buffer is irrelevant --- probably the only
>possible thing to do is to keep it consinstent with the locale, otherwise
>you have to invent an input strategy from zero --- a la yudit.
Suppose that the file on disk uses an ISO-LATIN character set. In
particular, consider the upside-down exclamation point character (¡)
given by character code 161. When a file containing this character is
read under a UTF-8 locale, it will be displayed as the 4 character
sequence <A1>, since by itself it represents an illegal UTF-8 encoded
sequence. A command such as
set_charset ("iso-latin-1");
will convert an illegal sequence such 161 to its two character UTF-8
equivalent 0xC2 0xA1. That is, specifying a character set could
actually cause the bytes in a buffer to change.
--John
--------------------------
To unsubscribe send email to <jed-users-request@xxxxxxxxxxx> with
the word "unsubscribe" in the message body.
Need help? Email <jed-users-owner@xxxxxxxxxxx>.
[2003 date index]
[2003 thread index]
[Thread Prev] [Thread Next]
[Date Prev] [Date Next]