jed-users mailing list

[2007 Date Index] [2007 Thread Index] [Other years]
[Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: Non-ascii chars in UTF-8 mode not bindable

Subject: Re: Non-ascii chars in UTF-8 mode not bindable
From: "John E. Davis" <davis@xxxxxxxxxxxxx>
Date: Fri, 1 Jun 2007 13:45:54 -0400

G. Milde <milde@xxxxxxxxxxxxxxxxxxxxx> wrote:
>I did some more testing to track down the problem:

There is really no mystery to what is happening.  The fact is that the
keymap routines use byte-semantics.  In general, a keymap is a series
of 256 element lookup tables.  If the tables were naively expanded
from 256 to the maximum allowable unicode character (~1 million), then
the tables would be unacceptably large.  The work-around that I posted
(and later corrected) avoids this problem.

In the case you considered, the four keys correspond to the following
byte strings:

   Key: Â´ : "\c2\b4"
   Key: Â¬ : "\c2\ac"
   Key: Â° : "\c2\b0"
   Key: Â¼ : "\c2\bc"

The default bindings of the bytes 0xc2, 0xb4, 0xac, 0xb0, and 0xbc are
to "self_insert_cmd".  So when the editor sees a byte sequence such as
0xc2 0xb4, it simply inserts both bytes into the buffer.  When in
UTF-8 mode, this combination is interpreted as the single unicode
character 'Â´' (\u{00b4}).

When you bound the the byte-sequence "\c2\b4" to something, that
effectively created a keymap for sequences beginning with 0xc2.  As a
result, 0xc2 was nolonger bound to "self_insert_cmd", and a sequence
such as "\c2\bc" would not do anything since "\bc" is unbound in the
0xc2 based keymap.

At some point, I will integrate the work-around that I posted into the
setkey functions.  I posted the slang version to give others an
immediate solution to the problem, although I suspect only a few will
ever run into this issue.

I hope this clarifies things a bit.

Thanks,
--John

--------------------------
To unsubscribe send email to <jed-users-request@xxxxxxxxxxx> with
the word "unsubscribe" in the message body.
Need help? Email <jed-users-owner@xxxxxxxxxxx>.

Follow-Ups:
- Re: Non-ascii chars in UTF-8 mode not bindable
  - From: Jörg Sommer
- Re: Non-ascii chars in UTF-8 mode not bindable
  - From: G. Milde

References:
- Non-ascii chars in UTF-8 mode not bindable
  - From: Jörg Sommer
- Re: Non-ascii chars in UTF-8 mode not bindable
  - From: G. Milde
- Re: Non-ascii chars in UTF-8 mode not bindable
  - From: Jörg Sommer
- Re: Non-ascii chars in UTF-8 mode not bindable
  - From: G. Milde
- Re: Non-ascii chars in UTF-8 mode not bindable
  - From: Jörg Sommer
- Re: Non-ascii chars in UTF-8 mode not bindable
  - From: G. Milde
- Re: Non-ascii chars in UTF-8 mode not bindable
  - From: G. Milde

[2007 date index] [2007 thread index]
[Thread Prev] [Thread Next] [Date Prev] [Date Next]