> -----Original Message----- > From: owner-jed-users-l@xxxxxxxxxxxxxx > [mailto:owner-jed-users-l@xxxxxxxxxxxxxx] On Behalf Of John E. Davis > Sent: martedì 19 aprile 2005 22.15 > Subject: Re: WJed key bindings and utf-8 > > SANGOI DINO <SANGOID@xxxxxxxxxxxxxxxxx> wrote: > >PS: BTW, how difficult will be supporting files in UTF-16? Those are > >common on Windoze. I haven't tried it myself, I'm just wondering it > >this can be done... > > This is a general question of supporting any character set. > The answer would involve converting the character set to > UTF-8 when the file is read and then converting it back when > written out. This would have to be done as transparently as > possible and would most likely involve hooks of some sort. > Yes, I see it... I have tried two different ways: using compress.sl (adding a command with iconv), but this doesn't work well, mostly because we can't match files using extensions, because we need a different command for every possible convertion, and anyways jed_popen in windows is implemented read-only (so I can read files, but cannot write). The other way was creating a completely different (but highly inspired by compress.sl) sl file. But this way I should rewrite the encoding-decoding routines in slang, or patch slang to export SLutf8_encode() and SLutf8_decode(), and converting UTF-32 to UTF-16 (far easier). But this misses other charsets. [ Anyways, a patch that exports utf8_encode and utf8_decode is attached, Just in case... ] After a lot of work, I found both ways difficult to implement. So I'm trying now another thing: I did an iconv module. I tried it under slsh, and for some small tests seems to work. To use it with jed, I need to enable import in jed for windows (it remains disabled as HAVE_DLOPEN is not defined), a simple way is adding to jedconf.h: #ifdef _MSC_VER # define HAVE_DLOPEN 1 #endif but this is a bit ugly, because it lies :) Far better should be having SLANG_HAS_DYNAMIC_LINKING in slang.h (now it's only in slimport.c), so programs can use it to check at compile time if slang supports import(). Either way, having an iconv library, compiling the iconv-module, and using something like utf16.sl (attached) should do it. And yes, utf16.sl should really be charset.sl, and support other charsets, and it should also allow to change the output charset (this is really easy, it should be stored in a buffer local variable). All this is lighty tested on windows, and untested on Linux. Note also that every iconv version for Windows around behaves in a strange way: it seems not to set errno, and UTF-16 encoding is Big Endian by default (but the machine is little endian). I worked around both in a way that should be compatible with well behaved libraries. I also tried building libiconv from sources, but I have not tried to debug it (I will try when I have some time). Comments? Later, Dino
Attachment:
slang-export-utf8_funcs.diff
Description: Binary data
Attachment:
iconv.sl
Description: Binary data
Attachment:
iconv-module.c
Description: Binary data
Attachment:
utf16.sl
Description: Binary data