jed-users mailing list

[2005 Date Index] [2005 Thread Index] [Other years]
[Thread Prev] [Thread Next]      [Date Prev] [Date Next]

RE: WJed key bindings and utf-8


> -----Original Message-----
> From: owner-jed-users-l@xxxxxxxxxxxxxx 
> [mailto:owner-jed-users-l@xxxxxxxxxxxxxx] On Behalf Of John E. Davis
> Sent: martedì 19 aprile 2005 22.15
> Subject: Re: WJed key bindings and utf-8
> 
> SANGOI DINO <SANGOID@xxxxxxxxxxxxxxxxx> wrote:
> >PS: BTW, how difficult will be supporting files in UTF-16? Those are 
> >common on Windoze. I haven't tried it myself, I'm just wondering it 
> >this can be done...
> 
> This is a general question of supporting any character set.  
> The answer would involve converting the character set to 
> UTF-8 when the file is read and then converting it back when 
> written out.  This would have to be done as transparently as 
> possible and would most likely involve hooks of some sort.
> 

Yes, I see it...

I have tried two different ways: using compress.sl (adding a command with
iconv), but this doesn't work well, mostly because we can't match files
using extensions, because we need a different command for every possible
convertion, and anyways jed_popen in windows is implemented read-only (so I
can read files, but cannot write).

The other way was creating a completely different (but highly inspired by
compress.sl) sl file. But this way I should rewrite the encoding-decoding
routines in slang, or patch slang to export SLutf8_encode() and
SLutf8_decode(), and converting UTF-32 to UTF-16 (far easier). But this
misses other charsets.
[ Anyways, a patch that exports utf8_encode and utf8_decode is attached,
Just in case... ]

After a lot of work, I found both ways difficult to implement. So I'm trying
now another thing: I did an iconv module. I tried it under slsh, and for
some small tests seems to work.

To use it with jed, I  need to enable import in jed for windows (it remains
disabled as HAVE_DLOPEN is not defined), a simple way is adding to
jedconf.h:

#ifdef _MSC_VER
# define HAVE_DLOPEN 1
#endif

but this is a bit ugly, because it lies :)

Far better should be having SLANG_HAS_DYNAMIC_LINKING in slang.h (now it's
only in slimport.c), so programs can use it to check at compile time if
slang supports import().

Either way, having an iconv library, compiling the iconv-module, and using
something like utf16.sl (attached) should do it.

And yes, utf16.sl should really be charset.sl, and support other charsets,
and it should also allow to change the output charset (this is really easy,
it should be stored in a buffer local variable).

All this is lighty tested on windows, and untested on Linux.

Note also that every iconv version for Windows around behaves in a strange
way: it seems not to set errno, and UTF-16 encoding is Big Endian by default
(but the machine is little endian). I worked around both in a way that
should be compatible with well behaved libraries. I also tried building
libiconv from sources, but I have not tried to debug it (I will try when I
have some time).

Comments?

Later,
								Dino

Attachment: slang-export-utf8_funcs.diff
Description: Binary data

Attachment: iconv.sl
Description: Binary data

Attachment: iconv-module.c
Description: Binary data

Attachment: utf16.sl
Description: Binary data


[2005 date index] [2005 thread index]
[Thread Prev] [Thread Next]      [Date Prev] [Date Next]