[oclug] unicode? file
Adrian Irving-Beer
wisq-oclug at wisq.net
Tue Feb 22 10:43:00 EST 2005
On Tue, Feb 22, 2005 at 10:31:21AM -0500, Stephen M. Webb wrote:
> UTF-16 is an encoding scheme that would allow Unicode text to be
> represented in a sequence of 16-bit values
[...]
> The BE is 'big-endian.'
Ah! Then that explains why iconv and vim were disagreeing when I said
'utf16' alone. Vim assumes big-endian, iconv assumes little-endian.
> UCS-2 is another encoding scheme that represents a subset of Unicode
> as a stream of 16-bit values and is used by Microsoft software. I
> believe that all UCS-2 strings are a proper subset of UTF-16, just
> as ASCII strings are a proper subset of UTF-8, but I could be wrong.
No, you're correct, according to vim:
ucs-2 16 bit UCS-2 encoded Unicode (ISO/IEC 10646-1)
ucs-2le like ucs-2, little endian
utf-16 ucs-2 extended with double-words for more characters
utf-16le like utf-16, little endian
And I was wrong about it being an alias for ucs2... that's just what
it autodetects some files as. Presumably ones that lack UTF-16
extensions.
> Anyways, using iconv would be a good solution assuming the utf16be
> locales are installed on the poster's system.
I think this may be separate from locales. I have massive conversion
capabilities, but I only have UTF-8, ISO8859-1, and EUC-JP locales
installed.
Libraries for conversion are glibc-installed and are in /usr/lib/gconv
on my system.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://tux.oclug.on.ca/pipermail/oclug/attachments/20050222/b11170a0/attachment.bin
More information about the OCLUG
mailing list