Search the web
Sign In
New User? Sign Up
vim-multibyte · Vim (Vi IMproved) text editor special language list
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Want your group to be featured on the Yahoo! Groups website? Add a group photo to Flickr.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Messages 2607 - 2636 of 2636   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Show Message Summaries   (Group by Topic) Sort by Date ^  
#2607 From: Tony Mechelynck <antoine.mechelynck@...>
Date: Mon Apr 13, 2009 8:24 pm
Subject: Re: The 'keymap' and 'iminsert' saga (cont.)
antoine.mechelynck@...
Send Email Send Email
 
On 13/04/09 21:50, Bram Moolenaar wrote:
>
> Tony Mechelynck wrote:
>
>> gvim 7.2.148 (Huge)
>>
>> When splitting a window using ":new" or ":new filename" from a window
>> for which a keymap is defined, the 'iminsert' status is cloned but the
>> 'keymap' isn't. This looks inconsistent to me. I'm not sure whether
>> these options ought to be cloned or not, but I feel pretty certain that
>> it ought to be both or neither - not just one without the other.
>>
>> Opinions?
>
> I assume you have set 'keymap' with ":setlocal".  Then the global value
> will be used for ":new".  The same happens for 'iminsert'.  Perhaps you
> have somehow a global value of 'iminsert'?  I can't reproduce the effect
> you describe except when using ":setlocal keymap=name".
>

Ah, thanks for the clarification. Yes, I set 'keymap' locally, since I
have a number of files loaded in split-windows, and only one of them
uses a non-Latin script. As for 'iminsert', I'm less sure, since here
are the mappings by means of which I toggle it:

	 :noremap <F8> :let &l:imi = !&l:imi<CR>
	 :noremap! <F8> <C-^>

(I use F8 because I'm not sure there's a Ctrl-^ on my AZERTY keyboard.)

If the Ctrl-^ key toggles the global value in Insert mode, then that's
the culprit. Maybe it too, ought to act only locally. But for the
moment, I'll copy my map to a map! but with a Ctrl-O in front of it.


Best regards,
Tony.
--
It is something to be able to paint a particular picture, or to carve a
statue, and so to make a few objects beautiful; but it is far more
glorious to carve and paint the very atmosphere and medium through
which we look, which morally we can do.  To affect the quality of the
day, that is the highest of arts.
		 -- Henry David Thoreau, "Where I Live"

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2608 From: Kenneth Reid Beesley <krbeesley@...>
Date: Wed Apr 15, 2009 11:57 pm
Subject: DejaVu Sans Mono, MacVim, Combining Diacritics
krbeesley@...
Send Email Send Email
 
Problem with rendering Combining Diacritics, using DejaVuSansMono in
MacVim/gvim


I use MacVim/gvim with either

	 1.  DejaVuSansMono.ttf , currently 2.29, or
	 2.  BrighamVuSansMono.ttf (my modification of DejaVuSansMono.ttf
2.22 that I augmented, using FontForge, with Deseret Alphabet glyphs)

With input sequences involving combining diacritics, and having
nothing to do with the added Deseret Alphabet glyphs, I somehow have
better results with BrighamVuSansMono.ttf (based on DejaVuSansMono.ttf
2.22) than with DejaVuSansMono.ttf 2.29.

For example, if, using DejaVuSansMono.ttf 2.29, I enter (in MacVim/
gvim) the sequence

0 0061 LATIN SMALL LETTER A
1 0328 COMBINING OGONEK
2 0301 COMBINING ACUTE ACCENT
3 0020 SPACE
4 0061 LATIN SMALL LETTER A
5 0301 COMBINING ACUTE ACCENT
6 0328 COMBINING OGONEK
7 0020 SPACE
8 000a

the gvim rendering is garbled.  (I should see two instances of 'a'
with ogonek below and an acute accent above.)
The 'a's are not displayed, and there are some floating accents.
Here's a picture of the result in my gvim window:


--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---
I dumped the buffer to file, and the sequence of characters is as
shown above.  So it's a rendering (or font) problem.
I downloaded the old DejaVuSansMono.ttf v. 2.22 and got the same
results.

***********************

If, however, I use my BrighamVuSansMono.ttf (based on
DejaVuSansMono.ttf 2.22), and enter the same sequence above,
the rendering is very different.  In either case, I see an acceptably
rendered 'a' with a ogonek below and an acute accent above.
Voila:
The only change was this.  In the first (garbled) example, I specified

set anti guifont=DejaVu\ Sans\ Mono.h14

in my .gvimrc file.  In the second (successful) example, I specific
instead

set anti guifont=BrighamVu\ Sans\ Mono.h14

As far as I can remember, BrighamVuSansMono.ttf was created by adding
glyphs to DejaVuSansMono.ttf version 2.22.  These glyphs are in the
supplementary area, so perhaps I increased the encoding (in FontForge)
of the resulting font to cover all of Unicode.  But I'm sure I did
nothing directly to the combining diacritics.

I know that the rendering is being done by MacVim/gvim, but can anyone
tell me why it might work better with my modified v. 2.22 than with
2.29 or the unmodified 2.22?

(My MacVim is 7.2.148, snapshot 45)

Thanks,

Ken


******************************
Kenneth R. Beesley, D.Phil.
P.O. Box 540475
North Salt Lake, UT
84054  USA

#2609 From: James Cloos <cloos@...>
Date: Thu Apr 16, 2009 2:01 am
Subject: Re: [DejaVu-fonts] DejaVu Sans Mono, MacVim, Combining Diacritics
cloos@...
Send Email Send Email
 
I tested that combination in my terminal (which uses DejaVu Sans Mono,
rendered via the bytecode), which was handy.

My version of DejaVu as installed is svn revision 2351 of
2009-03-20T02:27:23.189619Z.

I tried your example of U+61 with U+328 and U+301 in either order.
AKA: » ą́ « and » ą́ «.

When rendering U+301 before U+328 the two accents are centered on the.
base letter.  However, if U+328 precedes U+301, then U+328 is located
at the right stem; U+301 remains centered.  (The vertical placement is
corrent in both instances.)

Emacs has the same rendering as urxvt.

AIUI, the UCS provides no guidance on this, but Unicode says that U+301
has ccc="230" and U+328 has ccc="202", which means that the cannonical
order is U+61 U+328 U+301.

-JimC
--
James Cloos <cloos@...>         OpenPGP: 1024D/ED7DAEA6

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2610 From: Tony Mechelynck <antoine.mechelynck@...>
Date: Fri Apr 17, 2009 11:55 am
Subject: Re: DejaVu Sans Mono, MacVim, Combining Diacritics
antoine.mechelynck@...
Send Email Send Email
 
On 16/04/09 01:57, Kenneth Reid Beesley wrote:
> Problem with rendering Combining Diacritics, using DejaVuSansMono in
> MacVim/gvim
>
>
> I use MacVim/gvim with either
>
>  1.  DejaVuSansMono.ttf , currently 2.29, or
>  2.  BrighamVuSansMono.ttf (my modification of DejaVuSansMono.ttf
> 2.22 that I augmented, using FontForge, with Deseret Alphabet glyphs)
>
> With input sequences involving combining diacritics, and having
> nothing to do with the added Deseret Alphabet glyphs, I somehow have
> better results with BrighamVuSansMono.ttf (based on DejaVuSansMono.ttf
> 2.22) than with DejaVuSansMono.ttf 2.29.
>
> For example, if, using DejaVuSansMono.ttf 2.29, I enter (in MacVim/
> gvim) the sequence
>
> 0 0061 LATIN SMALL LETTER A
> 1 0328 COMBINING OGONEK
> 2 0301 COMBINING ACUTE ACCENT
> 3 0020 SPACE
> 4 0061 LATIN SMALL LETTER A
> 5 0301 COMBINING ACUTE ACCENT
> 6 0328 COMBINING OGONEK
> 7 0020 SPACE
> 8 000a
>
> the gvim rendering is garbled.  (I should see two instances of 'a'
> with ogonek below and an acute accent above.)
> The 'a's are not displayed, and there are some floating accents.
> Here's a picture of the result in my gvim window:
[...]

You're spurring me to run some more tests on my version of gvim.

I'm on Linux with GTK2 gvim, and I don't have BrighamVu Sans Mono
installed, but I have DejaVu Sans Mono and I normally use Bitstream Vera
Sans Mono (another avatar of DejaVu, I think). I tried your examples
with a very large font size (20) to avoid any possible errors due to
incorrectly seeing what was displayed. Also, I added several spaces
before, between and after the two complex characters so as not to miss
badly located combiners. I'm not sure which versions of the fonts are
installed, but gvim is 7.2.148 and GTK2 is 2.14.4.

Bitstream Vera Sans Mono 20: for both examples the first combining char.
is correctly located but the second one is one character cell to the
right of where it ought to be; when moving the block cursor over it by
one cell at a time, I see the following: with the cursor on the a, the
ill-placed combiner blinks in opposite phase with it; if I move the
cursor right from there, that ill-placed combiner disappears, but Ctrl-L
or focus off-on makes it reappear. Moving the cursor left from the a
makes the ill-placed combiner stay visible.

DejaVu Sans Mono 20: the first example is displayed correctly (with
acute above and ogonek below). The second one has its ogonek correctly
placed, but the acute is now one cell left of where it ought to be, and
the visual weirdness described above is reversed: displayed in opposite
phase when the block cursor blinks atop the a (but this time barely
visible against the background during the blinking cursor's "on" phase),
disappears if I move the cursor left from there, reappears by Ctrl-L or
focus off-on, remains shown if I move the cursor right from there.

Let's try another font: Courier New 20. Here the second example is
displayed correctly, the first one has its acute one cell right of where
it ought to be, same weird interaction with the blinking block cursor as
the BitStream Vera ogonek.

And another: Lucida Typewriter 20. Same results as with Bitstream Vera.

And another: FZFangSong 20 (a Chinese font). Same results again (yes,
even ogoneks exist in this Chinese font).


I wonder what makes one character display correctly (for me) in DejaVu,
the other one in Courier, and neither in the other fonts. I don't think
it is Vim (though I might be wrong), but is it something in the font, or
something in the GUI interface (GTK2 in my case)? I don't know.


Best regards,
Tony.
--
Any fool can paint a picture, but it takes a wise person to be able to
sell it.

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2611 From: Bram Moolenaar <Bram@...>
Date: Tue May 12, 2009 8:07 pm
Subject: Vim charity update
Bram@...
Send Email Send Email
 
Hello Vim users,

Vim comes for free.  I do ask you to consider helping Vim's charity,
please read on.

In April I have visited the Kibaale Childrens Centre.  My overall
impression is that things are very organized, many children are given
proper education and other basic needs.  We are gradually able to help
the community improve the living conditions in this poor district.  You
can read my report, with lots of pictures, here:
http://iccf-holland.org/news.html#April2009

The new clinic has become very popular.  Every day many patients come
here for medical help that would otherwise not be available to them.
The Kibaale clinic saves lives!

This popularity comes at a price.  We struggle to pay for all the
medicine, salaries of the medical staff, maintenance of equipment, etc.
Therefore I have set up this page for sponsoring a room in the clinic:
http://iccf-holland.org/sponsorclinic.html

We need your help!

Happy Vimming!

--
hundred-and-one symptoms of being an internet addict:
226. You sit down at the computer right after dinner and your spouse
      says "See you in the morning."

  /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\        download, build and distribute -- http://www.A-A-P.org        ///
  \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2612 From: Bram Moolenaar <Bram@...>
Date: Tue May 12, 2009 8:07 pm
Subject: Vim charity update
Bram@...
Send Email Send Email
 
Hello Vim users,

Vim comes for free.  I do ask you to consider helping Vim's charity,
please read on.

In April I have visited the Kibaale Childrens Centre.  My overall
impression is that things are very organized, many children are given
proper education and other basic needs.  We are gradually able to help
the community improve the living conditions in this poor district.  You
can read my report, with lots of pictures, here:
http://iccf-holland.org/news.html#April2009

The new clinic has become very popular.  Every day many patients come
here for medical help that would otherwise not be available to them.
The Kibaale clinic saves lives!

This popularity comes at a price.  We struggle to pay for all the
medicine, salaries of the medical staff, maintenance of equipment, etc.
Therefore I have set up this page for sponsoring a room in the clinic:
http://iccf-holland.org/sponsorclinic.html

We need your help!

Happy Vimming!

--
hundred-and-one symptoms of being an internet addict:
226. You sit down at the computer right after dinner and your spouse
      says "See you in the morning."

  /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\        download, build and distribute -- http://www.A-A-P.org        ///
  \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///



--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2613 From: bjrn <bjorn.winckler@...>
Date: Sat Jun 20, 2009 5:58 pm
Subject: Re: Failed to drag&drop-open a file with wide-chars in its filename
bjorn.winckler@...
Send Email Send Email
 
2009/6/20 Tony Mechelynck:
>
> On Jun 19, 11:22 pm, björn <bjorn.winck...@...> wrote:
> [...]
>> I'm afraid I know way too little about text rendering to fix this, so
>> until somebody else fixes it in core Vim the problem will remain.  I
>> would highly suggest that you take up the rendering problem on the
>> vim_dev mailing list (just send a text file with 한 as the content and
>> ask why it renders as three glyphs).
>>
>> Sorry,
>> Björn
>
> The OP's problem was about a file with 한 as (part of) the filename,
> not as the content.

Tony,

I appreciate that you reply to my post, but there really is no need in
stating the painfully obvious.  The problem appears when the filename
is displayed in the command line (as a result of opening the file) and
as such you run into the same problem if you have that character as
part of the contents of a file.

> I'm on Linux, so what I see may be different from what you see on
> MacVim; but in gvim I see 한 (when in the content of a file) as one
> glyph (U+D55C, corresponding to three bytes, hex ED 95 9C). Are you
> sure you have 'encoding' correctly set? (I use utf-8).

I tried it myself on Linux and had the same problem and realized that
the problem has to do with how you represent 한.  If done as you
suggest with U+D55C it works (both Linux and MacVim), but if
represented by U+1112, U+1161, U+11AB then Vim will render it as three
glyphs but here the Cocoa text system combines these into one glyph
and that is where the problem in MacVim appears.  (By the way: MacVim
defaults to use utf-8 for 'encoding'.)

So in a way the problem is related to having 한 in a filename since Mac
OS X apparently represents it as U+1112, U+1161, U+11AB instead of as
U+D55C.  Still, if one were to enter those three characters separately
in a buffer the same problem would arise.  As far as I see it this
only means that the Cocoa text system is not suitable for this purpose
which only means that we will have to migrate to the ATSUI or CoreText
renderers sometime in the future.

To conclude: it seems that this is a problem with the Cocoa text
system and not that something in Vim has to be "fixed" as I stated in
my previous post (unless Vim should do the same as Cocoa and
automatically render ᄒ,ᅡ,ᆫ as 한).

Björn

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2614 From: Tony Mechelynck <antoine.mechelynck@...>
Date: Sun Jun 21, 2009 12:14 am
Subject: Re: Failed to drag&drop-open a file with wide-chars in its filename
antoine.mechelynck@...
Send Email Send Email
 
On 20/06/09 19:58, björn wrote:
>
> 2009/6/20 Tony Mechelynck:
>>
>> On Jun 19, 11:22 pm, björn<bjorn.winck...@...>  wrote:
>> [...]
>>> I'm afraid I know way too little about text rendering to fix this, so
>>> until somebody else fixes it in core Vim the problem will remain.  I
>>> would highly suggest that you take up the rendering problem on the
>>> vim_dev mailing list (just send a text file with 한 as the content and
>>> ask why it renders as three glyphs).
>>>
>>> Sorry,
>>> Björn
>>
>> The OP's problem was about a file with 한 as (part of) the filename,
>> not as the content.
>
> Tony,
>
> I appreciate that you reply to my post, but there really is no need in
> stating the painfully obvious.  The problem appears when the filename
> is displayed in the command line (as a result of opening the file) and
> as such you run into the same problem if you have that character as
> part of the contents of a file.
>
>> I'm on Linux, so what I see may be different from what you see on
>> MacVim; but in gvim I see 한 (when in the content of a file) as one
>> glyph (U+D55C, corresponding to three bytes, hex ED 95 9C). Are you
>> sure you have 'encoding' correctly set? (I use utf-8).
>
> I tried it myself on Linux and had the same problem and realized that
> the problem has to do with how you represent 한.  If done as you
> suggest with U+D55C it works (both Linux and MacVim), but if
> represented by U+1112, U+1161, U+11AB then Vim will render it as three
> glyphs but here the Cocoa text system combines these into one glyph
> and that is where the problem in MacVim appears.  (By the way: MacVim
> defaults to use utf-8 for 'encoding'.)

Ah, I see. I entered it in Vim by copy-paste from your previous post in
the vim_mac Google Group page in my browser.

Vim is obviously unaware of hangul jamo decomposition / recomposition
and IIUC will render each of them as one glyph. I'm not sure how to have
them be treated as "one spacing + (in this case) 2 composing characters"
though IIUC it would be "the right way" to do it.

>
> So in a way the problem is related to having 한 in a filename since Mac
> OS X apparently represents it as U+1112, U+1161, U+11AB instead of as
> U+D55C.  Still, if one were to enter those three characters separately
> in a buffer the same problem would arise.  As far as I see it this
> only means that the Cocoa text system is not suitable for this purpose
> which only means that we will have to migrate to the ATSUI or CoreText
> renderers sometime in the future.
>
> To conclude: it seems that this is a problem with the Cocoa text
> system and not that something in Vim has to be "fixed" as I stated in
> my previous post (unless Vim should do the same as Cocoa and
> automatically render ᄒ,ᅡ,ᆫ as 한).
>
> Björn

Well, sorry I can't help you.


Best regards,
Tony.
--
Really heard in court in the U.S.A.:
Q.: Doctor, before you started the autopsy, did you check the pulse?
A.: No, I didn't.
Q.: Did you test the blood pressure?
A.: No, I didn't.
Q.: Did you check the breathing?
A.: No, I didn't.
Q.: Then there is a possibility that you autopsied a living person?
A.: No, there isn't.
Q.: How can you be so sure, Doctor?
A.: Because his brain was in a jar on my desk.
Q.: I see. But couldn't the patient be still alive nevertheless?
A.: Hm, yes, he could still be alive, practicing as a lawyer.

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2615 From: Andrew Dunbar <hippytrail@...>
Date: Sun Jun 21, 2009 3:03 am
Subject: Re: Failed to drag&drop-open a file with wide-chars in its filename
hippytrail@...
Send Email Send Email
 
2009/6/20 Tony Mechelynck <antoine.mechelynck@...>:
>
> On 20/06/09 19:58, björn wrote:
>>
>> 2009/6/20 Tony Mechelynck:
>>>
>>> On Jun 19, 11:22 pm, björn<bjorn.winck...@...>  wrote:
>>> [...]
>>>> I'm afraid I know way too little about text rendering to fix this, so
>>>> until somebody else fixes it in core Vim the problem will remain.  I
>>>> would highly suggest that you take up the rendering problem on the
>>>> vim_dev mailing list (just send a text file with 한 as the content and
>>>> ask why it renders as three glyphs).
>>>>
>>>> Sorry,
>>>> Björn
>>>
>>> The OP's problem was about a file with 한 as (part of) the filename,
>>> not as the content.
>>
>> Tony,
>>
>> I appreciate that you reply to my post, but there really is no need in
>> stating the painfully obvious.  The problem appears when the filename
>> is displayed in the command line (as a result of opening the file) and
>> as such you run into the same problem if you have that character as
>> part of the contents of a file.
>>
>>> I'm on Linux, so what I see may be different from what you see on
>>> MacVim; but in gvim I see 한 (when in the content of a file) as one
>>> glyph (U+D55C, corresponding to three bytes, hex ED 95 9C). Are you
>>> sure you have 'encoding' correctly set? (I use utf-8).
>>
>> I tried it myself on Linux and had the same problem and realized that
>> the problem has to do with how you represent 한.  If done as you
>> suggest with U+D55C it works (both Linux and MacVim), but if
>> represented by U+1112, U+1161, U+11AB then Vim will render it as three
>> glyphs but here the Cocoa text system combines these into one glyph
>> and that is where the problem in MacVim appears.  (By the way: MacVim
>> defaults to use utf-8 for 'encoding'.)
>
> Ah, I see. I entered it in Vim by copy-paste from your previous post in
> the vim_mac Google Group page in my browser.
>
> Vim is obviously unaware of hangul jamo decomposition / recomposition
> and IIUC will render each of them as one glyph. I'm not sure how to have
> them be treated as "one spacing + (in this case) 2 composing characters"
> though IIUC it would be "the right way" to do it.
>
>>
>> So in a way the problem is related to having 한 in a filename since Mac
>> OS X apparently represents it as U+1112, U+1161, U+11AB instead of as
>> U+D55C.  Still, if one were to enter those three characters separately
>> in a buffer the same problem would arise.  As far as I see it this
>> only means that the Cocoa text system is not suitable for this purpose
>> which only means that we will have to migrate to the ATSUI or CoreText
>> renderers sometime in the future.
>>
>> To conclude: it seems that this is a problem with the Cocoa text
>> system and not that something in Vim has to be "fixed" as I stated in
>> my previous post (unless Vim should do the same as Cocoa and
>> automatically render ᄒ,ᅡ,ᆫ as 한).
>>
>> Björn

Hangul jamo (de)composition is part of Unicode normalization. Do we know
if OS X does Unicode for all characters or just for Korean? I suspect it is
done for all characters to prevent two identical looking filenames which differ
only in Unicode normalization. A good language to test this with would be
Vietnamese which uses Latin script with up to three "accents" per character.

Unicode normalization might be a feature of the HFS+ filesystem as there is
a general problem in computing of encodings vs. filesystems.

Andrew Dunbar (hippietrail)

> Well, sorry I can't help you.
>
>
> Best regards,
> Tony.
> --
> Really heard in court in the U.S.A.:
> Q.: Doctor, before you started the autopsy, did you check the pulse?
> A.: No, I didn't.
> Q.: Did you test the blood pressure?
> A.: No, I didn't.
> Q.: Did you check the breathing?
> A.: No, I didn't.
> Q.: Then there is a possibility that you autopsied a living person?
> A.: No, there isn't.
> Q.: How can you be so sure, Doctor?
> A.: Because his brain was in a jar on my desk.
> Q.: I see. But couldn't the patient be still alive nevertheless?
> A.: Hm, yes, he could still be alive, practicing as a lawyer.
>
> >
>



--
http://wiktionarydev.leuksman.com http://linguaphile.sf.net

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2616 From: Tony Mechelynck <antoine.mechelynck@...>
Date: Sat Jun 20, 2009 12:58 pm
Subject: Re: Failed to drag&drop-open a file with wide-chars in its filename
antoine.mechelynck@...
Send Email Send Email
 
On Jun 19, 11:22 pm, björn <bjorn.winck...@...> wrote:
[...]
> I'm afraid I know way too little about text rendering to fix this, so
> until somebody else fixes it in core Vim the problem will remain.  I
> would highly suggest that you take up the rendering problem on the
> vim_dev mailing list (just send a text file with 한 as the content and
> ask why it renders as three glyphs).
>
> Sorry,
> Björn

The OP's problem was about a file with 한 as (part of) the filename,
not as the content.

I'm on Linux, so what I see may be different from what you see on
MacVim; but in gvim I see 한 (when in the content of a file) as one
glyph (U+D55C, corresponding to three bytes, hex ED 95 9C). Are you
sure you have 'encoding' correctly set? (I use utf-8).


Best regards,
Tony.
--
Swipple's Rule of Order:
	 He who shouts the loudest has the floor.

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2617 From: bjrn <bjorn.winckler@...>
Date: Tue Jun 23, 2009 6:54 pm
Subject: Re: Failed to drag&drop-open a file with wide-chars in its filename
bjorn.winckler@...
Send Email Send Email
 
2009/6/21 Andrew Dunbar:
> 2009/6/20 Tony Mechelynck:
>> On 20/06/09 19:58, björn wrote:
>>>
>>> I tried it myself on Linux and had the same problem and realized that
>>> the problem has to do with how you represent 한.  If done as you
>>> suggest with U+D55C it works (both Linux and MacVim), but if
>>> represented by U+1112, U+1161, U+11AB then Vim will render it as three
>>> glyphs but here the Cocoa text system combines these into one glyph
>>> and that is where the problem in MacVim appears.  (By the way: MacVim
>>> defaults to use utf-8 for 'encoding'.)
>>
>> Ah, I see. I entered it in Vim by copy-paste from your previous post in
>> the vim_mac Google Group page in my browser.
>>
>> Vim is obviously unaware of hangul jamo decomposition / recomposition
>> and IIUC will render each of them as one glyph. I'm not sure how to have
>> them be treated as "one spacing + (in this case) 2 composing characters"
>> though IIUC it would be "the right way" to do it.
>
> Hangul jamo (de)composition is part of Unicode normalization. Do we know
> if OS X does Unicode for all characters or just for Korean? I suspect it is
> done for all characters to prevent two identical looking filenames which
differ
> only in Unicode normalization. A good language to test this with would be
> Vietnamese which uses Latin script with up to three "accents" per character.
>
> Unicode normalization might be a feature of the HFS+ filesystem as there is
> a general problem in computing of encodings vs. filesystems.

Hi Andrew,

As far as I can tell (from searching around) HFS+ always uses
normalization form D (NFD) for filenames.  So as a workaround for the
issue the OP had I now normalize filenames to compatibility form C
(NFKC) before passing the filename on to Vim and this takes care of
the OP's problem.

However, as I see it this really is a legitimate issue in Vim itself
in that it does not handle NFD properly (the example above should
always render as one glyph, not three as it does now if NFD is used).
Either Vim should ensure that all buffers are normalized to composed
form NFC/NFKC or it needs to be made "NFD aware".  Does anybody on the
vim_multibyte list (this mail goes to vim_mac as well) have any
comments on this?

Björn

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2618 From: "John (Eljay) Love-Jensen" <eljay@...>
Date: Tue Jun 23, 2009 7:46 pm
Subject: Re: Failed to drag&drop-open a file with wide-chars in its filename
eljay@...
Send Email Send Email
 
Hi Bjrn,

> As far as I can tell (from searching around) HFS+ always uses
> normalization form D (NFD) for filenames.

HFS+ uses a variant of NFD for filenames.  (The HFS+ variant predates
standardizatoin of NFD.)  This requirement is enforced by the OS.

http://developer.apple.com/technotes/tn/tn1150.html
http://developer.apple.com/technotes/tn/tn1150table.html
http://developer.apple.com/qa/qa2001/qa1235.html
http://www.unicode.org/reports/tr15/

Windows uses NFC for filenames.  I'm not sure if the Linux world settled on
NFC or NFK.

Amiga OS (at least the one I used) is ECMA 94 Latin 1 based (precursor to
ISO 8859-1).

> So as a workaround for the issue the OP had I now normalize filenames
> to compatibility form C (NFKC) before passing the filename on to Vim
> and this takes care of the OP's problem.

NFC or NFKC?  Those are different normalizations.

Windows NTFS file system uses NFC.  But it isn't enforced by the OS, yet.

> However, as I see it this really is a legitimate issue in Vim itself
> in that it does not handle NFD properly (the example above should
> always render as one glyph, not three as it does now if NFD is used).
> Either Vim should ensure that all buffers are normalized to composed
> form NFC/NFKC or it needs to be made "NFD aware".

I agree with your assessment.

> Does anybody on the vim_multibyte list (this mail goes to vim_mac as
> well) have any comments on this?

The relevant Mac OS X routine APIs are:

CFURLRef url =
CFURLCreateWithFileSystemPath(
   kCFAllocatorDefault,
   cfstringFullPath,
   kCFURLPOSIXPathStyle,
   false));

char bufferUTF8[32768*4]; // Worst case scenario.
// As per Apple documentation, paths can be "up to 30,000 UTF-16
// encoding units long", with each component being up to 255 UTF-16
// encoding units long.  Too bad there isn't an API to specify the
// exact buffer size /a priori/.

Boolean success =
CFURLGetFileSystemRepresentation(
  url,
  true,
  &bufferUTF8[0],
  sizeof bufferUTF8);

Sincerely,
--Eljay


--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2619 From: "John (Eljay) Love-Jensen" <eljay@...>
Date: Tue Jun 23, 2009 8:02 pm
Subject: Re: Failed to drag&drop-open a file with wide-chars in its filename
eljay@...
Send Email Send Email
 
> Windows uses NFC for filenames.  I'm not sure if the Linux world settled on
> NFC or NFK.

I meant:  ... NFC or NFD.

Fat fingers.

--Eljay


--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2620 From: Andrew Dunbar <hippytrail@...>
Date: Wed Jun 24, 2009 1:34 am
Subject: Re: Failed to drag&drop-open a file with wide-chars in its filename
hippytrail@...
Send Email Send Email
 
2009/6/23 John (Eljay) Love-Jensen <eljay@...>:
>
> Hi Bjrn,
>
>> As far as I can tell (from searching around) HFS+ always uses
>> normalization form D (NFD) for filenames.
>
> HFS+ uses a variant of NFD for filenames. (The HFS+ variant predates
> standardizatoin of NFD.) This requirement is enforced by the OS.
>
> http://developer.apple.com/technotes/tn/tn1150.html
> http://developer.apple.com/technotes/tn/tn1150table.html
> http://developer.apple.com/qa/qa2001/qa1235.html
> http://www.unicode.org/reports/tr15/
>
> Windows uses NFC for filenames. I'm not sure if the Linux world settled on
> NFC or NFK.

When I worked on AbiWord a few years ago Linux left filename encoding
up to the filesystem and the user. This may have changed since...

Linux supports many filesystems including Windows and Mac filesystems.
For filesystems which mandate a specific encoding Linux should follow
those rules. For older filesystems the encoding would generally be the
encoding of the OS but... Linux as Unix is a multisuer OS and may have
various users using various languages in various encodings. Each user
gets to decide their language and encoding through enviroment
variables such as LANG, LC_ALL, LC_COLLATE etc. These vary by vintage
of the OS and may well vary for other Unixes too such as FreeBSD.

I think Linux generally uses extN filesytems as default. When I was
last working with it that was ext2 but ext3 has now been in use for
some time and ext4 is the current iteration which may or may not be in
general release. The ext3 or ext4 filesystems may mandate an encoding
that ext2 did not.

The general soltion for the Unix/Linux world may be to honour the
user's locale settings and assume that the filesystem software will
convert to any specifically mandated encoding it requires when you
call the standard open() etc APIs.

But further research is definitely recommended!

Andrew Dunbar.


> Amiga OS (at least the one I used) is ECMA 94 Latin 1 based (precursor to
> ISO 8859-1).
>
>> So as a workaround for the issue the OP had I now normalize filenames
>> to compatibility form C (NFKC) before passing the filename on to Vim
>> and this takes care of the OP's problem.
>
> NFC or NFKC? Those are different normalizations.
>
> Windows NTFS file system uses NFC. But it isn't enforced by the OS, yet.
>
>> However, as I see it this really is a legitimate issue in Vim itself
>> in that it does not handle NFD properly (the example above should
>> always render as one glyph, not three as it does now if NFD is used).
>> Either Vim should ensure that all buffers are normalized to composed
>> form NFC/NFKC or it needs to be made "NFD aware".
>
> I agree with your assessment.
>
>> Does anybody on the vim_multibyte list (this mail goes to vim_mac as
>> well) have any comments on this?
>
> The relevant Mac OS X routine APIs are:
>
> CFURLRef url =
> CFURLCreateWithFileSystemPath(
> kCFAllocatorDefault,
> cfstringFullPath,
> kCFURLPOSIXPathStyle,
> false));
>
> char bufferUTF8[32768*4]; // Worst case scenario.
> // As per Apple documentation, paths can be "up to 30,000 UTF-16
> // encoding units long", with each component being up to 255 UTF-16
> // encoding units long. Too bad there isn't an API to specify the
> // exact buffer size /a priori/.
>
> Boolean success =
> CFURLGetFileSystemRepresentation(
> url,
> true,
> &bufferUTF8[0],
> sizeof bufferUTF8);
>
> Sincerely,
> --Eljay
>
>
> >
>



--
http://wiktionarydev.leuksman.com http://linguaphile.sf.net

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2621 From: Nico Weber <nicolasweber@...>
Date: Wed Jun 24, 2009 4:13 am
Subject: Re: Failed to drag&drop-open a file with wide-chars in its filename
nicolasweber@...
Send Email Send Email
 
>> Windows uses NFC for filenames.  I'm not sure if the Linux world
>> settled on
>> NFC or NFK.
>
> When I worked on AbiWord a few years ago Linux left filename encoding
> up to the filesystem and the user. This may have changed since...


I'm pretty sure it hasn't. As far as I know, for linux a filename is
just a bunch of bytes, and you only need to know the encoding for
lesser tasks such as file name display anyway ;-) In that case, the
recommended way is to get the encoding from an env var.

Nico

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2622 From: bjrn <bjorn.winckler@...>
Date: Wed Jun 24, 2009 12:00 pm
Subject: Re: Failed to drag&drop-open a file with wide-chars in its filename
bjorn.winckler@...
Send Email Send Email
 
Hi Eljay,

2009/6/23 John (Eljay) Love-Jensen:
>
>> As far as I can tell (from searching around) HFS+ always uses
>> normalization form D (NFD) for filenames.
>
> HFS+ uses a variant of NFD for filenames. (The HFS+ variant predates
> standardizatoin of NFD.) This requirement is enforced by the OS.
>
> http://developer.apple.com/technotes/tn/tn1150.html
> http://developer.apple.com/technotes/tn/tn1150table.html
> http://developer.apple.com/qa/qa2001/qa1235.html
> http://www.unicode.org/reports/tr15/

Thanks for clarifying that (and for the links!).

> Windows uses NFC for filenames. I'm not sure if the Linux world settled on
> NFC or NFK.

I read that Windows uses NFKC.  Have you got a reference for the claim
that NFC is used?

>> So as a workaround for the issue the OP had I now normalize filenames
>> to compatibility form C (NFKC) before passing the filename on to Vim
>> and this takes care of the OP's problem.
>
> NFC or NFKC? Those are different normalizations.
>
> Windows NTFS file system uses NFC. But it isn't enforced by the OS, yet.

I did mean the compatibility form NFKC since I read somewhere that
NTFS uses NFKC, but I did not research that very carefully.


>> However, as I see it this really is a legitimate issue in Vim itself
>> in that it does not handle NFD properly (the example above should
>> always render as one glyph, not three as it does now if NFD is used).
>> Either Vim should ensure that all buffers are normalized to composed
>> form NFC/NFKC or it needs to be made "NFD aware".
>
> I agree with your assessment.
>
>> Does anybody on the vim_multibyte list (this mail goes to vim_mac as
>> well) have any comments on this?
>
> The relevant Mac OS X routine APIs are:
>
> CFURLRef url =
> CFURLCreateWithFileSystemPath(
> kCFAllocatorDefault,
> cfstringFullPath,
> kCFURLPOSIXPathStyle,
> false));
>
> char bufferUTF8[32768*4]; // Worst case scenario.
> // As per Apple documentation, paths can be "up to 30,000 UTF-16
> // encoding units long", with each component being up to 255 UTF-16
> // encoding units long. Too bad there isn't an API to specify the
> // exact buffer size /a priori/.
>
> Boolean success =
> CFURLGetFileSystemRepresentation(
> url,
> true,
> &bufferUTF8[0],
> sizeof bufferUTF8);

Thanks.  NSString has a method called fileSystemRepresentation which
I'm guessing does the same thing(?).  I used the NSString method
precomposedStringWithCompatibilityMapping to convert to NFKC.

Bjrn

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2623 From: "John (Eljay) Love-Jensen" <eljay@...>
Date: Wed Jun 24, 2009 1:09 pm
Subject: Re: Failed to drag&drop-open a file with wide-chars in its filename
eljay@...
Send Email Send Email
 
Hi Bjrn,

> I read that Windows uses NFKC.  Have you got a reference for the claim
> that NFC is used?

Drat, I cannot find the MSDN reference.  Maybe my memory has failed me.

NFKC is lossy.  NFC is non-lossy.

Perhaps you are remembering the security information:
http://msdn.microsoft.com/en-us/library/dd374047(VS.85).aspx#SC_Unicode

File Names, Paths, and Namespaces information is here:
http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx

Note that modern UNC (starts with "\\?\" (for paths) or with "\\.\" (for
volumes) -- such as "\\?\C:\Dir\Sub\File.ext", and up to 32,767 UTF-16
encoding units (Vista), or UCS-2 characters (XP), using 16-bit encoding of
Unicode) is different from older "short" UNC (DOS-era limit of 260 8-bit
characters dependent on the OS code page setting).

The NFC is mentioned here in a MSDN blog:
http://blogs.msdn.com/michkap/archive/2006/12/07/1232365.aspx

But I don't consider that canonical, since it was in a blog feedback
comment.

I asked for clarification on the MSDN "File Names, Paths, and Namespaces"
page, in the comments section.

NOTE:  "short" UNC and "old" DOS style has to abide by the OS code page
setting.  Even when using the FooW routines and wchar_t (16-bit) paths.

> Thanks.  NSString has a method called fileSystemRepresentation which
> I'm guessing does the same thing(?).  I used the NSString method
> precomposedStringWithCompatibilityMapping to convert to NFKC.

I presume so.  My Cocoa experience is not as extensive as my Carbon
experience.

Sincerely,
--Eljay


--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2624 From: Tony Mechelynck <antoine.mechelynck@...>
Date: Wed Jun 24, 2009 1:33 pm
Subject: Re: Failed to drag&drop-open a file with wide-chars in its filename
antoine.mechelynck@...
Send Email Send Email
 
On 24/06/09 14:00, bjrn wrote:
>
> Hi Eljay,
>
> 2009/6/23 John (Eljay) Love-Jensen:
>>
>>> As far as I can tell (from searching around) HFS+ always uses
>>> normalization form D (NFD) for filenames.
>>
>> HFS+ uses a variant of NFD for filenames.  (The HFS+ variant predates
>> standardizatoin of NFD.)  This requirement is enforced by the OS.
>>
>> http://developer.apple.com/technotes/tn/tn1150.html
>> http://developer.apple.com/technotes/tn/tn1150table.html
>> http://developer.apple.com/qa/qa2001/qa1235.html
>> http://www.unicode.org/reports/tr15/
>
> Thanks for clarifying that (and for the links!).
>
>> Windows uses NFC for filenames.  I'm not sure if the Linux world settled on
>> NFC or NFK.
>
> I read that Windows uses NFKC.  Have you got a reference for the claim
> that NFC is used?
>
>>> So as a workaround for the issue the OP had I now normalize filenames
>>> to compatibility form C (NFKC) before passing the filename on to Vim
>>> and this takes care of the OP's problem.
>>
>> NFC or NFKC?  Those are different normalizations.
>>
>> Windows NTFS file system uses NFC.  But it isn't enforced by the OS, yet.
>
> I did mean the compatibility form NFKC since I read somewhere that
> NTFS uses NFKC, but I did not research that very carefully.
>
>
>>> However, as I see it this really is a legitimate issue in Vim itself
>>> in that it does not handle NFD properly (the example above should
>>> always render as one glyph, not three as it does now if NFD is used).
>>> Either Vim should ensure that all buffers are normalized to composed
>>> form NFC/NFKC or it needs to be made "NFD aware".
>>
>> I agree with your assessment.
>>
>>> Does anybody on the vim_multibyte list (this mail goes to vim_mac as
>>> well) have any comments on this?
>>
>> The relevant Mac OS X routine APIs are:
>>
>> CFURLRef url =
>> CFURLCreateWithFileSystemPath(
>>   kCFAllocatorDefault,
>>   cfstringFullPath,
>>   kCFURLPOSIXPathStyle,
>>   false));
>>
>> char bufferUTF8[32768*4]; // Worst case scenario.
>> // As per Apple documentation, paths can be "up to 30,000 UTF-16
>> // encoding units long", with each component being up to 255 UTF-16
>> // encoding units long.  Too bad there isn't an API to specify the
>> // exact buffer size /a priori/.
>>
>> Boolean success =
>> CFURLGetFileSystemRepresentation(
>>   url,
>>   true,
>>   &bufferUTF8[0],
>>   sizeof bufferUTF8);
>
> Thanks.  NSString has a method called fileSystemRepresentation which
> I'm guessing does the same thing(?).  I used the NSString method
> precomposedStringWithCompatibilityMapping to convert to NFKC.
>
> Bjrn

Hm, NFKC and NFKD sometimes fuse slightly different glyphs into a single
"normalized" form. For instance, NFKC() = 2, though both are
(different) Latin1 characters (0xB2 and 0x32). IIRC, DOS would have kept
them distinct.

Best regards,
Tony.
--
hundred-and-one symptoms of being an internet addict:
56. You leave the modem speaker on after connecting because you think it
      sounds like the ocean wind...the perfect soundtrack for "surfing
the net".

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2625 From: KL <klu1024@...>
Date: Sat Jul 4, 2009 7:29 am
Subject: About line wrap when both English and Chinese Chars are present
klu1024@...
Send Email Send Email
 
Hi,

I hope this is the correct place to bring up this question. I love vim
very much! But I think that Vim's localization is still not perfect.

http://lh6.ggpht.com/_fHAIwtuRTpY/Sk8EiZSN0gI/AAAAAAAAABg/435m_aPipNA/LineWrap.p\
ng

Please take a look at the screenshot. The line should wrap at a
Chinese character which hits the right edge of the window. Vim does it
right for those lines with ONLY chinese chars. However, obviously when
an English word is present in the line, Vim does the wrong thing: it
considers the English word AND the following Chinese chars as
unsplittable.

This is because I set "lbr" in Vim. Unsetting "lbr" will solve the
problem BUT when the English word hits the edge it will be split right
in the middle, which is not what I want, either.

Any suggestions?

Thanks!

KL

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2626 From: Bram Moolenaar <Bram@...>
Date: Sun Jul 5, 2009 10:41 am
Subject: Re: About line wrap when both English and Chinese Chars are present
Bram@...
Send Email Send Email
 
KL wrote:

[ Please mention your name ]

> I hope this is the correct place to bring up this question. I love vim
> very much! But I think that Vim's localization is still not perfect.
>
>
http://lh6.ggpht.com/_fHAIwtuRTpY/Sk8EiZSN0gI/AAAAAAAAABg/435m_aPipNA/LineWrap.p\
ng
>
> Please take a look at the screenshot. The line should wrap at a
> Chinese character which hits the right edge of the window. Vim does it
> right for those lines with ONLY chinese chars. However, obviously when
> an English word is present in the line, Vim does the wrong thing: it
> considers the English word AND the following Chinese chars as
> unsplittable.
>
> This is because I set "lbr" in Vim. Unsetting "lbr" will solve the
> problem BUT when the English word hits the edge it will be split right
> in the middle, which is not what I want, either.

Perhaps this code will help: http://vimgadgets.sourceforge.net/liblinebreak/


--
ERROR 047: Keyboard not found.  Press RETURN to continue.

  /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\        download, build and distribute -- http://www.A-A-P.org        ///
  \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2627 From: Aleksey <alex.baibarin@...>
Date: Wed Jul 8, 2009 8:48 pm
Subject: BOM in QuickFix window
alex.baibarin@...
Send Email Send Email
 
Hi.

I'm using few MONO apps (Nemerle compiler and COCO/R parser generator)
through compiler plugins in VIM under Linux, and the problem is that
they produce output starting with FEFF.
I had this problem about a half a year ago and being (and staying) a
newbie, thought that the problem was with *errorformat* but didn't
cope defining it correctly - not sure if it is possible.
I found workaround, setting LANG=en_US in compiler call command.

I have encoding and termencoding set to 'utf-8' and locale is
'en_US.utf8'.

Is there a better way to solve this problem?

Thanks

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2628 From: Tony Mechelynck <antoine.mechelynck@...>
Date: Thu Jul 9, 2009 3:52 am
Subject: Re: BOM in QuickFix window
antoine.mechelynck@...
Send Email Send Email
 
On 08/07/09 22:48, Aleksey wrote:
>
> Hi.
>
> I'm using few MONO apps (Nemerle compiler and COCO/R parser generator)
> through compiler plugins in VIM under Linux, and the problem is that
> they produce output starting with FEFF.
> I had this problem about a half a year ago and being (and staying) a
> newbie, thought that the problem was with *errorformat* but didn't
> cope defining it correctly - not sure if it is possible.
> I found workaround, setting LANG=en_US in compiler call command.
>
> I have encoding and termencoding set to 'utf-8' and locale is
> 'en_US.utf8'.
>
> Is there a better way to solve this problem?
>
> Thanks

Make sure your 'fileencodings' option (plural) starts with "ucs-bom". In
that case, for editfiles at least (not sure about quickfix error files),
Vim will recognise the Unicode codepoint U+FEFF (known as the BOM for
Byte Order Mark though actually it's more than that) when it happens at
the very start of a file, and set 'fileencoding' (singular) and 'bomb'
accordingly for that file, as follows:

Lead bytes (hex)     'fileencoding'      'bomb'
EF BB BF             utf-8               bomb
00 00 FE FF          ucs-4               bomb
FF FE 00 00          ucs-4le             bomb
FE FF                utf-16              bomb
FF FE                utf-16le            bomb
anything else        not yet known       nobomb

As you can see, the BOM allows us to identify all Unicode encodings
(endianness included, but treating UCS-2 as a particular case of UTF-16,
which it is) assuming that no little-endian UTF-16 file will ever start
with a NULL codepoint, which I think is a reasonable assumption. ("Not
yet known" in the table above means "try the next entry in
'fileencodings'".)

If, after trying it, you find that it doesn't work for the quickfix
errorfile, come back to report it, and in that case maybe Bram will pass
by and add it to his TODO list. But even if he does, don't set your
hopes too high: it's a very long list, see |todo.txt| in the Vim help
(the file is currently 4705 lines long as of 2 July 2009, but I didn't
count how many todo items (which should be fewer ;-) ) are in it.

IIUC, even if 'fileencodings' doesn't work for quickfix error files,
setting LC_MESSAGES should be enough (LANG is used as a default for any
LC_<something> which is unset, or LC_ALL if present overrides any other
LC_* even if present). Maybe even just using ":language messages en-US"
or ":lang mess C" near the top of your vimrc could be enough.



Best regards,
Tony.
--
hundred-and-one symptoms of being an internet addict:
66. You create a homepage with the impression to cure the afflicted...but
      your hidden agenda is to receive more e-mail.

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2629 From: Aleksey <alex.baibarin@...>
Date: Thu Jul 9, 2009 6:52 am
Subject: Re: BOM in QuickFix window
alex.baibarin@...
Send Email Send Email
 
> Make sure your 'fileencodings' option (plural) starts with "ucs-bom".

> IIUC, even if 'fileencodings' doesn't work for quickfix error files,
> setting LC_MESSAGES should be enough (LANG is used as a default for any
> LC_<something> which is unset, or LC_ALL if present overrides any other
> LC_* even if present). Maybe even just using ":language messages en-US"
> or ":lang mess C" near the top of your vimrc could be enough.
>
> Best regards,
> Tony.
> --
> hundred-and-one symptoms of being an internet addict:
> 66. You create a homepage with the impression to cure the afflicted...but
>   your hidden agenda is to receive more e-mail.

Thanks for your answer.

fileencoding already had 'ucs-bom' in its start, so it didnt' help.
I also have :language messages C, so it didn't help either.
I didn't quite understand what you wrote about LC_ stuff, because I
lack even basic knowledge, but does it mean my solution with setting
LANG=en_US is fine?

Reading docs for VIM 7.2 (:help quickfix) I found the following
paragraph:
     When 'encoding' differs from the locale, the error messages may
have a
     different encoding from what Vim is using.  To convert the
messages you can
     use this code:
	 function QfMakeConv()
	    let qflist = getqflist()
	    for i in qflist
	       let i.text = iconv(i.text, "cp936", "utf-8")
	    endfor
	    call setqflist(qflist)
	 endfunction

	 au QuickfixCmdPost make call QfMakeConv()

I've tried putting this to .vimrc replacing "cp936" with ucs-bom, but
not sure if that was right. It didn't change anything.

I don't think this problem is worth of Bram's attention since it's not
really wide-spread and the workaround is available.
--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2630 From: Tony Mechelynck <antoine.mechelynck@...>
Date: Thu Jul 9, 2009 9:51 am
Subject: Re: BOM in QuickFix window
antoine.mechelynck@...
Send Email Send Email
 
On 09/07/09 08:52, Aleksey wrote:
>
>> Make sure your 'fileencodings' option (plural) starts with "ucs-bom".
------------------------------------------^^^^^^
>
>> IIUC, even if 'fileencodings' doesn't work for quickfix error files,
>> setting LC_MESSAGES should be enough (LANG is used as a default for any
>> LC_<something>  which is unset, or LC_ALL if present overrides any other
>> LC_* even if present). Maybe even just using ":language messages en-US"
>> or ":lang mess C" near the top of your vimrc could be enough.
>>
>> Best regards,
>> Tony.
>> --
>> hundred-and-one symptoms of being an internet addict:
>> 66. You create a homepage with the impression to cure the afflicted...but
>>       your hidden agenda is to receive more e-mail.
>
> Thanks for your answer.
>
> fileencoding already had 'ucs-bom' in its start, so it didnt' help.

should be 'fileencodings' with s at the end. There are two different
options, with and without s, and they don't have the same meaning.
Without s, it's the encoding used on disk for one file at a time. With
s, it's the heuristics to find what to use when opening an existing file.

> I also have :language messages C, so it didn't help either.
> I didn't quite understand what you wrote about LC_ stuff, because I
> lack even basic knowledge, but does it mean my solution with setting
> LANG=en_US is fine?

yeah, sure.

>
> Reading docs for VIM 7.2 (:help quickfix) I found the following
> paragraph:
>      When 'encoding' differs from the locale, the error messages may
> have a
>      different encoding from what Vim is using.  To convert the
> messages you can
>      use this code:
>  function QfMakeConv()
> 	   let qflist = getqflist()
> 	   for i in qflist
> 	      let i.text = iconv(i.text, "cp936", "utf-8")
> 	   endfor
> 	   call setqflist(qflist)
>  endfunction
>
>  au QuickfixCmdPost make call QfMakeConv()
>
> I've tried putting this to .vimrc replacing "cp936" with ucs-bom, but
> not sure if that was right. It didn't change anything.

no, iconv doesn't know about ucs-bom.

>
> I don't think this problem is worth of Bram's attention since it's not
> really wide-spread and the workaround is available.

How can one know how widespread it is? Sounds like the age-old "Nobody
uses it" used without proof by developers of some applications (not by
Bram) when speaking of a feature they want to remove because they don't
use it themselves and they don't want to support it.

OK, so IIUC the problem seems to be: When a compiler errorfile starts
with a BOM, Vim doesn't know how to handle it. I'm not sure if there's
an 'errorfile' setting allowing to get over it because I've never looked
into that option myself.


Best regards,
Tony.
--
"Benson, you are so free of the ravages of intelligence"
		 -- Time Bandits

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2631 From: Aleksey <alex.baibarin@...>
Date: Thu Jul 9, 2009 10:06 am
Subject: Re: BOM in QuickFix window
alex.baibarin@...
Send Email Send Email
 
On Jul 9, 1:51pm, Tony Mechelynck <antoine.mechely...@...>
wrote:

> should be 'fileencodings' with s at the end. There are two different
> options, with and without s, and they don't have the same meaning.
> Without s, it's the encoding used on disk for one file at a time. With
> s, it's the heuristics to find what to use when opening an existing file.

Sorry, mistyped it. It's 'fileencodings' certainly, and 'ucs-bom' is
the list.


> > I don't think this problem is worth of Bram's attention since it's not
> > really wide-spread and the workaround is available.
>
> How can one know how widespread it is? Sounds like the age-old "Nobody
> uses it" used without proof by developers of some applications (not by
> Bram) when speaking of a feature they want to remove because they don't
> use it themselves and they don't want to support it.

I didn't mean that. I've made such decision because I searched a lot
for similar
  problem and found almost nothing except your recent answers to 'Match
a BOM'
in a file. Which deals with different issue, but it made me think
about mine.

I wanted to say, that since this issue has a solution and isn't
reported by other developers,
people know how to live with it or don't care about it. It's just
about not wasting time on low priority stuff, IMHO.
I'm just glad to understand what the problem is.

>
> OK, so IIUC the problem seems to be: When a compiler errorfile starts
> with a BOM, Vim doesn't know how to handle it. I'm not sure if there's
> an 'errorfile' setting allowing to get over it because I've never looked
> into that option myself.
>

Yeah, you've got it right. Thank you for your time, I'll settle down
on LANG=en_US which is just ok for me.
--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2632 From: Bram Moolenaar <Bram@...>
Date: Sun Oct 4, 2009 1:18 pm
Subject: Not a Vim announcement
Bram@...
Send Email Send Email
 
Hello Vim users!

Vim bug fixes are coming out one by one, not very exciting but essential
maintenance.  Keeps me busy.  Nothing worth announcing though, that is
why this list has been quiet for a while.

I thought you might want to know that I'll be doing a talk at the
reflections/projections conference, October 17 in Illinois.  Not about
Vim but a new fun project I'm working on: Zimbu.
http://www.acm.uiuc.edu/conference/2009/index.html

- Bram

PS: Don't forget to do your Amazon orders through this ICCF web page, so
that a percentage goes to help needy children in Uganda:
http://www.iccf.nl/click1.html  It's a way of thanking me for Vim.

--
GALAHAD: No. Look, I can tackle this lot single-handed!
GIRLS:   Yes, yes, let him Tackle us single-handed!
                  "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

  /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\        download, build and distribute -- http://www.A-A-P.org        ///
  \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2633 From: Bram Moolenaar <Bram@...>
Date: Sun Oct 4, 2009 1:18 pm
Subject: Not a Vim announcement
Bram@...
Send Email Send Email
 
Hello Vim users!

Vim bug fixes are coming out one by one, not very exciting but essential
maintenance.  Keeps me busy.  Nothing worth announcing though, that is
why this list has been quiet for a while.

I thought you might want to know that I'll be doing a talk at the
reflections/projections conference, October 17 in Illinois.  Not about
Vim but a new fun project I'm working on: Zimbu.
http://www.acm.uiuc.edu/conference/2009/index.html

- Bram

PS: Don't forget to do your Amazon orders through this ICCF web page, so
that a percentage goes to help needy children in Uganda:
http://www.iccf.nl/click1.html  It's a way of thanking me for Vim.

--
GALAHAD: No. Look, I can tackle this lot single-handed!
GIRLS:   Yes, yes, let him Tackle us single-handed!
                  "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

  /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\        download, build and distribute -- http://www.A-A-P.org        ///
  \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///



--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2634 From: Paolo Baruffa (wintec) <wintec@...>
Date: Thu Nov 5, 2009 11:56 am
Subject: ANSI<>UNICODE and reverse
wintec@...
Send Email Send Email
 
Hi all!
I need to perform ANSI/UNICODE commands in my GVIM.
I read many docs on the web about and I did set these:

"------------------------------------------
" .vimrc  includes
"----------------------
:set ffs=dos,unix,mac " autosense Dos,Unix,Mac
:set fileencodings=ucs-bom,utf-8,latin1 " autosense coding
" (no fileencoding is set in .vimrc)

"------------------------------------------
" my menu includes
"----------------------

:set fileencoding=latin1<CR><Esc>:set ff=dos<CR>:w!<CR> " ANSI Dos
:set fileencoding=utf-8<CR>:w!<CR><Esc> " UNICODE

"-------------------------------------------

but they don't work correctly (I check the files with an
other editor)...

1) I open an ANSI file with GVim, I ask ":set fileencoding"
and the file appears as "utf-8"

2) conversion between ANSI <> UNICODE and then reverse likes
right ("converted" in bottom status line), but refreshing
the file on the other editor I see the same coding...

Does anybody can help, pls?


--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

#2635 From: winterTTr <winterTTr.vim@...>
Date: Fri Nov 6, 2009 12:54 am
Subject: Re: ANSI<>UNICODE and reverse
winterTTr.vim@...
Send Email Send Email
 
On Thu, Nov 5, 2009 at 7:56 PM, Paolo Baruffa <wintec@...> wrote:

Hi all!
I need to perform ANSI/UNICODE commands in my GVIM.
I read many docs on the web about and I did set these:

"------------------------------------------
"       .vimrc  includes
"----------------------
:set ffs=dos,unix,mac   "       autosense Dos,Unix,Mac
:set fileencodings=ucs-bom,utf-8,latin1 "       autosense coding
" (no fileencoding is set in .vimrc)

"------------------------------------------
"       my menu includes
"----------------------

:set fileencoding=latin1<CR><Esc>:set ff=dos<CR>:w!<CR> " ANSI Dos
:set fileencoding=utf-8<CR>:w!<CR><Esc> " UNICODE

"-------------------------------------------

but they don't work correctly (I check the files with an
other editor)...

1) I open an ANSI file with GVim, I ask ":set fileencoding"
and the file appears as "utf-8"

What's the value for the option "encoding" ?
if the "encoding" is set to utf-8 when you don't set "fenc" , the file 
will open with the same encoding as what you set to enc , which should 
become 'utf8'
 

2) conversion between ANSI <> UNICODE and then reverse likes
right ("converted" in bottom status line), but refreshing
the file on the other editor I see the same coding...

What do you mean for the "The same coding "?
The coding is always ANSI or UNICODE?

And before this , there is something should be mentioned.
UTF-8 is a variable-length character encoding for UNICODE , and 
if you original file is ansi encoding ,  when it is convert to utf8 ( without BOM),
the file should be same ( seems like no convert ). Because utf8 uses the single
octet encoding only for the 128 US-ASCII characters  which is the same as
when it is encoding with ansi.
 

Does anybody can help, pls?





--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---


#2636 From: "John Beckett" <johnb.beckett@...>
Date: Fri Nov 6, 2009 2:22 am
Subject: RE: ANSI<>UNICODE and reverse
johnb.beckett@...
Send Email Send Email
 
Paolo wrote:
> I need to perform ANSI/UNICODE commands in my GVIM.

See discussion in vim_use
http://groups.google.com/group/vim_use/browse_thread/thread/cc7e9076fa7d277d

John


--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Messages 2607 - 2636 of 2636   Oldest  |  < Older  |  Newer >  |  Newest
Advanced
Add to My Yahoo!      XML What's This?

Copyright 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help