Hi all, Talking of term extraction, what kind of algorithm is Rainbow using for this? Does it do any kind of part-of-speech-tagging, morphological analysis...
Hi Martin, Sorry for the late answer (vacation). The Term Extraction utility uses a crude method: it break down the text into words, and build list of...
This looks good - I hope the docx filter will be ready as well as it is a popular format The pipeline will be ready - both event and filebase ... -- Sent from...
When I import a tab delimited file and later save it as TMX, Olifant will not escape any special chacters in it, by special I mean in particular &, < and >,...
Hi Piotr -- I find what you are saying very strange. I have lots of &, < and > in my TMs and Olifant always writes them as & < > etc. This is not a bug, it's...
Hi Piotr, ... I've tried: importing a tab-delimited file into an empty Olifant and the '&' seems to be read and escaped properly. The saved TMX looked ok as...
... Esther, What I mean is that they were not written as entities both in the imported txt file and in the exported TMX file. Regards, Piotr ... Esther...
... Yves, The file is too big to be sent by e-mail. First I converted it from windows-1250 to uf-8. Then I imported it into an empty Olifant. I did some...
Hi all, I actually experienced the same behaviour the other day as Piotr is describing: imported a tab-delimited file with <, >, etc. as characters in it,...
Hi Piotr, ... Strange. I've done a few more tests with entries containing '<', '&', '>', and '"', in the UI, as well as in the TMX either saved or exported I...
Hi Joeri, ... So there has to be some kind of combination of context that creates this problem. Re-opening seems to change one thing: the '>' gets escaped to...
... try to reproduce the problem, but so far I'm stumped to guess how ... explicitly do a search and replace of "&" to "&". I think I can reproduce this...
... I've tried this and I get: <tu> <tuv xml:lang="EN-US"> <seg>eins & eins</seg> </tuv> <tuv xml:lang="FR-FR"> <seg>one & one</seg> </tuv> </tu> And...
Sure. I'll try to post a temporary RC today. -yves From: okapitools@yahoogroups.com [mailto:okapitools@yahoogroups.com] On Behalf Of Piotr Bienkowski Sent:...
There is a Candidate Release 22 now available Here: https://sourceforge.net/project/showfiles.php?group_id=42949 It's actually quite stable since I only fixed...
Hi all, I know I should have entered this earlier, but possibly it is not too late for the final Release 22 ... 1. the "memory function" of the Search for and...
Hi Frank, ... This behavior is driven by the auto-complete of the combobox where the replace text is shown. Alas, there is nothing I can do about it. So I've...
Hi Yves -- I just want to jump in here to say I'm happy you're turning auto-complete off in the Find box. It's a real pain. Best, Esther ... De: Yves Savourel...
The same is true for me. The memory combo box without auto-replacement is the perfect solution :-) best regards, \frank ... complete off in the Find box. It's...
Hi all, There is a new build of Olifant here: http://sourceforge.net/project/showfiles.php?group_id=42949 (Candidate Release 22, dated March-02) It should have...
Hi Yves, I managed to find a repeatable way for you to track this bug: I had the same problem this morning on a very large TM (more than 250 MB), and while...
Hi Denis, Thanks for the example. I can reproduce the issue. Alas, looking at it again, I come to the same conclusion: it’s a problem with the underlying...
I am dealing with some htm files (Ansi) that contains both Cyrillic in plain form and escaped Cyrillic characters (e.g С etc) Problem I have is that when...
... Not sure :) If the file is ANSI it cannot have Cyrillic in plain text: they have to be escaped, if they are not they are ANSI chars. Maybe you meant the...
Hi Fabrice ... When your ANSI files are Windows-1251 (as Yves assumes) you can use any Unicode format as a 'pivot': 1. convert CP 1251 to Unicode (and resolve...
Thanks Frank, That did help, I converted a windows-1251 Ansi file to unicode (UTF-8) and then when I processed the file in rainbow the file was properly...