I just noticed that in the examples file, there are many duplicate id numbers. For example, in the 2010-04-25 file, line number 105311 is: A:...
3629
Jim Breen
breen_jim
Apr 5, 2010 12:19 am
2010/4/5 Stuart McGraw <smcg4191@...> ... No, as it says in the info on http://www.edrdg.org/wiki/index.php/Tanaka_Corpus "This sequence number is the...
3630
Francis Bond
f_c_bond
Apr 5, 2010 12:56 am
G'day, I would prefer that we use a combination of numbers jpn-id:eng-id so that every pair gets a unique ID. ... I will comment on that there :-). -- Francis...
3631
Glenn Maynard
gfxmaynard
Apr 5, 2010 1:10 am
... I havn't tried to do this yet, but having a static, unique ID (preferably a single integer, not a tuple) for each entry is absolutely critical for...
3632
Francis Bond
f_c_bond
Apr 5, 2010 1:20 am
G'day, tatoeba thinks of the English and Japanese as first class objects, whereas the tuples are not --- therefore it does not have IDs for the tuples ...
3633
Glenn Maynard
gfxmaynard
Apr 5, 2010 1:37 am
... That's just a badly broken design. You can't just assign your own ID, because the next time you update, there's no reliable way to assign the same IDs to...
3634
Stuart McGraw
smcgr4444
Apr 5, 2010 2:14 am
... Harumph. :-( OK, thanks for the info. I guess I missed the tatoeba wiki page previously. I echo what Glenn and Francis said....
3635
Francis Bond
f_c_bond
Apr 5, 2010 2:58 am
G'day, ... I'm extremely grateful to Trang for building the infrastructure to edit the example sentences, as well as adding translations in new languages. So...
3636
Jim Breen
breen_jim
Apr 5, 2010 3:00 am
... Within the WWWJDIC system I could use a combination like that. The CSV dump I download includes the IDs of both the Japanese and English sentences. I chose...
3637
Jim Breen
breen_jim
Apr 5, 2010 3:45 am
... Her. $B%H%i%s%0$5$s$O=w@-$@$h!#(B ... I can only guess Trang's DB design, but pretty much everything people want is in her downloads:...
3638
Stuart McGraw
smcgr4444
Apr 5, 2010 3:58 am
... Does not the usefulness of this depend on *how* the jpn-id's are maintained in Tatoeba? That is, the value of an id is that it tells that a specific thing...
3639
Francis Bond
f_c_bond
Apr 5, 2010 4:18 am
... I thought that the ids were stable over revisions, but I could be wrong. ... -- Francis Bond <http://www3.ntu.edu.sg/home/fcbond/> Division of Linguistics...
3640
Jim Breen
breen_jim
Apr 5, 2010 4:38 am
... [...] ... They are stable. See http://tatoeba.org/eng/sentences/show/162312 for an example of a recently-changed sentence. The "162312" stayed on as the...
3641
Jim Breen
breen_jim
Apr 10, 2010 4:32 am
Greetings, Fabrice Orgogozo (Fabrice.Orgogozo@...) emailed me about a Japanese/French mathematics glossary he has been compiling. It's at ...
3642
Paul Blay
blay_paul
Apr 11, 2010 7:55 am
I think I've suggested this before, but how about including the unique Entry ID numbers in the Edict2 file? (Or having an Edict3 file if necessary) I think...
3643
Jim Breen
breen_jim
Apr 11, 2010 9:35 am
... I could easily do this, as it's currently an option in the utility that makes the edict2 format. At present it can do it one of two ways: - simply dumps...
3644
James Rose
from_csted
Apr 11, 2010 10:36 pm
... As long as there is a file with a unique number for each entry, I'm a happy camper. Haven't built Ice Mocha 2 yet, but its coming. Its coming. I swear...
3645
Jim Breen
breen_jim
Apr 12, 2010 11:51 am
Hi, I just want to let people know of a small change to the Translate Words function. A year or so back I fiddled the EDICT version that goes into the ...
3646
James Rose
from_csted
Apr 12, 2010 10:43 pm
... Looks very good to me. On Apr 12, 2010, at 1:51 AM, Jim Breen wrote: 勉強する from 勉強 [べんきう] /(n,vs) (1) study/(2) diligence/.. .. and...
3647
Jim Breen
breen_jim
Apr 15, 2010 7:23 am
Greetings, I want to fill people in on where we are with moving to an online database for JMdict/EDICT, etc. We are getting close to the start of a period...
3648
Paul Blay
blay_paul
Apr 16, 2010 6:29 pm
... Well nobody's said 'no' to this (as far as I know) how about it? I think the 'EntL100100039; form would be more familiar to people and so less open to...
3649
Jim Breen
breen_jim
Apr 20, 2010 11:50 am
... OK. What the heck. From tomorrow the EDICT2 file will be distributed with the same final field as in the WWWJDIC edition (e.g. EntL1000130X). The "X" is...