Does anyone know what sentence boundary algorithm that lingpipe use? is there any paper that explain that algorithm and the implementation in lingpipe in...
... We're working on the user's guide now, and that'll contain a detailed text-based description. Should be out in a month or so. Luckily, sentence detection...
Bob Carpenter
carp@...
May 26, 2004 6:25 pm
48
Thank you for the explanation about the sentence detection in LingPipe, it really helps me. Anyway, what is Andrei Mikheev's CL paper? Do i know there are two ...
... By all means, keep the questions coming. It helps us sort out our doc and interfaces. ... I should've found the reference for you; I meant the CL journal...
Bob Carpenter
carp@...
May 27, 2004 5:12 pm
52
Hi, I notice thread from archives Feb 13 2004. I'll post new message on this. On the below Mr. Carpenter claims: Adding Dictionary-Based Training Data ...
Dear Bogdan, Indeed you are correct. Thank you for your response. Unfortunately I do not know the answer to your question below. However I suspect that...
... Hello all, At this level of detail I cannot really help since this is really Bob's code. Until Bob gets back from Spain (ACL conference) all I can offer...
I believe this pertains to a number of messages that arrived when I was on an extended vacation. This is how things *should* work. I'm going to do a number ...
I just uploaded a new version of the site, including version 1.0.7 of LingPipe: http://www.aliasi.com/lingpipe It addresses some issues raised on this mailing ...
Hi, Try escaping all XML types of characters coming in since it appears LingPipe is trying to treat your document as XML. The way to do this is to convert all...
There are several parts to this answer. 1. Well-formed XML ... You need to replace instances of '&', '<', '>' or '"' with entity references "&", "<",...
Hi, I wonder if I can find some examples that illustrate how to use NEEvaluateCommand. I'm also interested in using it for 10-fold cross validation... Please...
I saw from your next message you found what you were looking for, but just to clarify for everyone ... We can't distribute examples because we don't own any of...
I would like retrain LingPipe for a digital library we are working on. Right now we are tagging all named entities in every document by hand. All of our...
Thanks Bob. I have got NEEvaluateCommand to work with -dictionary=DictionaryFilePath (the code of NEEvaluateCommand needs to be modified first). I wonder...
... Yes, that should be more than enough. Are you tagging different kinds of entities? The more entity types there are, the more training data you need to ...
... All of our documents are in XML format, but we are using TEI notation. Using NoteTabPro I wrote a simple clip that converts MUC to TEI notation. I believe...
... Great. That's the hardest part of the whole process. I have to do it all the time for evaluations. I looked up TEI and found more than I bargained for. I...
Bob, i have downloaded and have been experimenting with LingPipe 1.0.7 and i think i have run into a bug. if i take the demo bat file that processes a...
... Yes. Here's what Sun has to say: On Microsoft Windows platforms, the Java 2 SDK includes both the Java HotSpot Server VM and Java HotSpot Client VM....
... Sounds like the namespaces are a bug. Directory tagging may be tickling a "feature". ... LingPipe only looks at contiguous text content, and to find...
I finally got the train command to run :) Thanks a lot for your help Bob. I did receive an error after the command but I don't know if it's significant....
... Did the training work? This does not look good. What is the exact command you are using with all params? Does it work with just one training file? breck...
... The training command actually creates the model file. It will fail if the Java program is not running with sufficient permissions to create the file....
Ok, I switched where i was running the command from and got it working again, but with the same error I got the first time it ran. Here is my command line and...
This is a known bug (my fault) with an easy workaround: Use "./NCHFDL" rather than "NCHFDL" in the file name. I patched it for the next release (2.0). The bug...