Hi, Having returned from vacation and almost dug my way out of my mail box, here are a couple of answers to your gracious ... The stats were computed using the...
Can these classes use getters and setters? I'm trying to script these into an output, but it doesn't work because it doesn't follow the JavaBean notation. ...
We have decided to add a new page to the web site listing LingPipe users. See http://alias-i.com/lingpipe/web/customers.html We cover commercial/paid...
Mention and MentionChain ... What exactly do you need in the way of get/set functionality? Our mention implementations are immutable, so there can't be...
I have a classification task where a body of text needs to be trained either as "favorable" or "not favorable". The vast majority (around 85%) of texts will...
... If you have a lot of data, you can reduce training load one of two ways. First, you can sample from the data (better than just taking the first 10%). If...
The Named Entity Tutorial has an example program "TrainConll2002.java" where the class "Conll2002ChunkTagParser" is used. I did not see the APIs of the class...
I am new to LingPipe and trying to understand the various classes in " com.aliasi.corpus.parsers" package. I am developing an application that requires...
... I like MUC-style XML-formatted data because you don't lose the information about whitespaces. Whitespace information is only used in the rescoring...
... Assuming this is the right reference: http://wiki.apache.org/jakarta-lucene/SpellChecker then there are substantial differences between what we're doing...
Hi Bob, Just tiny addition, n-grams in Lucene SpellC are used to restrict search, but not as a Boolen filters, rather as "words" in standard lucene TF/IDF...
... Right. This is the same thing that I did in our case study in the "Lucene in Action" book to find transliteration variants from Arabic in English (e.g. Al...
Thanks Bob, allready deep in Gusfield's book (your tip a few weeks, months ago), really good one! Anyhow, back to "spell checking". This fuzzy phrase...
... I'm afraid I lost the thread here. I'm not sure where the SpanQuery comes in. For fuzzy matching, they pull out an OR (disjunctive) query over terms...
Hello, I've noticed that the MentionAnnotator.java in demos/tutorial/uima/src/com/aliasi/uima for UIMA integration seemed at little outdated and had some...
Florian Laws
florian@...
Sep 14, 2006 4:49 pm
362
Hi Bob, ... Sorry for being confused here. I went really fast forward on this one: if you search for "Fordo Baggins" you get "Frodo Baggins" correction, as it...
... Right. Basically, it's too many dependencies (quadratic) to model explicitly. With n tokens, you get n squared (n^2) dependencies without order, but only...
... Thanks so much. I forgot we even released the old UIMA code, but now we're just dumping our whole CVS snapshot into the release. I'll try to get a proper...
thanks, I guess, now I have much better feeling now what is LingPipe SCheck doing. I skipped briefly through SpellChecking classes in LP, not too easy to...
Hi all, Please tell me if I'm missing something terribly obvious here, I'm hoping I am. Can the lingpipe.jar file be used (in my case for parts-of-speech...
... Yes, I think you're missing the lingpipe.jar and your classpath. ... Try: javac -classpath lingpipe-2.3.0.jar *.java assuming you have the lingpipe jar in...
Hi Bob, Ant, eh? Hmm, I'll download the files from Apache and take a look. Thanks! My first guess to below was that is was a classpath error as well, but I'm...
What is the best way to optimize the CorefDemo? I ran it through a open source profiler (http://jrat.sourceforge.net/) while running it on a sample document...
Looks to me like the method parseString is a little misleading. The method signature is calling for a character array in the first argument. If you put...
slo_java wrote: Now why'd you go and choose a login name like that? ... 1/4 * 100 years * 365 days/year * 24 hours/day * 60 minutes/hr * 1 doc/2.5 minutes = 5M...
... Well, so much for first our first guesses. I looked at your code more closely against our API and the problem's that you're sending in a String rather than...
Thanks everyone for the help. I did indeed get it working, and thought I'd post this additional little thought for the archives. I was fooling around and...
Hello, I've been exploring the contents of the uploaded data in the 'mention' table of the old Medline-Genia db tutorial. There seems to be a lot of: **" limb...
... First, the reason you're seeing this is that the models are very weak and local and don't do a good job of dealing with things even slightly different than...