Hi, The testing I did on spellcheck part gives around 110 ms for each spellcheck. The training set is pretty small, and contains around 200 characters. Is it...
... We've trained spell checkers on dozens of GBs of data with a large lexicon and gotten run times of 100 queries/sec (10ms/query) for queries with averages...
Thank you very much for your reply. The data I used for the testing is as following: " for (int i = 0; i < 100; ++i) { // trainer.train("abracadabra "); ...
... I'm not sure how eclipse is going to run the JUnit code. If it forks a new JVM for each test, I'd think it'd be even slower than you report. The test does...
Thank you very much. According to your suggest I have re-run the test in eclipse and found the performance was around 18ms per spellcheck. I will run the java ...
Dears: I am working for a paper, which need combine lingpipe and some other tools. however, other tools run only on jdk 1.4.2. Latest lingpipe version can be...
hello friends, i have been using lingpipe demos for some time now and this indeed seems a powerful product. but there are few more capabilities that i would...
... What you want is to extract the relations as well as the entities in the text. This is a very hard problem in general but it can work reasonably well for...
... That's a very simple example. Real sentences look like this (first sentence of first article on the NY Times home page right now): The ruling was a rebuff...
LingPipe 3.5.0 is now out. Check it out on the home page: http://alias-i.com/lingpipe The major new addition is logistic regression (aka max entropy) ...
... You just deserialize it to produce an objective/subjective sentence classifier. ... There's nothing in LingPipe to do the crawling for you. The reason is...
Dear all, I want to use partitionDistance(double) method for partitioning clusters. However, I can not understand how the parameter value works for ...
... I see there's no juavadoc for this method in LingPipe 3.5. Here's what I just added (it'll be in the next release). Note that I changed the argument's...
Sorry if reports to bugs@... have been getting misplaced. We switched mail providers and are still catching up with all of our forwarding. We're also...
... Doh! Cosine, as those who recall their trig probably already noticed, produces results in the range -1.0 and 1.0. I keep forgetting not all cosines are...
I read the documents of Lingpipe, which provides two classes i.e. TfIdfClassifierTrainer and TfIdfDistance. How can I use these classes to compute TF-IDF?...
... Hope that the following java code can help you: import com.aliasi.spell.TfIdfDistance; import com.aliasi.tokenizer.IndoEuropeanTokenizerFactory; import...
Dear youcef bey: Great Thanks for you. With your help, I Know how to compute TF- IDF values in Lingpipe. Best wishes for you. iwantnb ... i.e. ... classes ... ...
We received a question by mail from someone who didn't mind us cross-posting the answer to the ... That's unrelated. The scalability issue there is large...
... You're right -- the current tutorial is wrong and I'll fix it for the next release. ... You're right. ... Ironically, I was just up at Columbia Wednesday...
Hello, I am trying to extract news content from htmls. Which turned out to be a diffucult task. Each news html has other content then news, such as related...
Hi, I'm new to LingPipe. I tried TestCluster.java with 4news-test and 4news-train data and it ran fine. However, when I use my own data, I get this error...
What's happening is that a pairwise proximity is winding up undefined rather than non-negative. If you're following our demos, the problems most likely from a...
Hello Bob, Thanks for the prompt reply! You are right. One of the files I use is an empty file. After removing it, the demo program ran fine. TestCluster.java...
Removing boilerplate is an interesting but challenging problem. And it's not one we've thought about deeply. Boilerplate detection is a confounding factor for...
... We may have had that in an older release, but it's not in our 3.5.0 release. If you're interested in clustering, there's been a lot added to the clustering...