We received a question by mail from someone who didn't mind us cross-posting the answer to the ... That's unrelated. The scalability issue there is large...
... You're right -- the current tutorial is wrong and I'll fix it for the next release. ... You're right. ... Ironically, I was just up at Columbia Wednesday...
Hello, I am trying to extract news content from htmls. Which turned out to be a diffucult task. Each news html has other content then news, such as related...
Hi, I'm new to LingPipe. I tried TestCluster.java with 4news-test and 4news-train data and it ran fine. However, when I use my own data, I get this error...
What's happening is that a pairwise proximity is winding up undefined rather than non-negative. If you're following our demos, the problems most likely from a...
Hello Bob, Thanks for the prompt reply! You are right. One of the files I use is an empty file. After removing it, the demo program ran fine. TestCluster.java...
Removing boilerplate is an interesting but challenging problem. And it's not one we've thought about deeply. Boilerplate detection is a confounding factor for...
... We may have had that in an older release, but it's not in our 3.5.0 release. If you're interested in clustering, there's been a lot added to the clustering...
Bob, ... Hmm ... odd. It's actually in my 3.5.0 download. This is the full path: C:\java\Clustering\lingpipe-3.5.0\demos\tutorial\cluster\src ... I will...
Just some feedback - I briefly tried name entity recognition and Chinese word segmentation demos. Name entity recognition worked very well in my very brief...
Could anyone shed some light on my questions? And point it out if I'm missing something obvious, even briefly? Thanks! ===8<==============Original message...
... There is no TestCluster.java program in the 3.5.0 release. There's no reference to it in the clustering tutorial. There's not a version in our CVS...
Hello Bob, I just downloaded 3.5.0 again. You are right. It is not there. I might have created it or copied it from somewhere else. I'll email it to you...
We're happy to announce the availability of LingPipe 3.5.1. This is primarily a bug fix release and it is fully backward compatible with 3.5.0. The details of...
Hello Bob, I used Lingpipe 2.4.1 to executive NER task. I found something must be wrong with it. In my code, I used ExactDictionaryChunker class and set the...
... something must ... class and ... case ... second ... 3.5. ... recognize ... ignored? ... Thanks for Breck's help. I must use Lingpipe 2.4.1 for some...
... I can guess that'd be because in 3.0 we introduced generics and thus required the 1.5 JDK to compile. ... Sure -- just use the class from 3.5. You'll have...
Hi, I asked a similar question before, at which time the error was "Found cost=NaN" but this time it's a negative number. I double-checked and the files are...
... It's an arithmetic precision issue in computing distances. I won't be able to tell you exactly what's going on until I get more info about what you're...
Jack sent me his TestCluster.java, and the problem's in that class's definition of COSINE_DISTANCE. It turns out that TestCluster.java was just a working copy...
All, We are going to start hobby nights at Alias-i again. The goals are: 1) Provide help on how to best encode linguistic problems into NLP (Natural Language...
Any idea on how to compute word-vector matrix ( for a given large document ) using lingpipe. I checkedout the package : com.aliasi.spell and realized that I...
Hi Bob, First of all Thanks a bunch for your detailed response. Really helped a lot. I really dont need to compute the dot-product ( which is why I think the...
... You pretty much need sparse vectors for bags of words if you want to scale at all. For info on how to feed LDA, check out the last section of our...
It seems that we can evaluta the probability to a new document? I would like to know how I can implement that using the class of LatentDirichletAllocation.java...