... For our implementation, check out the com.aliasi.tokenizer package: http://alias-i.com/lingpipe/docs/api/com/aliasi/tokenizer/package-summar y.html You'll...
Hi Bogdan, i am trying to get this code to work with lingpipe 2.0. But there is no "AbstractCorefCommand" class anymore. All i want is to split up a text in...
... <morgenshalbzehnindeutschland@g...> wrote: Have a look in lingpipe/demos/command/tutorial.html It lays out a bunch of common command line tasks. The...
An anonymous user sent mail asking how to compute character offsets for sentences. I decided to add some tutorial code that shows how to do this for sentences...
Hi, i've modified the source code so it handles texts with many whitespaces much better. You often get those texts when working with HTML files. Here is what i...
Hello, Is there a white paper especally on Named Entity Tagging Method of LingPipe? Both of papers at http://alias-i.com/lingpipe/papers.html seem to be just...
... Here are some other presentations available online. I'll add links and add the papers to the distro in the next release. In all cases, the LingPipe...
Hello, I'm replying to an older email from August 4th, 2005, because it contains some relevant context. ... e/lucene/analysis/StopFi ... Here here :)...
... I'm working on that now. I have the code that I'll post as a demo with the next release. ... That's a good idea -- I can wrap a TokenStream to create a...
I've just discovered LingPipe; it has some great features! I have one question about usage, though: if I build a TokenizedLM, then compile it to a file and...
... Nope -- you're absolutely spot on correct in both the way the lm package does behave and how it should behave. Having just reviewed the code after reading...
LingPipe Version 2.1 Released New York, 3 October 2005 LingPipe 2.1.0 is out. This release is backward compatible with 2.0 (and in almost all cases, with 1.0...
Greetings. I was trying to run experiments with a very big model file (56.8MB) generated by a 69.9MB iob file. The training was OK (I had to increase the...
... Hi Breck, Thanks for getting back to me. Indeed, I should have provided more details. So here we go: I am using the NE detection model, using your command...
... This could be due to a lot of things. If you'd like me to try to debug it for you, could you put the BIO training file and the test file up somewhere on a...
... Hi Bob, I can put it on my webpage, but unless you prefer to do it this way, let's see if it can be solved without doings so. ... Thanks for the...
... Longer's more helpful for debugging. Thanks. ... It's a problem when the decoder can't find any viable paths. I'm going to rewrite the decoder so it can...
Found it. The problem isn't the size of the model per se, but rather the high-skew (see description below) of a transition in the model interacting with...
... Yes, that's it probably! I think the way I create my training data results in skewed distributions, apart from large models. Thanks for sorting this out, ...
Final report on null returns from named-entity decoding. It does turn out to be the case that all possible continuations are being pruned. I fixed that, but...
Hi all, I've found recently about LinpPipe. I implemented a naive Bayes classifier that follows the approach from Mitchell (Machine Learning) and try it on the...
... Thanks for writing to the list -- this is a great case study. The bottom line answer to question two is that LingPipe's classifier is better because of...
Hi all, First I want to thank Bob for his detailed and competent answer. It really has been useful. However I want to make some clarifications: 1. I do not...
... That's an unusual approach, but mathematically sound. (In fact, more sound than many attempts to deal with unknown tokens.) ... This can skew the result...
LingPipe 2.1.1 is out (see the Download Page). This release patches all bugs of which we are aware. Major bug fixes include: (a) some MEDLINE handler...
... Most of the tools out there, like LingPipe, can be trained on arbitrary data. The reason we don't distribute models is that neither the CoNLL licenses nor...