Hi all, this is my first post, I'm a new LingPipe user, but very impressed so far. Kudos on an excellent piece of software! As an early exercise, I'm trying to...
... May I ask what the basis for the classification is? ... OK -- you've got the right intuition here. At a high level, our text classifiers (and everyone ...
Hi Bob, ... Yup: legit blog urls vs. spam-blog urls. I ran across a paper describing classification by training exclusively on tokenized versions of the URLs. ...
David, Do you have a link to the paper you are referring to? I could use this to enhance spammer detection in Simpy (see sig). Thanks, Otis -- Simpy --...
... Exactly right. My bad for not having the reference available. I wasn't trying to be mysterious, I didn't understand it would be of general interest here....
Bob, as a followup to your comments below, I've been looking at the javadoc for ScoredPrecisionRecallEvaluation. I see that it provides "an evaluation of...
... Nope -- that's the right definition. The "operating characteristic" is implicit -- it's just a ranked evaluation. It basically tells you what the...
LingPipe 2.4.1 Released ======================= This is a patch release replacing LingPipe 2.4.0. It patches all bugs that have been reported to us; thanks to...
Hi! I'm working on a system to try to automate the analysis of customer satisfaction based on a database of their e-mail correspondences. So far we've had good...
... We've had requests to do other scalar classifiers, like reading level classification (on a grade-in-school scale). This is a general problem in statistical...
Hi All, I am new in on Dynamic Model Language Classification. In fact, my knowledge are very limited on how to user LingPipe for classification, so I am trying...
... Let me reformulate your question to see if I understand it. You have a set S1 of docs in language L1 and a set S2 of docs in language L2. Now you want to...
... It sounds like you're using the chunker: $LINGPIPE/demos/models/ne-en-bio-genia.TokenShapeChunker That is, indeed, a TokenShapeChunker. I'd suggest not...
Hi bob, Thank you for the big tip earlier. I have some more questions about training Chunkers (LmRescoringChunker, TokenShapeChunker). I made a set of tagged ...
Hi there, i am pretty new to the NLP sector, so please forgive me my perhaps dumb questions... What i am searching is a way to retrieve all kind of relations...
Hi bob, Thank you for the big tip earlier. I have some more questions about training Chunkers (LmRescoringChunker, TokenShapeChunker). I made a set of tagged ...
Sorry for not getting to this sooner -- I've been on vacation for the last week and a half. ... Let me help reset your expectations. First, if you only want to...
... What you're looking for is some kind of relational tagging or parsing. I'm afraid that we don't have anything to do that in LingPipe. The within-document...
Is anyone else having trouble compiling LingPipe 3.0? I was hoping it wouldn't be a disaster, but I'm now seeing the bugs in the various compilers as I get...
Hi All! I am starting to learn Language Modeling approach for Classification. I understand that one of main advantages of LM is that it does not require text...
... I'd say the main advantage of the character LM approach is that it can learn from parts of tokens as well as across tokens, which means it's far less...
Hi Lena, A solution could be to embed your Lingpipe learner / classifier inside a framework like UIMA or GATE. That would allow to run any number of processors...
... LingPipe was designed so that it'd be easy to integrate into larger integration frameworks like UIMA or GATE. LingPipe's models are all serializable and...
... Didn't you already publish a UIMA wrapper for at least the chunkers? I have successfully used the LingPipe NER components with UIMA 1.3 Regards, Florian...