Thank you very much. The application is to find law firm and accounting firm names in certain financial and legal documents. We are looking at training our...
985
Bob Carpenter
colloquialdo...
Aug 2, 2010 9:33 pm
Thanks for the rundown. CRFs are your best bet for accuracy, assuming you have enough training data. But they're much fiddlier to train, and exploiting the...
986
moges ahmed
moges_m
Aug 3, 2010 10:08 am
Dear members, have a problem in running "Evaluatepos.java" class. please see the next code: ... ... String...
987
moges ahmed
moges_m
Aug 4, 2010 11:45 am
Dear members, have a problem in running "Evaluatepos.java" class. please see the next code: Â Â Â Â Â Â Â ... Â Â Â Â Â Â Â ... Â Â Â Â Â Â Â String...
988
reckb
Aug 4, 2010 1:22 pm
Are you running our tutorial? Look at: http://alias-i.com/lingpipe/demos/tutorial/posTags/read-me.html It doesn't look like you are using that code from the...
989
Bob Carpenter
colloquialdo...
Aug 4, 2010 7:05 pm
Once you understand the demo, you just want to go ahead and build your own corpus. You don't need to use reflection like on the command line. It looks like...
990
Maneesha Jain
jain_maneesha
Aug 13, 2010 12:52 am
Hi, I'm working on a prototype where I need to extract the business/POI name from a user query that might have both name and the address of the POI. There ...
991
sanaz
sanazjabbari
Aug 13, 2010 3:36 pm
Dear Bob Carpenter, I have been using your classifiers for sentiment detection and have been very pleased with the quality of your code and the results. So...
992
colloquialdotcom
colloquialdo...
Aug 13, 2010 7:46 pm
I'm not clear what you mean by aspect here given your example. When people talk about aspects of sentiment, they usually mean aspects of what's being...
993
colloquialdotcom
colloquialdo...
Aug 13, 2010 7:47 pm
Is your application search or database creation? That is, how much recall and precision do you need? We don't have anything that's more scalable for queries....
994
reckb
Aug 14, 2010 2:38 pm
... Bob has already responded to the fuzzy named entity detection approach to this. Is it appropriate to view this as a query spell checking problem? That may...
995
reckb
Aug 14, 2010 2:43 pm
... Do you have evaluation/training data? That helps considerably in understanding what problem you are trying to solve. Have a look at my blog post on our...
996
kasparfischer
Aug 16, 2010 1:47 pm
Hi everybody, I have a list L of names and a several plaintext documents. I want to find all occurrences of the names from L in the documents. LingPipe's...
997
kasparfischer
Aug 16, 2010 2:27 pm
The rich text editor messed this up, it seems. This link contains the post with the Chinese characters I used: http://pastebin.com/twGRUJ8h...
998
Bob Carpenter
colloquialdo...
Aug 16, 2010 5:06 pm
I think you're asking about the tokenizers that underly the exact dictionary chunker. And yes, you will only be able to find phrases that start on tokens and...
999
xiaojia0459
Aug 20, 2010 1:52 pm
Hi, everyone! I am doing a research on text subjectivity. I have experimented with the lingpipe Subjectivity Analysis tool (SubjectivityBasic.java), but I also...
1000
Bob Carpenter
colloquialdo...
Aug 20, 2010 9:47 pm
... How to do this will depend on your application. You could run the subjectivity classifier over whole documents instead of sentences. But I doubt that'll ...
1001
Bob Carpenter
colloquialdo...
Aug 25, 2010 8:09 pm
(<anon>: I cross-posted the answer to your question [attached below] to our mailing list.) Yes, it's possible to use LingPipe to extract feature vectors,...
1002
ross12177
Sep 17, 2010 2:15 am
Hi everyone, I am working on a research project which has to implement the following: 1. Allow the user to define topics of interest, where a "topic" is a set...
1003
Jay Bartot
jbartot
Sep 17, 2010 2:15 am
Thanks. I got this working with the following code. Next question is, how do I incorporate NGrams (tokens and characters) and TF/IDFs? //...
1004
Bob Carpenter
colloquialdo...
Sep 20, 2010 6:38 pm
Sorry the mailing list so messes up code. The answer's luckily simple. Just use the appropriate tokenizer factory. The next version of LingPipe will have a...
1005
Bob Carpenter
colloquialdo...
Sep 20, 2010 6:47 pm
Semi-supervisied LDA isn't that hard, but it's not supported by LingPipe. All you need to do is just pin down some of the values (of the model's variables) and...
1006
gpalt
Sep 21, 2010 2:02 pm
Hello, using a prior version of lingpipe (3.8) I've build a Language Model classifier, based on the DynamicLMClassifier<TokenizedLM> class using 1-grams...
1007
Bob Carpenter
colloquialdo...
Sep 21, 2010 5:28 pm
The short answer is that no, Witten-Bell as employed in our Dynamic-LM-based naive-Bayes classifiers doesn't reduce to add-one (aka Laplace smoothing). ...
1008
gpalt
Sep 22, 2010 12:18 pm
Dear Bob, thank you very much for your extensive answer. It has been very informative. I realise that for Naive Bayes with Laplace smoothing, I would use the...
1009
Bob Carpenter
colloquialdo...
Sep 22, 2010 5:21 pm
I'm happy to try to clarify. You have complete control over all the configuration. The constructor for DynamicLM (as opposed to the convenience factory...
1010
Kaw Demba
kaw_trade
Sep 29, 2010 12:15 am
Hi All, i am testing the lingpipe classifier on my data. I mapped synsets from wikipedia to Wordnet and run the classification. My results looks like this: ...
1011
Bob Carpenter
colloquialdo...
Sep 29, 2010 12:32 am
No, it's very unlikely three different tasks would produce exactly the same result. Are you sure you didn't run the same data three times? I'm not sure where...
1012
Bob Carpenter
colloquialdo...
Sep 29, 2010 12:53 am
... Thanks for pointing it out. It'll be fixed in the next release, which should be in the next month or so. For now, I've put a new read-me.html and...
1013
Yogesh
yogesh.pandi...
Sep 29, 2010 7:02 pm
Hello, I need some help with finding patterns of tokens and labels from text. I have some number of documents. Each document has a set of tokens and labels...