The bug ... There are bugs in both single and complete link clustering that arise only when there are elements that are further than the distance bound from...
1093
Joyce Neroni
joyce.neroni
Mar 20, 2011 9:32 pm
Hello, I am working with Lingpipe on a Dutch database and I have a question. I have a database with a lot of texts (12 million rows), and I want to classify...
1094
Bob Carpenter
colloquialdo...
Mar 21, 2011 4:04 am
You are right about the way LingPipe classifiers (and most machine-learning-based classifiers) work -- they assume a finite set of categories that are taken to...
1095
Joyce Neroni
joyce.neroni
Mar 21, 2011 9:26 am
Thank you Bob for your quick reply! My apologies, I think I didn't explain my problem very well. That's because I wanted to explain the problem in short, so I...
1096
Bob Carpenter
colloquialdo...
Mar 21, 2011 6:33 pm
These problems are notoriously difficult to talk about because of all the background assumptions and terminology. First, you're absolutely right about what's ...
1097
Bob Carpenter
colloquialdo...
Mar 21, 2011 11:00 pm
I just realized that you didn't want to have to rebuild everything with binary classifiers. The skew in counts is going to be a problem no matter what you do....
1098
Joyce Neroni
joyce.neroni
Mar 22, 2011 12:59 pm
Thank you so much, I think I understand better now why the way I wanted to use the program is a problem in this case. So now I decided to use the binary...
1099
reckb
Mar 22, 2011 2:14 pm
... Ok, good, we have a clear problem definition. The goal of this project is to measure tweets for your categories--you are building an measuring device for...
1100
Ben McCann
chengas123
Mar 27, 2011 7:08 pm
Hi, I was trying to make sense of the named entity recognition examples. It ... Can that code be removed or am I missing how it is used? Thanks, Ben [Non-text...
1101
reckb
Mar 27, 2011 8:31 pm
... It sure looks like it can be removed. There are no references to the variable whitespaces out side of this section so it must be from an earlier effort at...
1102
Bob Carpenter
colloquialdo...
Mar 27, 2011 10:33 pm
Breck's being polite -- that's my bad. I've been sloppy overall in writing parsers for particular corpora, which is one reason I took them out of the core of...
1103
charles_t_jackson
charles_t_ja...
Mar 28, 2011 8:08 pm
We're using a LogisticRegressionClassifier, and I'm trying to find a way to speed up the training process for it. At a corpus size of 10,000 documents, the...
1104
reckb
Mar 28, 2011 8:31 pm
... Unfortunately Logistic Regression training is single threaded. I believe there is some research on how to make it multi-threaded but it is a hard problem...
1105
reckb
Mar 28, 2011 8:35 pm
... You also might want to try either the Naive Bayes classifier or the language model classifiers which train up much quicker. It sounds like you might have...
1106
Bob Carpenter
colloquialdo...
Mar 28, 2011 10:04 pm
Before answering the technical question, let me make a couple of suggestions. First, if you have a K-ary classifier, you might want to consider using K binary...
1107
charles_t_jackson
charles_t_ja...
Mar 29, 2011 4:46 pm
Thanks for the responses Breck and Bob. - Charlie...
1108
nlp_hhcib
Mar 30, 2011 6:56 pm
I am able to compile a Naive Bayes classifier fine, but when I try to load the model for use I receive the runtime error: java.lang.ClassCastException:...
1109
Bob Carpenter
colloquialdo...
Mar 30, 2011 7:14 pm
First, you may want to check out the TradNaiveBayesClassifier, which is implemented more like a textbook naive Bayes classifier rather than being implemented...
1110
Heather Dewey-Hagborg
hdeweyh
Apr 2, 2011 10:09 pm
I am trying to create an application that builds a word-based language model incrementally as it is exposed to word sequences. The model needs to be saved in...
1111
reckb
Apr 2, 2011 10:27 pm
... There is a compileTo(ObjectOutput objOut) method for TokenizedLM that allows for serialization. Deserializing however does yield a trainable TokenizedLM,...
1112
reckb
Apr 2, 2011 10:35 pm
1113
Heather Dewey-Hagborg
hdeweyh
Apr 2, 2011 11:17 pm
yeah, unfortunately not I need to keep training it after serializing......
1114
Bob Carpenter
colloquialdo...
Apr 4, 2011 4:14 pm
As I added more documentation for the next release, I realized that what I said last time was wrong. The compiled form of a NaiveBayesClassifier is an...
1115
reckb
Apr 4, 2011 11:08 pm
How Techniques Behind NLP (Natural Language Processing) Driven Observations Influence the Quality of Observations for Use in Financial Models Breck Baldwin,...
1116
Heather Dewey-Hagborg
hdeweyh
Apr 8, 2011 10:28 pm
I got about as far with this example as I am going to for a bit. It is posted here: ...
1117
reckb
Apr 13, 2011 3:30 pm
All, We have 5 MBA students focused on improving LingPipe products and services. They will need about 15 minutes of your time to fill out an anonymous online...
1118
reckb
Apr 13, 2011 3:41 pm
I forgot to mention that we are interested in all types of licensees, Royalty Free to Enterprise. thanks Breck...
1119
Dave Lange
melinasotnos
Apr 13, 2011 4:24 pm
Could this be consulting on mlp/ machine learning implementations in general or does it have to be in lingpipe details? From following group I know you guys...
1120
reckb
Apr 13, 2011 5:15 pm
... I am a better computational linguist than machine learning guru but I would not restrict solutions/feedback to just LingPipe tools. We point customers to...
1121
sangeeta_ducs
Apr 19, 2011 5:04 pm
Hi I am using DLM classifier for my project. Can somebody tell me where I can get tutorial DLM(Which have been used in lingpipe). Thanks...