Search the web
Sign In
New User? Sign Up
LingPipe
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Want your group to be featured on the Yahoo! Groups website? Add a group photo to Flickr.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Messages 610 - 639 of 777   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Simplify | Expand   (Group by Topic) Author Sort by Date ^
610
Hi, I asked a similar question before, at which time the error was "Found cost=NaN" but this time it's a negative number. I double-checked and the files are...
Jack L
jlist9
Offline Send Email
Jul 2, 2008
9:32 pm
611
... It's an arithmetic precision issue in computing distances. I won't be able to tell you exactly what's going on until I get more info about what you're...
Bob Carpenter
colloquialdo...
Offline Send Email
Jul 2, 2008
11:39 pm
612
Jack sent me his TestCluster.java, and the problem's in that class's definition of COSINE_DISTANCE. It turns out that TestCluster.java was just a working copy...
Bob Carpenter
colloquialdo...
Offline Send Email
Jul 3, 2008
12:10 am
613
Hello Bob, Thank you for the quick response. This fixed the problem. -- Best regards, Jack...
Jack L
jlist9
Offline Send Email
Jul 3, 2008
6:36 am
614
All, We are going to start hobby nights at Alias-i again. The goals are: 1) Provide help on how to best encode linguistic problems into NLP (Natural Language...
reckb
Offline Send Email
Jul 11, 2008
4:58 pm
615
Any idea on how to compute word-vector matrix ( for a given large document ) using lingpipe. I checkedout the package : com.aliasi.spell and realized that I...
prasen_bea
Offline Send Email
Jul 12, 2008
4:39 pm
616
The simplest implementation, which is actually implicit (doesn't create a Matrix or even Vector objects of the features), is in: ...
Bob Carpenter
colloquialdo...
Offline Send Email
Jul 14, 2008
4:10 pm
617
Hi Bob, First of all Thanks a bunch for your detailed response. Really helped a lot. I really dont need to compute the dot-product ( which is why I think the...
prasenjit mukherjee
prasen_bea
Offline Send Email
Jul 15, 2008
12:49 pm
618
... You pretty much need sparse vectors for bags of words if you want to scale at all. For info on how to feed LDA, check out the last section of our...
Bob Carpenter
colloquialdo...
Offline Send Email
Jul 15, 2008
2:59 pm
619
It seems that we can evaluta the probability to a new document? I would like to know how I can implement that using the class of LatentDirichletAllocation.java...
yezh0716
Offline Send Email
Jul 20, 2008
5:27 pm
620
Did you mean how to get the topciProbability Distribution for a given unseen/new document. Following are the steps to get that : First you need to get an...
prasenjit mukherjee
prasen_bea
Offline Send Email
Jul 21, 2008
2:00 pm
621
Prasenjit's right about getting a point topic estimate for new docs. I'm afraid I didn't implement the probability estimate for new docs. Nor do I have any...
Bob Carpenter
colloquialdo...
Offline Send Email
Jul 21, 2008
5:57 pm
622
hi there, we use the dictionary chunkers as UIMA components with very large dictionaries. We plan to use a dictionary as singleton in a multithreaded setting...
rico.la77
Offline Send Email
Jul 23, 2008
3:28 pm
623
... Nice. Singleton annotators are our recommended deployment strategy for all of our components. ... Like all the LingPipe components, they're read-write...
Bob Carpenter
colloquialdo...
Offline Send Email
Jul 23, 2008
5:44 pm
624
... Yes. But it has to be defined at the factory level. We really need a tutorial on this. If you have a filtered tokenizer FilterTokenizer with the obvious ...
Bob Carpenter
colloquialdo...
Offline Send Email
Aug 20, 2008
8:09 pm
625
Hello, Currently I have a scenario where a list of curse words is put to a trie, and then words entered by users (not text, but individual words) are run...
cambazz
Offline Send Email
Sep 2, 2008
8:08 am
626
I'm afraid edit distance won't be sufficient due to the example you mention. This isn't an implementation issue, though it will slow down as you get more and...
Bob Carpenter
colloquialdo...
Offline Send Email
Sep 2, 2008
2:59 pm
627
Little confused, I understand when using approximate matcher, as the size of the dictionary grows the speed goes down. Question: approximate matcher tries to...
cambazz
Offline Send Email
Sep 4, 2008
2:25 am
628
may I add the language is turkish. I have looked at both soundex and metaphone, but i think these are language dependent. I could have a list of clean words,...
cambazz
Offline Send Email
Sep 4, 2008
2:38 am
629
... The results won't be equivalent. The string vs. string edit distance matches the whole strings against each other, whereas the approximate chunker looks...
Bob Carpenter
colloquialdo...
Offline Send Email
Sep 4, 2008
5:57 pm
630
Dear Bob, Thank you for your explanatory post. What I have done was to only check words with more then 6+ inputs, with a max distance of 2. It works fairly...
cambazz
Offline Send Email
Sep 4, 2008
9:43 pm
631
Thanks to an anonymous bug reporter for submitting these issues with $LINGPIPE/demos/tutorial/posTags First the typo: The read-me.html gives the wrong...
Bob Carpenter
colloquialdo...
Offline Send Email
Sep 15, 2008
6:35 pm
632
Has anyone successfully compiled LingPipe 3.5.1 using GNU's JDK? We received an error report for some classes that work just fine in the Sun JDKs. Here are...
Bob Carpenter
colloquialdo...
Offline Send Email
Sep 24, 2008
4:22 pm
633
We just rolled out LingPipe 3.6.0. As usual, find it on our homepage: http://alias-i.com/lingpipe Here are the details: Intermediate Release The latest...
Bob Carpenter
colloquialdo...
Offline Send Email
Oct 2, 2008
7:43 pm
634
Dear All, my name is Marco, and I'm new in this group and as lingpipe user, so do not hate me for some silly questions :-) My research project requires to...
marco turchi
marco.turchi2
Offline Send Email
Oct 14, 2008
5:12 pm
635
... We don't distribute any data, but our named entity tutorial points to some sources of data. ELRA and LDC also distribute data, but it's expensive. Most...
Bob Carpenter
colloquialdo...
Offline Send Email
Oct 14, 2008
5:17 pm
636
Dear Bob, my problem is multilanguage in the sense that I handle documents that can be written in German, French, Spanish and so on, each document is written...
marco turchi
marco.turchi2
Offline Send Email
Oct 14, 2008
5:33 pm
637
... Nothing free and fast, I'm afraid. We don't have corpora in French or German. Spanish is easy -- you can get it from the CoNLL data. You can get Spanish...
Bob Carpenter
colloquialdo...
Offline Send Email
Oct 14, 2008
5:45 pm
638
Hello- I'm having a blank. In the context of spell checking, is the edit distance used between the user-entered term and the suggested term, or the reverse? I...
imoulinier
Offline Send Email
Oct 23, 2008
2:42 pm
639
... I should make this clearer in the doc for each operation. ... Right. It's the noisy-channel setup, so it's edits going from the suggested term to the...
Bob Carpenter
colloquialdo...
Offline Send Email
Oct 23, 2008
5:00 pm
Messages 610 - 639 of 777   Oldest  |  < Older  |  Newer >  |  Newest
Advanced
Add to My Yahoo!      XML What's This?

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help