Search the web
Sign In
New User? Sign Up
LingPipe
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Want your group to be featured on the Yahoo! Groups website? Add a group photo to Flickr.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Messages 624 - 653 of 777   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Simplify | Expand   (Group by Topic) Author Sort by Date ^
624
... Yes. But it has to be defined at the factory level. We really need a tutorial on this. If you have a filtered tokenizer FilterTokenizer with the obvious ...
Bob Carpenter
colloquialdo...
Offline Send Email
Aug 20, 2008
8:09 pm
625
Hello, Currently I have a scenario where a list of curse words is put to a trie, and then words entered by users (not text, but individual words) are run...
cambazz
Offline Send Email
Sep 2, 2008
8:08 am
626
I'm afraid edit distance won't be sufficient due to the example you mention. This isn't an implementation issue, though it will slow down as you get more and...
Bob Carpenter
colloquialdo...
Offline Send Email
Sep 2, 2008
2:59 pm
627
Little confused, I understand when using approximate matcher, as the size of the dictionary grows the speed goes down. Question: approximate matcher tries to...
cambazz
Offline Send Email
Sep 4, 2008
2:25 am
628
may I add the language is turkish. I have looked at both soundex and metaphone, but i think these are language dependent. I could have a list of clean words,...
cambazz
Offline Send Email
Sep 4, 2008
2:38 am
629
... The results won't be equivalent. The string vs. string edit distance matches the whole strings against each other, whereas the approximate chunker looks...
Bob Carpenter
colloquialdo...
Offline Send Email
Sep 4, 2008
5:57 pm
630
Dear Bob, Thank you for your explanatory post. What I have done was to only check words with more then 6+ inputs, with a max distance of 2. It works fairly...
cambazz
Offline Send Email
Sep 4, 2008
9:43 pm
631
Thanks to an anonymous bug reporter for submitting these issues with $LINGPIPE/demos/tutorial/posTags First the typo: The read-me.html gives the wrong...
Bob Carpenter
colloquialdo...
Offline Send Email
Sep 15, 2008
6:35 pm
632
Has anyone successfully compiled LingPipe 3.5.1 using GNU's JDK? We received an error report for some classes that work just fine in the Sun JDKs. Here are...
Bob Carpenter
colloquialdo...
Offline Send Email
Sep 24, 2008
4:22 pm
633
We just rolled out LingPipe 3.6.0. As usual, find it on our homepage: http://alias-i.com/lingpipe Here are the details: Intermediate Release The latest...
Bob Carpenter
colloquialdo...
Offline Send Email
Oct 2, 2008
7:43 pm
634
Dear All, my name is Marco, and I'm new in this group and as lingpipe user, so do not hate me for some silly questions :-) My research project requires to...
marco turchi
marco.turchi2
Offline Send Email
Oct 14, 2008
5:12 pm
635
... We don't distribute any data, but our named entity tutorial points to some sources of data. ELRA and LDC also distribute data, but it's expensive. Most...
Bob Carpenter
colloquialdo...
Offline Send Email
Oct 14, 2008
5:17 pm
636
Dear Bob, my problem is multilanguage in the sense that I handle documents that can be written in German, French, Spanish and so on, each document is written...
marco turchi
marco.turchi2
Offline Send Email
Oct 14, 2008
5:33 pm
637
... Nothing free and fast, I'm afraid. We don't have corpora in French or German. Spanish is easy -- you can get it from the CoNLL data. You can get Spanish...
Bob Carpenter
colloquialdo...
Offline Send Email
Oct 14, 2008
5:45 pm
638
Hello- I'm having a blank. In the context of spell checking, is the edit distance used between the user-entered term and the suggested term, or the reverse? I...
imoulinier
Offline Send Email
Oct 23, 2008
2:42 pm
639
... I should make this clearer in the doc for each operation. ... Right. It's the noisy-channel setup, so it's edits going from the suggested term to the...
Bob Carpenter
colloquialdo...
Offline Send Email
Oct 23, 2008
5:00 pm
640
As of 3.0, the chunking interface completely changed so it's no longer backward compatible with 2.x code. The last version of LingPipe to support the...
Bob Carpenter
colloquialdo...
Offline Send Email
Oct 28, 2008
3:25 pm
641
I recently downloaded LingPipe 3.6.0 and tried the ChineseToken tutorial sample.  I always got NullPointerException. My environment is: - Window XP - Java...
Sue Chen
suelingpipe
Offline Send Email
Nov 12, 2008
8:21 pm
642
... It sure does. Thanks for the detailed bug report. The culprit is the following file: $LINGPIPE/src/com/aliasi/spell/CompiledSpellChecker.java The method...
Bob Carpenter
colloquialdo...
Offline Send Email
Nov 13, 2008
9:12 pm
643
Thanks much, Bob.  The patch fixed my problem. ________________________________ From: Bob Carpenter <carp@...> To: LingPipe@yahoogroups.com Sent:...
Sue Chen
suelingpipe
Offline Send Email
Nov 14, 2008
4:38 pm
644
Hi Bob, I have a question on Model Quality. I used the ChineseToken sample to generated a words-zh-as.CompiledSpellChecker model, which has size 78,303KB.  I...
Sue Chen
suelingpipe
Offline Send Email
Nov 14, 2008
6:11 pm
645
... The other way to control model size is take longer n-grams and prune out low-count sequences. If you follow the tutorial, you'll see where we run standard...
Bob Carpenter
colloquialdo...
Offline Send Email
Nov 15, 2008
2:08 am
646
Hi Bob, Thanks for replying. Does longer n-grams model mean more accuracy? How do I prune out low-count sequences from model using LingPipe? I have some...
Sue Chen
suelingpipe
Offline Send Email
Nov 18, 2008
7:49 pm
647
... Usually longer n-grams means more accuracy up to a point at which accuracy plateaus. Longer n-grams can overfit in some situations compared to shorter...
Bob Carpenter
colloquialdo...
Offline Send Email
Nov 19, 2008
6:12 pm
648
Thanks, Bob. The goal of making English Chinese word alignment is to create some TMX files for "translation memory" tools used by translators. We have some MT...
Sue Chen
suelingpipe
Offline Send Email
Nov 20, 2008
3:36 pm
649
... We did this for Chinese in the past by extending sentences.HeuristicSentenceModel with the appropriate end tokens for Chinese and using the...
Bob Carpenter
colloquialdo...
Offline Send Email
Nov 20, 2008
7:28 pm
650
LingPipe 3.7.0 is now available from: http://alias-i.com/lingpipe The only significant change is an update to the MEDLINE DTDs used by the MEDLINE parser....
Bob Carpenter
colloquialdo...
Offline Send Email
Dec 12, 2008
9:45 pm
651
LingPipe 3.7.0 will generate a warning when compiling all jars because of an issue with non-ASCII chars in one of our java files. It works with Windows...
Bob Carpenter
colloquialdo...
Offline Send Email
Jan 7, 2009
11:38 pm
652
Hello, I was looking at NGramBoundaryLM and wondering why exactly those begin and end characters are needed and when you'd use NGramBoundaryLM instead of...
Otis Gospodnetic
otis_gospodn...
Offline Send Email
Jan 29, 2009
9:55 pm
653
... The spelling corrector and HMMs wrap this all up the right way by default. Classifiers require you to specify which to use, and the decision is based on...
Bob Carpenter
colloquialdo...
Offline Send Email
Jan 29, 2009
11:53 pm
Messages 624 - 653 of 777   Oldest  |  < Older  |  Newer >  |  Newest
Advanced
Add to My Yahoo!      XML What's This?

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help