Skip to search.

Breaking News Visit Yahoo! News for the latest.

×Close this window

LingPipe

The Yahoo! Groups Product Blog

Check it out!

Group Information

  • Members: 470
  • Category: Open Source
  • Founded: Oct 8, 2003
  • Language: English
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Real people. Real stories. See how Yahoo! Groups impacts members worldwide.

Messages

Advanced
Messages Help
Messages 623 - 652 of 1477   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Simplify | Expand Author Sort by Date ^
623 Bob Carpenter
colloquialdo... Send Email
Jul 23, 2008
5:44 pm
... Nice. Singleton annotators are our recommended deployment strategy for all of our components. ... Like all the LingPipe components, they're read-write...
624 Bob Carpenter
colloquialdo... Send Email
Aug 20, 2008
8:09 pm
... Yes. But it has to be defined at the factory level. We really need a tutorial on this. If you have a filtered tokenizer FilterTokenizer with the obvious ...
625 cambazz Send Email Sep 2, 2008
8:08 am
Hello, Currently I have a scenario where a list of curse words is put to a trie, and then words entered by users (not text, but individual words) are run...
626 Bob Carpenter
colloquialdo... Send Email
Sep 2, 2008
2:59 pm
I'm afraid edit distance won't be sufficient due to the example you mention. This isn't an implementation issue, though it will slow down as you get more and...
627 cambazz Send Email Sep 4, 2008
2:25 am
Little confused, I understand when using approximate matcher, as the size of the dictionary grows the speed goes down. Question: approximate matcher tries to...
628 cambazz Send Email Sep 4, 2008
2:38 am
may I add the language is turkish. I have looked at both soundex and metaphone, but i think these are language dependent. I could have a list of clean words,...
629 Bob Carpenter
colloquialdo... Send Email
Sep 4, 2008
5:57 pm
... The results won't be equivalent. The string vs. string edit distance matches the whole strings against each other, whereas the approximate chunker looks...
630 cambazz Send Email Sep 4, 2008
9:43 pm
Dear Bob, Thank you for your explanatory post. What I have done was to only check words with more then 6+ inputs, with a max distance of 2. It works fairly...
631 Bob Carpenter
colloquialdo... Send Email
Sep 15, 2008
6:35 pm
Thanks to an anonymous bug reporter for submitting these issues with $LINGPIPE/demos/tutorial/posTags First the typo: The read-me.html gives the wrong...
632 Bob Carpenter
colloquialdo... Send Email
Sep 24, 2008
4:22 pm
Has anyone successfully compiled LingPipe 3.5.1 using GNU's JDK? We received an error report for some classes that work just fine in the Sun JDKs. Here are...
633 Bob Carpenter
colloquialdo... Send Email
Oct 2, 2008
7:43 pm
We just rolled out LingPipe 3.6.0. As usual, find it on our homepage: http://alias-i.com/lingpipe Here are the details: Intermediate Release The latest...
634 marco turchi
marco.turchi2 Send Email
Oct 14, 2008
5:12 pm
Dear All, my name is Marco, and I'm new in this group and as lingpipe user, so do not hate me for some silly questions :-) My research project requires to...
635 Bob Carpenter
colloquialdo... Send Email
Oct 14, 2008
5:17 pm
... We don't distribute any data, but our named entity tutorial points to some sources of data. ELRA and LDC also distribute data, but it's expensive. Most...
636 marco turchi
marco.turchi2 Send Email
Oct 14, 2008
5:33 pm
Dear Bob, my problem is multilanguage in the sense that I handle documents that can be written in German, French, Spanish and so on, each document is written...
637 Bob Carpenter
colloquialdo... Send Email
Oct 14, 2008
5:45 pm
... Nothing free and fast, I'm afraid. We don't have corpora in French or German. Spanish is easy -- you can get it from the CoNLL data. You can get Spanish...
638 imoulinier Send Email Oct 23, 2008
2:42 pm
Hello- I'm having a blank. In the context of spell checking, is the edit distance used between the user-entered term and the suggested term, or the reverse? I...
639 Bob Carpenter
colloquialdo... Send Email
Oct 23, 2008
5:00 pm
... I should make this clearer in the doc for each operation. ... Right. It's the noisy-channel setup, so it's edits going from the suggested term to the...
640 Bob Carpenter
colloquialdo... Send Email
Oct 28, 2008
3:25 pm
As of 3.0, the chunking interface completely changed so it's no longer backward compatible with 2.x code. The last version of LingPipe to support the...
641 Sue Chen
suelingpipe Send Email
Nov 12, 2008
8:21 pm
I recently downloaded LingPipe 3.6.0 and tried the ChineseToken tutorial sample.  I always got NullPointerException. My environment is: - Window XP - Java...
642 Bob Carpenter
colloquialdo... Send Email
Nov 13, 2008
9:12 pm
... It sure does. Thanks for the detailed bug report. The culprit is the following file: $LINGPIPE/src/com/aliasi/spell/CompiledSpellChecker.java The method...
643 Sue Chen
suelingpipe Send Email
Nov 14, 2008
4:38 pm
Thanks much, Bob.  The patch fixed my problem. ________________________________ From: Bob Carpenter <carp@...> To: LingPipe@yahoogroups.com Sent:...
644 Sue Chen
suelingpipe Send Email
Nov 14, 2008
6:11 pm
Hi Bob, I have a question on Model Quality. I used the ChineseToken sample to generated a words-zh-as.CompiledSpellChecker model, which has size 78,303KB.  I...
645 Bob Carpenter
colloquialdo... Send Email
Nov 15, 2008
2:08 am
... The other way to control model size is take longer n-grams and prune out low-count sequences. If you follow the tutorial, you'll see where we run standard...
646 Sue Chen
suelingpipe Send Email
Nov 18, 2008
7:49 pm
Hi Bob, Thanks for replying. Does longer n-grams model mean more accuracy? How do I prune out low-count sequences from model using LingPipe? I have some...
647 Bob Carpenter
colloquialdo... Send Email
Nov 19, 2008
6:12 pm
... Usually longer n-grams means more accuracy up to a point at which accuracy plateaus. Longer n-grams can overfit in some situations compared to shorter...
648 Sue Chen
suelingpipe Send Email
Nov 20, 2008
3:36 pm
Thanks, Bob. The goal of making English Chinese word alignment is to create some TMX files for "translation memory" tools used by translators. We have some MT...
649 Bob Carpenter
colloquialdo... Send Email
Nov 20, 2008
7:28 pm
... We did this for Chinese in the past by extending sentences.HeuristicSentenceModel with the appropriate end tokens for Chinese and using the...
650 Bob Carpenter
colloquialdo... Send Email
Dec 12, 2008
9:45 pm
LingPipe 3.7.0 is now available from: http://alias-i.com/lingpipe The only significant change is an update to the MEDLINE DTDs used by the MEDLINE parser....
651 Bob Carpenter
colloquialdo... Send Email
Jan 7, 2009
11:38 pm
LingPipe 3.7.0 will generate a warning when compiling all jars because of an issue with non-ASCII chars in one of our java files. It works with Windows...
652 Otis Gospodnetic
otis_gospodn... Send Email
Jan 29, 2009
9:55 pm
Hello, I was looking at NGramBoundaryLM and wondering why exactly those begin and end characters are needed and when you'd use NGramBoundaryLM instead of...
Messages 623 - 652 of 1477   Oldest  |  < Older  |  Newer >  |  Newest
Add to My Yahoo!      XML What's This?

Copyright © 2010 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines NEW - Help