Hello, I am using CharLmHmmChunker for tagging. The score for Chunking that I get is Infinity. Is there some way to get the actual probabilities? Thanks, ...
The interfaces for chunking are explained in: http://alias-i.com/lingpipe/demos/tutorial/ne/read-me.html Chunks only get scores in the confidence chunking...
1059
eckstein@...
Dec 16, 2010 6:26 pm
In com.aliasi.lingmed.medline.parser.MedlineCitationSet, ELECTRONIC_PUB_DATE_ELT is not defined correctly. It should be: /** * The <code>ArticleDate</code>...
... For what year MEDLINE is this a fix for? We have not updated to the 2011 DTD and are not planning on it currently. best Breck ... -- Breck Baldwin ...
The classpath indicates it's the version from the sandbox project lingmed. We had alredy fixed that problem as of the latest version of the MedlineCitationSet...
Hi I'm new to LingPipe and new to the usegroup and need some advice and troubleshooting (not solved by a general Google search so far): My interest is in NLP...
Yes, the demo should point to the 2010 medsamp. Thanks for pointing out the bug. It looks like it also needs a URL, not just a file path. We'll sort it out...
Many Thanks Bob, I look forward to the next release! I am new to this area, so I'm sure to not get the terminology straight. We have transcribed/typewritten...
Seeing a sample of data always helps. There's been a huge amount of work in medical terminology extraction. There are large databases of terminology like...
Simple question - I have trained a TokenizedLM on a text and I want to analyze some stats of the text. The first thing I want to know is how many Ngrams are in...
You're right that there's not a straightforward way to determine the number of unique n-grams found in a text. You can get the size of the trie through the ...
On second look, if you only want the counts, I have already built that. It's in the sequence counter method uniqueSequenceCount(int), where the argument is...
Great blog post! What about using model.sequenceCounter().nGramCounts(NGRAM, 1); ? I am trying to find something simple. I am using the number as a parameter ...
Hi, For a research project at university, I've trained my own NER for recognizing german entities in the domain of Software Engineering. So I used the...
There are two standard ways to include the extra information, only one of which is supported by LingPipe. The first, which LingPipe doesn't support, is the...
... It looks like the mac has tossed in a bunch of files into the unpacked directory structure. You can try and clean out the junk in .DS_Store or put a file...
... Sorry, I made a mistake. The filter should be: if (languageDirNames[i].startsWith(".")) { continue; } That will prevent exploring the various '.'...
Looking at the code on the line throwing the exception, it looks like your x data matrix (what you call "inputs") has some null elements in it. Make sure your...
That was it - a one off error I just missed it. Initial performance looks good, I am going to do some more testing and let you know. Happy to have you re-post...
hello, We are using Lingpipe's DynamicLMClassifier to classify news articles. It is working fairly well, but we would like to have a usable score (btw 0 and...
... The LM classifiers push the estimates to 0 and 1 aggressively. At my last tutorial I showed the ranked estimates for whether a tweet was about David Bowie...
If the log joint probabilities are very far apart (around 50 or more greater), then the conditional probability of the first category up to machine precision...
Hi all- I was wondering what is the best practice with regard to licenses when including a LingPipe-dependent module in a larger open source application...
Pablo, You are free to distribute the LingPipe jar. What we do is have a licenses directory in the distribution with the various licenses in there. So you...
Thanks so far. I got the CRF approach working and already created some shape features. But I still have problems using my pre-computed tokens and POS tags (as...