Dear Mr. Bob Carpenter, I posted since few weeks the fellowing email (see above) where I asked you for possible use of Lingpipe for sentence or string...
Dear Mr. Bob Carpenter, I posted since few weeks the fellowing email (see below) where I asked you for possible use of Lingpipe for sentence or string...
I'm afraid the short answer is no -- I haven't written the tutorial for string comparison and don't currently have any code samples to send you that aren't...
Hi - I'm working on Language Id and was trying to download the Leipzig Corpora from http://corpora.informatik.uni-leipzig.de/download.html. To the surprise...
... I just tried it out and had no problem (i.e. it "works on my machine", where my machine is running Firefox 2.0.0.12 on Win x64 over a local ethernet...
Hello All, I have been looking over the demo for named entity recognizer. I did read and understood about Rule-Based Named Entity Detection and Exact ...
The best place to find info on building your own data and training a NE recognizer is in our "citationEntities" sandbox project. Instructions are at: ...
So I have been playing with the citationEntities, and I run the gui tool to do the manual annotation. It did form some files like: <?xml version="1.0"...
Hello, When I am training a named entity annotator and If I have input like: "INTEL CORE 2 QUAD Q6700 2.66 GHZ 8 MB CACHE 1066 MHZ LGA775 Processor" intel ->...
... You should look at the doc in citationsEntities/readme.html The entities you annotate inside of is available on the command line. ... You can specify the...
... Product identification "in the wild" is one of the harder named entity tasks. If you are tagging uniform data like a catalog or single website then it may...
Hi, I recently use LingPipe for my research. I have texts from wine menus, and I use both rule-based and dictionary-based name entity recognizer for them. My...
Hello, I made multiple chunkers and both dictionary based and regexp based. And I have input coming line by line? How can I use them in a combined manner? ...
... You still have to be careful with things like punctuation and capitalization (unless you case normalize the data). For instance, consider a wine the...
... I'm not sure what you intend here. If "input" is a string, what does the chunk() method do? It's not something from LingPipe. What you really need to do...
thanks I did some list processing, and now I can use chunkers in a chain. For a HMM named entity recognizer to work properly, how many lines are required and...
... It depends on the task, of course. For persons, locations and organizations in newswire text, you start getting decent performance at 50K tokens and see...
... That's too bad. In English, there's a lot of information in capitalization for everything from sentence boundaries to spelling checking to proper noun...
Hello Bob, Some time ago I asked about ways for multi-label (when input may fall into more than one category) classification. Your answer was that regular...
... The exclusivity shows up in two category distributions in the model, the prior and posterior: p(cat|ws) = p(ws|cat) * p(cat) / p(ws) posterior = likelihood...
Hi, I recently started evaluating the LingPipe SDK and I have a few problems and questions I build a big corpus, with around 50 text categories that represent ...
... What exactly do you mean by text styles? As in writing styles like letter, e-mail, technical article, etc.? Are those styles mutually exclusive? If not,...
i will try to clarify my issue in regards to your questions. i have around 60 text categories, for each category i have between 200 - 1000 text files that...
... It's the amount of text, not number of files that determines sizing. ... OK -- that's fairly small. ... You probably want an NGramProcess classifier unless...
Hi bob, I used your advise and pruned the small ngrams, I use 3 as a counter and it solved the java heap problem. You wrote: You need to train the individual...
Overall, you might want to rethink your classifier. Are the categories mutually exclusive and do they collectively cover all input types? Is there any ...
Hi Bob First I would like to put in a good word. I have worked with open source project in the past but never did I get such response to technical questions, ...