|
I would like retrain LingPipe for a digital library we are working on.
Right now we are tagging all named entities in every document by
hand. All of our documents relate to North Carolin history and
fiction, so the New Wire model isn't very useful. We've currently
tagged about 3200 pages.
What I need to know is how to retrain LingPipe using the documents
that we've tagged already. I'm not familiar at all with the actual
process of retraining. Also, I'm guessing that there are about 30,000
entities in these documents...Is that enough to use for retraining?
I'd appreciate any help.
|