|
Re: [LingPipe] Re: LingPipe 1.0.7
> Bob, i have downloaded and have been experimenting with LingPipe
> 1.0.7 and i think i have run into a bug.
Sounds like the namespaces are a bug. Directory tagging may be
tickling a "feature".
> if i take the demo bat file that processes a directory and add -
> elements= to tag some of my XML documents only within some of my
> specific elements, it no longer does named entity tagging. instead,
> all it does is sentence boundary detection within the tags that i
> named (i.e. <sent> tags are produced but no ENAMEX tags).
LingPipe only looks at contiguous text content, and to find entities, it
usually needs some context. Even so, I'd expect high frequency
lexical name components to get tagged.
Could you send me an example file?
> BTW, you would make my life a lot easier if you preserved XML
> namespace specifications.
That's my fault. The commands don't hook up handlers other
than the content handler. I've never actually run across any docs
with namespace specs, so it hasn't surfaced. I'll try to fix it for
the next release. If the example file had the namespace spec,
that'd be helpful, too.
Most helpful would be a very small test that tickles the bug,
but I'm willing to wade through big files.
- Bob
|