Search the web
Sign In
New User? Sign Up
search_dev · Independent Search Engine Developers
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Message search is now enhanced, find messages faster. Take it for a spin.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Statsserver   Message List  
Reply | Forward Message #719 of 869 |
Japanese search in autonomy


Does anyone have a proven project or program for improving relevance in Japanese language queries in IDOL? Or a set of resources to point me at so I can get my hands around the size of the problem?

Thank you,

Ed

Any U.S. tax advice contained in the body of this e-mail was not intended or written to be used, and cannot be used, by the recipient for the purpose of avoiding penalties that may be imposed under the Internal Revenue Code or applicable state or local tax law provisions.
________________________________________________________________________
The information contained in this message may be privileged and confidential and protected from disclosure.  If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer.

Notice required by law:  This e-mail may constitute an advertisement or solicitation under U.S. law, if its primary purpose is to advertise or promote a commercial product or service.   You may choose not to receive advertising and promotional messages from Ernst & Young LLP (except for Ernst & Young Online and the ey.com website, which track e-mail preferences through a separate process) at this e-mail address by forwarding this message to no-more-mail@....  If you do so, the sender of this message will be notified promptly. Our principal postal address is 5 Times Square, New York, NY 10036. Thank you.  Ernst & Young LLP


Wed Mar 11, 2009 8:47 pm

arentanji
Offline Offline
Send Email Send Email

Forward
Message #719 of 869 |
Expand Messages Author Sort by Date

Has anyone had any experience with breaking out the databases so that the PushStats could track individual databases?...
armstrkat101
Offline Send Email
Mar 10, 2009
3:01 pm

Does anyone have a proven project or program for improving relevance in Japanese language queries in IDOL? Or a set of resources to point me at so I can get my...
ed.dale@...
arentanji
Offline Send Email
Mar 11, 2009
8:44 pm

Hi Ed, I don't have answers specifically for Japanese, though we have dealt a bit with CJK in the past. We do have an article on Autonomy relevance in general:...
Mark Bennett
ttennebkram
Offline Send Email
Mar 11, 2009
8:54 pm

Mark: Thanks for the quick reply. These are some interesting and valuable articles. We are applying some of these techniques now with our English queries. My...
ed.dale@...
arentanji
Offline Send Email
Mar 11, 2009
10:35 pm

In my experience with Japanese search in Ultraseek, the customer needs to add terms to the user stemming dictionary. Terms like place names and product names...
Walter Underwood
walter_under...
Offline Send Email
Mar 11, 2009
10:41 pm

I can offer a few general Japanese / Chinese comments, though not specific to your engine: 0: You might try to quantify if they are having trouble with...
Mark Bennett
ttennebkram
Offline Send Email
Mar 11, 2009
11:01 pm

We can offer a few more thoughts about Japanese. We have been doing a lot of work with Japanese users of our information discovery toolkit (OrcaTec). Japanese...
orcatecherb
Offline Send Email
Mar 12, 2009
5:02 pm

... Here's an open source CJK tokeniser: http://code.google.com/p/cjk-tokenizer/ originally developed for use with Xapian, but should be adaptable. You can...
Charlie Hull
charliejuggler
Offline Send Email
Mar 12, 2009
5:07 pm

Hi Ed Mark's question about determining what kind of problem you think you have (recall vs. precision) is important, and it relates to the linguistic aspect of...
stevec22
Offline Send Email
Mar 12, 2009
5:12 pm

HI Ed Just of curiosity,is your content a mix of English and Japanese or just Japanese and wht is the charset in the metadata fields..we have problem with...
Kalyan Srinivas
kalyan_kuram
Offline Send Email
Mar 11, 2009
11:03 pm

Kalyan: We have a mixture of English and other language content. Japanese is one of our top 10 content languages. English accounts for 90% of our content in...
ed.dale@...
arentanji
Offline Send Email
Mar 12, 2009
2:29 pm

Thanks for replying Ed,By the way the link sent by Mark(Thanks Mark) on relevancy tuning does work..i read his article yesterday and we do use Method 1 in his...
Kalyan Srinivas
kalyan_kuram
Offline Send Email
Mar 12, 2009
5:23 pm

This is going to be a tough one for the IDOL engine, as its general approach is to minimize language-specific approaches tokenization, document-specific...
Chris Biow
chris.biow@...
Send Email
Mar 12, 2009
7:22 pm
Advanced

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help