Search the web
Sign In
New User? Sign Up
icsi-speech-tools
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Show off your group to the world. Share a photo of your group with us.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Another FAQ: How should the input be randomized when training?   Message List  
Reply | Forward Message #146 of 162 |
(This is another question & answer copied from ICSI's new internal FAQ.)

How should the input be randomized when training?

This article in the old speech FAQ discusses randomization:

http://www.icsi.berkeley.edu/speech/faq/nn-rand.html

The following email message provides some extra detail:

From: David Johnson
To: David Gelbart

David Gelbart writes:

David> The qnstrn tool has a built-in train_cache_frames
David> option which controls randomization of presentation during
David> training.
David>
David> However, I think we sometimes randomize file lists when
David> creating pfiles for qnstrn. Why is this useful when qnstrn
David> has randomization built in? Is it because of the finite
David> size of the qnstrn randomization cache?

qnstrn reads in a set of contiguous "sentences" from a feature file
and then randomly picks frames from this sent of sentences to train
the net. If you use a big cache size and your sentences are randomized
in the feature file, this is a reasonable approximation to picking a
random frame for the whole dataset. If your sentences aren't
randomized (i.e. you didn't randomize your file list) this isn't a
very good approximation to picking a random frame and is likely to
lead to trouble.

The reason we do this is because actually picking a random frame is
expensive. With big feature files, it requires a disk seek, which
takes a few ms (assuming no one else is accessing the same disk!).
Training at a rate of a few 100 frames/second isn't acceptable in many
situations.




Tue Dec 4, 2007 9:06 pm

zizazze
Offline Offline
Send Email Send Email

Forward
Message #146 of 162 |
Expand Messages Author Sort by Date

(This is another question & answer copied from ICSI's new internal FAQ.) How should the input be randomized when training? This article in the old speech FAQ...
David
zizazze
Offline Send Email
Dec 4, 2007
9:06 pm
Advanced

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help