(This is another question & answer copied from ICSI's new internal FAQ.)
How should the input be randomized when training?
This article in the old speech FAQ discusses randomization:
http://www.icsi.berkeley.edu/speech/faq/nn-rand.html
The following email message provides some extra detail:
From: David Johnson
To: David Gelbart
David Gelbart writes:
David> The qnstrn tool has a built-in train_cache_frames
David> option which controls randomization of presentation during
David> training.
David>
David> However, I think we sometimes randomize file lists when
David> creating pfiles for qnstrn. Why is this useful when qnstrn
David> has randomization built in? Is it because of the finite
David> size of the qnstrn randomization cache?
qnstrn reads in a set of contiguous "sentences" from a feature file
and then randomly picks frames from this sent of sentences to train
the net. If you use a big cache size and your sentences are randomized
in the feature file, this is a reasonable approximation to picking a
random frame for the whole dataset. If your sentences aren't
randomized (i.e. you didn't randomize your file list) this isn't a
very good approximation to picking a random frame and is likely to
lead to trouble.
The reason we do this is because actually picking a random frame is
expensive. With big feature files, it requires a disk seek, which
takes a few ms (assuming no one else is accessing the same disk!).
Training at a rate of a few 100 frames/second isn't acceptable in many
situations.