>>>>> "Petr" == Petr Fousek <p.fousek@...> writes:
Petr> Hi all, IN SHORT: mlp_bunch_size > 256 does not train
Petr> properly
Petr> I have been using for years qnstrn/qnmultitrn with a bunch
Petr> size of 256, which trains much faster than when updating on
Petr> every sample. It gives reasonable accuracy. Now I tried to
Petr> set mlp_bunch_size to 1024 and I am no more able to train
Petr> the MLP - the train/CV scores stay very low till the end of
Petr> the training.
I talked to Arlo and he's trained some 4-layer nets with big bunch
sizes. So in theory this works. I've done nothing with 4 layer nets
myself.
I've definitely seen smaller datasets go down hill with 3-layer nets
and bunch sizes >256 but it's been a more gradual decline than you're
seeing.
One issue is the bug we've seen when the number of reject frames is a
significant percentage of the total number of frames.
David.
Petr> I hoped it was due to a very small train corpus (10^4 frames
Petr> for some 10^6 MLP parameters), but the same holds for the
Petr> full-scale training (details below). Does anyone have a
Petr> guess what am I doing wrong?
Petr> Small experiment: - MLP size 826x3500x39x210 (826 inputs,
Petr> 210 phoneme-state outputs) - 30665 train frames, 3354 CV
Petr> frames - 4-threads training with Atlas, @Xeon (64bit) *
Petr> mlp_bunch_size=256 -> 53/43% Acc (train/CV), 10 epochs *
Petr> mlp_bunch_size=512 -> 15/16% Acc (train/CV), 6 epochs *
Petr> mlp_bunch_size=1024-> 5/0.00% Acc (train/CV), 2 epochs
Petr> Normal size experiment: - MLP size 351x3500x39x210 (351
Petr> inputs, 210 phoneme-state outputs) - 20,382,407 train +
Petr> 2,206,465 CV frames * mlp_bunch_size=256 -> 52/50% Acc
Petr> (train/CV), 7 epochs * mlp_bunch_size=1024 -> 4.7/5.0% Acc
Petr> (train/CV), 4 epochs
Petr> Petr.
Petr> Yahoo! Groups Links