Dropout regularization in hierarchical mixture of experts

Alpaydın, Ahmet İbrahim Ethem2022-09-072022-09-072021-01-020925-2312http://hdl.handle.net/10679/7840https://doi.org/10.1016/j.neucom.2020.08.052Dropout is a very effective method in preventing overfitting and has become the go-to regularizer for multi-layer neural networks in recent years. Hierarchical mixture of experts is a hierarchically gated model that defines a soft decision tree where leaves correspond to experts and decision nodes correspond to gating models that softly choose between its children, and as such, the model defines a soft hierarchical partitioning of the input space. In this work, we propose a variant of dropout for hierarchical mixture of experts that is faithful to the tree hierarchy defined by the model, as opposed to having a flat, unitwise independent application of dropout as one has with multi-layer perceptrons. We show that on a synthetic regression data and on MNIST, CIFAR-10, and SSTB datasets, our proposed dropout mechanism prevents overfitting on trees with many levels improving generalization and providing smoother fits.engrestrictedAccessDropout regularization in hierarchical mixture of expertsarticle41914815600059017550001310.1016/j.neucom.2020.08.052DropoutHierarchical modelsMixture of expertsRegularization2-s2.0-85091258086