3.1.8. unit_scaling.Dropout
- class unit_scaling.Dropout(p: float = 0.5, inplace: bool = False)[source]
A unit-scaled implementation of Dropout.
The zeroed elements are chosen independently for each forward call and are sampled from a Bernoulli distribution.
Each channel will be zeroed out independently on every forward call.
This has proven to be an effective technique for regularization and preventing the co-adaptation of neurons as described in the paper `Improving neural networks by preventing co-adaptation of feature detectors`_ .
Furthermore, the outputs are scaled by a factor of \(\frac{1}{1-p}\) during training. This means that during evaluation the module simply computes an identity function.
- Parameters:
p – probability of an element to be zeroed. Default: 0.5
inplace – [not supported by unit-scaling] If set to
True
, will do this operation in-place. Default:False
- Shape:
Input: \((*)\). Input can be of any shape
Output: \((*)\). Output is of the same shape as input
Examples
>>> m = nn.Dropout(p=0.2) >>> input = torch.randn(20, 16) >>> output = m(input)