3.1.8. unit_scaling.Dropout

class unit_scaling.Dropout(p: float = 0.5, inplace: bool = False)[source]

A unit-scaled implementation of Dropout.

The zeroed elements are chosen independently for each forward call and are sampled from a Bernoulli distribution.

Each channel will be zeroed out independently on every forward call.

This has proven to be an effective technique for regularization and preventing the co-adaptation of neurons as described in the paper `Improving neural networks by preventing co-adaptation of feature detectors`_ .

Furthermore, the outputs are scaled by a factor of \(\frac{1}{1-p}\) during training. This means that during evaluation the module simply computes an identity function.

Parameters:
  • p – probability of an element to be zeroed. Default: 0.5

  • inplace[not supported by unit-scaling] If set to True, will do this operation in-place. Default: False

Shape:
  • Input: \((*)\). Input can be of any shape

  • Output: \((*)\). Output is of the same shape as input

Examples

>>> m = nn.Dropout(p=0.2)
>>> input = torch.randn(20, 16)
>>> output = m(input)