3.1.16. unit_scaling.RMSNorm

class unit_scaling.RMSNorm(normalized_shape: int | Tuple[int, ...], eps: float = 1e-05, elementwise_affine: bool = False)[source]

Applies a unit-scaled RMS normalisation over trailing dimensions.

This layer implements the operation as described in the paper Root Mean Square Layer Normalization.

\[y = \frac{x}{ \sqrt{\sum x^2 + \epsilon}} * \gamma\]

Note that this layer sets elementwise_affine=False by default.

Parameters:
  • normalized_shape (Tuple[int]) – input shape, for an expected input tensor of shape (*, *normalized_shape).

  • elementwise_affine (bool) – a boolean value that when set to True, this module has learnable per-element weight parameters initialized to ones. Default: False.

  • eps (float) – a value added to the denominator for numerical stability. Default: 1e-5.

weight

the learnable weights of the module of shape normalized_shape when elementwise_affine is set to True. The values are initialized to 1.