3.1.22.18. unit_scaling.functional.silu_glu

unit_scaling.functional.silu_glu(input: Tensor, gate: Tensor, mult: float = 1.0) Tensor[source]

Applies a unit-scaled gated linear unit for input * silu(gate).

\[\text{silu_glu}(x, g) = x * g * \sigma(g), \text{where } \sigma(g) \text{ is the logistic sigmoid.}\]
Parameters:
  • input (Tensor) – linear input

  • gate (Tensor) – gate (SiLU) input

  • mult (float, optional) – a multiplier to be applied to change the shape of a nonlinear function. Typically, high multipliers (> 1) correspond to a ‘sharper’ (low temperature) function, while low multipliers (< 1) correspond to a ‘flatter’ (high temperature) function.

Returns:

a scaled output, the same shape as input

Return type:

Tensor