3.1.22.18. unit_scaling.functional.silu_glu

unit_scaling.functional.silu_glu(input: Tensor, gate: Tensor, mult: float = 1.0) Tensor[source]

Applies a unit-scaled gated linear unit for input * silu(gate).

silu_glu(x,g)=xgσ(g),where σ(g) is the logistic sigmoid.
Parameters:
  • input (Tensor) – linear input

  • gate (Tensor) – gate (SiLU) input

  • mult (float, optional) – a multiplier to be applied to change the shape of a nonlinear function. Typically, high multipliers (> 1) correspond to a ‘sharper’ (low temperature) function, while low multipliers (< 1) correspond to a ‘flatter’ (high temperature) function.

Returns:

a scaled output, the same shape as input

Return type:

Tensor