3.1.22. unit_scaling.functional
Unit-scaled versions of common torch.nn.functional functions.
Functions
|
Applies a unit-scaled addition. |
|
Applies a unit-scaled 1D convolution. |
|
Computes the unit-scaled cross entropy loss between input logits and target. |
|
Applies a unit-scaled dropout function. |
|
A unit-scaled lookup table that looks up embeddings in a fixed dictionaryand size. |
|
Applies a unit-scaled GELU function. |
|
Applies a unit-scaled Layer Normalization for last certain number of dimensions. |
|
Applies a unit-scaled linear transformation. |
|
Applies a unit-scaled linear transformation, for the final network output. |
|
A unit-scaled matrix product of two tensors. |
|
Computes the unit-scaled element-wise mean squared error. |
|
Adds a residual connection and skip connection together, with a relative weighting tau applied to the residual branch. |
|
Apply a weighted residual branch, maintaining unit scale. |
|
Splits a tensor into an residual and skip tensor, prior to being used in a residual layer, with a relative weighting tau applied to the residual branch. |
|
Apply unit-scaled RMS Normalization for last certain number of dimensions. |
|
A unit-scaled dot-product attention function. |
|
Applies a unit-scaled SiLU function. |
|
Applies a unit-scaled gated linear unit for input * silu(gate). |
|
Applies a unit-scaled softmax function. |