4.1.22.8. unit_scaling.functional.linear
- unit_scaling.functional.linear(input: Tensor, weight: Tensor, bias: Tensor | None, constraint: str | None = 'to_output_scale', scale_power: Tuple[float, float, float] = (0.5, 0.5, 0.5)) Tensor [source]
Applies a unit-scaled linear transformation.
Applies a linear transformation to the incoming data: \(y = xA^T + b\).
This operation supports 2-D
weight
with sparse layoutWarning
Sparse support is a beta feature and some layout(s)/dtype/device combinations may not be supported, or may not have autograd support. If you notice missing functionality please open a feature request.
This operator supports TensorFloat32.
- Parameters:
constraint (Optional[str]?) – The name of the constraint function to be applied to the outputs & input gradient. In this case, the constraint name must be one of: [None, ‘gmean’, ‘hmean’, ‘amean’, ‘to_output_scale’, ‘to_grad_input_scale’] (see unit_scaling.constraints for details on these constraint functions). Defaults to gmean.
scale_power ((float, float, float)?) – scaling power for each of (output, grad(input), grad(weight|bias))
- Shape:
Input: \((*, in\_features)\) where * means any number of additional dimensions, including none
Weight: \((out\_features, in\_features)\) or \((in\_features)\)
Bias: \((out\_features)\) or \(()\)
Output: \((*, out\_features)\) or \((*)\), based on the shape of the weight