unit-scaling
Contents
1. User guide
2. Limitations
3. API reference
3.1. unit_scaling
3.1.1. unit_scaling.Parameter
3.1.2. unit_scaling.transformer_residual_scaling_rule
3.1.3. unit_scaling.visualiser
3.1.4. unit_scaling.Conv1d
3.1.5. unit_scaling.CrossEntropyLoss
3.1.6. unit_scaling.DepthModuleList
3.1.7. unit_scaling.DepthSequential
3.1.8. unit_scaling.Dropout
3.1.9. unit_scaling.Embedding
3.1.10. unit_scaling.GELU
3.1.11. unit_scaling.LayerNorm
3.1.12. unit_scaling.Linear
3.1.13. unit_scaling.LinearReadout
3.1.14. unit_scaling.MHSA
3.1.15. unit_scaling.MLP
3.1.16. unit_scaling.RMSNorm
3.1.17. unit_scaling.SiLU
3.1.18. unit_scaling.Softmax
3.1.19. unit_scaling.TransformerDecoder
3.1.20. unit_scaling.TransformerLayer
3.1.21. unit_scaling.core
3.1.22. unit_scaling.functional
3.1.23. unit_scaling.optim
3.1.23.1. unit_scaling.optim.lr_scale_for_depth
3.1.23.2. unit_scaling.optim.lr_scale_func_adam
3.1.23.3. unit_scaling.optim.lr_scale_func_sgd
3.1.23.4. unit_scaling.optim.scaled_parameters
3.1.23.5. unit_scaling.optim.Adam
3.1.23.6. unit_scaling.optim.AdamW
3.1.23.7. unit_scaling.optim.SGD
3.1.24. unit_scaling.parameter
3.2. unit_scaling.analysis
3.3. unit_scaling.constraints
3.4. unit_scaling.formats
3.1.22. unit_scaling.functional
3.1.23. unit_scaling.optim
3.1.23.1. unit_scaling.optim.lr_scale_for_depth
lr_scale_for_depth()
3.1.23.2. unit_scaling.optim.lr_scale_func_adam
3.1.23.3. unit_scaling.optim.lr_scale_func_sgd
3.1.23.4. unit_scaling.optim.scaled_parameters
3.1.23.5. unit_scaling.optim.Adam
3.1.23.6. unit_scaling.optim.AdamW
3.1.23.7. unit_scaling.optim.SGD
3.5. unit_scaling.scale
3.6. unit_scaling.transforms
3.7. unit_scaling.transforms.utils
3.8. unit_scaling.utils
3.1.21.1. unit_scaling.core.functional
unit-scaling
3.
API reference
3.1.
unit_scaling
3.1.23.
unit_scaling.optim
3.1.23.1.
unit_scaling.optim.lr_scale_for_depth
View page source
3.1.23.1.
unit_scaling.optim.lr_scale_for_depth
unit_scaling.optim.
lr_scale_for_depth
(
param
:
ParameterData
)
→
float
[source]
Calculate the LR scaling factor for depth only.