Gradient Clipping
A technique that caps gradient values at a maximum threshold during training to prevent exploding gradients. If a gradient exceeds the threshold, it is scaled down.
Why It Matters
Gradient clipping is a simple but essential safeguard in training deep networks. It is applied by default in most modern training frameworks.
Example
Setting a gradient clip norm of 1.0 so that if any gradient vector has a magnitude greater than 1.0, it is scaled down to 1.0 while maintaining its direction.
Think of it like...
Like a speed limiter on a car — it lets you drive normally but prevents dangerous speeds, keeping the system stable.
Related Terms
Exploding Gradient
A training problem where gradients become extremely large during backpropagation, causing weight updates to be so drastic that the model becomes unstable and training diverges.
Backpropagation
The primary algorithm used to train neural networks. It calculates how much each weight in the network contributed to the error, then adjusts weights backward from the output layer to reduce future errors.