Machine Learning

Gradient Clipping

A technique that caps gradient values at a maximum threshold during training to prevent exploding gradients. If a gradient exceeds the threshold, it is scaled down.

Why It Matters

Gradient clipping is a simple but essential safeguard in training deep networks. It is applied by default in most modern training frameworks.

Example

Setting a gradient clip norm of 1.0 so that if any gradient vector has a magnitude greater than 1.0, it is scaled down to 1.0 while maintaining its direction.

Think of it like...

Like a speed limiter on a car — it lets you drive normally but prevents dangerous speeds, keeping the system stable.

Related Terms

Exploding Gradient

A training problem where gradients become extremely large during backpropagation, causing weight updates to be so drastic that the model becomes unstable and training diverges.

The primary algorithm used to train neural networks. It calculates how much each weight in the network contributed to the error, then adjusts weights backward from the output layer to reduce future errors.

Back to Glossary

Why It Matters

Example

Think of it like...

Related Terms

Exploding Gradient

Backpropagation