Machine Learning

Adversarial Training

A defense technique where adversarial examples are included in the training data to make the model more robust against attacks. The model learns to handle both normal and adversarial inputs.

Why It Matters

Adversarial training is the most effective known defense against adversarial attacks, making models significantly more robust for safety-critical applications.

Example

Generating adversarial versions of training images and including them in training, teaching the classifier to correctly identify objects even when adversarial noise is present.

Think of it like...

Like a martial artist who practices against opponents who use unconventional techniques — the unexpected practice makes them better prepared for real fights.

Adversarial Training

Why It Matters

Example

Think of it like...

Related Terms

Adversarial Attack

Robustness

Data Augmentation

Regularization

AI Safety