Weight
A numerical parameter in a neural network that is learned during training. Weights determine the strength of connections between neurons and collectively encode the model's knowledge.
Why It Matters
The weights are literally what the model has learned — a GPT-4 class model has hundreds of billions of weights that together encode its capabilities.
Example
A 7B parameter model like Llama 2 7B has 7 billion individual weight values, each one adjusted during training to minimize prediction errors.
Think of it like...
Like the strength settings on a mixing board — each slider (weight) controls how much one signal influences the final output, and training adjusts all the sliders.
Related Terms
Parameter
Any learnable value in a machine learning model that is adjusted during training. Parameters include weights and biases in neural networks. Model size is often described by parameter count.
Neural Network
A computing system inspired by the biological neural networks in the human brain. It consists of interconnected nodes (neurons) organized in layers that process information and learn to recognize patterns.
Backpropagation
The primary algorithm used to train neural networks. It calculates how much each weight in the network contributed to the error, then adjusts weights backward from the output layer to reduce future errors.
Gradient Descent
An optimization algorithm used to minimize the error (loss) of a model by iteratively adjusting parameters in the direction that reduces the loss most quickly. It is the primary method for training machine learning models.