Causal Language Model
A training approach where the model predicts the next token given only the preceding tokens (left-to-right). This is how GPT models are trained and is the basis for text generation.
Why It Matters
Causal LMs are the foundation of all modern text-generating AI. They predict one token at a time, and this simple objective scales to incredibly sophisticated capabilities.
Example
Given 'The cat sat on the', the model predicts 'mat'. Given that full sequence, it might then predict a period. Each prediction only sees what came before.
Think of it like...
Like writing a story one word at a time without being able to peek ahead — each word choice is based only on what has been written so far.
Related Terms
GPT
Generative Pre-trained Transformer — a family of large language models developed by OpenAI. GPT models are trained to predict the next token in a sequence and can generate coherent, contextually relevant text across many tasks.
Pre-training
The initial phase of training a model on a large, general-purpose dataset before specializing it for specific tasks. Pre-training gives the model broad knowledge and capabilities.