AI Glossary
The definitive dictionary for AI, Machine Learning, and Governance terminology. From Flash Attention to RAG — look up any term.
C
Capability Elicitation
Techniques for discovering and activating latent capabilities in AI models — abilities that exist but are not obvious from standard testing or usage.
Catastrophic Forgetting
The tendency of neural networks to completely forget previously learned information when trained on new data or tasks. New learning overwrites old knowledge.
Catastrophic Interference
When learning new information in a neural network severely disrupts previously learned knowledge. It is the underlying mechanism behind catastrophic forgetting.
Catastrophic Risk
The potential for AI systems to cause large-scale, irreversible harm to society. This includes risks from misuse (bioweapons), accidents (autonomous systems), and structural disruption (mass unemployment).
CatBoost
A gradient boosting library by Yandex that handles categorical features natively without requiring manual encoding. CatBoost also addresses prediction shift and target leakage.
Causal Inference
Statistical methods for determining cause-and-effect relationships from data, going beyond correlation to understand whether X actually causes Y.
Causal Language Model
A training approach where the model predicts the next token given only the preceding tokens (left-to-right). This is how GPT models are trained and is the basis for text generation.
Chain-of-Thought
A prompting technique where the model is encouraged to show its step-by-step reasoning process before arriving at a final answer. This improves accuracy on complex reasoning tasks.
Chatbot
An AI application designed to simulate conversation with human users through text or voice. Modern chatbots use LLMs to provide natural, contextually aware responses.
ChatGPT
OpenAI's consumer-facing AI chatbot powered by GPT models. ChatGPT brought LLMs to the mainstream when it launched in November 2022, reaching 100 million users in two months.
Chinchilla Scaling
Research by DeepMind showing that many LLMs were significantly undertrained — for a given compute budget, training a smaller model on more data yields better performance.
Chunking
The process of breaking large documents into smaller pieces (chunks) before creating embeddings for use in RAG systems. Chunk size and strategy significantly impact retrieval quality.
CI/CD for ML
Continuous Integration and Continuous Deployment applied to machine learning — automating the testing, validation, and deployment of ML models whenever code or data changes.
Citizen Data Scientist
A business professional who creates ML models and analytics using no-code or low-code tools, without formal data science training. They bridge the gap between business and technical teams.
Classification
A type of supervised learning task where the model predicts which category or class an input belongs to. The output is a discrete label rather than a continuous value.
Claude
Anthropic's family of AI assistants known for their focus on safety, helpfulness, and honesty. Claude models are designed with Constitutional AI principles for safer, more reliable AI interactions.
CLIP
Contrastive Language-Image Pre-training — an OpenAI model trained to understand the relationship between images and text. CLIP can match images to text descriptions without being trained on specific image categories.
Closed Source AI
AI models where the architecture, weights, and training details are proprietary and not publicly available. Users access them only through APIs or products controlled by the developer.
Cloud Computing
On-demand access to computing resources (servers, storage, databases, AI services) over the internet. Cloud providers like AWS, Azure, and GCP offer scalable infrastructure without owning physical hardware.
Clustering
An unsupervised learning technique that groups similar data points together based on their characteristics, without predefined labels. The algorithm discovers natural groupings in the data.
Code Generation
The AI capability of producing functional source code from natural language descriptions, specifications, or partial code. Modern LLMs can generate code in dozens of programming languages.
Cognitive Architecture
A framework or blueprint for building AI systems that mimics aspects of human cognition, including perception, memory, reasoning, learning, and action.
Cold Start Problem
The challenge of making recommendations for new users (who have no history) or new items (which have no ratings). Cold start is a fundamental difficulty in recommendation systems.
Collaborative Filtering
A recommendation technique that predicts a user's interests based on the preferences of similar users. It assumes people who agreed in the past will agree again in the future.
Compliance
The process of ensuring AI systems meet regulatory requirements, industry standards, and organizational policies. AI compliance is becoming increasingly complex as regulations proliferate.
Compute
The computational resources (processing power, memory, time) required to train or run AI models. Compute is measured in FLOPs (floating-point operations) and is a primary constraint and cost in AI development.
Compute-Optimal Training
Allocating a fixed compute budget optimally between model size and training data quantity, based on scaling law research like the Chinchilla findings.
Computer Vision
A field of AI that trains computers to interpret and understand visual information from the world — images, videos, and real-time camera feeds. It enables machines to 'see' and make decisions based on what they see.
Concept Bottleneck
A model architecture that forces predictions through a set of human-interpretable concepts. The model first predicts concepts, then uses those concepts to make the final prediction.
Concept Drift
A change in the underlying relationship between inputs and outputs over time. Unlike data drift, concept drift means the rules of the game have changed, not just the distribution of inputs.
Confidence Score
A numerical value (typically 0-1) indicating how certain a model is about its prediction. Higher scores indicate greater confidence in the output.
Confusion Matrix
A table that summarizes the performance of a classification model by showing true positives, true negatives, false positives, and false negatives. It reveals the types of errors a model makes.
Confusion Matrix Metrics
The set of performance metrics derived from the confusion matrix including true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN).
Constitutional AI
An alignment approach developed by Anthropic where AI models are guided by a set of principles (a 'constitution') that help them self-evaluate and improve their responses without relying solely on human feedback.
Constitutional AI Principles
The specific set of rules and values embedded in a Constitutional AI system that guide its self-evaluation and response generation. These principles define what 'good' behavior means.
Constrained Generation
Techniques that force LLM output to conform to specific formats, schemas, or grammars. This ensures outputs are always valid JSON, SQL, or match a defined structure.
Constraint Satisfaction
The problem of finding values for variables that satisfy a set of constraints. In AI, it is used in scheduling, planning, and configuration tasks.
Content Moderation
The process of monitoring and filtering user-generated or AI-generated content to ensure it meets platform guidelines and legal requirements. AI is increasingly used to automate content moderation.
Content-Based Filtering
A recommendation technique that suggests items similar to those a user has previously liked, based on the items' features and attributes rather than other users' behavior.
Context Distillation
A technique where the behavior of a model prompted with detailed instructions is distilled into a model that exhibits the same behavior without the instructions.
Context Management
Strategies for efficiently using an LLM's limited context window, including what information to include, how to compress it, and when to summarize or truncate.
Context Window
The maximum amount of text (measured in tokens) that a language model can process in a single interaction. It includes both the input prompt and the generated output. Larger context windows allow models to handle longer documents.
Contextual Bandits
An extension of multi-armed bandits where the agent observes context (features) before making a decision, enabling personalized choices based on the current situation.
Continual Learning
Training a model on new data or tasks over time without forgetting previously learned knowledge. Also called lifelong learning or incremental learning.
Continual Pre-Training
Extending a pre-trained model's training on new domain-specific data without starting from scratch. It adapts the model to a new domain while preserving general capabilities.
Continuous Batching
A serving technique where new requests are added to an in-progress batch as existing requests complete, maximizing GPU utilization rather than waiting for an entire batch to finish.
Contrastive Learning
A self-supervised technique where the model learns by comparing similar (positive) and dissimilar (negative) pairs of examples. It learns representations where similar items are close and different items are far apart.
Conversational AI
AI technology that enables natural, multi-turn conversations between humans and machines. It combines NLU, dialog management, and NLG to maintain coherent, contextual interactions.
Convolutional Neural Network
A type of neural network specifically designed for processing grid-like data such as images. CNNs use convolutional layers that apply filters to detect patterns like edges, textures, and shapes at different scales.
Cosine Similarity
A metric that measures the similarity between two vectors by calculating the cosine of the angle between them. Values range from -1 (opposite) to 1 (identical), with 0 meaning unrelated.
Counterfactual Explanation
An explanation of an AI decision that describes what would need to change in the input for the model to produce a different output. It answers 'What if?' questions about predictions.
Cross-Encoder
A model that takes two texts as input simultaneously and outputs a relevance or similarity score. Unlike bi-encoders, cross-encoders consider the full interaction between both texts.
Cross-Entropy
A loss function commonly used in classification tasks that measures the difference between the predicted probability distribution and the actual distribution. Lower cross-entropy means better predictions.
Cross-Validation
A model evaluation technique that splits data into multiple folds, trains on some folds and tests on the held-out fold, repeating so every fold serves as the test set. It provides a robust estimate of model performance.
Crowdsourcing
Using a large group of distributed workers (often through platforms like Amazon Mechanical Turk or Scale AI) to perform data annotation and labeling tasks.
CUDA
Compute Unified Device Architecture — NVIDIA's parallel computing platform that enables GPU programming for AI workloads. CUDA is the dominant software ecosystem for AI computation.
Curriculum Learning
A training strategy inspired by human education where the model is exposed to training examples in a meaningful order — starting with easier examples and gradually increasing difficulty.