AI Glossary
The definitive dictionary for AI, Machine Learning, and Governance terminology. From Flash Attention to RAG — look up any term.
Q
QLoRA
Quantized Low-Rank Adaptation — combines LoRA with quantization to further reduce memory requirements for fine-tuning. It quantizes the base model to 4-bit precision while training LoRA adapters in higher precision.
Quantization
The process of reducing the precision of a model's numerical weights (e.g., from 32-bit to 8-bit or 4-bit), making the model smaller and faster while accepting a small trade-off in accuracy.
Quantization-Aware Training
Training a model while simulating the effects of quantization, so the model learns to maintain accuracy even when weights are later reduced to lower precision.
Question Answering
An NLP task where the model provides direct answers to questions, either from a given context passage (extractive QA) or from general knowledge (open-domain QA).