Data Science

Data Quality

The degree to which data is accurate, complete, consistent, timely, and fit for its intended use. Data quality directly impacts the reliability and performance of AI models.

Why It Matters

Data quality is the foundation of trustworthy AI. Models trained on high-quality data consistently outperform models trained on larger but lower-quality datasets.

Example

An audit revealing that a customer database has 15% duplicate records, 8% missing email addresses, and inconsistent date formats — all of which must be fixed before ML training.

Think of it like...

Like the ingredients in cooking — a five-star recipe with rotten ingredients produces a terrible dish. The best model architecture cannot compensate for bad data.

Related Terms