What is a confusion matrix?
The confusion matrix (or “error matrix“) is a table that is used to describe the performance of a classification model by comparing its predictions to a data set of which the true values are known. In a binary classification task, the confusion matrix is a 2×2 table but expanding it to multiple categories is completely analogous.
A binary classifier can make two types of errors:
- Predicting positive while the true value is negative (false positive)
- Predicting negative while the true value is positive (false negative)
The binary classifier can make two correct predictions:
- Predicting positive when the true value is positive (true positive)
- Predicting negative when the true value is negative (true negative)
This is what a typical confusion matrix looks like. There is no general agreement about what to display on the horizontal & vertical axis. In “An Introduction to Statistical Learning“, the predictions will be on the rows, while the true values will be in the columns. In “Python Machine Learning“, you’ll find the predictions in the columns and the true values on the rows.
The confusion matrix can be analyzed in multiple ways using a variety of performance metrics and ratios to assess a classification algorithm’s performance.