What is the Matthews Correlation Coefficient?
Matthews Correlation Coefficient (MCC) has many names:
- Phi Coefficient
- Pearson’s Phi Coefficient
- Yule Phi Coefficient
Contrary to other performance metrics (such as F1-Score), the MCC is regarded as one of the best measures to evaluate class predictions in a binary setting — even if there is a severe class imbalance. Although there is no single measure to describe model performance, the MCC is often your best option.
In essence, the MCC is a correlation coefficient between the predicted values and the true values. That’s why it will return a value between -1 and 1. When the predictions are perfect, the MCC will be +1. When it does no better than random prediction, it will be 0. Finally, when the predictions and observations disagree, the MCC will be -1. In most situations, you’d want this value to be as close to 1.
The formula for the Matthews Correlation Coefficient:
Interestingly, if any of the four sums in the denominator is 0, the MCC will also be zero. This happens when:
- One of the classes is never found in the data (e.g. TP + FN = 0)
- If all predictions return the same value (e.g. TP + FP = 0)
It is worth noting that the formula can also be written using ratios only:
Further reading: