Performance Metrics: Matthews Correlation Coefficient

roelpi

3 years ago

What is the Matthews Correlation Coefficient?

Matthews Correlation Coefficient (MCC) has many names:

Contrary to other performance metrics (such as F1-Score), the MCC is regarded as one of the best measures to evaluate class predictions in a binary setting — even if there is a severe class imbalance. Although there is no single measure to describe model performance, the MCC is often your best option.

In essence, the MCC is a correlation coefficient between the predicted values and the true values. That’s why it will return a value between -1 and 1. When the predictions are perfect, the MCC will be +1. When it does no better than random prediction, it will be 0. Finally, when the predictions and observations disagree, the MCC will be -1. In most situations, you’d want this value to be as close to 1.

The formula for the Matthews Correlation Coefficient:

Interestingly, if any of the four sums in the denominator is 0, the MCC will also be zero. This happens when:

One of the classes is never found in the data (e.g. TP + FN = 0)
If all predictions return the same value (e.g. TP + FP = 0)

It is worth noting that the formula can also be written using ratios only: