Skip to content
Home » Data Science » Machine Learning

Machine Learning

Confusion Matrix

What is a confusion matrix? The confusion matrix (or “error matrix“) is a table that is used to describe the performance of a classification model by comparing its predictions to a data set of which the true values are known. In a binary classification task, the confusion matrix is a… 

Data Leakage

What is data leakage? Within the field of machine learning, data leakage is a term used to describe how data from outside the training data set is used to create the model. This is a problem because, within machine learning, our goal is to develop a model that is good… 

Data Shift

  • by

What is Data Shift? Data shift— or dataset shift, model drift, data drift– is the phenomenon that describes the change in input data in your model (over time), relative to the data it was trained on. It is one of the most common reasons for degrading model accuracy. That’s why… 

Performance Metrics: Accuracy

  • by

What is the Accuracy? The Accuracy is a performance metric that tells you the fraction of the predictions that were correct, without distinguishing between positive and negative predictions. The Accuracy can be a very misleading metric when the data set is unbalanced (when the prevalence is either very high or very… 

Performance Metrics: F1-Score

  • by

What is the F1-Score? The F1-Score has many names: F-Score F-Measure Sørensen’s Similarity Coefficient Sørensen–Dice Coefficient Dice Similarity Coefficient (DSC) Dice’s Coincidence Index Hellden’s Mean Accuracy Index The F1-Score is a metric to evaluate the performance of a binary classifier. It is calculated as the harmonic mean of the precision…