Skip to content
Home ยป sklearn

sklearn

Solving “Found unknown categories […] in column” with sklearn OneHotEncoder

In this short blog post, I tackle an error related to a classic problem within machine learning: how to treat unseen categorical values and solve the “found unknown categories” error. Imagine you have a train and a test data set with the following values in a column: Train: [‘green’, ‘red’,… 

Using scikit’s OneHotEncoder only on categorical variables of a data frame

I’ve been trying to build a model using machine learning today and I bumped into an error when I wanted to dummify my categorical predictors. It seemed I didn’t really know how Scikit’s OneHotEncoder worked. But I do now. And I want to share it with you. I had a…