Home ยป machine learning

machine learning

Undersampling a Pandas DataFrame

  • by
  • 2 min read

In a previous post, I explained how you can sample two Pandas DataFrame exactly the same way. In this blog post, I want to use that helper function to undersample your predictors and target variable. When you are working with an imbalanced data set, it’s often good practice to under-… 

Dealing with right-censored data in machine learning: Random Survival Forests

  • by
  • 4 min read

A couple of weeks ago, I started working with survival analysis. It was fairly new to me, so I had to dig into some new methods. There was one method that captured my attention: random survival forests (RSFs). It’s one of many statistical learning techniques designed to work with right-censored… 

randomForest gives NA/NaN/Inf in foreign function call and how to solve it

  • by
  • 3 min read

Personally, Random Forest is one of my favorite algorithms for supervised learning. It’s quick and dirty and still allows for some interpretation. However, R and the RandomForest package are somewhat cryptic when it comes to requirements not met to properly train the algorithm. I bumped a lot into this error…