Skip to content
Home ยป machine learning

machine learning

Undersampling a Pandas DataFrame

  • by
  • 2 min read

In a previous post, I explained how you can sample two Pandas DataFrame exactly the same way. In this blog post, I want to use that helper function to undersample your predictors and target variable. When you are working with an imbalanced data set, it’s often good practice to under-… 

Dealing with right-censored data in machine learning: Random Survival Forests

A couple of weeks ago, I started working with survival analysis. It was fairly new to me, so I had to dig into some new methods. There was one method that captured my attention: random survival forests (RSFs). It’s one of many statistical learning techniques designed to work with right-censored… 

randomForest gives NA/NaN/Inf in foreign function call and how to solve it

Personally, Random Forest is one of my favorite algorithms for supervised learning. It’s quick and dirty and still allows for some interpretation. However, R and the RandomForest package are somewhat cryptic when it comes to requirements not met to properly train the algorithm. I bumped a lot into this error…