Home » pandas » Page 3

pandas

Pandas’ pivot_table vs. pivot

  • by
  • 4 min read

When you’re an R poweruser, pivoting tables in pandas feels unnecessarily complex. Why are there two pivot functions? Why does it return an index when you wanted a column? Why does it generate multi index columns? Those are the questions I tackle in this blog post. To answer some questions… 

Working with NaN’s (nulls/NA’s) in pandas: per column, per row and per group

  • by
  • 2 min read

Getting a firm understanding of NaNs in your dataset ensures you don’t draw wrong conclusions from an incomplete dataset. In this blog post I show how you can list the amount of NaNs per column, per row, and per group. First, let’s create some dummy data, and add some NaNs.… 

Replacing multiple values in a pandas DataFrame column

  • by
  • 2 min read

Without going into detail, here’s something I truly hate in R: replacing multiple values. In Python’s pandas, it’s really easy. In this blog post I try several methods: list comprehension, apply(), replace() and map(). First, let’s create some dummy data. First, let’s try with list comprehension. The get() function tries… 

Pandas: Solve ‘You are trying to merge on object and int64 columns’

Pandas is the go-to package for anything data science in Python. However, if you’re used to R and the convenience of dplyr or data.table, pandas can be confusing, now and then. For example, the following error is a real newb issue. ValueError: You are trying to merge on object and…