**Getting a firm understanding of NaNs in your dataset ensures you don’t draw wrong conclusions from an incomplete dataset. In this blog post I show how you can list the amount of NaNs per column, per row, and per group.**

First, let’s create some dummy data, and add some NaNs.

```
import pandas as pd
from timeit import timeit
from random import randint, seed
import numpy as np
seed(1988)
taste = ['sweet','sour','sweet','bitter'] * 1000
color = ['red','green','yellow','red'] * 1000
fruit = ['apple','pear','banana','cherry'] * 1000
data = {'taste': taste, 'color': color, 'fruit': fruit}
df = pd.DataFrame(data)
xy_rand = [(randint(0,3999),randint(0,2)) for i in range(1,100)]
for xy in xy_rand:
df.iloc[xy] = np.nan
```

Getting **the number of NaN values per column** is fairly straightforward. You can list all the NaN values (true/false) and then sum them. The difference between *isnull()* and *isna()*? *isnull()* and *isna()* are the same functions (an alias), so you can choose either one.

```
df.isnull().sum()
df.isna().sum()
```

Finding **which index (or row number) contains missing values** can be done analogously to the previous example, simply by adding *axis=1*. However, it will return all the rows. If you have a large dataset, you can filter it using *.index*.

```
df.isnull().sum(axis=1) # all indices
df.index[df.isnull().sum(axis=1) > 0] # only indices with NaN
```

Finally, you can check **the amount of NaN’s per group** (or class) as follows. In this example I want to find the number of missing values per color. It can be done for every column and also for one column specifically.

```
df.isnull().groupby(df.color, sort=False).sum() # all columns
df.taste.isnull().groupby(df.color, sort = False).sum() # 'taste' column only
```

Great success!