Home » Subsetting a Pandas DataFrame using multiple conditions, Part 1: Bitwise operators

Subsetting a Pandas DataFrame using multiple conditions, Part 1: Bitwise operators

  • by
  • 2 min read

This blog post is the first post in a two-part series on subsetting Pandas DataFrame rows using chained conditions. In this post, we tackle the following ValueError.

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().


Filtering (or subsetting) a DataFrame can easily be done using the loc property, which can access a group of rows and columns by label(s) or a boolean array. To filter rows, one can also drop loc completely, and implicitly call it by putting the conditioning booleans between square brackets.

💥 Watch out, if your conditions are a list of strings, it will filter the columns.

# explicit row and column filter using loc
df.loc[row_condition, column_condition]

# implicit row filter by passing a list of booleans
condition = [True, False, True]
df[condition] # will filter rows

# implicit column filter by passing a list of booleans
condition = ['column_a', 'column_b']
df[condition] # will filter columns

Of course, one can filter on multiple conditions, simply by chaining them using the and/or operators. However, there’s something to keep in mind.

In the follow example, I’m trying to filter a DataFrame on both column_a and column_b.

df.loc[(df.column_a == 'some_value') and (df.column_b == 'another_value')]

As you can see, I’m using the boolean AND. These are useful when you are creating a chained condition of two (or more) conditions, each simply returning True or False.

However, by creating a condition using a Pandas Series, we create an array of True’s and False’s. The boolean operator cannot chain two arrays and the following error will be thrown.

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

To prevent this from happening, you should use the bitwise AND, which chains each individual element of the array.

df.loc[(df.column_a == 'some_value') & (df.column_b == 'another_value')]

Furthermore, you should not forget to use proper parentheses, as discussed in part 2 of this post.

Say thanks, ask questions or give feedback

Technologies get updated, syntax changes and honestly… I make mistakes too. If something is incorrect, incomplete or doesn’t work, let me know in the comments below and help thousands of visitors.

Leave a Reply

Your email address will not be published. Required fields are marked *