This blog post is the second post in a two-part series on subsetting Pandas DataFrame rows using chained conditions. In this post, we tackle the following TypeError.
TypeError: cannot compare a dtyped [object] array with a scalar of type [bool]
- Part 1: Bitwise operators
- Part 2: Parentheses
Filtering (or subsetting) a DataFrame can easily be done using the loc property, which can access a group of rows and columns by label(s) or a boolean array. To filter rows, one can also drop loc completely, and implicitly call it by putting the conditioning booleans between square brackets.
💥 Watch out, if your conditions are a list of strings, it will filter the columns.
# explicit row and column filter using loc df.loc[row_condition, column_condition] # implicit row filter by passing a list of booleans condition = [True, False, True] df[condition] # will filter rows # implicit column filter by passing a list of booleans condition = ['column_a', 'column_b'] df[condition] # will filter columns
By chaining conditions, you can filter on multiple conditions, all at once. Nevertheless, make sure to use proper parentheses.
First, let’s try not using parentheses around our conditions, as can be seen in the chunk below.
df.loc[df.column_a == 'some_value' & df.column_b == 'another_value']
The error you’ll run into is the following:
TypeError: cannot compare a dtyped [object] array with a scalar of type [bool]
This happens because & has higher precedence than ==. Here’s what the Pandas documentation has to say about it.
Another common operation is the use of boolean vectors to filter the data. The operators are: | for or, & for and, and ~ for not. These must be grouped by using parentheses, since by default Python will evaluate an expression such as df[‘A’] > 2 & df[‘B’] < 3 as df[‘A’] > (2 & df[‘B’]) < 3, while the desired evaluation order is (df[‘A’] > 2) & (df[‘B’] < 3).
The correct way to combine multiple conditions (whether it’s an and or an or), is by adding the necessary parentheses, as follows.
df.loc[(df.column_a == 'some_value') & (df.column_b == 'another_value')]
This will make sure that == is processed before & and that no errors are thrown.
As you can see, I’m using the bitwise operator, and not the boolean operator. More details can be found in part 1 of this post.
- Part 1: Bitwise operators
- Part 2: Parentheses
Your article helped me a lot, is there any more related content? Thanks! https://accounts.binance.com/zh-CN/register-person?ref=V3MG69RO
Your article gave me a lot of inspiration, I hope you can explain your point of view in more detail, because I have some doubts, thank you.
Your article helped me a lot, is there any more related content? Thanks!
Полностью трендовые события модного мира.
Актуальные события самых влиятельных подуимов.
Модные дома, торговые марки, гедонизм.
Самое приятное место для модных людей.
https://malemoda.ru/
ManModa.ru – это журнал о мужской моде и стиле. Сайт предлагает актуальные тренды, обзоры, советы по выбору одежды и аксессуаров. Также здесь можно найти новости модных показов, информацию о брендах, уходе за собой и другие темы, связанные с мужским стилем. Журнал помогает мужчинам создавать уникальные образы и следить за новинками в мире моды.
https://manmoda.ru
Thank you for your sharing. I am worried that I lack creative ideas. It is your article that makes me full of hope. Thank you. But, I have a question, can you help me?