pandas Series is a one-dimensional labelled data structure that may store texts, numbers, and even other Python objects. It is one of the fundamental data structure in pandas
for holding one-dimensional data and is constructed on top of numpy
array. In a lot of situation, you may need to use it in conjunction with boolean operations, and
, or
, or not
. You may encounter the following error while doing so : ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
This article is going to show you the reason behind "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()" error, as well as possible ways to fix it.
What does ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all() really means?
Every time you run an expression with operands and operators, the Python tries to evaluate individual values to boolean. This is what called "truthy" or "falsy" values. A "truthy" value will satisfy the check performed by if
or while
statements, or operators.
In Python, all values are considered "truthy" except for the following, which are "falsy":
None
False
0
0.0
0j
decimal.Decimal(0)
fraction.Fraction(0, 1)
[]
– an emptylist
{}
– an emptydict
()
– an emptytuple
''
– an emptystr
b''
– an emptybytes
set()
– an emptyset
- an empty
range
, likerange(0)
- objects for which
obj.__bool__()
returnsFalse
obj.__len__()
returns0
Now, to be able to define whether a Series is truthy or falsy is tricky. A Series can contains both True and False values, Python doesn’t know which value to use, meaning that the series has an ambiguous truth value. It would cause so much confusion if we provide a default truthy/valsy value for a Series object. Look at the example below to see what I mean:
In [1]: if pd.Series([False, True, False]):
...: print("The Series is true")
...:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-3-fa1755415a60> in <module>
----> 1 if pd.Series([False, True, False]):
2 print("The Series is true")
3
~/.local/share/virtualenvs/testing-p6fO7ldL/lib/python3.8/site-packages/pandas/core/generic.py in __nonzero__(self)
1535 @final
1536 def __nonzero__(self):
-> 1537 raise ValueError(
1538 f"The truth value of a {type(self).__name__} is ambiguous. "
1539 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Avoid "ValueError: The truth value of a Series is ambiguous"
Now that you know that a Series cannot hold a "truthy" or "falsy" value, depending on what you want to do, you can use one of these attributes/methods : Series.empty, Series.bool(), Series.item(), Series.any() or Series.all(). You can also use bitwise and/or (| and &) to compare two Series element-by-element.
Compare two Series of the same size using bitwise operations
Let’s say s1
and s2
are two Series with the same size, s1|s2
does a bitwise OR and s1&s2
does a bitwise AND. This is useful if the Series contains only boolean or numerical values.
In [13]: s1 = pd.Series([1, 2, 3, 4])
In [14]: s2 = pd.Series([5, 6, 7, 8])
In [15]: s1|s2
Out[15]:
0 5
1 6
2 7
3 12
dtype: int64
In [16]: s1&s2
Out[16]:
0 1
1 2
2 3
3 0
dtype: int64
Alternatively, you can import numpy
and use its logical_or
and logical_and
methods.
In [35]: import numpy as np
In [36]: np.logical_and(s1, s2)
Out[36]:
0 True
1 True
2 True
3 True
dtype: bool
In [37]:
In [37]: np.logical_or(s1, s2)
Out[37]:
0 True
1 True
2 True
3 True
dtype: bool
Filter DataFrame column values based on a condition
Let’s say you have a DataFrame like the ones below, and you want to get the rows that is priced below 50000.
import pandas as pd
df = pd.DataFrame.from_dict({
'manufacturer': ['HP', 'DELL', 'LENOVO', 'MSI'],
'processor_i': ['3', '5', '7', '5'],
'price': [30000, 45000, 80000, 55000],
})
Normally, new users would extract the price
column and put it into an if/else
block of code, but it will produce the "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()" error.
In [1]: if df['price'] < 40000:
...: print(df)
...:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-21-4411513fbfb9> in <module>
----> 1 if df['price'] < 40000:
2 print(df)
3
~/.local/share/virtualenvs/testing-p6fO7ldL/lib/python3.8/site-packages/pandas/core/generic.py in __nonzero__(self)
1535 @final
1536 def __nonzero__(self):
-> 1537 raise ValueError(
1538 f"The truth value of a {type(self).__name__} is ambiguous. "
1539 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
What you should do instead is applying the condition directly into the DataFrame by putting it into square brackets.
In [23]: df[df['price'] < 50000]
Out[23]:
manufacturer processor_i price
0 HP 3 30000
1 DELL 5 45000
Use Series.any() to evaluate Series values
Series.any() is kind of similar to Python’s any(). It returns True if any element in the Series is True or "truthy", and returns False otherwise.
In [26]: df = pd.DataFrame.from_dict({
...: 'manufacturer': ['HP', 'DELL', 'LENOVO', 'MSI'],
...: 'processor_i': ['3', '5', '7', '5'],
...: 'price': [30000, 45000, 80000, 55000],
...: })
In [27]: if (df['price'] < 50000).any():
...: print(df)
...:
manufacturer processor_i price
0 HP 3 30000
1 DELL 5 45000
2 LENOVO 7 80000
3 MSI 5 55000
Use Series.all() to obtain truthy value from Series
Series.all() and Python all() works the same way. The function return True if all element in the Series is True or "truthy", and returns False otherwise. Below is how we can apply it to our example.
In [29]: if (df['price'] < 100000).all():
...: print(df)
...:
manufacturer processor_i price
0 HP 3 30000
1 DELL 5 45000
2 LENOVO 7 80000
3 MSI 5 55000
Use Series.empty, Series.bool() and Series.item()
If you are doing stuff with loops and conditions, Series.empty, Series.bool() and Series.item() can be helpful.
For example, if you just want to check if series contains more than zero elements, then you can use .empty
.
>>> series_x = pd.Series([])
>>> print(series_x.empty)
True
.bool()
is used when there is only one element in the series and in boolean type, whereas .item()
return the only element of single item series.
>>> series_x = pd.Series([0])
>>> series_y = pd.Series([1])
>>> series_z = pd.Series([True])
>>> series_u = pd.Series([False])
>>> series_v = pd.Series([50])
>>> series_w = pd.Series([50, 60])
>>> print(series_x.bool())
Output: Error! Works with boolean values only
>>> print(series_z.bool())
True
>>> print(series_u.bool())
False
>>> print(series_x.item())
0
>>> print(series_w.item())
Output: Error! Works with single value series only
>>> print(series_z.item())
True
Conclusion
We hope that the explanation above helps you understand how "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()" error happened, as well as how to avoid seeing it in the future. If you notice any problem with the article, don’t hesitate to let us know in the comments below.