Pandas version checks
-
[x] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
-
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
import numpy as np
df = pd.DataFrame(
{
"a": [-1., 2., 3., 4., 5.],
"b": [1., 2., 3., 4., 5.],
}, dtype=pd.Float64Dtype()
)
df = np.sqrt(df)
# Returns False
print(df.isna().any().any())
# Returns True
print(pd.isna(df.loc[0, "a"]))
Issue Description
Apply a NumPy operation that yields NaN for some value of the dataframe of type Float64Dtype. Then pandas null checking functions (isna, isnull, notna) will not detect the NaN value. However, it is detected if we index the NaN value.
Expected Behavior
Both the above print statements should be True.
Installed Versions
Comment From: gake572
I confirmed on pandas 2.3.0 + NumPy 2.1.2 that after applying a NumPy ufunc the nullable Float64 dtype mask isn’t catching NaNs:
df_plain = df.astype(float)
print(df_plain.isna().any().any()) # True
This suggests Float64Dtype.isna()
isn’t recognizing the NaN created by np.sqrt
.
Can someone confirm this is a bug in the nullable array mask logic? Thanks.
Comment From: jbrockmendel
Can someone confirm this is a bug in the nullable array mask logic? Thanks.
It is not a bug but it is a design choice that frequently causes confusion (#60106, #59891, #56451, #53887). The original discussion for how to handle this is in #32265 and more recently in #61618.