Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
import numpy as np
data={"x":[1,0],"y":[1,0]}
df=pd.DataFrame(data,dtype="Float64")
df['z']=df['y']/df['x']
df['z'].isna()
Issue Description
The pandas isna() function does not catch NaN values that are of type np.NaN when using the Float64 datatype. The call df['z'].isna()
returns a series with following rows.
0 False
1 False
Name: z, dtype: bool
Using the code above, both rows return a false value. Using df['z'].apply(np.isnan)
correctly returns false for the first row, and true for the second row.
0 False
1 True
Name: z, dtype: boolean
Expected Behavior
I would expect the pandas isna() function to also classify the np.NaN type as a null or nan value when using the Float64 datatype.
The returned value of df['z'].isna()
should be a series with following rows.
0 False
1 True
Name: z, dtype: bool
Installed Versions
Comment From: mortnstak
Also - this is correctly handled if datatype the non-nullable "float64"
Comment From: rhshadrach
cc @jbrockmendel @jorisvandenbossche @mroeschke - there is one test that hits this where we check that NaN
in the float values gives False
with pd.isna
:
https://github.com/pandas-dev/pandas/blob/68d9dcab5b543adb3bfe5b83563c61a9b8afae77/pandas/tests/arrays/floating/test_contains.py#L6
But it seems unexpected to me to have float64
and Float64
act differently here. Can't find a discussion on the current behavior, has this been considered before?
Comment From: asishm
It would be good to get a decision here.. https://github.com/pandas-dev/pandas/issues/53887
Comment From: jbrockmendel
This has been considered before, many times. xref #32265. Pdep 16 is the solution.
Comment From: rhshadrach
Thanks, this is certainly a duplicate of #53887, if not #32265. Closing.