Pandas version checks
-
[x] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
-
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd, numpy as np
arr1 = pd.array([],pd.StringDtype("pyarrow", na_value=pd.NA))
arr2 = pd.array([], pd.StringDtype("python", na_value=np.nan))
arr1 == arr2 # NotImplementedError: eq not implemented for <class 'pandas.core.arrays.string_.StringArrayNumpySemantics'>
Issue Description
This appears to be the type of issue discussed in https://github.com/pandas-dev/pandas/issues/60639. That issue was closed, but I got an error when I tried running the above reproducer on the example given in the whatsnew for release 2.3.
My understanding was that the issue was closed when https://github.com/pandas-dev/pandas/pull/61138 was merged to main, but it's unclear if the fix was successfully backported to the 2.3.x branch. I haven't had the time yet to try when building pandas myself from main.
Expected Behavior
Comparisons of string arrays/series with different dtypes should not error and the return dtype should follow the behavior laid out in #60639 .
Installed Versions
Comment From: a-reich
I’m also fine with re-opening #60639 if that’s better.
Comment From: chilin0525
@a-reich Thanks for raising the issue — it has been resolved on the main branch:
>>> import pandas as pd, numpy as np
>>> import pandas as pd, numpy as np
>>> arr1 = pd.array([],pd.StringDtype("pyarrow", na_value=pd.NA))
>>> arr2 = pd.array([], pd.StringDtype("python", na_value=np.nan))
>>> arr1 == arr2
<ArrowExtensionArray>
[]
Length: 0, dtype: bool[pyarrow]
Comment From: a-reich
Yeah, so it boils down to the PR was intended to be backported to 2.3 but was not.
Comment From: a-reich
Fixed by #61649 - thank you!