Pandas version checks
-
[x] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
-
[x] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
# test_mask_stringdtype
obj = pd.DataFrame(
{"A": ["foo", "bar", "baz", pd.NA]},
index=["id1", "id2", "id3", "id4"],
dtype=pd.StringDtype(),
)
filtered_obj = pd.DataFrame(
{"A": ["this", "that"]}, index=["id2", "id3"], dtype=pd.StringDtype()
)
expected = pd.DataFrame(
{"A": [pd.NA, "this", "that", pd.NA]},
index=["id1", "id2", "id3", "id4"],
dtype=pd.StringDtype(),
)
filter_ser = pd.Series([False, True, True, False])
obj.mask(filter_ser, filtered_obj)
# A
# id1 <NA>
# id2 this
# id3 that
# id4 <NA>
filter_ser = pd.Series([True, False, False, True])
obj.mask(filter_ser, filtered_obj)
# A
# id1 <NA>
# id2 this
# id3 that
# id4 <NA>
filter_ser = pd.Series([False, False, False, False])
obj.mask(filter_ser, filtered_obj)
# A
# id1 <NA>
# id2 this
# id3 that
# id4 <NA>
filter_ser = pd.Series([True, True, True, True])
obj.mask(filter_ser, filtered_obj)
# A
# id1 <NA>
# id2 this
# id3 that
# id4 <NA>
Issue Description
Found during #60772 .
I suppose the purpose of this test is to check if mask
works as expected with pd.StringDtype()
(See #40824 ), but the test seems to return the same result regardless of cond
since it fails to align in _where
.
If we want to check if mask
replaces with other
only where cond
is True
and let cond
propagate where cond
is False
, I think filter_ser
should have index
so that mask
can recognize the corresponding other
value.
Expected Behavior
filter_ser = pd.Series([False, True, True, False], index=["id1", "id2", "id3", "id4"])
obj.mask(filter_ser, filtered_obj)
# A
# id1 foo
# id2 this
# id3 that
# id4 <NA>
Installed Versions
commit : c888af6d0bb674932007623c0867e1fbd4bdc2c6
python : 3.12.7
python-bits : 64
OS : Windows
OS-release : 11
Version : 10.0.26100
machine : AMD64
processor : AMD64 Family 25 Model 80 Stepping 0, AuthenticAMD
byteorder : little
LC_ALL : None
LANG : None
LOCALE : Korean_Korea.949
pandas : 2.3.1
numpy : 2.3.1
pytz : 2025.2
dateutil : 2.9.0.post0
pip : 24.2
Cython : None
sphinx : None
IPython : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : None
lxml.etree : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : None
pyreadstat : None
pytest : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlsxwriter : None
zstandard : None
tzdata : 2025.2
qtpy : None
pyqt5 : None
Comment From: arthurlw
Confirmed on main. PRs and investigations are welcome
Thanks for raising this!