Pandas version checks

  • [x] I have checked that this issue has not already been reported.

  • [x] I have confirmed this bug exists on the latest version of pandas.

  • [ ] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

pd.Series(["a", "a"]).astype("category").map(lambda x: x == "c")

Issue Description

The above snippet erroneously returns category dtype:

0    False
1    False
dtype: category
Categories (1, bool): [False]

Expected Behavior

As soon as there are at least two categories, one gets the expected bool dtype:

pd.Series(["a", "b"]).astype("category").map(lambda x: x == "c")

returns:

0    False
1    False
dtype: bool

I would expect the same result if there is only one category involved.

Installed Versions

INSTALLED VERSIONS ------------------ commit : 2cc37625532045f4ac55b27176454bbbc9baf213 python : 3.12.9 python-bits : 64 OS : Windows OS-release : 11 Version : 10.0.26100 machine : AMD64 processor : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : English_Belgium.1252 pandas : 2.3.0 numpy : 2.3.1 pytz : 2025.2 dateutil : 2.9.0.post0 pip : None Cython : None sphinx : None IPython : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : None blosc : None bottleneck : 1.5.0 dataframe-api-compat : None fastparquet : None fsspec : 2025.5.1 html5lib : None hypothesis : None gcsfs : None jinja2 : None lxml.etree : 5.4.0 matplotlib : 3.10.3 numba : None numexpr : 2.11.0 odfpy : None openpyxl : 3.1.5 pandas_gbq : None psycopg2 : None pymysql : None pyarrow : 20.0.0 pyreadstat : None pytest : 8.4.1 python-calamine : None pyxlsb : None s3fs : None scipy : 1.15.3 sqlalchemy : None tables : None tabulate : 0.9.0 xarray : 2025.3.1 xlrd : None xlsxwriter : 3.2.5 zstandard : None tzdata : 2025.2 qtpy : None pyqt5 : None

Comment From: kernelism

take

Comment From: kernelism

@kdebrab The issue is happening here:

https://github.com/pandas-dev/pandas/blob/35b0d1dcadf9d60722c055ee37442dc76a29e64c/pandas/core/arrays/categorical.py#L1583-L1585

In the first case, new_categories would be Index([False], dtype='bool') and since its unique, it ends up returning a CategoricalDtype. Should note that issue depends on unique categories after the condition is applied. For example in this code snippet:

pd.Series(["a", "a", "a", "b"]).astype("category").map(lambda x: x == "b")

even though there are at least 2 categories, the result is still:

0    False
1    False
2    False
3     True
dtype: category
Categories (2, bool): [False, True]

This is because the mapping condition does not return duplicate categories. I think this specific code block was added for efficiency purposes by checking a 1:1 mapping.

A simple fix to this would be to instead use:

pd.Series(["a", "a"]).astype("category") == "c"

or

pd.Series(["a", "a"]).astype("category").eq("c")

which correctly returns:

0    False
1    False
dtype: bool

Comment From: jbrockmendel

dtype inference at the end of map calls is a really tricky problem that has come up before. Maybe someone will find an elegant solution, but this is a "don't get your hopes up" situation

Comment From: kernelism

Yes I agree.