Pandas version checks

  • [x] I have checked that this issue has not already been reported.

  • [x] I have confirmed this bug exists on the latest version of pandas.

  • [ ] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

df = pd.DataFrame(
    [
        (1, "1900-01-01", "a"),
        (2, "1900-01-01", "b")
     ],
columns=["id", "date", "val"]
).astype({"id": "int64[pyarrow]", "date": "timestamp[ns][pyarrow]", "val":"string[pyarrow]"})

df = df.set_index(["id", "date"])

idx_val = df.index[0]

idx_val in df.index # will show True

df.index.difference([idx_val]) # The two elements are still present in the dataframe

Issue Description

Note that the code will work if we using datetime64[ns] instead of timestamp[ns][pyarrow] type.

Also the code works fine if we convert the index to a none multi index.

Expected Behavior

We expect the same behavior with timestamp[ns][pyarrow] and other type. The element that we use to apply the difference should be removed from the dataframe

Installed Versions

INSTALLED VERSIONS ------------------ commit : 0691c5cf90477d3503834d983f69350f250a6ff7 python : 3.10.12 python-bits : 64 OS : Linux OS-release : 5.15.167.4-microsoft-standard-WSL2 Version : #1 SMP Tue Nov 5 00:21:55 UTC 2024 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : C.UTF-8 LOCALE : en_US.UTF-8 pandas : 2.2.3 numpy : 1.26.4 pytz : 2024.2 dateutil : 2.9.0.post0 pip : 22.0.2 Cython : None sphinx : 8.1.3 IPython : 8.30.0 adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : None blosc : None bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : None html5lib : None hypothesis : None gcsfs : None jinja2 : 3.1.4 lxml.etree : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None psycopg2 : None pymysql : None pyarrow : 18.1.0 pyreadstat : None pytest : 8.3.4 python-calamine : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlsxwriter : None zstandard : None tzdata : 2024.2 qtpy : None pyqt5 : None

Comment From: arthurlw

Thanks for raising this! Confirmed on main.

MultiIndex.difference doesn't exclude the matching row when the index includes a timestamp[ns][pyarrow] column.

Comment From: rhshadrach

MultiIndex._convert_can_do_stop creates a MultiIndex internally from the provided list which results in a DatetimeIndex. This then doesn't compare against PyArrow. It seems to me we should enable comparisons between the two.