Pandas version checks
-
[x] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
-
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
from datetime import date
import pandas as pd
mi_a = pd.MultiIndex.from_tuples([(date(2001, 1, 1), "foo")], names=["first", "second"])
mi_b = pd.MultiIndex.from_tuples([(pd.Timestamp(date(2001, 1, 1)), "asdf")], names=["first", "second"])
mi_a.union(mi_b)
Issue Description
The following exception is thrown:
InvalidIndexError Traceback (most recent call last)
...
InvalidIndexError: Reindexing only valid with uniquely valued Index objects
Expected Behavior
I would have expected the two values date(2001, 1, 1)
and pd.Timestamp(date(2001, 1, 1))
to be treated as different values, which is how I believe pd.DataFrame.drop_duplicates
acts. However treating the two values as identical could also be valid, but I don't think that the exception is.
Installed Versions
Comment From: jwg4
This apparent bug also affects another basic operation which could be expected to succeed, DataFrame.combine_first
when using frames with MultiIndex as described above:
from datetime import date
import pandas as pd
df_a = pd.DataFrame(
[
(date(2001, 1, 1), "foo", 11),
],
columns=["a", "b", "c"]
)
df_a = df_a.set_index(["a", "b"])
df_b = pd.DataFrame(
[
(pd.Timestamp(date(2001, 1, 1)), "bar", 33),
],
columns=["a", "b", "c"]
)
df_b = df_b.set_index(["a", "b"])
df_a.combine_first(df_b)
Comment From: rhshadrach
Thanks for the report. pandas is converting the Index
to a DatetimeIndex
here:
https://github.com/pandas-dev/pandas/blob/7c2796d134e74f613cbfd85137d6809f5abf39a4/pandas/core/indexes/base.py#L6217-L6221
Further investigations are welcome!