Pandas version checks
-
[x] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
-
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
index1_without_name = pd.Index([1, 2])
index1_with_name = pd.Index([1, 2], name="x")
index2_without_duplicates = pd.Index([2, 3])
index2_with_duplicates = pd.Index([2, 3, 2])
multi_index1_without_name = pd.MultiIndex.from_tuples([(1, "a"), (2, "b")])
multi_index1_with_name = pd.MultiIndex.from_tuples([(1, "a"), (2, "b")], names=["x", "y"])
multi_index2_without_duplicates = pd.MultiIndex.from_tuples([(2, "b"), (3, "c")])
multi_index2_with_duplicates = pd.MultiIndex.from_tuples([(2, "b"), (3, "c"), (2, "b")])
# These work
print(index1_without_name.union(index2_without_duplicates))
print(index1_without_name.union(index2_with_duplicates))
print(index1_with_name.union(index2_without_duplicates))
print(index1_with_name.union(index2_with_duplicates))
print(multi_index1_without_name.union(multi_index2_without_duplicates))
print(multi_index1_without_name.union(multi_index2_with_duplicates))
print(multi_index1_with_name.union(multi_index2_without_duplicates))
# This one raises
print(multi_index1_with_name.union(multi_index2_with_duplicates))
Issue Description
For 2 MultiIndex instances i1 and i2, i1.union(i2) behaves inconsistently depending on whether i1 has names and whether i2 has duplicates:
- If
i1has no names ori2has no duplicates theni1.union(i2)works as expected - If
i1has names andi2has duplicates theni1.union(i2)raisesValueError: cannot join with no overlapping index names
In addition, if i1 and i2 are plain Index instances, then the case that is problematic for MultiIndex (names and duplicates) works as expected.
Expected Behavior
I expect no exception to be raised. The result should contain the duplicate values of the second MultiIndex as duplicates, just as in the other cases for consistency (although personally this did surprise me, but that's a different topic).
Installed Versions
Comment From: WillAyd
Thanks for the find - there is something strange going on here.
The result should contain the duplicate values of the second
MultiIndexas duplicates, just as in the other cases for consistency (although personally this did surprise me, but that's a different topic).
This sounds like a separate issue with Index.union not returning unique values. Could you check for an existing issue on that, and if nothing open a new one with a minimal reproducer?
Comment From: torfsen
This sounds like a separate issue with Index.union not returning unique values.
@WillAyd I'm not sure, but #56137 sounds like this is the intended (but undocumented) behavior.
Comment From: WillAyd
Ah nice find! In that case, it does seem intentional