Pandas version checks
-
[x] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
-
[x] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
df1 = pd.DataFrame({"col1":[1], "col2":[2]})
df2 = pd.DataFrame({"col1":[1], "col2":[2], "col2_dup":[3]})
pd.merge(df1, df2, on="col1", suffixes=("_dup", ""))
# Observe (1)
pd.merge(df1, df2, on="col1", suffixes=("", "_dup"))
# Observe (2)
Issue Description
Case 1 provides the following result:
col1 col2_dup col2 col2_dup
0 1 2 2 3
Case 2 results in an exception:
pandas.errors.MergeError: Passing 'suffixes' which cause duplicate columns {'col2_dup'} is not allowed.
While the MergeError in this case does make sense (ideally duplicate columns should not be allowed as they might cause confusion), the same issue is observed in the first case and no exception is raised.
Expected Behavior
Since this bug is about consistency, either of the following 2 should happen:
- An error should be raised in both cases.
- An error should not be raised in any case, and the duplicate column should be allowed.
Installed Versions
Comment From: nikaltipar
Possibly related #13659
Comment From: rhshadrach
Thanks for the report! Agreed both should raise. PRs to fix are welcome!
Comment From: nikaltipar
Ok, I'll see if I can squeeze this in for this month. It should easy enough to fix. Of course, if anyone else wants to take it up, feel free to!
Comment From: samruddhibaviskar11
take
Comment From: samruddhibaviskar11
Hii I'm new to open source and pandas internals, but I'd love to try fixing this. I might ask a few beginner questions as I go — hope that's okay!
Comment From: chilin0525
@nikaltipar @samruddhibaviskar11 Just a reminder: the issue already has a PR to address it — https://github.com/pandas-dev/pandas/pull/61422
Comment From: Farsidetfs
Sorry I didn't make a note here I was working on this. Apologies @samruddhibaviskar11. I'm just finalizing documentation and finished revisions as recommended. Do I need to ask to "take" here?
Comment From: Farsidetfs
take