Pandas BUG: Duplicate columns allowed on merge if originating from separate dataframes

Pandas version checks

[x] I have checked that this issue has not already been reported.
[x] I have confirmed this bug exists on the latest version of pandas.
[x] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
df1 = pd.DataFrame({"col1":[1], "col2":[2]})
df2 = pd.DataFrame({"col1":[1], "col2":[2], "col2_dup":[3]})

pd.merge(df1, df2, on="col1", suffixes=("_dup", ""))
# Observe (1)

pd.merge(df1, df2, on="col1", suffixes=("", "_dup"))
# Observe (2)

Issue Description

Case 1 provides the following result:

   col1  col2_dup  col2  col2_dup
0     1         2     2         3

Case 2 results in an exception:

pandas.errors.MergeError: Passing 'suffixes' which cause duplicate columns {'col2_dup'} is not allowed.

While the MergeError in this case does make sense (ideally duplicate columns should not be allowed as they might cause confusion), the same issue is observed in the first case and no exception is raised.

Expected Behavior

Since this bug is about consistency, either of the following 2 should happen:

An error should be raised in both cases.
An error should not be raised in any case, and the duplicate column should be allowed.

Installed Versions

INSTALLED VERSIONS ------------------ commit : 0691c5cf90477d3503834d983f69350f250a6ff7 python : 3.11.7 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.22631 machine : AMD64 processor : Intel64 Family 6 Model 170 Stepping 4, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : English_United States.1252 pandas : 2.2.3 numpy : 2.2.5 pytz : 2025.2 dateutil : 2.9.0.post0 pip : 23.2.1 Cython : None sphinx : None IPython : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : None blosc : None bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : None html5lib : None hypothesis : None gcsfs : None jinja2 : None lxml.etree : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None psycopg2 : None pymysql : None pyarrow : None pyreadstat : None pytest : None python-calamine : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlsxwriter : None zstandard : None tzdata : 2025.2 qtpy : None pyqt5 : None

Comment From: nikaltipar

Possibly related #13659

Comment From: rhshadrach

Thanks for the report! Agreed both should raise. PRs to fix are welcome!

Comment From: nikaltipar

Ok, I'll see if I can squeeze this in for this month. It should easy enough to fix. Of course, if anyone else wants to take it up, feel free to!

Comment From: samruddhibaviskar11

take

Comment From: samruddhibaviskar11

Hii I'm new to open source and pandas internals, but I'd love to try fixing this. I might ask a few beginner questions as I go — hope that's okay!

Comment From: chilin0525

@nikaltipar @samruddhibaviskar11 Just a reminder: the issue already has a PR to address it — https://github.com/pandas-dev/pandas/pull/61422

Comment From: Farsidetfs

Sorry I didn't make a note here I was working on this. Apologies @samruddhibaviskar11. I'm just finalizing documentation and finished revisions as recommended. Do I need to ask to "take" here?

Comment From: Farsidetfs

take