Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [ ] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import warnings
import pandas as pd

warnings.filterwarnings("once", category=UserWarning)

warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
pd.DataFrame()
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)

Issue Description

Using filterwarnings with action 'once' should only print a warning of a specific category and text once. But calling pd.DataFrame() or other pandas functions (like pd.read_csv) makes both warnings shown twice. Deleting pd.DataFrame yields the expected behaviour.

I read issue #31978. This has been closed saying that it is a PyCharm issue, but I am using VSCode and I verified my example in termnial both from Windows and Ubuntu.

Expected Behavior

Both warnings ("This is a warning" and "This is a second warning") should be shown only once each.

Installed Versions

INSTALLED VERSIONS ------------------ commit : 0691c5cf90477d3503834d983f69350f250a6ff7 python : 3.10.12 python-bits : 64 OS : Linux OS-release : 5.15.153.1-microsoft-standard-WSL2 Version : #1 SMP Fri Mar 29 23:14:13 UTC 2024 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : C.UTF-8 LOCALE : en_US.UTF-8 pandas : 2.2.3 numpy : 2.1.2 pytz : 2024.2 dateutil : 2.9.0.post0 pip : 24.3.1 Cython : None sphinx : None IPython : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : None blosc : None bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : None html5lib : None hypothesis : None gcsfs : None jinja2 : None lxml.etree : None matplotlib : 3.9.2 numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None psycopg2 : None pymysql : None pyarrow : 17.0.0 pyreadstat : None pytest : None python-calamine : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlsxwriter : None zstandard : None tzdata : 2024.2 qtpy : None pyqt5 : None

Comment From: Tuckersteward

I am unable to reproduce this. I am using VS Code as well, but checked terminal as well.

My outputs are identical and appear to behave correctly between

import warnings

warnings.filterwarnings("once", category=UserWarning)

warnings.warn("This is a warning")
warnings.warn("This is a warning")
warnings.warn("This is a second warning")
warnings.warn("This is a second warning")

and

import pandas as pd

warnings.filterwarnings("once", category=UserWarning)

pd.DataFrame()
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)

Do you have any other information? It could be tied to your version of python but I tested this on 3.10.11, 3.13.11 and 3.12.4

Comment From: StrawberryOwl

Did you run the code as you wrote it? Or did you run my example completely unchanged? If you just issue the warnings as you did, my output is the same as well. The problem appears if you raise the warnings before pd.DataFrame() AND after it.

Running my code WITH pd.DataFrame()

import warnings
import pandas as pd

warnings.filterwarnings("once", category=UserWarning)

warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
pd.DataFrame()
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)

results in

warningtest.py:6: UserWarning: This is a warning
  warnings.warn("This is a warning", UserWarning)
warningtest.py:8: UserWarning: This is a second warning
  warnings.warn("This is a second warning", UserWarning)
warningtest.py:11: UserWarning: This is a warning
  warnings.warn("This is a warning", UserWarning)
warningtest.py:13: UserWarning: This is a second warning
  warnings.warn("This is a second warning", UserWarning)

Running the example WITHOUT pd.DataFrame

import warnings
import pandas as pd

warnings.filterwarnings("once", category=UserWarning)

warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)

Results in:

warningtest.py:6: UserWarning: This is a warning
  warnings.warn("This is a warning", UserWarning)
warningtest.py:8: UserWarning: This is a second warning
  warnings.warn("This is a second warning", UserWarning)

The second output is as I expected filterwarnings to work. The docs say:

"once" print only the first occurrence of matching warnings, regardless of location

When I call a pandas function, the filter seems to have been reset, since both warnings are shown again, although they have been called before.

Comment From: Tuckersteward

Ah I see it now.

Upon further investigation it appears the warnings registry is getting reset. The DataFrame only does this when it is creating a new dataframe if it is empty, if we create it with data we do not see this issue. While doing more testing I have found that merge has a similar issue but it always occurs.

You can view the issues with merge with the below code.

import warnings
import pandas as pd

warnings.filterwarnings("once", category=UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
pd.merge(pd.DataFrame({"A": [1, 2], "B": [3, 4]}), pd.DataFrame({"A": [5, 6], "C": [7, 8]}), on="A", how="inner")
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)

returns

test3.py:6: UserWarning: This is a warning
  warnings.warn("This is a warning", UserWarning)
test3.py:8: UserWarning: This is a second warning        
  warnings.warn("This is a second warning", UserWarning)
test3.py:11: UserWarning: This is a warning
  warnings.warn("This is a warning", UserWarning)
test3.py:13: UserWarning: This is a second warning       
  warnings.warn("This is a second warning", UserWarning)

We'll check the filters on warnings before and after using merge/DataFrame below

import warnings
import pandas as pd
import copy

warnings.filterwarnings("once", category=UserWarning)
before = copy.deepcopy(warnings.filters)
pd.DataFrame()
after = copy.deepcopy(warnings.filters)
print(before == after)

returns

True

This confirms that the cause is most likely the warnings registry being reset. If anyone has any insight it would be greatly appreciated as I've hit a bit of a wall.

Comment From: rhshadrach

I believe this is https://github.com/python/cpython/issues/73858

Comment From: StrawberryOwl

Is there any hope that this can be fixed in the foreseable future seeing that this issue is already open for a few years now? Have there been any updates on python/cpython#8232?

Comment From: mroeschke

Closing as a cpython issue