Pandas BUG: numerical inconsistency in calculating rolling kurtosis

Pandas version checks

[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[X] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

series_1 =  pd.Series([-1] + ([1] * 19))
print(series_1.kurt())
print(series_1.rolling(20).kurt().max())

series_2 = pd.Series(([-1] * 7) + ([1] * 19))
print(series_2.rolling(20).kurt().max())

series_3 = pd.Series(([-1] * 6) + ([1] * 19))
print(series_3.rolling(20).kurt().max())

Issue Description

I met a problem in calculating rolling kurtosis for a specific kind of data.

for series_1 = pd.Series([-1] + ([1] * 19)), I checked the source code and expected its kurtosis to be 20.00000000000001 because of the binary rounding error. While this holds true for calculating series_1.kurt(), the rolling version of it behaves oddly and returns an exact 20.0.

The numerical inconsistency also exists when I create another series series_2 = pd.Series(([-1] * 7) + ([1] * 19)). This time it returns 20.00000000000001, which is not equal to the max rolling kurtosis of series_1. However, series_3 would give a 20.0.

You can create similar series like above to see different behaviors. What is the rationale of it? Why would pandas sometimes give a 20.0?

Expected Behavior

Expected all results to be 20.00000000000001.

Installed Versions

INSTALLED VERSIONS ------------------ commit : 0f437949513225922d851e9581723d82120684a6 python : 3.11.5.final.0 python-bits : 64 OS : Darwin OS-release : 23.4.0 Version : Darwin Kernel Version 23.4.0: Fri Mar 15 00:12:49 PDT 2024; root:xnu-10063.101.17~1/RELEASE_ARM64_T6020 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : None LOCALE : None.UTF-8 pandas : 2.0.3 numpy : 1.24.3 pytz : 2023.3.post1 dateutil : 2.8.2 setuptools : 68.0.0 pip : 23.2.1 Cython : None pytest : 7.4.0 hypothesis : None sphinx : 5.0.2 blosc : None feather : None xlsxwriter : None lxml.etree : 4.9.3 html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.2 IPython : 8.15.0 pandas_datareader: None bs4 : 4.12.2 bottleneck : 1.3.5 brotli : fastparquet : None fsspec : 2023.4.0 gcsfs : None matplotlib : 3.7.2 numba : 0.57.1 numexpr : 2.8.4 odfpy : None openpyxl : 3.0.10 pandas_gbq : None pyarrow : 11.0.0 pyreadstat : None pyxlsb : None s3fs : 2023.4.0 scipy : 1.11.1 snappy : sqlalchemy : 1.4.39 tables : 3.8.0 tabulate : 0.8.10 xarray : 2023.6.0 xlrd : None zstandard : 0.19.0 tzdata : 2023.3 qtpy : 2.2.0 pyqt5 : None

Comment From: auderson

The rolling algos in pandas generally use the online updating version for performance. It is expected the result can be a bit different due to floating point artifacts. If you really want consistent result you can try rolling.apply(lambda x: x.kurt()), but this is much slower.

Comment From: HaloCollider

The rolling algos in pandas generally use the online updating version for performance. It is expected the result can be a bit different due to floating point artifacts. If you really want consistent result you can try rolling.apply(lambda x: x.kurt()), but this is much slower.

I understand that the online updating method may cause numerical instability in a time-series manner. But 20.0 or larger than 20.0 is an overall characteristic of a series. In other words, you cannot get a 20.0 and a larger than 20.0 in a single series.

For example: pd.Series([1] * 19 + [-1] * 7 + [1] * 1).rolling(20).kurt().max() gives a 20.00000000000001, while pd.Series([1] * 19 + [-1] * 7 + [1] * 2).rolling(20).kurt().max() gives a 20.0. Their difference is just an additional 1 at the tail, which doesn't affect the max kurtosis from the view of online updating.

(That's also why I'm calling it inconsistency rather than instability.)

Comment From: auderson

Looks like it's due to a demean operation prior to calculation:

https://github.com/pandas-dev/pandas/blob/283a2dcb2f91db3452a9d2ee299632a109b224f4/pandas/_libs/window/aggregations.pyx#L828-L838

Comment From: HaloCollider

Looks like it's due to a demean operation prior to calculation:

https://github.com/pandas-dev/pandas/blob/283a2dcb2f91db3452a9d2ee299632a109b224f4/pandas/_libs/window/aggregations.pyx#L828-L838

Thanks a lot. This solves my issue. Previously I checked the source but missed the demean operation, which made my version produce consistent results that caused confusion.

I found the exact thresholds of the proportion of 1 of a series being 0.25 and 0.75, i.e., the mean being -0.5 and 0.5. Out of range (0.25 to 0.75) distributions lead to 20.0.