Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
from datetime import datetime
datetime(2020, 1, 1) + pd.DateOffset(days=1.5)
datetime(2020, 1, 1) + pd.DateOffset(hours=1.5)
Issue Description
After updating pandas from 1.5.3 to 2.1.2 the behaviour of DateOffset has changed. When executing datetime(2020, 1, 1) + pd.DateOffset(days=1.5)
I obtain datetime(2020, 1, 2)
rather than datetime(2020, 1, 2, 12)
. In the latter case datetime(2020, 1, 1) + pd.DateOffset(hours=1.5)
I obtain the expected datetime(2020, 1, 1, 1, 30)
.
Expected Behavior
The expected behaviour should either be an Exception, when executing pd.DateOffset(days=1.5)
with a non-integer input or the datetime(2020, 1, 2, 12)
.
Installed Versions
INSTALLED VERSIONS
commit : a60ad39b4a9febdea9a59d602dad44b1538b0ea5 python : 3.9.13.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.19045 machine : AMD64 processor : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : de_DE.cp1252 pandas : 2.1.2 numpy : 1.26.1 pytz : 2023.3.post1 dateutil : 2.8.2 setuptools : 68.2.2 pip : 23.3.1 Cython : 3.0.5 pytest : 7.4.3 hypothesis : None sphinx : 7.2.6 blosc : None feather : None xlsxwriter : 3.1.9 lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.2 IPython : 8.17.2 pandas_datareader : None bs4 : 4.12.2 bottleneck : None dataframe-api-compat: None fastparquet : None fsspec : None gcsfs : None matplotlib : 3.8.1 numba : 0.58.1 numexpr : 2.8.7 odfpy : None openpyxl : 3.1.2 pandas_gbq : None pyarrow : 14.0.1 pyreadstat : None pyxlsb : None s3fs : None scipy : 1.11.1 sqlalchemy : None tables : 3.9.1 tabulate : 0.9.0 xarray : None xlrd : 2.0.1 zstandard : None tzdata : 2023.3 qtpy : 2.4.1 pyqt5 : None
Comment From: jbrockmendel
i'd expect that to raise in the constructor
Comment From: jrmylow
bisecting finds this first cropped up in this PR: 53681.
Summary: The line that causes this behaviour is this: offsets.pyx#L1356. This is due to how the relativedelta
from dateutil
handles multiplication incorrectly. I'm raising a bug report with them.
Tracing down, there is some interesting behaviour starting with the following setup:
import pandas as pd
import datetime as dt
import dateutil
start = dt.datetime(2020, 1, 1)
offset_1 =dt.timedelta(days=1.5)
offset_2 = pd.DateOffset(days=1.5)
offset_3 = dateutil.relativedelta.relativedelta(days=1.5)
Testing each offset reveals that multiplying the relativedelta
is the cause here:
>>> start + offset_1
datetime.datetime(2020, 1, 2, 12, 0)
>>> start + offset_2
Timestamp('2020-01-02 00:00:00')
>>> start + offset_3
datetime.datetime(2020, 1, 2, 12, 0)
>>> start + offset_3 * 1
datetime.datetime(2020, 1, 2, 0, 0)
Comment From: jrmylow
Raised here dateutil/issues/1318