Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
pd.Series(pd.Timedelta(1)) + pd.Series(['1'])
Issue Description
The example above should produce pd.Series(pd.Timedelta(2))
, but it raises a TypeError
instead.
There are some other cases that don't work either:
- pd.Series(['1']) + pd.Series(pd.Timedelta(1))
- pd.Timedelta(1) + pd.Series(['1'])
- pd.Series(pd.Timedelta(1)) + '1'
- '1' + pd.Series(pd.Timedelta(1))
- pd.Timedelta(1) + pd.Series('1')
But some cases that do:
pd.DataFrame(['1']) + pd.DataFrame([pd.Timedelta(1)])
pd.DataFrame([pd.Timedelta(1)]) + pd.DataFrame(['1'])
pd.Series(['1']) + pd.Timedelta(1)
pd.Series('1') + pd.Timedelta(1)
Expected Behavior
Given that the Timedelta
itself seems to support arithmetic with scalar strings, it seems that dataframes and series of timedelta should support arithmetic with strings or data structures containing strings.
Installed Versions
INSTALLED VERSIONS
------------------
commit : d9cdd2ee5a58015ef6f4d15c7226110c9aab8140
python : 3.9.18.final.0
python-bits : 64
OS : Darwin
OS-release : 23.6.0
Version : Darwin Kernel Version 23.6.0: Mon Jul 29 21:13:04 PDT 2024; root:xnu-10063.141.2~1/RELEASE_ARM64_T6020
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.2.2
numpy : 1.26.3
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 68.2.2
pip : 23.3.1
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : 8.18.1
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.4
qtpy : None
pyqt5 : None
Comment From: sfc-gh-mvashishtha
Also, some operators don't work, e.g. pd.Series(['1']) + pd.Timedelta(1)
and pd.Series(['1']) == pd.Timedelta(1)
work, but pd.Series(['1']) <= pd.Timedelta(1)
does not
Comment From: rhshadrach
Thanks for the report, confirmed on main. Further investigations and PRs to fix are welcome!
Comment From: mingjji
take
Comment From: jbrockmendel
None of the operations listed as "should work" in the OP should work. All of the ones listed as working do not work in main (correctly). adding a string to a Timedelta should raise, and correctly does.
Comment From: rhshadrach
On main I'm seeing:
print(pd.DataFrame(['1']) + pd.DataFrame([pd.Timedelta(1)]))
# 0
# 0 0 days 00:00:00.000000002
Is that what the result should be? If not, I don't think this issue should be closed.
Comment From: jbrockmendel
On main that raises TypeError for me. I'm on Mac, py3.11.7, numpy 2.3.1
Comment From: rhshadrach
Ah, I didn't pulled in the most recent commits, including the change of infer_string
. Agreed this now raises a TypeError, but shouldn't
print('1' + pd.Timedelta(1))
# 0 days 00:00:00.000000002
raise as well? Equivalently, on object dtype.
Comment From: jbrockmendel
yes it should. good catch