Feature Type
-
[ ] Adding new functionality to pandas
-
[x] Changing existing functionality in pandas
-
[ ] Removing existing functionality in pandas
Problem Description
I wish Pandas could implement arithmetic operations for DatetimeArrays and Python native lists.
Feature Description
I wish the following code could run.
from datetime import datetime
import pandas as pd
arr = pd.to_datetime(["2020-01-01", "2020-01-02"]).array
assert isinstance(arr, pd.arrays.DatetimeArray)
arr - [datetime(2019, 12, 31), datetime(2020, 1, 1)] # TypeError: unsupported operand type(s) for -: 'DatetimeArray' and 'list'
Alternative Solutions
from datetime import datetime
import pandas as pd
arr = pd.to_datetime(["2020-01-01", "2020-01-02"]).array
assert isinstance(arr, pd.arrays.DatetimeArray)
arr - pd.to_datetime([datetime(2019, 12, 31), datetime(2020, 1, 1)]).array
Additional Context
No response
Comment From: cmp0xff
Following this comment, The following code snippet runs fine:
from datetime import datetime
import pandas as pd
arr = pd.to_datetime(["2020-01-01", "2020-01-02"]).array
assert isinstance(arr, pd.arrays.DatetimeArray)
pd.Series(arr).sub([datetime(2019, 12, 31), datetime(2020, 1, 1)])
Comment From: jbrockmendel
Not supporting this is an intentional decision. We do all wrapping-lists-in-arrays at a higher level before getting to the DTA method.
Comment From: cmp0xff
Hi @jbrockmendel , thank you for the quick reply. I am not an expert in Pandas internals, but the following code works, which might be related:
import pandas as pd
pd.Series([1]).array - [1] # no error at run time
Comment From: Dr-Irv
Also, pandas is inconsistent because you can subtract a single datetime but not a list of datetime
. You can also subtract a DatetimeIndex
and a Series
.
from datetime import datetime
import pandas as pd
arr = pd.to_datetime(["2020-01-01", "2020-01-02"]).array
assert isinstance(arr, pd.arrays.DatetimeArray)
arr - datetime(2019, 12, 31) # This works
Comment From: jbrockmendel
Scalars are supported intentionally.
Comment From: Dr-Irv
Scalars are supported intentionally.
Sure, but I think it's inconsistent that we support the operation on a scalar, but not on a list of scalars, but we do support if it is a DatetimeIndex
or a Series
with datetimes or a numpy array.
Why differentiate the list case?
Comment From: jbrockmendel
Because doing the wrapping-in-an-array and length-checks in every single EA subclass would be a hassle, duplicative, a perf penalty etc. We do that upfront at the beginning of the Index/Series/DataFrame method, and support only scalars and right-length-ArrayLike at the EA level
Comment From: Dr-Irv
We also have this kind of inconsistency:
left = pd.Series([pd.Timestamp(2025, 8, 20)])
s = [datetime(2025, 8, 20)]
left - s # fails
left.sub(s) # works
So why should the operator fail, but the sub()
method work?
Comment From: jbrockmendel
id expect both Series operations to work