-
[x] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
-
[ ] (optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
import pandas
from pandas.tseries.frequencies import to_offset
pandas.Timestamp('2000-01-10 10:00') - to_offset('BH')
Problem description
This issue causes the business hour 09:00 to be skipped when subtracting, leading to an incorrect time. The current output is: Timestamp('2000-01-07 17:00')
. This issue only occurs to the 10:00 time, other times appear to work as expected. Additionally, this issue only appears during subtraction, not addition.
Expected Output
Timestamp('2000-01-10 09:00')
Output of pd.show_versions()
Comment From: CloseChoice
take
Comment From: CloseChoice
As far as I see it, this result from the following line: https://github.com/pandas-dev/pandas/blob/master/pandas/tseries/offsets.py#L931
else:
while bhour_remain != timedelta(0):
# business hour left in this business time interval
bhour = self._next_opening_time(other) - other
if (
bhour_remain > bhour # change > to >= to get the desired behaviour
or bhour_remain == bhour
and nanosecond != 0
):
Note the vice versa example:
pd.Timestamp('2000-01-07 16:00') + pd.tseries.frequencies.to_offset('BH')
Timestamp('2000-01-10 09:00:00') # result
This might not even be an undesired effect. Does somebody know more about whether this behaviour is desired?
Comment From: CloseChoice
Some more thoughts on this issue:
We could (as intended by @ahyerman ) change business hour offset so that if a day break could happen ( pd.Timestamp('2000-01-07 16:00') + pd.tseries.frequencies.to_offset('BH')
) just stay at the day we are coming from ( Timestamp('2000-01-07 17:00:00')
) since both timestamps are equal in the sense that we can add and subtract more business hours and still arrive at the same result (
pd.Timestamp('2000-01-07 17:00') + pd.tseries.frequencies.to_offset('BH')
is equal to pd.Timestamp('2000-01-07 09:00') + pd.tseries.frequencies.to_offset('BH')
)
What we need to clarify first is: - how do other offsets handle such cases? - shall this be the standard or an additional option, set by a parameter?
Comment From: ahyerman
@CloseChoice I'm not sure what the standard is for other offsets, but it feels bit strange to treat 17:00 and 9:00 as equivalent. In my opinion business hour should end before 17:00 and then ambiguity goes away. 9:00 - 16:59 or more correctly [9:00, 17:00)
In my opinion the ambiguity is a bit inconsistent, and thus results in the confusion.