• [x] I have checked that this issue has not already been reported.

  • [x] I have confirmed this bug exists on the latest version of pandas.

  • [ ] (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

import pandas
from pandas.tseries.frequencies import to_offset
pandas.Timestamp('2000-01-10 10:00') - to_offset('BH')

Problem description

This issue causes the business hour 09:00 to be skipped when subtracting, leading to an incorrect time. The current output is: Timestamp('2000-01-07 17:00'). This issue only occurs to the 10:00 time, other times appear to work as expected. Additionally, this issue only appears during subtraction, not addition.

Expected Output

Timestamp('2000-01-10 09:00')

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : None python : 3.6.10.final.0 python-bits : 64 OS : Windows OS-release : 10 machine : AMD64 processor : Intel64 Family 6 Model 142 Stepping 10, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : None.None pandas : 1.0.3 numpy : 1.18.2 pytz : 2019.3 dateutil : 2.8.1 pip : 20.0.2 setuptools : 46.1.3.post20200330 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : None fastparquet : None gcsfs : None lxml.etree : None matplotlib : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pytables : None pytest : None pyxlsb : None s3fs : None scipy : 1.4.1 sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None xlsxwriter : None numba : None

Comment From: CloseChoice

take

Comment From: CloseChoice

As far as I see it, this result from the following line: https://github.com/pandas-dev/pandas/blob/master/pandas/tseries/offsets.py#L931

else:
while bhour_remain != timedelta(0):
    # business hour left in this business time interval
    bhour = self._next_opening_time(other) - other
    if (
            bhour_remain > bhour # change > to >= to get the desired behaviour
            or bhour_remain == bhour
            and nanosecond != 0
    ):

Note the vice versa example:

pd.Timestamp('2000-01-07 16:00') + pd.tseries.frequencies.to_offset('BH')
Timestamp('2000-01-10 09:00:00')  # result

This might not even be an undesired effect. Does somebody know more about whether this behaviour is desired?

Comment From: CloseChoice

Some more thoughts on this issue:

We could (as intended by @ahyerman ) change business hour offset so that if a day break could happen ( pd.Timestamp('2000-01-07 16:00') + pd.tseries.frequencies.to_offset('BH')) just stay at the day we are coming from ( Timestamp('2000-01-07 17:00:00')) since both timestamps are equal in the sense that we can add and subtract more business hours and still arrive at the same result ( pd.Timestamp('2000-01-07 17:00') + pd.tseries.frequencies.to_offset('BH') is equal to pd.Timestamp('2000-01-07 09:00') + pd.tseries.frequencies.to_offset('BH'))

What we need to clarify first is: - how do other offsets handle such cases? - shall this be the standard or an additional option, set by a parameter?

Comment From: ahyerman

@CloseChoice I'm not sure what the standard is for other offsets, but it feels bit strange to treat 17:00 and 9:00 as equivalent. In my opinion business hour should end before 17:00 and then ambiguity goes away. 9:00 - 16:59 or more correctly [9:00, 17:00)

In my opinion the ambiguity is a bit inconsistent, and thus results in the confusion.