import pandas as pd
dtype1 = "datetime64[ns, US/Eastern]"
dtype2 = "timestamp[ns, US/Eastern][pyarrow]"
ts = pd.Timestamp("2025-07-03 18:10")
>>> pd.Series([ts], dtype=dtype1)[0]
Timestamp('2025-07-03 18:10:00-0400', tz='US/Eastern')
>>> pd.Series([ts], dtype=dtype2)[0]
Timestamp('2025-07-03 14:10:00-0400', tz='US/Eastern')
Long ago we decided that when passing tznaive datetimes and specifying a tzaware dtype, we treat the input as a wall-time. It looks like the pyarrow path (which I'm pretty sure just ends up calling pa.array([ts], type=...)
) treats it as a UTC time.
cc @jorisvandenbossche
Comment From: jorisvandenbossche
I see it is not documented very well (the array constructor docstring does mention something about timezones, but that is only for the case of inferring, not when a type is specified), but AFAIK he behaviour of pyarrow is indeed to assume naive data to be UTC (so choosing to interpret it as the underlying storage, not as wall clock time).
I assume for converting object to a timestamp type, we might need to use our own to_datetime
first (which is what you were trying to do, I think?)