Pandas version checks
-
[x] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
-
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
import numpy as np
import datetime
pd.DataFrame([np.nan], dtype='datetime64[ns]').replace(np.nan, datetime.datetime(3000,1,1))
Issue Description
code above gives this error, likely because the datetime is too big for datetime64[ns]:
pd.DataFrame([np.nan], dtype='datetime64[ns]').replace(np.nan, datetime.datetime(3000,1,1))
File "C:\Users\mdarnall\mdarnall-local-dev\tma-venv-prod.venv\lib\site-packages\pandas\core\generic.py", line 8141, in replace new_data = self._mgr.replace( File "C:\Users\mdarnall\mdarnall-local-dev\tma-venv-prod.venv\lib\site-packages\pandas\core\internals\base.py", line 249, in replace return self.apply_with_block( File "C:\Users\mdarnall\mdarnall-local-dev\tma-venv-prod.venv\lib\site-packages\pandas\core\internals\managers.py", line 363, in apply applied = getattr(b, f)(**kwargs) File "C:\Users\mdarnall\mdarnall-local-dev\tma-venv-prod.venv\lib\site-packages\pandas\core\internals\blocks.py", line 924, in replace blk = self.coerce_to_target_dtype(value) File "C:\Users\mdarnall\mdarnall-local-dev\tma-venv-prod.venv\lib\site-packages\pandas\core\internals\blocks.py", line 490, in coerce_to_target_dtype raise AssertionError( AssertionError: Something has gone wrong, please report a bug at https://github.com/pandas-dev/pandas/issues
Expected Behavior
different error message or change column type
Installed Versions
Comment From: iabhi4
take
Comment From: iabhi4
I looked into this and confirmed that .to_datetime64()
on Timestamp(datetime(3000, 1, 1))
returns a datetime64[us]
, and casting that to datetime64[ns]
silently overflows, resulting in a corrupted timestamp like 1830-11-23T00:50:52
. Since no error is raised, infer_dtype_from_scalar
infers an invalid dtype, and we later hit an AssertionError
when new_dtype == self.dtype
.
Would adding a bounds check after .to_datetime64()
in infer_dtype_from_scalar
make sense here to catch this early?
Comment From: jbrockmendel
Looks like this needs special handling in dtypes.cast.find_result_type. At the end of the function we call find_common_type with dt64[ns] and dt64[us] which gives dt64[ns]. For arithmetic-like use cases that is the right thing to do for find_common_type, but not here.
Comment From: iabhi4
That makes sense. I’ll move the bounds check into find_result_type
, right after find_common_type
, and special-case the datetime64[ns]/[us] scenario. Thanks for the clarification!