Pandas version checks
-
[x] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
-
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
from pandas.api.types import infer_dtype
assert infer_dtype(pd.Series([1.,2.,.3,pd.NA], dtype=object)) == infer_dtype(pd.Series([1.,2.,.3,np.nan], dtype=object))
Issue Description
Dear pandas-folks,
This was checked for pandas V 2.3.0 and 2.2.X
When using pandas' infer_dtype
on an object array consisting out of floats with embedded pd.NA
, the result will be mixed-integer-float
tough skipna
is True
as a default.
The same test for embedded np.nan
returns floating
.
>>> from pandas.api.types import infer_dtype
>>> infer_dtype(pd.Series([1,2,3,pd.NA], dtype=object))
'integer'
>>> infer_dtype(pd.Series([1,2,3,np.nan], dtype=object))
'integer'
>>> infer_dtype(pd.Series([1.,2.,.3,pd.NA], dtype=object))
'mixed-integer-float' v <<< should be `floating`
>>> infer_dtype(pd.Series([1.,2.,.3,np.nan], dtype=object))
'floating'
>>> infer_dtype(pd.Series(['1.0', np.nan],dtype=object))
'string'
>>> infer_dtype(pd.Series(['1.0', pd.NA],dtype=object))
'string'
In case of other types, like integer or strings, the function does not produce a false / different output w.r.t. the na-type.
Context, I am maintaining a small project which assures integers in columns to stay integers - a common known issue. I you know of a well established extension for this purpose, feel free to point me towards it.
Expected Behavior
>>> infer_dtype(pd.Series([1.,2.,.3,pd.NA], dtype=object))
should return floating
Installed Versions
Comment From: arthurlw
Confirmed on main! Investigations and PRs are welcome.
Thanks for raising this!
Comment From: heoh
I want to contribute to this. Thank you for explaining the issue.
Comment From: heoh
take
Comment From: MarkusZimmerDLR
Since this seems to be a very simple and minor fix, is it possible to not wait for the 3.0 release? Or is the release imminent?