Feature Type
-
[x] Adding new functionality to pandas
-
[ ] Changing existing functionality in pandas
-
[ ] Removing existing functionality in pandas
Problem Description
Sometimes, I have in my data fully null object columns. When I apply a groupby, for no acceptable reason, it's transformed into float. It makes no sense to change to groupby key types, especially when I didn't ask for it. I edited : it appears with object, not string.
Feature Description
A new parameter to groupby function : keep_key_types (or whatever the name) that force the groupby to keep the exact types.
Alternative Solutions
N/A
Additional Context
No response
Comment From: samukweku
Kindly share an example
Comment From: simonaubertbd
df1 = pd.DataFrame({"str_mixed": ["x", None], "str_all_null": [None, None]})
print(df1.dtypes)
# str_mixed object
# str_all_null object 👈 still object here
out = df1.groupby(["str_mixed", "str_all_null"], dropna=False).size().reset_index()
print(out.dtypes)
# str_mixed object
# str_all_null float64 👈 changed because it was all-null
Comment From: simonaubertbd
As more context : combined with https://github.com/apache/arrow/issues/47460, my string in parquet becomes a float64 in pandas after groupby. This is two issues, on two different packages but combined, it's like the plague and the cholera for type consistency.