Feature Type

  • [x] Adding new functionality to pandas

  • [ ] Changing existing functionality in pandas

  • [ ] Removing existing functionality in pandas

Problem Description

Sometimes, I have in my data fully null object columns. When I apply a groupby, for no acceptable reason, it's transformed into float. It makes no sense to change to groupby key types, especially when I didn't ask for it. I edited : it appears with object, not string.

Feature Description

A new parameter to groupby function : keep_key_types (or whatever the name) that force the groupby to keep the exact types.

Alternative Solutions

N/A

Additional Context

No response

Comment From: samukweku

Kindly share an example

Comment From: simonaubertbd

df1 = pd.DataFrame({"str_mixed": ["x", None], "str_all_null": [None, None]})
print(df1.dtypes)
# str_mixed       object
# str_all_null    object  👈 still object here

out = df1.groupby(["str_mixed", "str_all_null"], dropna=False).size().reset_index()
print(out.dtypes)
# str_mixed        object
# str_all_null    float64 👈 changed because it was all-null

Comment From: simonaubertbd

As more context : combined with https://github.com/apache/arrow/issues/47460, my string in parquet becomes a float64 in pandas after groupby. This is two issues, on two different packages but combined, it's like the plague and the cholera for type consistency.