Lets suppose aggregate function returns int or float. Then if it returns only 0 and 1 then result is converted to BooleanArray. Otherwise, it returns int or float arrays (as expected).

This is because this code is preserving type if series values is not a subclass of np.ndarray type. And BooleanArray is not. https://github.com/pandas-dev/pandas/blob/b552dc95c9fa50e9ca2a0c9f9cdb8757f794fedb/pandas/core/groupby/ops.py#L917 So then the code tries to preserve type if it can.

Code to reproduce

df = pd.DataFrame({0: [1, 2, 2], 1: [True, False, None]})
df[1] = df[1].astype("boolean")
print(df.groupby(by=0).aggregate(lambda s: s.fillna(False).mean()).dtypes.values[0])

prints boolean.

If we change values in array

df = pd.DataFrame({0: [1, 2, 2], 1: [True, True, None]})
df[1] = df[1].astype("boolean")
print(df.groupby(by=0).aggregate(lambda s: s.fillna(False).mean()).dtypes.values[0])

then it prints float64.

If dtype is "bool" (not "boolean"), then groupby always returns expected float result.

df = pd.DataFrame({0: [1, 2, 2], 1: [True, False, None]})
df[1] = df[1].astype("bool")
print(df.groupby(by=0).aggregate(lambda s: s.fillna(False).mean()).dtypes.values[0])

prints float64

Comment From: emmacherrin

Hi, I'm a student in a University of Michigan Software Engineering course tasked with fixing a bug in the next couple weeks. My partner, @longovin and I would like to fix this issue!

Comment From: Aloqeely

Good luck. For future reference, you can claim an issue by commenting exactly take under the issue.

Comment From: echerrin

take

Comment From: longovin

take