Feature Type

  • [ ] Adding new functionality to pandas

  • [x] Changing existing functionality in pandas

  • [ ] Removing existing functionality in pandas

Problem Description

In pd.DataFrame.describe(), the most frequent value is termed 'top'.

The top is the most common value.

But there exists a statistical term 'mode' (https://en.wikipedia.org/wiki/Mode_(statistics)) depicting the same. To reduce disambiguity I propose to rename top to mode, both in the docs as well as in the print-out of the function.

Feature Description

I guess it would start here (replacing top with mode):

def describe_categorical_1d(
    data: Series,
    percentiles_ignored: Sequence[float],
) -> Series:
    """Describe series containing categorical data.

    Parameters
    ----------
    data : Series
        Series to be described.
    percentiles_ignored : list-like of numbers
        Ignored, but in place to unify interface.
    """
    names = ["count", "unique", "mode", "freq"]
    objcounts = data.value_counts()
    count_unique = len(objcounts[objcounts != 0])
    if count_unique > 0:
        mode, freq = objcounts.index[0], objcounts.iloc[0]
        dtype = None
    else:
        # If the DataFrame is empty, set 'mode' and 'freq' to None
        # to maintain output shape consistency
        mode, freq = np.nan, np.nan
        dtype = "object"

    result = [data.count(), count_unique, mode, freq]

    from pandas import Series

    return Series(result, index=names, name=data.name, dtype=dtype)

Alternative Solutions

Leave as it is.

Additional Context

No response

Comment From: mroeschke

Thanks for the suggestion but this has been long standing behavior and would be a large breaking change for users expecting "top". I would suggest renaming this label if you prefer "mode". Closing