Pandas ENH: unset_index method - Aurora Blog|java/go/python

Feature Type

[x] Adding new functionality to pandas
[ ] Changing existing functionality in pandas
[ ] Removing existing functionality in pandas

Problem Description

In many contexts I've had the need to "unset" a data frame index prior to passing the data frame to an index-blind method. My main requirements for such an index unset are: - the output should have the most trivial index possible, i.e. an unnamed range index - if the input has only an unnamed range index, the "unset" operation should be a no-op - if the input has a named index, it should be kept as a new column (or columns for a multi-index) - if the input is not a simple RangeIndex and is unnamed, raise a value error to remind me to decide whether that index is supposed to be meaningful (typically I'd name it if so, or do .reset_index(drop=True) if not).

Usage of the .reset_index method to cover all of these requirements in a general sense requires a bit of nuance: - calling df.reset_index() introduces a new column named simply "index" when users forget to use drop=False even when the index is [trivial] unnamed range index. In a stricter/safer world, you might want an error instead when you're about to construct a data frame column from an unnamed index. - the requirement to specify drop at all is sometime onerous; it typically would be preferable to have a method that's a no-op in case the existing index is already an unnamed range index.

Feature Description

Define a new unset_index method on DataFrame something like

def unset_index(self) -> Self:
    if is_unnamed_range_index(df.index):
        return df
    if index_has_any_unnamed_col(df.index):
        raise ValueError(
            "At least one column of the index is unnamed while the index itself is not a RangeIndex. "
            "Set the names of the index columns before calling unset, or just call reset_index(drop=True) directly."
        )
    return df.reset_index(drop=False, allow_duplicates=False)

Alternative Solutions

Until something like this is available in pandas, I'm using the unset method in pandahandler.

Additional Context

No response

Comment From: jbrockmendel

IIUC reset_index works for your use case but passing keep=False is inconvenient?

Comment From: zkurtz

[Revised OP to use drop not keep; there isn't actually an arg called keep]

The inconvenience is not in passing drop=False so much as deciding when to pass drop=False.

The proposed unset_index method uses drop=False or a ValueError to avoid lossy behavior in case the existing index is a potentially informative index, but also avoids creating a new meaningless dataframe column if the index is already an unnamed range index (providing idempotence).