Feature Type
-
[x] Adding new functionality to pandas
-
[ ] Changing existing functionality in pandas
-
[ ] Removing existing functionality in pandas
Problem Description
In many contexts I've had the need to "unset" a data frame index prior to passing the data frame to an index-blind method. My main requirements for such an index unset are:
- the output should have the most trivial index possible, i.e. an unnamed range index
- if the input has only an unnamed range index, the "unset" operation should be a no-op
- if the input has a named index, it should be kept as a new column (or columns for a multi-index)
- if the input is not a simple RangeIndex and is unnamed, raise a value error to remind me to decide whether that index is supposed to be meaningful (typically I'd name it if so, or do .reset_index(drop=True)
if not).
Usage of the .reset_index
method to cover all of these requirements in a general sense requires a bit of nuance:
- calling df.reset_index()
introduces a new column named simply "index" when users forget to use drop=False
even when the index is [trivial] unnamed range index. In a stricter/safer world, you might want an error instead when you're about to construct a data frame column from an unnamed index.
- the requirement to specify drop
at all is sometime onerous; it typically would be preferable to have a method that's a no-op in case the existing index is already an unnamed range index.
Feature Description
Define a new unset_index
method on DataFrame something like
def unset_index(self) -> Self:
if is_unnamed_range_index(df.index):
return df
if index_has_any_unnamed_col(df.index):
raise ValueError(
"At least one column of the index is unnamed while the index itself is not a RangeIndex. "
"Set the names of the index columns before calling unset, or just call reset_index(drop=True) directly."
)
return df.reset_index(drop=False, allow_duplicates=False)
Alternative Solutions
Until something like this is available in pandas, I'm using the unset method in pandahandler.
Additional Context
No response
Comment From: jbrockmendel
IIUC reset_index
works for your use case but passing keep=False
is inconvenient?
Comment From: zkurtz
[Revised OP to use drop
not keep
; there isn't actually an arg called keep
]
The inconvenience is not in passing drop=False
so much as deciding when to pass drop=False
.
The proposed unset_index
method uses drop=False
or a ValueError to avoid lossy behavior in case the existing index is a potentially informative index, but also avoids creating a new meaningless dataframe column if the index is already an unnamed range index (providing idempotence).