Feature Type
-
[ ] Adding new functionality to pandas
-
[x] Changing existing functionality in pandas
-
[ ] Removing existing functionality in pandas
Problem Description
I wish I could pass method
and tolerance
kwargs to pandas.Index.slice_indexer
, as I can to pandas.Index.get_indexer
.
Feature Description
Add new parameters to pandas.Index.slice_indexer
, so that it looks like
class Index:
def slice_indexer(start=None, end=None, step=None, method=None, tolerance=None):
"""
Compute the slice indexer for input labels and step.
Index needs to be ordered and unique.
Parameters
----------
start : label, default None
If None, defaults to the beginning.
end : label, default None
If None, defaults to the end.
step : int, default None
method : {None, ‘pad’/’ffill’, ‘backfill’/’bfill’, ‘nearest’}, optional
- default: exact matches only.
- pad / ffill: find the PREVIOUS index value if no exact match.
- backfill / bfill: use NEXT index value if no exact match
- nearest: use the NEAREST index value if no exact match. Tied distances are broken by preferring the larger index value.
tolerance : optional
Maximum distance between original and new labels for inexact matches. The values of the index at the matching locations must satisfy the equation abs(index[indexer] - target) <= tolerance.
Tolerance may be a scalar value, which applies the same tolerance to all values, or list-like, which applies variable tolerance per element. List-like includes list, tuple, array, Series, and must be the same size as the index and its dtype must exactly match the index’s type.
Returns
-------
slice
Alternative Solutions
Currently I've effectively written a custom version of .slice_indexer
that calls .get_indexer
on the start and end bounds, but I'm worried that doing this downstream will miss edge cases compared to solving it upstream in pandas.
Additional Context
This would be nice for use within the internals of xarray's .sel()
methods. See https://github.com/pydata/xarray/issues/10710.
Comment From: rhshadrach
Thanks for the request, I'm confused as to what the desired behavior of the additional arguments are. Are you possibly suggesting that slice_indexer
not return a slice? Or that the provided arguments only apply to what is determined to be the start
/ end
?
Closing until further details are provided - happy to reopen!
Comment From: TomNicholas
Are you possibly suggesting that slice_indexer not return a slice?
No I still want it to return a slice.
Or that the provided arguments only apply to what is determined to be the start / end?
Exactly - so that the start and end points of the slice can be fuzzily specified.
In my xarray PR here you can see the consequences of these kwargs not being available in pandas. Inside my _query_slice
function you can see that I now am forced to have two totally different codepaths, one which uses slice_indexer
but doesn't support method
and tolerance
, and one that does support method
and tolerance
but only by some much more complicated logic involving multiple calls to get_indexer
. It would be nice to consolidate these codepaths. There is further discussion about the desired behaviour in that PR too.
Comment From: rhshadrach
So let's say the user is using ffill
on the index 5, 3, 1, 4
and specifies start=2
. What start will be chosen? And similarly with bfill
.
Comment From: rhshadrach
Unfortunately the desire here is still not clear to me. Closing until details are added.