Pandas Suggestion: Pop row or multiple rows from a DataFrame

Problem description

I'd like to suggest a modification to df.pop(item). Currently, pop(item) deletes the column from the dataframe it's being called on and returns that column as a series. It doesn't accept multiple items.

It might be a nice convenience to:

pop multiple columns at once (ex: pop(['A', 'B'])
specifying an axis parameter (default: axis=1) to allow popping rows and columns (ex: pop(1, axis=0))
pop slices (ex: pop([1:3], axis=1))

Thought I'd throw it out there to the pandas gods and see if it is interesting. If it's not the best API design decision for pop, I completely understand.

Common use-case

you have one or multiple problem rows you want to delete from a dataframe but still keep for later evaluation. You'd just pop the rows and they'd be deleted from your existing dataframe and saved to a new variable.
many times people seem to need to pop the last row, or second row. It is easy to pop the last row using .iloc[:-1] but popping the second row in one swoop isn't as easy I think. It could be if you just pop it out of there using pop.
sometimes people loop through a dataframe. not recommended I understand, but in such a scenario, you could pop a row based on a condition while looping perhaps in a complex manner.

Code Sample, a copy-pastable example if possible

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]},
                  columns=['A', 'B', 'C'])

def pop(df, values, axis=1):
    if axis == 0:
        if isinstance(values, (list, tuple)):
            popped_rows = df.loc[values]
            df.drop(values, axis=0, inplace=True)
            return popped_rows
        elif isinstance(values, (int)):
            popped_row = df.loc[values].to_frame().T
            df.drop(values, axis=0, inplace=True)
            return popped_row
        else:
            print('values parameter needs to be a list, tuple or int.')
    elif axis == 1:
        # current df.pop(values) logic here
        return df.pop(values)

Example Usage

# example df
>>> df
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9

# pop multiple indices, delete from df inplace, return popped rows
# the df param wouldn't exist in the pop method; it'd be self
# df param just shown here to illustrate the idea
>>>pop(df, [0, 2], axis=0)
   A  B  C
0  1  4  7
2  3  6  9

# pop one index value, delete from df, return row as a dataframe (not series)
>>> pop(df, 1, axis=0)
   A  B  C
1  2  5  8

Demand for such a feature

How to pop rows from a dataframe?

Comment From: TomAugspurger

inplace popping of rows is going to be very inefficient, and I don't thing we should encourage that.

I think the best way to do this is with boolean masking. That covers your use cases 1 and 2, and I don't think we should encourage 3 :)

This could be useful as a cookbook entry, "How do I pop rows from a DataFrame?", answering that you don't.

Comment From: jaradc

I don't disagree with anything you said here @TomAugspurger :) I believe pop is usually reserved for a concept of popping 1 of something (ex: 1 row, 1 item, 1 column, etc.) so I'm not sure modifying the existing pop function is appropriate in that context.

A cookbook entry would also be a great help if there's a better way to do this. My main idea is having a convenience method to be able to do this kind of action in one call - pop rows or columns in-place (delete from existing dataframe) and return

Comment From: ghost

@jaradc I would like to add that I too thought this would be useful.

I just distributed my first package https://github.com/kdggavkc/pandas-refract for this purpose, but would prefer to see syntax in pandas like:

target_df = df.pop(df['target_column'] == 'target_value', axis=0)

It's not providing functionality that doesn't exist in pandas, but to me its syntax I would have thought existed already. Currently you have to slice based on a condition, and then slice on the inverse (mask and ~mask) to split a df this way.

@TomAugspurger any thoughts here? how open are we too allowing above syntax?)

Comment From: TomAugspurger

@kdggavkc I may misunderstand, but your pop looks different.

dict.pop / DataFrame.pop take a label. This issue was about expanding pop to take multiple keys and an axis argument. Your pop seems to take a mask.

On Tue, Jul 10, 2018 at 5:42 AM, kdggavkc notifications@github.com wrote:

@jaradc https://github.com/jaradc I would like to add that I too thought this would be useful.

I just distributed my first package https://github.com/kdggavkc/ pandas-refract for this purpose, but would prefer to see syntax in pandas like:

target_df = df.pop(df['target_column'] == 'target_value')

It's not providing functionality that doesn't exist in pandas, but to me its syntax I would have thought existed already. Currently you have to slice based on a condition, and then slice on the inverse (mask and ~mask) to split a df this way.

@TomAugspurger https://github.com/TomAugspurger any thoughts here? how open are we too allowing above syntax?)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pandas-dev/pandas/issues/19501#issuecomment-403779729, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQHIhAmZqLF3kw62oPjPwkywdnaPRgyks5uFIUhgaJpZM4R2gsF .

Comment From: ghost

@TomAugspurger no you are correct. I piped in here because I saw someone had a similar idea and didn't want to make a separate issue. If you feel it's more appropriate for me to do so I certainly can.

Comment From: jmarshall9120

December calling.. did we ever get a cookbook entry for this?

Comment From: kevinbird15

I know this is pretty old, but this is how I solved this problem for me:

def pop_first_row(df):
    first_row = df.iloc[[0]]
    df = df.iloc[1:]
    return first_row, df

I only cared about the top row but this could also take an index argument but not 100% sure how I would rebuild the df in a case where it wasn't 0. would need to think about that a bit. I guess if performance wasn't a concern, you could do a pd.concat with the two sides of the popped index, but I bet there is a smarter way to deal with it.

Comment From: jbrockmendel

I agree with @tomaugspurger we do t want to encourage popping on rows. Closing.