Feature Type
-
[X] Adding new functionality to pandas
-
[ ] Changing existing functionality in pandas
-
[ ] Removing existing functionality in pandas
Problem Description
I suggest adding options first_inverted
and last_inverted
as keep
options to function pandas.DataFrame.duplicated
. Below an example of how it would work and what it would return.
df = pd.DataFrame({ 'brand': ['Yum Yum', 'Yum Yum', 'Yum Yum', 'Indomie', 'Indomie', 'Indomie'], 'style': ['cup', 'cup', 'cup', 'cup', 'pack', 'pack'], 'rating': [4, 4, 4, 3.5, 15, 5], })
df.duplicated(keep='first_inverted')
0 True 1 False 2 False 3 False 4 False 5 False dtype: bool
Feature Description
.
Alternative Solutions
.
Additional Context
No response
Comment From: KevsterAmp
take
Comment From: rhshadrach
Is this request the same as doing df.duplicated(keep=False) & ~df.duplicated(keep="first")
?
Comment From: tommycarstensen
@rhshadrach Yes, that is correct. I just wanted to avoid two loops over a very large dataframe.
Comment From: rhshadrach
I do not think we should expand the API to include a specific implementation for this operation. There are many different ways users may want to flag duplicates, and it's unsustainable to try to have specific implementations for each one.
Comment From: mroeschke
Agreed, closing