I usually have to change the value of an element in pd.Index
but it's difficult without the replace
method that is present in pd.Series
. I usually have a bunch of Jupyter
cells depending on a particular pd.DataFrame
or pd.Series
with complicated transformations and a graph at some point. In most cases, I have to change the labels for a few to clean them up for publication quality figures. Currently, I have to use an if else statement with map
method but it would nice to be able to just do replace
.
print(pd.__version__)
print("_________________________")
idx = pd.Index([0,1,2])
Se = pd.Series(list("abc"), index=idx)
# Can do it to pd.Series
print(Se)
print(Se.replace({"b":"replaced_b"}))
# but not pd.Index
print("_________________________")
print(idx)
d = {1:100}
print(idx.map(lambda x: d[x] if x in d else x))
print(idx.replace({1:100}))
# 0.21.1
# _________________________
# 0 a
# 1 b
# 2 c
# dtype: object
# 0 a
# 1 replaced_b
# 2 c
# dtype: object
# _________________________
# Int64Index([0, 1, 2], dtype='int64')
# Int64Index([0, 100, 2], dtype='int64')
# ---------------------------------------------------------------------------
# AttributeError Traceback (most recent call last)
# <ipython-input-383-0b8eacd67c54> in <module>()
# 11 d = {1:100}
# 12 print(idx.map(lambda x: d[x] if x in d else x))
# ---> 13 print(idx.replace({1:100}))
# 14
# AttributeError: 'Int64Index' object has no attribute 'replace'
Comment From: ZhuBaohe
You can do a type conversion.
>>>idx = pd.Index([0,1,2])
...d = {1:100}
...idx = pd.DataFrame(idx).replace(d).set_index(0).index
...print(idx)
Int64Index([0, 100, 2], dtype='int64', name=0)
Comment From: jolespin
Thanks @ZhuBaohe , there are a couple of work arounds but I think it would make sense to have a general method just like .to_csv
is in both pd.Series
and pd.DataFrame
.
Comment From: jolespin
I've been using the following:
replace = lambda x,d: d[x] if x in d else x
df.index.map(lambda x:replace(x, d_cpd_newcpdname))
Not sure if this is helpful at all.
Comment From: jorisvandenbossche
@jolespin is Index.rename what you want?
Comment From: jolespin
@jorisvandenbossche I don't think it works the same way with pd.Index
objects:
>>> import pandas as pd
>>> idx = pd.Index([1,2,3])
>>> idx.rename({1:"a"})
Int64Index([1, 2, 3], dtype='int64', name={1: 'a'})
Comment From: jorisvandenbossche
Sorry, I meant DataFrame.rename(index={1: 'a'})
Comment From: jolespin
It's nice doing it to the actual index objects as well because I use them to store iterables of strings that I manipulate often. Is there a core method like the plans for to_csv for series and dataframes that could be applied to replace?
On Feb 3, 2018, at 2:49 AM, Joris Van den Bossche notifications@github.com wrote:
Sorry, I meant DataFrame.rename(index={1: 'a'})
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Comment From: jorisvandenbossche
Hmm, we could consider adding replace
to Index to make it consistent with Series ?
Comment From: oguzhanogreden
No objections for @jorisvandenbossche's suggestion?
I "suffered" from this myself. Less dramatically, I expected that I could of course use a replace method on an index, so I'd like to give this a go if a PR is welcome.
Comment From: oguzhanogreden
take