Feature Type

  • [x] Adding new functionality to pandas

  • [ ] Changing existing functionality in pandas

  • [ ] Removing existing functionality in pandas

Problem Description

Hi all!

I'm currently working on a forecasting model and had to create multiple lags of many columns. Doing this made me realize that the shift method does not have prefix, prefix_sep, suffix nor suffix_sep arguments. I think adding some (or all) of these arguments would be super useful and could help standardize the method with others such as pd.get_dummies. Additionally, this is already implemented to some extent because when periods is a list, it adds a suffix to each lagged column.

Feature Description

Obviously this is redundant because the method calls itself, but I think it conveys the idea well.

Suppose suffix and suffix_sep are strings (eg 'lag' and '_') and that columns is an iterable.

if suffix and suffix_sep:
    for column in columns:
        for period in periods:
            data[f'{column}{suffix_sep}{suffix}{period}] = data[column].shift(period)

Alternative Solutions

Here's what I'm currently doing to add _lagX as a suffix:

lags = [1, 2, 3, 6, 9, 12] 
_temp = df[cols_og_feats].shift(periods=lags)  # Lag each column by each lag in lags
_temp.columns = [
    '_'.join(col.split('_')[:-1]) + '_lag' + col.split('_')[-1] for col in _temp.columns
]  # add '_lagX' suffix

Additional Context

No response

Comment From: sajansshergill

lags = [1, 2, 3] cols = ['sales', 'revenue']

Efficiently create lagged columns with suffix _lag{n}

df_lagged = pd.concat( [df[col].shift(lag).rename(f"{col}_lag{lag}") for col in cols for lag in lags], axis=1 )

Optionally combine with original dataframe

df = pd.concat([df, df_lagged], axis=1)

Comment From: ArturoSbr

Thank you! I see what you mean @sajansshergill but I wanted to pass an iterable to the method. Is list comp more efficient than passing an iterable?