Tuesday, 14 February 2023

Named rolling aggregate custom functions in Pandas

I fail to find a way to properly name custom aggregate functions applied to rolling windows. This answer explains it well for groupby aggregates. I've tried using pd.NamedAggregates, like so

df
.rolling(f"{num_days_window + 1}D", min_periods=day_length)            
.aggregate(time_mean=pd.NamedAgg(column="time", aggfunc=lambda w: window_daily_stats(w, np.mean)),
           time_std=pd.NamedAgg(column="time", aggfunc=lambda w: window_daily_stats(w, np.std)))

Nested dictionaries for naming are deprecated, so that's not an option. Passing in tuples also doesn't work.

.rolling(f"{num_days_window + 1}D", min_periods=day_length)
.aggregate(time_mean=("time", lambda w: window_daily_stats(w, np.mean)),
           time_std=("time", lambda w: window_daily_stats(w, np.std)))

In both cases the error is the same:

TypeError: aggregate() missing 1 required positional argument: 'func'

The way I currently do it is I pass the aggregate function a dict containing column: list of functions pairs, but in that case the resulting columns are named

('time', '<lambda>'),
('time', '<lambda>'), 

Which unfortunately doesn't give me uniquely valued Index objects for columns.

All in all my question is, how do I create named aggregates for custom functions for rolling windows?



from Named rolling aggregate custom functions in Pandas

No comments:

Post a Comment