Hemant Vishwakarma: groupby rolling date window sum with duplicate dates

Wednesday, 22 December 2021

groupby rolling date window sum with duplicate dates

This answer provides a solution to get a rolling sum of a column grouped by another column based on a date window. To reproduce it here:

df = pd.DataFrame(
    {
        'ID': {0: 10001, 1: 10001, 2: 10001, 3: 10001, 4: 10002, 5: 10002, 6: 10002},
        'Date': {
            0: datetime.datetime(2019, 7, 1),
            1: datetime.datetime(2019, 5, 1),
            2: datetime.datetime(2019, 6, 25),
            3: datetime.datetime(2019, 5, 27),
            4: datetime.datetime(2019, 6, 29),
            5: datetime.datetime(2019, 7, 18),
            6: datetime.datetime(2019, 7, 15)
        },
        'Amount': {0: 50, 1: 15, 2: 10, 3: 20, 4: 25, 5: 35, 6: 40},
    }
)
amounts = df.groupby(["ID"]).apply(lambda g: g.sort_values('Date').rolling('28d', on='Date').sum())
df['amount_4wk_rolling'] = df["Date"].map(amounts.set_index('Date')['Amount'])

Output:

+-------+------------+--------+--------------------+
|  ID   |    Date    | Amount | amount_4wk_rolling |
+-------+------------+--------+--------------------+
| 10001 | 01/07/2019 |     50 |                 60 |
| 10001 | 01/05/2019 |     15 |                 15 |
| 10001 | 25/06/2019 |     10 |                 10 |
| 10001 | 27/05/2019 |     20 |                 35 |
| 10002 | 29/06/2019 |     25 |                 25 |
| 10002 | 18/07/2019 |     35 |                100 |
| 10002 | 15/07/2019 |     40 |                 65 |
+-------+------------+--------+--------------------+

However, if two of the dates are the same then I get the error:

pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

This makes sense as I can see on the final line that Date is being used to set an index which is now no longer unique. However, as I don't really understand what that final line does I'm little stumped on trying to develop an alternative solution.

Could someone help out?

from groupby rolling date window sum with duplicate dates

Hemant Vishwakarma

Wednesday, 22 December 2021

groupby rolling date window sum with duplicate dates

No comments:

Post a Comment