Wednesday, 15 December 2021

Is there a way to calculate ratio between a crosstab dataframe with another dataframe in pandas?

Summary - the end goal is to calculate the percentage based on the output from a crosstab function in Pandas with another dataframe at a shared index.

What I've tried - tried to split the original crosstab dataframe as numerator and div another dataframe, but it doesn't seem to work that way because the result was all nan

Code

import pandas as pd
import numpy as np 

df1 = pd.DataFrame({"Vntg": ["2020-01","2020-02","2020-03"],"Funded":[1000,2000,4000]}) # This is the df we want to use as denominator
df2 = pd.DataFrame({"Vntg": ["2020-01","2020-01","2020-01","2020-02","2020-02","2020-03"],
                    "Funded":[1000,1000,1000,2000,2000,4000],
                    "Payment":[10,20,20,30,15,30],
                    "Timing":[0,1,2,0,1,0]})
ct_df = pd.crosstab(df2["Vntg"], df2["Timing"], values=df2["Payment"], aggfunc="sum", margins=False)
ct_df = ct_df.cumsum(axis=1) # This is the crosstab df we want to use as numerator on a cumulative basis

Starting from the cumsum to accumulate the payments, is there a way to convert/replace the dollar value by funded amount in df1 as a percentage? Thanks in advance and appreciate all the help.

I've also looked at the thread below and it doesn't seem to solve my issue: Customized normalization of pd.crosstab()

Edit:

So I think some folks are confused about the ask. To clarify, the final result would be taking 10 from df2 at timing 0 and divided by funded amount, which is 1000 from df1 for vintage 2020-01. For the subsequent timing at 1, it would just be (10+30) from df2 and divided the same funded amount from df1 for the same vintage because it doesn't change in nature. The result would be populated by the same logic for other vintages.



from Is there a way to calculate ratio between a crosstab dataframe with another dataframe in pandas?

No comments:

Post a Comment