Summary - the end goal is to calculate the percentage based on the output from a crosstab function in Pandas with another dataframe at a shared index.
What I've tried - tried to split the original crosstab dataframe as numerator and div
another dataframe, but it doesn't seem to work that way because the result was all nan
Code
import pandas as pd
import numpy as np
df1 = pd.DataFrame({"Vntg": ["2020-01","2020-02","2020-03"],"Funded":[1000,2000,4000]}) # This is the df we want to use as denominator
df2 = pd.DataFrame({"Vntg": ["2020-01","2020-01","2020-01","2020-02","2020-02","2020-03"],
"Funded":[1000,1000,1000,2000,2000,4000],
"Payment":[10,20,20,30,15,30],
"Timing":[0,1,2,0,1,0]})
ct_df = pd.crosstab(df2["Vntg"], df2["Timing"], values=df2["Payment"], aggfunc="sum", margins=False)
ct_df = ct_df.cumsum(axis=1) # This is the crosstab df we want to use as numerator on a cumulative basis
Starting from the cumsum
to accumulate the payments, is there a way to convert/replace the dollar value by funded amount in df1
as a percentage? Thanks in advance and appreciate all the help.
I've also looked at the thread below and it doesn't seem to solve my issue: Customized normalization of pd.crosstab()
Edit:
So I think some folks are confused about the ask. To clarify, the final result would be taking 10 from df2
at timing 0 and divided by funded amount, which is 1000 from df1
for vintage 2020-01. For the subsequent timing at 1, it would just be (10+30) from df2
and divided the same funded amount from df1
for the same vintage because it doesn't change in nature. The result would be populated by the same logic for other vintages.
from Is there a way to calculate ratio between a crosstab dataframe with another dataframe in pandas?
No comments:
Post a Comment