Friday, 8 April 2022

Evaluate Time Dependent Rate of Return to create Pandas DataFrame

Suppose that I have a Pandas dataframe as follows:

+------------+--------+
|    Date    | Price  |
+------------+--------+
| 2021-07-30 | 438.51 |
| 2021-08-02 | 437.59 |
| 2021-08-03 | 441.15 |
| 2021-08-04 | 438.98 |
+------------+--------+

The above data frame can be genreated using the code below:

data = {'Date': ['2021-07-30', '2021-08-02', '2021-08-03', '2021-08-04'],
        'Price': [438.51, 437.59, 441.15, 438.98]
        }

df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
normalisation_days = 365.25
compounding_days = 365.25

For the given timeseries I want to compute the rate_of_return which is time dependent and problem here is to identify the time periods between which the best or worst values of rate_of_return is achieved.

One could simply calculate rate_of_return on all possible combinations and then create a data frame containing period_start period_end and rate_of_return and sort in descending (best) or ascending (worst) order and then exclude any periods where there is overlap.

rate_of_return = ((period_end_price/period_start_price)^(compounding_days/(days_in_between))-1 * (normalisation_days/compounding_days)

On above data frame I've computed rate_of_return using code below

df['rate_of_return_l1'] = ((((df.Price /
                                   df.Price[0]) **
                                  (compounding_days /
                                   (df.Date - df.Date[0]).dt.days) - 1) *
                                 (normalisation_days /
                                  compounding_days)))
df['rate_of_return_l1'].iloc[0] = np.nan

df['rate_of_return_l2'] = ((((df.Price /
                                   df.Price[1]) **
                                  (compounding_days /
                                   (df.Date - df.Date[1]).dt.days) - 1) *
                                 (normalisation_days /
                                  compounding_days)))
df['rate_of_return_l2'].iloc[:2] = np.nan

df['rate_of_return_l3'] = ((((df.Price /
                                   df.Price[2]) **
                                  (compounding_days /
                                   (df.Date - df.Date[2]).dt.days) - 1) *
                                 (normalisation_days /
                                  compounding_days)))
df['rate_of_return_l3'].iloc[:3] = np.nan

Based on the results the best/worst cases period are as below

+--------------+------------+----------------+
| Period Start | Period End | Rate of Return |
+--------------+------------+----------------+
| 2021-08-02   | 2021-08-03 |    18.28751739 |
| 2021-08-02   | 2021-08-04 |    0.784586925 |
| 2021-07-30   | 2021-08-03 |    0.729942907 |
| 2021-07-30   | 2021-08-04 |    0.081397181 |
| 2021-07-30   | 2021-08-02 |   -0.225626914 |
| 2021-08-03   | 2021-08-04 |   -0.834880227 |
+--------------+------------+----------------+

Expected Output

If I want to see the best of rate_of_return the resulting dataframe would be

+--------------+------------+----------------+
| Period Start | Period End | Rate of Return |
+--------------+------------+----------------+
| 2021-08-02   | 2021-08-03 |    18.28751739 |
+--------------+------------+----------------+

If I want to see the worst of rate_of_return the resulting dataframe would be

+--------------+------------+----------------+
| Period Start | Period End | Rate of Return |
+--------------+------------+----------------+
| 2021-08-03   | 2021-08-04 |   -0.834880227 |
| 2021-07-30   | 2021-08-02 |   -0.225626914 |
+--------------+------------+----------------+
  • What we be the optimal approach to test out all scenarios to compute rate_of_return?
  • How can I achieve the expected output such that periods don't overlap? (seen in expected output)
  • Best/Worst data frame is not sign dependent best data frame can contain negative rate_of_returns given that there is no time period overlap.
  • What would the approach be if formula changes to (period_end_price/period_start_price) - 1(not time dependent)?


from Evaluate Time Dependent Rate of Return to create Pandas DataFrame

No comments:

Post a Comment