Tuesday, 1 October 2019

Deleting values conditional on large values of another column

I have a timeseries df comprised of daily Rates in column A and the relative change from one day to the next in column B.

DF looks something like the below:

                   IR      Shift
May/24/2019        5.9%    - 
May/25/2019        6%      1.67%      
May/26/2019        5.9%    -1.67
May/27/2019        20.2%   292%
May/28/2019        20.5%   1.4% 
May/29/2019        20%    -1.6% 
May/30/2019        5.1%   -292%
May/31/2019        5.1%     0%

I would like to delete all values in column A which occur between between large relative shifts,> +/- 50%.

So the above DF should look as the below:

                      IR      Shift
May/24/2019        5.9%    - 
May/25/2019        6%       1.67%      
May/26/2019        5.9%    -1.67
May/27/2019        np.nan   292%
May/28/2019        np.nan   1.4% 
May/29/2019        np.nan  -1.6% 
May/30/2019        5.1%    -292%
May/31/2019        5.1%      0%

This is where I've got to so far.... would appreciate some help

 for i, j in df1.iterrows():
      if df1['Shift'][i] > .50 :
          x = df1['IR'][i]
      if df1['Shift'][j] < -.50 :
          y = df1['IR'][j]
      df1['IR'] = np.where(df1['Shift'].between(x,y), df1['Shift'], 
      np.nan)                                                                                                                                  

Error ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().



from Deleting values conditional on large values of another column

No comments:

Post a Comment