Supposing I have a df like the following,
col1 | type | date_1 | date_2 | date_3 |.... | date_n
ab | A | -10 | | -10
ab | B | 100 | 99 | -12
cd | A | 0 | -25 | 6
cd | B | -1 | 8 | -34
ab | A | 98 | -9 | 0
ab | B | -7 | -2 | 0
First step is to remove all positive numbers including 0
Now the df should look like,
col1 | type | date_1 | date_2 | date_3 | .... | date_n
ab | A | -10 | | -10 |
ab | B | | | -12 |
cd | A | | -25 | |
cd | B | -1 | | -34 |
ab | A | | -9 | |
ab | B | -7 | -2 | |
Second step is to compare the numbers for each 'date' col as per 'type' A and B,
-
If the 'type' A row has a negative number and 'type' B is blank, then remove the negative number, of 'date' col, of 'type' A
-
If the 'type' B row has a negative number and 'type' A is blank, then do nothing
-
If both types are blank do nothing
After this step, the df should look like this,
col1 | type | date_1 | date_2 | date_3 | .... | date_n
ab | A | | | -10 |
ab | B | | | -12 |
cd | A | | | |
cd | B | -1 | | -34 |
ab | A | | -9 | |
ab | B | -7 | -2 | |
Final step,
-
If both types are negative for the current, for each set of col1 (ab,cd,ab), check the left-hand-side value of same Ath and Bth of the same row,
1) If both types A and B values are blank, then remove the remove the negative number of current row 'type' A and keep the -ve number of 'type' B 2) If either of the types are blank, then remove the negative of the current row 'type' B and keep the -ve number of 'type' A
Finally, the final_df should look like this,
col1 | type | date_1 | date_2 | date_3 | .... | date_n
ab | A | | | |
ab | B | | | -12 |
cd | A | | | |
cd | B | -1 | | -34 |
ab | A | | -9 | |
ab | B | -7 | | |
For the final step, the comparison should start from the 'date_2'.
What would be the best way to solve this problem? Any help would be greatly appreciated!
Note: I cannot use the column headers (the date ones) to manipulate data because they will keep changing.
from How to perform cell by cell comparison using pandas?
No comments:
Post a Comment