I have the following problem set which I have spent days on trying to find the optimal solution:
giving a country evaluating process having 3 parameters (V1,V2,V3), which may be recorded in sperate dates (Date1,Date2,Date3) repsectively, each record of a single parameter is stored in one row, with date and value, as illustrated in the picture below.
I need to fill empty cells in a row with two parameters recorded in other rows based on two rules
-
for the country (ISO), and the recorded parameter in current row, if the other two parameters' recordings of the same country exist in the table in other rows, chose the recorded parameter value/date for the same country in other rows with the date most close to the parameter recording date in current row.
-
otherwise, if the same country doesn't have other one or two parameters recorded in other rows yet, use the recorded parameter value and date in the current row to fill the empty cells of the other two parameters in the current row.
for example:
Record NO.1, country "AND" only have one parameter V1 recorded in the table (C2,D2), thus, (E2,F2) and (G2,H2) should be filled with value from (C2,D2),as 123 and 2022/4/12
Record NO.2, Country "COR" has only one V2 recorded (E3,F3), and two V1 recorded in Record NO.4 and NO.5, then, in Record NO.2, V1(C3,D3) should be filled with value in Record NO.5 (C6,D6), since Record NO.5's date (D6,2022.07.12) is most close to Record NO.2(D3,2022.07.13).
The process has to loop through the dataframe to fill all the empty cells.
Please HELP!
from Python fill empty cells in dataframe based on the values of other column values and conditions
No comments:
Post a Comment