I am trying to apply a while statement to my code in order to run it until all the elements in the lists below (in the column Check) are in column Source.
My code is as so far:
while set_condition: # to set the condition
newCol = pd.Series(list(set(df['Check']) - set(df['Source']))) # this check for elements which are not currently included in the column Source
newList1 = newCol.apply(lambda x: my_function(x)) # this function should generate the lists n Check -> this explains why I need to create a while statement
df = df.append(pd.DataFrame(dict('Source'=newCol, 'Check'=newList1)), ignore_index=True) # append the results in the new column
df = df.explode('Check')
I will give you an example of the process and of how my_function
works: let's say that I have my initial dataset
Source Check
mouse [dog, horse, cat]
horse [mouse, elephant]
tiger []
elephant [horse, bird]
After exploding Check
column and appending the results to Source
, I will have
Source Check
mouse [dog, horse, cat]
horse [mouse, elephant]
tiger []
elephant [horse, bird]
dog [] # this will be filled in after applying the function
cat [] # this will be filled in after applying the function
bird [] # this will be filled in after applying the function
Every elements in the lists should be added in Source column before applying the function. When I apply the function, I populate the lists of the other elements; so, for example I can have
Source Check
mouse [dog, horse, cat]
horse [mouse, elephant]
tiger []
elephant [horse, bird]
dog [mouse, fish] # they are filled in
cat [mouse]
bird [elephant, penguin]
fish [dog]
Since fish
and penguin
are not in Source
, I will need to run again the code in order to have the expected output (all the elements in the lists are already in the Source column):
Source Check
mouse [dog, horse, cat]
horse [mouse, elephant]
tiger []
elephant [horse, bird]
dog [mouse, fish]
cat [mouse]
bird [elephant, penguin]
fish [dog]
penguin [bird]
as both dog
and bird
are already in Source
, I will not need to apply again the function as all the lists are populated with elements already in the Source column. The code can stop to run.
I cannot provide the code for my_function
, but I hope it can be clear how it works, in order to try to figure out how to set the while statement.
What I would like to do is to stop the cycle/loop when all the elements in the lists are in the column Source and have applied the function to populate all the lists.
Thank you for all the help you will provide.
from How can I iterate until all entries are in a given column?
No comments:
Post a Comment