Monday, 26 April 2021

Adding other variables with key and value and appending of a dict to a dataframe

I need to pass columns from a dataframe to a function which in return provides a dictionary, that needs to be appended in the same dataframe in 2 columns result and cost.

For example the function is:

    def costsplit (acc, srv, owner, cost):
           
        test = splitter().split(acc, srv, owner, cost)
               
        return test

Suppose dictionary type of data being return by test for dev is test = {'dps':32, 'dd':21, 'ct':92, 'cc':32}.

It means {'dps':32, 'dd':21, 'ct':92, 'cc':32} is being returned by test when acc = dev, srv = instance owner = dpc is cost =30 is passed i.e. row 1 of below dataframe, same some other output {'dps':20, 'dd':21, 'ct':92, 'cc':2} is being returned by test when acc = prd, srv = instance, owner = abs, cost =35 is passed i.e. row 4 and they are getting appended in the result and cost column in dataframe.

The current dataframe looks like:

    date         acc  srv         owner    result         cost
    
    2021-03-01   dev   bucket      dps      gcp.dev.dps       177
    2021-03-01   prd   instance    abs       gcp.prd.abs      35
    2021-03-01   dev   spanner      cc      gcp.dev.cc       98
    2021-03-01   prd   instance        it    gcp.prd.it     135

Now the output dataframe should append into the result and cost columns from the dictionary key-value pair.

The output should be like:

    date         acc  srv         owner    result         cost
    
    2021-03-01   dev   bucket      dps      gcp.dev.dps       177
    2021-03-01   prd   instance    abs       gcp.prd.abs      35
    2021-03-01   dev   spanner      cc      gcp.dev.cc        98
    2021-03-01   prd   instance    it        gcp.prd.it       135
    2021-03-01                              gcp.dev.dps       32
    2021-03-01                              gcp.dev.dd        21
    2021-03-01                               gcp.dev.ct       92
    2021-03-01                               gcp.dev.cc       32
    2021-03-01                              gcp.prd.dps       20
    2021-03-01                              gcp.prd.dd        21
    2021-03-01                               gcp.prd.ct       92
    2021-03-01                               gcp.prd.cc       2

i.e. loop runs on each row of current dataframe for acc, srv, owner, cost column data being passed to costsplit function should get appended with each gcp.{acc}.{testkey} in the result section and test value gets added to the cost column which are being returned by test.

The splitter().split function is dividing the cost and renaming the owner based on each row that is being sent from the dataframe.

With the below command I am only able to append result function, not the cost one.

    acc['result'] = acc.apply(lambda x: [f'gcp.{acc}.{squ}' for squ, cost in test.items()], axis=1)


from Adding other variables with key and value and appending of a dict to a dataframe

No comments:

Post a Comment