Tuesday, 9 November 2021

How to add a percentage computation in pandas result

I have the following working code. I need to add a percentage column to monitor changes. I dont know much on how to do it in pandas. I need ideas on what part needs to be modified.

import pandas as pd
dl = []
with open('sampledata.txt') as f:
    for line in f:
        parts = line.split()
        # Cleaning data here.. Conversions to int/float etc,
        if not parts[3][:2].startswith('($'):
            parts.insert(3,'0')
        if len(parts) > 5:
            temp = ' '.join(parts[4:])
            parts = parts[:4] + [temp]
        parts[1] = int(parts[1])
        parts[2] = float(parts[2].replace(',', ''))
        parts[3] = float(parts[3].strip('($)'))
        dl.append(parts)
headers = ['col1', 'col2', 'col3', 'col4', 'col5']
df = pd.DataFrame(dl,columns=headers)
df = df.groupby(['col1','col5']).sum().reset_index()
df = df.sort_values('col2',ascending=False)
df['col4'] =  '($' + df['col4'].astype(str) + ')'
df = df[headers]
print(df)

sampledata.txt #-- Sample Data Source file

alpha   1   54,00.01                    ABC DSW2S
bravo   3   500,000.00                  ACDEF
charlie 1   27,722.29 ($250.45)         DGAS-CAS
delta   2   11 ($10)                    SWSDSASS-CCSSW
echo    5   143,299.00 ($101)           ACS34S1
lima    6   45.00181 ($38.9)            FGF5GGD-DDD
falcon  3   0.1234                      DSS2SFS3
echo    8   145,300 ($125.01)           ACS34S1
charlie 10  252,336,733.383 ($492.06)   DGAS-CAS
romeo   12  980                         ASDS SSSS SDSD
falcon  5   9.19                        DSS2SFS3

Current Output: #-- working result

      col1  col2          col3       col4            col5
4     echo    13  2.885990e+05  ($226.01)         ACS34S1
7    romeo    12  9.800000e+02     ($0.0)  ASDS SSSS SDSD
2  charlie    11  2.523645e+08  ($742.51)        DGAS-CAS
5   falcon     8  9.313400e+00     ($0.0)        DSS2SFS3
6     lima     6  4.500181e+01    ($38.9)     FGF5GGD-DDD
1    bravo     3  5.000000e+05     ($0.0)           ACDEF
3    delta     2  1.100000e+01    ($10.0)  SWSDSASS-CCSSW
0    alpha     1  5.400010e+03     ($0.0)       ABC DSW2S

Improved Output: #-- with Additional Column for %

      col1  col2          col3       col4            col5   col6
4     echo    13  2.885990e+05  ($226.01)         ACS34S1   60%     #-- (5 + 8) = 13
7    romeo    12  9.800000e+02     ($0.0)  ASDS SSSS SDSD   0%
2  charlie    11  2.523645e+08  ($742.51)        DGAS-CAS   900%  #-- (1 + 10) = 11
5   falcon     8  9.313400e+00     ($0.0)        DSS2SFS3   66.67%  #-- (3 + 5) = 8
6     lima     6  4.500181e+01    ($38.9)     FGF5GGD-DDD   0%
1    bravo     3  5.000000e+05     ($0.0)           ACDEF   0%
3    delta     2  1.100000e+01    ($10.0)  SWSDSASS-CCSSW   0%
0    alpha     1  5.400010e+03     ($0.0)       ABC DSW2S   0%


from How to add a percentage computation in pandas result

No comments:

Post a Comment