Tuesday, 23 March 2021

Fill NaN with gruop-by other column in spark

Data

 Col1 Col2    result
0  a    x      123.0 
1  a    y     NaN    
2  a    x      453.0 
3  a    y      675.0 
4  b    z      786.0 
5  b    z      332.0 

I want to fill NaN with 675.0, first group by col1 then by col2 and fill the NaN value

In Pandas

df['result'] = df['result'].fillna(df.groupby(['col1', 'col2', ])['result'].bfill())


df['result'] = df['result'].fillna(df.groupby(['col1', 'col2', ])['result'].ffill())

How I can implement it in pyspark ?



from Fill NaN with gruop-by other column in spark

No comments:

Post a Comment