I have a dataframe df1
that corresponds to the egelist of nodes
in a network and value
of the nodes themself like the following:
df
node_i node_j value_i value_j
0 3 4 89 33
1 3 2 89 NaN
2 3 5 89 69
3 0 2 45 NaN
4 0 3 45 89
5 1 2 109 NaN
6 1 8 109 NaN
I want to add a column w
that correspond to the value_j
if there is the value. If value_j
is NaN
I would like to set w
as the average of the values of the adjacent nodes of i
. In the case that node_i
has only adjacent nodes with NaN
values set w=1
.
so the final dataframe should be like the foolowing:
df
node_i node_j value_i value_j w
0 3 4 89 33 33
1 3 2 89 NaN 51 # average of adjacent nodes
2 3 5 89 69 69
3 0 2 45 NaN 89 # average of adjacent nodes
4 0 3 45 89 89
5 1 2 109 NaN 1 # 1
6 1 8 109 NaN 1 # 1
I am doing a loop like the following but I would like to use apply
:
nodes = pd.unique(df['node_i'])
df['w'] = 0
for i in nodes:
tmp = df[df['node_i'] == i]
avg_w = np.mean(tmp['value_j'])
if np.isnan(avg_w):
df['w'][idx] = 1
else:
tmp.ix[tmp.value_j.isnull(), 'value_j'] = avg_w ## replace NaN with values
df['w'][idx] = tmp['value_j'][idx]
from Python: how to replace NaN with conditions in a dataframe?
No comments:
Post a Comment