Is there a way to make a group by aggregation by multiple columns in numpy? Im trying to do it with this module: https://github.com/ml31415/numpy-groupies Goal is to get a faster groupby than pandas. for example:
group_idx = np.array([
np.array([4, 3, 3, 4, 4, 1, 1, 1, 7, 8, 7, 4, 3, 3, 1, 1]),
np.array([4, 3, 2, 4, 7, 1, 4, 1, 7, 8, 7, 2, 3, 1, 14 1]),
np.array([1, 2, 3, 4, 5, 1, 1, 2, 3, 4, 5, 4, 2, 3, 1, 1])
]
a = np.array([1, 2, 1, 2, 1, 2, 1, 2, 3, 4, 5, 4, 2, 3, 1, 1])
result = aggregate(group_idx, a, func='sum')
It should be like pandas df.groupby(['column1','column2','column3']).sum().reset_index()
from Python numpy groupby multiple columns
No comments:
Post a Comment