Tuesday 27 October 2020

How to speed up the performance of array masking from the results of numpy.searchsorted in python?

I want to generate a mask from the results of numpy.searchsorted():

import numpy as np

# generate test examples
x = np.random.rand(1000000)
y = np.random.rand(200)

# sort x
idx = np.argsort(x)
sorted_x = np.take_along_axis(x, idx, axis=-1)

# searchsort y in x
pt = np.searchsorted(sorted_x, y)

pt is an array. Then I want to create a boolean mask of size (200, 1000000) with True values when its indices are idx[0:pt[i]], and I come up with a for-loop like this:

mask = np.zeros((200, 1000000), dtype='bool')
for i in range(200):
     mask[i, idx[0:pt[i]]] = True

Anyone has an idea to speed up the for-loop?



from How to speed up the performance of array masking from the results of numpy.searchsorted in python?

No comments:

Post a Comment