Tuesday, 31 July 2018

Why this numba code is 6x slower than numpy code?

Is there any reason why the following code run in 2s,

def euclidean_distance_square(x1, x2):
    return -2*np.dot(x1, x2.T) + np.expand_dims(np.sum(np.square(x1), axis=1), axis=1) + np.sum(np.square(x2), axis=1)

while the following numba code run in 12s?

@jit(nopython=True)
def euclidean_distance_square(x1, x2):
   return -2*np.dot(x1, x2.T) + np.expand_dims(np.sum(np.square(x1), axis=1), axis=1) + np.sum(np.square(x2), axis=1)

My x1 is a matrix of dimension (1, 512) and x2 is a matrix of dimension (3000000, 512). It is quite weird that numba can be so much slower. Am I using it wrong?

I really need to speed this up because I need to run this function 3 million times and 2s is still way too slow.

I need to run this on CPU because as you can see the dimension of x2 is so huge, it cannot be loaded onto a GPU (or at least my GPU), not enough memory.



from Why this numba code is 6x slower than numpy code?

No comments:

Post a Comment