Wednesday, 7 August 2019

Why does numpy not short-circuit on non-contiguous arrays?

Consider the following simple test:

import numpy as np
from timeit import timeit

a = np.random.randint(0,2,1000000,bool)

Let us find the index of the first True

timeit(lambda:a.argmax(), number=1000)
# 0.000451055821031332

This is reasonably fast because numpy short-circuits.

It also works on contiguous slices,

timeit(lambda:a[1:-1].argmax(), number=1000)
# 0.0006490410305559635

But not, it seems, on non-contiguous ones. I was mainly interested in finding the last True:

timeit(lambda:a[::-1].argmax(), number=1000)
# 0.3737605109345168

UPDATE: My assumption that the observed slowdown was due to not short circuiting is inaccurate (thanks @Victor Ruiz). Indeed, in the worst-case scenario of an all False array

b=np.zeros_like(a)
timeit(lambda:b.argmax(), number=1000)
# 0.04321779008023441

we are still an order of magnitude faster than in the non-contiguous case. I'm ready to accept Victor's explanation that the actual culprit is a copy being made (timings of forcing a copy with .copy() are suggestive). Afterwards it doesn't really matter anymore whether short-circuiting happens or not.

But other step sizes != 1 yield similar behavior.

timeit(lambda:a[::2].argmax(), number=1000)
# 0.19192566303536296

Question: Why does numpy not short-circuit UPDATE without making a copy in the last two examples?

And, more importantly: Is there a workaround, i.e. some way to force numpy to short-ciruit UPDATE without making a copy also on non-contiguous arrays?



from Why does numpy not short-circuit on non-contiguous arrays?

No comments:

Post a Comment