Saturday, 24 October 2020

Why is the speed difference between Python's Numpy zeros and empty functions gone for larger array sizes?

I was intrigued by a blog post by Mike Croucher where he compared the time needed for the two functions numpy.zeros((N,N)) and numpy.empty((N,N)) for N=200 and N=1000. I ran a little loop in a jupyter notebook using the %timeitmagic. The graph below gives the ratio of the time needed for numpy.zero to numpy.empty. For N=346, numpy.zero is about 125 times slower than numpy.empty. At N=361 and up, both functions require almost the same amount of time.

Later, a discussion on Twitter led to the assumptions that either numpy does something special for small allocations to avoid a malloc call or that the OS might take the initiative to zero-out an allocated memory page.

What would be the cause of this difference for smaller N and the almost equal time needed for larger N?

enter image description here

Start of edit by Heap Overflow: I can reproduce it (that's why I got here in the first place), here's a plot for np.zeros and np.empty separately. The ratio would look like GertVdE's original plot:

enter image description here

Done with Python 3.9.0 64-bit, NumPy 1.19.2, Windows 10 Pro 2004 64-bit using this script to produce the data:

from timeit import repeat
import numpy as np

funcs = np.zeros, np.empty

number = 10
index = range(501)

# tsss[n][f] = list of times for shape (n, n) and function f, one time for each round.
tsss = [[[] for _ in funcs] for _ in index]

for round_ in range(10):
    print('Round', round_)
    for n, tss in zip(index, tsss):
        for func, ts in zip(funcs, tss):
            t = min(repeat(lambda: func((n, n)), number=number)) / number
            t = round(t * 1e6, 3)
            ts.append(t)
    
# bss[f][n] = best time for function f and shape (n, n).
bss = [[min(tss[f]) for tss in tsss]
       for f in range(len(funcs))]

print('tss =', bss)
print('index =', index)
print('names =', [func.__name__ for func in funcs])

And then this script (at colab) to plot:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.font_manager as font_manager
from google.colab import files

tss = ... (copied from above script's output)
index = range(0, 501)
names = ['np.zeros', 'np.empty']

df = pd.DataFrame(dict(zip(names, tss)), index=index)
ax = df.plot(ylim=0, grid=True)
ax.set(xlabel='n', ylabel='time in μs for shape (n, n)')
ax.legend(prop=font_manager.FontProperties(family='monospace'))
if 0:  # Make this true to create and download image files.
    plt.tight_layout()
    filename = f'np_zeros_vs_empty{cut}.png'
    ax.get_figure().savefig(filename, dpi=200)
    files.download(filename)

End of edit by Heap Overflow.



from Why is the speed difference between Python's Numpy zeros and empty functions gone for larger array sizes?

No comments:

Post a Comment