Hemant Vishwakarma: Why is the speed difference between Python's Numpy zeros and empty functions gone for larger array sizes?

Saturday, 24 October 2020

Why is the speed difference between Python's Numpy zeros and empty functions gone for larger array sizes?

I was intrigued by a blog post by Mike Croucher where he compared the time needed for the two functions numpy.zeros((N,N)) and numpy.empty((N,N)) for N=200 and N=1000. I ran a little loop in a jupyter notebook using the %timeitmagic. The graph below gives the ratio of the time needed for numpy.zero to numpy.empty. For N=346, numpy.zero is about 125 times slower than numpy.empty. At N=361 and up, both functions require almost the same amount of time.

Later, a discussion on Twitter led to the assumptions that either numpy does something special for small allocations to avoid a malloc call or that the OS might take the initiative to zero-out an allocated memory page.

What would be the cause of this difference for smaller N and the almost equal time needed for larger N?

Start of edit by Heap Overflow: I can reproduce it (that's why I got here in the first place), here's a plot for np.zeros and np.empty separately. The ratio would look like GertVdE's original plot:

Done with Python 3.9.0 64-bit, NumPy 1.19.2, Windows 10 Pro 2004 64-bit using this script to produce the data:

from timeit import repeat
import numpy as np

funcs = np.zeros, np.empty

number = 10
index = range(501)

# tsss[n][f] = list of times for shape (n, n) and function f, one time for each round.
tsss = [[[] for _ in funcs] for _ in index]

for round_ in range(10):
    print('Round', round_)
    for n, tss in zip(index, tsss):
        for func, ts in zip(funcs, tss):
            t = min(repeat(lambda: func((n, n)), number=number)) / number
            t = round(t * 1e6, 3)
            ts.append(t)
    
# bss[f][n] = best time for function f and shape (n, n).
bss = [[min(tss[f]) for tss in tsss]
       for f in range(len(funcs))]

print('tss =', bss)
print('index =', index)
print('names =', [func.__name__ for func in funcs])

And then this script (at colab) to plot:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.font_manager as font_manager
from google.colab import files

tss = ... (copied from above script's output)
index = range(0, 501)
names = ['np.zeros', 'np.empty']

df = pd.DataFrame(dict(zip(names, tss)), index=index)
ax = df.plot(ylim=0, grid=True)
ax.set(xlabel='n', ylabel='time in μs for shape (n, n)')
ax.legend(prop=font_manager.FontProperties(family='monospace'))
if 0:  # Make this true to create and download image files.
    plt.tight_layout()
    filename = f'np_zeros_vs_empty{cut}.png'
    ax.get_figure().savefig(filename, dpi=200)
    files.download(filename)

End of edit by Heap Overflow.

from Why is the speed difference between Python's Numpy zeros and empty functions gone for larger array sizes?

Hemant Vishwakarma

Saturday, 24 October 2020

Why is the speed difference between Python's Numpy zeros and empty functions gone for larger array sizes?

No comments:

Post a Comment