I was intrigued by a blog post by Mike Croucher where he compared the time needed for the two functions numpy.zeros((N,N))
and numpy.empty((N,N))
for N=200
and N=1000
. I ran a little loop in a jupyter notebook using the %timeit
magic. The graph below gives the ratio of the time needed for numpy.zero
to numpy.empty
. For N=346
, numpy.zero
is about 125 times slower than numpy.empty
. At N=361
and up, both functions require almost the same amount of time.
Later, a discussion on Twitter led to the assumptions that either numpy
does something special for small allocations to avoid a malloc
call or that the OS might take the initiative to zero-out an allocated memory page.
What would be the cause of this difference for smaller N
and the almost equal time needed for larger N
?
Start of edit by Heap Overflow: I can reproduce it (that's why I got here in the first place), here's a plot for np.zeros
and np.empty
separately. The ratio would look like GertVdE's original plot:
Done with Python 3.9.0 64-bit, NumPy 1.19.2, Windows 10 Pro 2004 64-bit using this script to produce the data:
from timeit import repeat
import numpy as np
funcs = np.zeros, np.empty
number = 10
index = range(501)
# tsss[n][f] = list of times for shape (n, n) and function f, one time for each round.
tsss = [[[] for _ in funcs] for _ in index]
for round_ in range(10):
print('Round', round_)
for n, tss in zip(index, tsss):
for func, ts in zip(funcs, tss):
t = min(repeat(lambda: func((n, n)), number=number)) / number
t = round(t * 1e6, 3)
ts.append(t)
# bss[f][n] = best time for function f and shape (n, n).
bss = [[min(tss[f]) for tss in tsss]
for f in range(len(funcs))]
print('tss =', bss)
print('index =', index)
print('names =', [func.__name__ for func in funcs])
And then this script (at colab) to plot:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.font_manager as font_manager
from google.colab import files
tss = ... (copied from above script's output)
index = range(0, 501)
names = ['np.zeros', 'np.empty']
df = pd.DataFrame(dict(zip(names, tss)), index=index)
ax = df.plot(ylim=0, grid=True)
ax.set(xlabel='n', ylabel='time in μs for shape (n, n)')
ax.legend(prop=font_manager.FontProperties(family='monospace'))
if 0: # Make this true to create and download image files.
plt.tight_layout()
filename = f'np_zeros_vs_empty{cut}.png'
ax.get_figure().savefig(filename, dpi=200)
files.download(filename)
End of edit by Heap Overflow.
from Why is the speed difference between Python's Numpy zeros and empty functions gone for larger array sizes?
No comments:
Post a Comment