Friday 20 July 2018

cProfile implies significant overhead when calling numba jit functions

Compare a pure Python no-op function with a no-op function decorated with @numba.jit, that is:

import numba

@numba.njit
def boring_numba():
    pass

def call_numba(x):
    for t in range(x):
        boring_numba()

def boring_normal():
    pass

def call_normal(x):
    for t in range(x):
        boring_normal()

If we time this with %timeit, we get the following:

%timeit call_numba(int(1e7))

792 ms ± 5.51 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit call_normal(int(1e7))

737 ms ± 2.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

All perfectly reasonable; there's a small overhead for the numba function, but not much.

If however we use cProfile to profile this code, we get the following:

cProfile.run('call_numba(int(1e7)); call_normal(int(1e7))', sort='cumulative')

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     76/1    0.003    0.000    8.670    8.670 {built-in method builtins.exec}
        1    6.613    6.613    7.127    7.127 experiments.py:10(call_numba)
        1    1.111    1.111    1.543    1.543 experiments.py:17(call_normal)
 10000000    0.432    0.000    0.432    0.000 experiments.py:14(boring_normal)
 10000000    0.428    0.000    0.428    0.000 experiments.py:6(boring_numba)
        1    0.000    0.000    0.086    0.086 dispatcher.py:72(compile)

cProfile thinks there is a massive overhead in calling the numba function. This extends to "real" code: I had a function that simply called my expensive computation (the computation being numba-JIT-compiled), and cProfile reported that the wrapper function was taking around a third of the total time.

I don't mind cProfile adding a bit of overhead, but if it's massively inconsistent about where it adds that overhead it's not very helpful. Does anyone know why this happens, whether there is anything that can be done about it, and/or if there are any alternative profiling tools that don't interact badly with numba?



from cProfile implies significant overhead when calling numba jit functions

No comments:

Post a Comment