Hemant Vishwakarma: Scaling of time to broadcast an operation on 3D arrays in numpy

Thursday, 1 November 2018

Scaling of time to broadcast an operation on 3D arrays in numpy

I am trying to broadcast a simple operation of ">" over two 3D arrays. One has dimensions (m, 1, n) the other (1, m, n). If I change the value of the third dimension (n), I would naively expect that the speed of the computation would scale as n.

However, when I try to measure this explicitly I find that there is an increase in computation time of about factor 10 when increasing n from 1 to 2, after which the scaling is linear.

Why does the computation time increase so drastically when going from n=1 to n=2? I'm assuming that it is an artifact of memory management in numpy but I'm looking for more specifics.

The code is attached below with the resulting plot.

import numpy as np
import time
import matplotlib.pyplot as plt

def compute_time(n):

    x, y = (np.random.uniform(size=(1, 1000, n)), 
            np.random.uniform(size=(1000, 1, n)))

    t = time.time()
    x > y 
    return time.time() - t

a = [
        [
            n, np.asarray([compute_time(n) 
            for _ in range(100)]).mean()
        ]
        for n in range(1, 30, 1)
    ]

a = np.asarray(a)
plt.plot(a[:, 0], a[:, 1])
plt.xlabel('n')
plt.ylabel('time(ms)')
plt.show()

Plot of time to broadcast an operation

from Scaling of time to broadcast an operation on 3D arrays in numpy

Hemant Vishwakarma

Thursday, 1 November 2018

Scaling of time to broadcast an operation on 3D arrays in numpy

No comments:

Post a Comment