Tuesday, 12 November 2019

Why is np.dot imprecise? (n-dim arrays)

Suppose we take np.dot of two 'float32' 2D arrays:

res = np.dot(a, b)   # see CASE 1
print(list(res[0]))  # list shows more digits
[-0.90448684, -1.1708503, 0.907136, 3.5594249, 1.1374011, -1.3826287]

Numbers. Except, they can change:


CASE 1: slice a

np.random.seed(1)
a = np.random.randn(9, 6).astype('float32')
b = np.random.randn(6, 6).astype('float32')

for i in range(1, len(a)):
    print(list(np.dot(a[:i], b)[0])) # full shape: (i, 6)
[-0.9044868,  -1.1708502, 0.90713596, 3.5594249, 1.1374012, -1.3826287]
[-0.90448684, -1.1708503, 0.9071359,  3.5594249, 1.1374011, -1.3826288]
[-0.90448684, -1.1708503, 0.9071359,  3.5594249, 1.1374011, -1.3826288]
[-0.90448684, -1.1708503, 0.907136,   3.5594249, 1.1374011, -1.3826287]
[-0.90448684, -1.1708503, 0.907136,   3.5594249, 1.1374011, -1.3826287]
[-0.90448684, -1.1708503, 0.907136,   3.5594249, 1.1374011, -1.3826287]
[-0.90448684, -1.1708503, 0.907136,   3.5594249, 1.1374011, -1.3826287]
[-0.90448684, -1.1708503, 0.907136,   3.5594249, 1.1374011, -1.3826287]

Results differ, even though the printed slice derives from the exact same numbers multiplied.


CASE 2: flatten a, take a 1D version of b, then slice a:
np.random.seed(1)
a = np.random.randn(9, 6).astype('float32')
b = np.random.randn(1, 6).astype('float32')

for i in range(1, len(a)):
    a_flat = np.expand_dims(a[:i].flatten(), -1) # keep 2D
    print(list(np.dot(a_flat, b)[0])) # full shape: (i*6, 6)
[-0.3393164, 0.9528787, 1.3627989, 1.5124314, 0.46389243, 1.437775]
[-0.3393164, 0.9528787, 1.3627989, 1.5124314, 0.46389243, 1.437775]
[-0.3393164, 0.9528787, 1.3627989, 1.5124314, 0.46389243, 1.437775]
[-0.3393164, 0.9528787, 1.3627989, 1.5124314, 0.46389243, 1.437775]
[-0.3393164, 0.9528787, 1.3627989, 1.5124314, 0.46389243, 1.437775]
[-0.3393164, 0.9528787, 1.3627989, 1.5124314, 0.46389243, 1.437775]
[-0.3393164, 0.9528787, 1.3627989, 1.5124314, 0.46389243, 1.437775]
[-0.3393164, 0.9528787, 1.3627989, 1.5124314, 0.46389243, 1.437775]

CASE 3: stronger control; set all non-involved entires to zero: add a[1:] = 0 to CASE 1 code. Result: discrepancies persist.


CASE 4: check indices other than [0]; like for [0], results begin to stabilize a fixed # of array enlargements from their point of creation. Output

np.random.seed(1)
a = np.random.randn(9, 6).astype('float32')
b = np.random.randn(6, 6).astype('float32')

for j in range(len(a) - 2):
    for i in range(1, len(a)):
        res = np.dot(a[:i], b)
        try:    print(list(res[j]))
        except: pass
    print()

Hence, for the 2D * 2D case, results differ - but are consistent for 1D * 1D. From some of my readings, this appears to stem from 1D-1D using simple addition, whereas 2D-2D uses 'fancier', performance-boosting addition that may be less precise (e.g. pairwise addition does the opposite). Nonetheless, I'm unable to understand why discrepancies vanish in case 1 once a is sliced past a set 'threshold'; the larger a and b, the later this threshold seems to lie, but it always exists.

All said: why is np.dot imprecise (and inconsistent) for ND-ND arrays?


Additional info:

  • Environment: Win-10 OS, Python 3.7.4, Spyder 3.3.6 IDE, Anaconda 3.0 2019/10
  • CPU: i7-7700HQ 2.8 GHz
  • Numpy v1.16.5

Possible culprit library: Numpy MKL - also BLASS libraries; thanks to Bi Rico for noting


Stress-test code: as noted, discrepancies exacerbate in frequency w/ larger arrays; if above isn't reproducible, below should be (if not, try larger dims). My output

np.random.seed(1)
a = (0.01*np.random.randn(9, 9999)).astype('float32') # first multiply then type-cast
b = (0.01*np.random.randn(9999, 6)).astype('float32') # *0.01 to bound mults to < 1

for i in range(1, len(a)):
    print(list(np.dot(a[:i], b)[0]))

Problem severity: shown discrepancies are 'small', but no longer so when operating on a neural network with billions of numbers multiplied over a few seconds, and trillions over the entire runtime; reported model accuracy differs by entire 10's of percents, per this thread.

Below is a gif of arrays resulting from feeding to a model what's basically a[0], w/ len(a)==1 vs. len(a)==32:


OTHER PLATFORMS results, according and with thanks to Paul's testing:

Case 1 reproduced (partly):

  • Google Colab VM -- Intel Xeon 2.3 G-Hz -- Jupyter -- Python 3.6.8
  • Win-10 Pro Docker Desktop -- Intel i7-8700K -- jupyter/scipy-notebook -- Python 3.7.3
  • Ubuntu 18.04.2 LTS + Docker -- AMD FX-8150 -- jupyter/scipy-notebook -- Python 3.7.3

Note: these yield much lower error than shown above; two entries on the first row are off by 1 in the least significant digit from corresponding entries in the other rows.

Case 1 not reproduced:

  • Ubuntu 18.04.3 LTS -- Intel i7-8700K -- IPython 5.5.0 -- Python 2.7.15+ and 3.6.8 (2 tests)
  • Ubuntu 18.04.3 LTS -- Intel i5-3320M -- IPython 5.5.0 -- Python 2.7.15+
  • Ubuntu 18.04.2 LTS -- AMD FX-8150 -- IPython 5.5.0 -- Python 2.7.15rc1

Notes:

  • The linked Colab notebook and jupyter environments show a far lesser discrepancy (and only for first two rows) than is observed on my system. Also, Case 2 never (yet) showed imprecision.
  • Within this very limited sample, the current (Dockerized) Jupyter environment is more susceptible than the IPython environment.
  • np.show_config() is too long to post here -- but very bare-bones in the ipython environments (blas/lapack based), and openblas-based on Google Colab; In the ipython Linux environments the blas libraries are system installed, whereas in the jupyter and colab environments they come from /opt/conda/lib


from Why is np.dot imprecise? (n-dim arrays)

No comments:

Post a Comment