I have gone through similar questions that has been asked before (for example [1] [2]). However, none of them completely relevant for my problem.
I am trying to calculate a dot product between two large matrices and I have some memory constraint that I have to meet.
I have a numpy sparse matrix, which is a shape of (10000,600000). For example,
from scipy import sparse as sps
x = sps.random(m=10000, n=600000, density=0.1).toarray()
The second numpy matrix is of size (600000, 256), which consists of only (-1, 1).
import numpy as np
y = np.random.choice([-1,1], size=(600000, 256))
I need dot product of x
and y
at lowest possible memory required. Speed is not the primary concern.
Here is what I have tried so far:
Scipy Sparse Format:
Naturally, I converted the numpy sparse matrix to scipy csr_matrix
. However, task is still getting killed due to memory issue. There is no error, I just get killed on the terminal.
from scipy import sparse as sps
sparse_x = sps.csr_matrix(x, copy=False)
z = sparse_x.dot(y)
# killed
Decreasing dtype precision + Scipy Sparse Format:
from scipy import sparse as sps
x = x.astype("float16", copy=False)
y = y.astype("int8", copy=False)
sparse_x = sps.csr_matrix(x, copy=False)
z = sparse_x.dot(y)
# Increases the memory requirement for some reason and dies
np.einsum
Not sure if it helps/works with sparse matrix. Found something interesting in this answer. However, following doesn't help either:
z = np.einsum('ij,jk->ik', x, y)
# similar memory requirement as the scipy sparse dot
Suggestions?
If you have any suggestions to improve any of these. Please let me know. Further, I am thinking in the following directions:
-
It would be great If I can get rid of dot product itself somehow. My second matrix (i.e.
y
is randomly generated and it just has [-1, 1]. I am hoping if there is way I could take advantage of its features. -
May be diving dot product into several small dot product and then, aggregate.
from Memory efficient dot product between a sparse matrix and a non-sparse numpy matrix
No comments:
Post a Comment