Monday, 24 May 2021

ALS algorithm in Dask optimization

I am trying to implement ALS algorithm in Dask, but I am having trouble figuring out how to compute latent feautures in one step. What I have so far is iteratively calculating each row of the matrix:

for i, Ri in enumerate(R):
   Users[i] = da.linalg.solve(da.add(da.dot(Items, da.dot(da.diag(Ri),Items.T)), lambda_ * da.eye(n_factors)),
                                   da.dot(Items, da.dot(da.diag(Ri), X_train[:,i].T))).T.compute()

for j, Rj in enumerate(R.T):
   Items[:,j] = da.linalg.solve(da.add(da.dot(Users.T, da.dot(da.diag(Rj), Users)), lambda_ * da.eye(n_factors)),
                                 da.dot(Users.T, da.dot(da.diag(Rj), X_train[j])))

Is it possible to compute Users and Items matrix without the for loop? Because mentioned approach is very slow and inefficient.

Example input:

n_factors = 2
lambda_ = 0.1
# We have 6 users and 4 items

Matrix X_train(6x4), R(4x6), Users(2x6) and Items(4x2) looks like:

1  0  0  0  5  2        1 0 0 0    0.8  1.3     1.1  0.2  4.1  1.6
0  0  0  0  4  0        0 0 1 1    3.9  4.3     3.5  2.7  4.3  0.5
0  3  0  0  4  0        0 0 0 0    2.9  1.5
0  3  0  0  0  0        0 0 0 0    0.2  4.7
                        1 1 1 0    0.9  1.1
                        1 0 0 0    4.8  3.0

EDIT 1: my first guess is, I need to prepare da.diag(Ri) and da.diag(Rj) in advance. But how to do so in a way, it would yield the same result.

EDIT 2: I followed formulas on this stackoverflow thread and come up with this code:

    Items = da.linalg.solve(da.add(da.dot(Users.T, Users), lambda_ * da.eye(n_factors)), 
                            da.dot(Users.T, X_train.T)).compute()
    Items = np.where(Items < 0, 0, Items)
    
    Users = da.linalg.solve(da.add(da.dot(Items, Items.T), lambda_ * da.eye(n_factors)), 
                            da.dot(Items, X_train)).T.compute()
    Users = np.where(Users < 0, 0, Users)

But I don't think this works correctly, because MSE is not decreasing. Any help would be appreciated.



from ALS algorithm in Dask optimization

No comments:

Post a Comment