I am trying to implement ALS algorithm in Dask, but I am having trouble figuring out how to compute latent feautures in one step. What I have so far is iteratively calculating each row of the matrix:
for i, Ri in enumerate(R):
Users[i] = da.linalg.solve(da.add(da.dot(Items, da.dot(da.diag(Ri),Items.T)), lambda_ * da.eye(n_factors)),
da.dot(Items, da.dot(da.diag(Ri), X_train[:,i].T))).T.compute()
for j, Rj in enumerate(R.T):
Items[:,j] = da.linalg.solve(da.add(da.dot(Users.T, da.dot(da.diag(Rj), Users)), lambda_ * da.eye(n_factors)),
da.dot(Users.T, da.dot(da.diag(Rj), X_train[j])))
Is it possible to compute Users and Items matrix without the for loop? Because mentioned approach is very slow and inefficient.
Example input:
n_factors = 2
lambda_ = 0.1
# We have 6 users and 4 items
Matrix X_train(6x4), R(4x6), Users(2x6) and Items(4x2) looks like:
1 0 0 0 5 2 1 0 0 0 0.8 1.3 1.1 0.2 4.1 1.6
0 0 0 0 4 0 0 0 1 1 3.9 4.3 3.5 2.7 4.3 0.5
0 3 0 0 4 0 0 0 0 0 2.9 1.5
0 3 0 0 0 0 0 0 0 0 0.2 4.7
1 1 1 0 0.9 1.1
1 0 0 0 4.8 3.0
EDIT 1: my first guess is, I need to prepare da.diag(Ri) and da.diag(Rj) in advance. But how to do so in a way, it would yield the same result.
EDIT 2: I followed formulas on this stackoverflow thread and come up with this code:
Items = da.linalg.solve(da.add(da.dot(Users.T, Users), lambda_ * da.eye(n_factors)),
da.dot(Users.T, X_train.T)).compute()
Items = np.where(Items < 0, 0, Items)
Users = da.linalg.solve(da.add(da.dot(Items, Items.T), lambda_ * da.eye(n_factors)),
da.dot(Items, X_train)).T.compute()
Users = np.where(Users < 0, 0, Users)
But I don't think this works correctly, because MSE is not decreasing. Any help would be appreciated.
from ALS algorithm in Dask optimization
No comments:
Post a Comment