Wednesday, 12 January 2022

Compute R^2 Score for Lasso Regression Against Specific Model in scikit-learn

I have regularized Lasso models based on PolynomialRegression features of four degrees (1, 3, 7, 11) on pre-trained data in sci-kit learn. I generated predictions for 100 evenly spaced points on the interval [0, 20] and stored the results in a numpy array. My task is to return the 𝑅^2 score for each of the Lasso models relative to a new 'gold standard' test set generated from the true underlying cubic polynomial model without noise. The initial model, from which my below code is based, includes a noise constant. I have to compute this new test set by computing the true noise-less underlying function t^3/20 - t^2 - t for each of 100 evenly spaced points on the interval [0, 20], and ultimately select the degree which has the R^2 that gives the best fit on the given function. Here is my code so far:

degs = (1, 3, 7, 11)

las_r2 = []
    
preds = np.zeros((4,100))

for i, deg in enumerate(degs):
    poly = PolynomialFeatures(degree=deg)
    X_poly = poly.fit_transform(X_train)
    linlasso = Lasso(alpha=0.01, max_iter = 10000).fit(X_poly, y_train)
    y_poly = linlasso.predict(poly.fit_transform(np.linspace(0,20,100).reshape(-1,1)));
    preds[i,:] = y_poly.transpose()
        
    X_test_poly = poly.fit_transform(X_test)
    las_r2.append(linlasso.score(X_test_poly, y_test))
    
answer = las_r2.max()
    

What I don't know is how to how to incorporate that "gold standard" function provided in the above paragraph into my for-loop.



from Compute R^2 Score for Lasso Regression Against Specific Model in scikit-learn

No comments:

Post a Comment