I'm trying to modify the sklearn VotingClassifier to be used in a RandomSearchCV. The idea being that for a larger number of classifiers the possible weight combinations explodes and is better represented by individual weight choices rather than many different tuples. Also this enables a change to a smarter hyperparameter tuning method as there is information in the change of weight.
So how do I correctly subclass the VotingClassifier Class as the below code results in either None being passed to the weights or the default being used and the search complaining about weights not being controlled by parameters (which they are).
class VotingClassifier2(VotingClassifier):
def __init__(self, estimators, w1, w2, voting='soft', weights=None, n_jobs=None, flatten_transform=True):
super().__init__(estimators, voting, weights, n_jobs, flatten_transform)
if w1:
tot=w1+w2
else:
breakpoint()
self.weights = (w1/tot, w2/tot)
pipe = Pipeline(
[
[
"vc",
VotingClassifier2(
estimators=[
("xgb", XGBClassifier()),
('lr', LogisticRegression(fit_intercept=True, max_iter=300, solver='lbfgs'))
],
voting="soft",
weights=None,
w1=1,
w2=0
),
]
]
)
opt = RandomizedSearchCV(
pipe,
{
"vc__w1": uniform(0.1, 1),
"vc__w2": uniform(0.1, 1)
},
n_iter=5,
cv=5,
n_jobs=25,
return_train_score=False,
error_score='raise'
)
When called initially w1 and w2 are None but weights has already been calculated as desired from the inputs. Then the search runs and fails to set them.
RuntimeError: Cannot clone object VotingClassifier2(estimators=[('xgb', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
colsample_bynode=1, colsample_bytree=1, gamma=0, learning_rate=0.1,
max_delta_step=0, max_depth=3, min_child_weight=1, missing=None,
n_estimators=100, n_jobs=1, nthread=None,
objectiv...alty='l2', random_state=None, solver='warn',
tol=0.0001, verbose=0, warm_start=False))]))],
flatten_transform=True, n_jobs=None, voting='soft', w1=None,
w2=None, weights=(1.0, 0.0)), as the constructor either does not set or modifies parameter weights
from Subclassing a Classifier in sklearn
No comments:
Post a Comment