I have the following data:
x_train is np.ndarray
, y_train is np.ndarray
and model is xgboost.sklearn.XGBClassifier
. The types are:
print(type(x_train))
print(x_train.dtype)
>> <class 'numpy.ndarray'>
>> float64
print(type(y_train))
print(y_train.dtype)
>> <class 'numpy.ndarray'>
>> float64
print(type(model))
>> xgboost.sklearn.XGBClassifier
I am using Databricks Runtime 12.2 LTS ML which corresponds to xgboost==1.7.2
.
Getting the following error:
model.fit(x_train, y_train)
>> XGBoostError: [09:28:22] ../src/data/data.cc:254: All feature_types must be one of {int, float, i, q, c}.
y_train is actually a vector or 1s and 0s, I have also tried with casting it to np.int32
or np.int64
. Then, I tried casting it to builtins.int
and builtins.float
, as such:
x_train = np.array(x_train, dtype=float)
y_train = np.array(y_train, dtype=int)
print(x_train.dtype)
print(y_train.dtype)
>>float64
>>int64
Same error as before.
I have checked this post but this does not help me as my types are different. I would prefer not to have to convert from numpy dtypes since these have worked in the past and my config files are set in such as way ..
Other relevant packages: sklearn==0.0.post7 and scikit-learn==1.0.2. You can reproduce the error as follows:
import numpy as np
import xgboost as xgb
params = {'base_score': 0.5,
'booster': 'gbtree',
'callbacks': 'null',
'colsample_bylevel': 1,
'colsample_bynode': 1,
'colsample_bytree': 1,
'early_stopping_rounds': 'null',
'enable_categorical': False,
'eval_metric': 'aucpr',
'feature_types': 'null',
'gamma': 7,
'gpu_id': -1,
'grow_policy': 'lossguide',
'importance_type': 'null',
'interaction_constraints': '',
'learning_rate': 0.05610004032698376,
'max_bin': 256,
'max_cat_threshold': 64,
'max_cat_to_onehot': 4,
'max_delta_step': 0,
'max_depth': 2,
'max_leaves': 0,
'min_child_weight': 1,
'monotone_constraints': (),
'n_estimators': 1275,
'n_jobs': 4,
'num_parallel_tree': 1,
'objective': 'binary:logistic',
'predictor': 'auto',
'random_state': 0,
'reg_alpha': 0,
'reg_lambda': 60,
'sampling_method': 'uniform',
'scale_pos_weight': 11.507905606798213,
'subsample': 1,
'tree_method': 'hist',
'use_label_encoder': False,
'validate_parameters': 1,
'verbosity': 0}
model = xgb.XGBClassifier(**params)
x = np.random.normal(0,1,(100,10)).astype(np.float64)
y = np.random.uniform(0,1,100).astype(np.int64)
model.fit(x,y)
from XGBoost's requires int or float when I actually have int and float
No comments:
Post a Comment