I was trying to verify that I had correctly understood how SVM - OVA (One-versus-All) works, by comparing the function OneVsRestClassifier
with my own implementation.
In the following code, I implemented num_classes
classifiers in the training phase, and then tested all of them on the testset and selected the one returning the highest probability value.
import pandas as pd
import numpy as np
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score,classification_report
from sklearn.preprocessing import scale
# Read dataset
df = pd.read_csv('In/winequality-white.csv', delimiter=';')
X = df.loc[:, df.columns != 'quality']
Y = df.loc[:, df.columns == 'quality']
my_classes = np.unique(Y)
num_classes = len(my_classes)
# Train-test split
np.random.seed(42)
msk = np.random.rand(len(df)) <= 0.8
train = df[msk]
test = df[~msk]
# From dataset to features and labels
X_train = train.loc[:, train.columns != 'quality']
Y_train = train.loc[:, train.columns == 'quality']
X_test = test.loc[:, test.columns != 'quality']
Y_test = test.loc[:, test.columns == 'quality']
# Models
clf = [None] * num_classes
for k in np.arange(0,num_classes):
my_model = SVC(gamma='auto', C=1000, kernel='rbf', class_weight='balanced', probability=True)
clf[k] = my_model.fit(X_train, Y_train==my_classes[k])
# Prediction
prob_table = np.zeros((len(Y_test), num_classes))
for k in np.arange(0,num_classes):
p = clf[k].predict_proba(X_test)
prob_table[:,k] = p[:,list(clf[k].classes_).index(True)]
Y_pred = prob_table.argmax(axis=1)
print("Test accuracy = ", accuracy_score( Y_test, Y_pred) * 100,"\n\n")
Test accuracy is equal to 0.21, while when using the function OneVsRestClassifier
, it returns 0.59. For completeness, I also report the other code (the pre-processing steps are the same as before):
....
clf = OneVsRestClassifier(SVC(gamma='auto', C=1000, kernel='rbf', class_weight='balanced'))
clf.fit(X_train, Y_train)
Y_pred = clf.predict(X_test)
print("Test accuracy = ", accuracy_score( Y_test, Y_pred) * 100,"\n\n")
Is there something wrong in my own implementation of SVM - OVA?
from Something wrong when implementing SVM One-vs-all in python
No comments:
Post a Comment