Wednesday, 4 August 2021

Sklearn - Multi-class confusion matrix for ordinal data

I've written a model that predicts on ordinal data. At the moment, I'm evaluating my model using quadratic cohen's kappa. I'm looking for a way to visualize the results using a confusion matrix, then calculate recall, precision and f1 score taking into account the prediction distance.

I.E predicting 2 when class was 1 is better than predicting 3 when class was 1.

I've written the following code to plot and calculate the results:

def plot_cm(df, ax):
    cf_matrix = confusion_matrix(df.x, df.y,normalize='true',labels=[0,1,2,3,4,5,6,7,8]) 
    
    ax = sns.heatmap(cf_matrix, linewidths=1, annot=True, ax=ax, fmt='.2f')
    ax.set_ylabel(f'Actual')
    ax.set_xlabel(f'Predicted')

    print(f'Recall score:',recall_score(df.x,df.y, average= 'weighted',zero_division=0))
    print(f'Precision score:',precision_score(df.x,df.y, average= 'weighted',zero_division=0))
    print(f'F1 score:',f1_score(df.x,df.y, average= 'weighted',zero_division=0))

enter image description here

Recall score: 0.53505
Precision score: 0.5454783454981732
F1 score: 0.5360650278722704

The visualization is fine, however, the calculation ignores predictions that where "almost" true. I.E predicted 8 when actual was 9 (for example).

Is there a way to calculate Recall, Precision and F1 taking into account the ordinal behavior of the data?



from Sklearn - Multi-class confusion matrix for ordinal data

No comments:

Post a Comment