Hemant Vishwakarma: Keras' model.predict() produces very different accuracy when calling on multiple batches at once VS calling on individual batches one by one

Monday, 18 January 2021

Keras' model.predict() produces very different accuracy when calling on multiple batches at once VS calling on individual batches one by one

This is the model I've used:

model = Sequential()
model.add(LSTM(units=200, input_shape=(15, 17), return_sequences=True)
model.add(Dropout(drop_out))
model.add(BatchNormalization())
model.add(LSTM(units=unit_per_layer))     
model.add(Dropout(drop_out))
model.add(BatchNormalization())
model.add(Dense(units=unit_per_layer, activation='tanh'))
model.add(Dropout(drop_out))
model.add(BatchNormalization())
model.add(Dense(units=1, activation='sigmoid'))

model.compile(optimizer=opti_func, loss='binary_crossentropy', metrics=['binary_accuracy'])

Now, after model.fit(), when I call model.predict(X_dataset_multiple_batch) I get some good predictions. However, if I devide X_dataset_multiple_batch into a series of individual batches (let's call each of them X_dataset_single_batch) and call model.predict(X_dataset_single_batch) on them one by one (i.e. call model.predict(X_dataset_single_batch) multiple times), the predictions become much worse than the former one.

Additional Note Honestly I only need the last prediction, but because of Keras internal design stuff, I cannot have only one single prediction. It has to be in batches. So I have to do my predictions on batches and then extract the last prediction. This is fine, but the question now becomes what is the optimum number of batches that I have to give model.predict()?

Update 1

This is how I pick individual batches from my test.csv:

df = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/Data/Test_file.csv")

X_ = df.loc[:, 'b':'dm'].to_numpy()
Y_ = df.loc[:, 'dn'].to_numpy()

predictions = []

for i in range (0, 2 + X_.shape[0] - (n_batch + time_step)):
    X = X_[i:i+n_batch+time_step]
    Y = Y_[i:i+n_batch+time_step]

    scaler = MinMaxScaler()
    X = scaler.fit_transform(X)

    X_one_batch_of_data = np.zeros((n_batch, time_step, X.shape[1]))
    
    for j in range(0, X_one_batch_of_data.shape[0]):
        X_one_batch_of_data[j] = X[j:j+time_step, :]

    predictions.append(model.predict(X_one_batch_of_data, batch_size=n_batch)[-1, 0])

from Keras' model.predict() produces very different accuracy when calling on multiple batches at once VS calling on individual batches one by one

Hemant Vishwakarma

Monday, 18 January 2021

Keras' model.predict() produces very different accuracy when calling on multiple batches at once VS calling on individual batches one by one

Update 1

No comments:

Post a Comment