Hemant Vishwakarma: Keras - Hyper Tuning the initial state of the model

Monday, 21 June 2021

Keras - Hyper Tuning the initial state of the model

I've written an LSTM model that predicts the sequential data.

def get_model(config, num_features, output_size):
    opt = Adam(learning_rate=get_deep(config, 'hp.learning_rate'), beta_1=get_deep(config, 'hp.beta_1'))

    inputs = Input(shape=[None, num_features], dtype=tf.float32, ragged=True)
    layers = LSTM(get_deep(config, 'hp.lstm_neurons'), activation=get_deep(config, 'hp.lstm_activation'))(
        inputs.to_tensor(), mask=tf.sequence_mask(inputs.row_lengths()))

    layers = BatchNormalization()(layers)
    if 'dropout_rate' in config['hp']:
        layers = Dropout(get_deep(config, 'hp.dropout_rate'))(layers)

    for layer in get_deep(config, 'hp.dense_layers'):
        layers = Dense(layer['neurons'], activation=layer['activation'])(layers)
        layers = BatchNormalization()(layers)
        if 'dropout_rate' in layer:
            layers = Dropout(layer['dropout_rate'])(layers)

    layers = Dense(output_size, activation='sigmoid')(layers)
    model = Model(inputs, layers)
    model.compile(loss='mse', optimizer=opt, metrics=['mse'])
    model.summary()
    return model

I've tuned some of the layer's params. While validating the model I've run a model with a specific configuration several times. Most of the time the results are similar, however, one run was much better than others, which led me to think that the initial state of the model is probably crucial in order to get the best performance.

As suggested in this video, weight initialization can provide some performance boost. I've googled around and found layer weight initializers, but I'm not sure what ranges should I tune.

from Keras - Hyper Tuning the initial state of the model

Hemant Vishwakarma

Monday, 21 June 2021

Keras - Hyper Tuning the initial state of the model

No comments:

Post a Comment