Hemant Vishwakarma: InvalidArgumentError on Decoder Model during Inference, for LSTM-based Seq2Seq on Tensorflow 2.0

Friday, 22 November 2019

InvalidArgumentError on Decoder Model during Inference, for LSTM-based Seq2Seq on Tensorflow 2.0

versions: Python 3.6.9, Tensorflow 2.0.0, CUDA 10.0, CUDNN 7.6.1, Nvidia driver version 410.78.

I'm trying to port a LSTM-based Seq2Seq tf.keras model to tensorflow 2.0

Right now I'm facing the following error when I try to call predict on the decoder model (see below for the actual inference setup code)

It is as if it were expecting a single word as argument, but I need it to decode a full sentence (my sentences are right-padded sequences of word indices, of length 24)

P.S.: This code used to work exactly as it is on TF 1.15

InvalidArgumentError:  [_Derived_]  Inputs to operation while/body/_1/Select_2 of type Select must have the same size and shape.
Input 0: [1,100] != input 1: [24,100]
     [[]]
     [[lstm_1_3/StatefulPartitionedCall]] [Op:__inference_keras_scratch_graph_45160]

Function call stack:
keras_scratch_graph -> keras_scratch_graph -> keras_scratch_graph

FULL MODEL

ENCODER inference model

DECODER inference model

Inference Setup (line where error actually happens)

Important information: sequences are right-padded to 24 elements and 100 is the number of dimensions for each word embedding. This is why the error message (and the prints) show that the input shapes are (24,100).

note that this code runs on a CPU. running it on a GPU leads to another error detailed here

# original_keyword is a sample text string

with tf.device("/device:CPU:0"):

    # this method turns the raw string into a right-padded sequence
    query_sequence = keyword_to_padded_sequence_single(original_keyword)

    # no problems here
    initial_state = encoder_model.predict(query_sequence)

    print(initial_state[0].shape) # prints (24, 100)
    print(initial_state[1].shape) # (24, 100)

    empty_target_sequence = np.zeros((1,1))

    empty_target_sequence[0,0] = word_dict_titles["sos"]

    # ERROR HAPPENS HERE:
    # InvalidArgumentError:  [_Derived_]  Inputs to operation while/body/_1/Select_2 of type Select 
    # must have the same size and shape.  Input 0: [1,100] != input 1: [24,100]
    decoder_outputs, h, c = decoder_model.predict([empty_target_sequence] + initial_state)

Things I have tried

disabling eager mode (this just made training much slower and the error during inference stayed the same)
reshaping the input prior to feeding it to the predict function
manually computing (embedding_layer.compute_mask(inputs)) and setting masks when calling the LSTM layers

from InvalidArgumentError on Decoder Model during Inference, for LSTM-based Seq2Seq on Tensorflow 2.0

Hemant Vishwakarma

Friday, 22 November 2019

InvalidArgumentError on Decoder Model during Inference, for LSTM-based Seq2Seq on Tensorflow 2.0

Inference Setup (line where error actually happens)

No comments:

Post a Comment