Friday, 22 November 2019

InvalidArgumentError on Decoder Model during Inference, for LSTM-based Seq2Seq on Tensorflow 2.0

versions: Python 3.6.9, Tensorflow 2.0.0, CUDA 10.0, CUDNN 7.6.1, Nvidia driver version 410.78.

I'm trying to port a LSTM-based Seq2Seq tf.keras model to tensorflow 2.0

Right now I'm facing the following error when I try to call predict on the decoder model (see below for the actual inference setup code)

It is as if it were expecting a single word as argument, but I need it to decode a full sentence (my sentences are right-padded sequences of word indices, of length 24)

P.S.: This code used to work exactly as it is on TF 1.15

InvalidArgumentError:  [_Derived_]  Inputs to operation while/body/_1/Select_2 of type Select must have the same size and shape.
Input 0: [1,100] != input 1: [24,100]
     [[]]
     [[lstm_1_3/StatefulPartitionedCall]] [Op:__inference_keras_scratch_graph_45160]

Function call stack:
keras_scratch_graph -> keras_scratch_graph -> keras_scratch_graph

FULL MODEL

enter image description here

ENCODER inference model

enter image description here

DECODER inference model

enter image description here

Inference Setup (line where error actually happens)

Important information: sequences are right-padded to 24 elements and 100 is the number of dimensions for each word embedding. This is why the error message (and the prints) show that the input shapes are (24,100).

note that this code runs on a CPU. running it on a GPU leads to another error detailed here

# original_keyword is a sample text string

with tf.device("/device:CPU:0"):

    # this method turns the raw string into a right-padded sequence
    query_sequence = keyword_to_padded_sequence_single(original_keyword)

    # no problems here
    initial_state = encoder_model.predict(query_sequence)

    print(initial_state[0].shape) # prints (24, 100)
    print(initial_state[1].shape) # (24, 100)

    empty_target_sequence = np.zeros((1,1))

    empty_target_sequence[0,0] = word_dict_titles["sos"]

    # ERROR HAPPENS HERE:
    # InvalidArgumentError:  [_Derived_]  Inputs to operation while/body/_1/Select_2 of type Select 
    # must have the same size and shape.  Input 0: [1,100] != input 1: [24,100]
    decoder_outputs, h, c = decoder_model.predict([empty_target_sequence] + initial_state)

Things I have tried

  • disabling eager mode (this just made training much slower and the error during inference stayed the same)

  • reshaping the input prior to feeding it to the predict function

  • manually computing (embedding_layer.compute_mask(inputs)) and setting masks when calling the LSTM layers



from InvalidArgumentError on Decoder Model during Inference, for LSTM-based Seq2Seq on Tensorflow 2.0

No comments:

Post a Comment