Monday, 17 June 2019

Creating tensor of dynamic shape from python lists to feed tensorflow RNN

I'm creating an end-to-end speech recognition architecture, in which my data is a list of segmented spectrograms. My data has shape (batch_size, timesteps, 8, 65, 1) in which batch_size is fixed but timesteps is varying. I can't figure out, how to put this data into a tensor with the appropriate shape to feed my model. Here is a piece of code that shows my problem:

import numpy as np
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.layers import Conv2D, MaxPool2D, Dense, Dropout, Flatten, TimeDistributed
from tensorflow.keras.layers import SimpleRNN, LSTM
from tensorflow.keras import Input, layers
from tensorflow.keras import backend as K

segment_width = 8
segment_height = 65
segment_channels = 1

batch_size = 4

segment_lengths = [28, 33, 67, 43]
label_lengths = [16, 18, 42, 32]

TARGET_LABELS = np.arange(35)

# Generating data
X = [np.random.uniform(0,1, size=(segment_lengths[k], segment_width, segment_height, segment_channels))
     for k in range(batch_size)]

y = [np.random.choice(TARGET_LABELS, size=label_lengths[k]) for k in range(batch_size)]

# Model definition
input_segments_data = tf.keras.Input(name='input_segments_data', shape=(None, segment_width, segment_height, segment_channels),
                               dtype='float32')
input_segment_lengths = tf.keras.Input(name='input_segment_lengths', shape=[1], dtype='int64')
input_label_lengths = tf.keras.Input(name='input_label_lengths', shape=[1], dtype='int64')
# More complex architecture comes here
outputs = Flatten()(input_segments_data)

model = tf.keras.Model(inputs=[input_segments_data, input_segment_lengths, input_label_lengths], outputs = outputs)

def dummy_loss(y_true, y_pred):
  return y_pred

model.compile(optimizer="Adam", loss=dummy_loss)
model.summary()

output:

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_segments_data (InputLayer [(None, None, 8, 65, 0                                            
__________________________________________________________________________________________________
input_segment_lengths (InputLay [(None, 1)]          0                                            
__________________________________________________________________________________________________
input_label_lengths (InputLayer [(None, 1)]          0                                            
__________________________________________________________________________________________________
flatten (Flatten)               (None, None)         0           input_segments_data[0][0]        
==================================================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
__________________________________________________________________________________________________

Now when I try to predict from my random data:

model.predict([X, segment_lengths, segment_lengths])

I get this error:

ValueError: Error when checking input: expected input_segments_data to have 5 dimensions, but got array with shape (4, 1)

How can I convert X (which is a list of arrays) to a tensor of shape (None, None, 8, 65, 1) and feed it to my model? I don't want to use zero padding!



from Creating tensor of dynamic shape from python lists to feed tensorflow RNN

No comments:

Post a Comment