Thursday, 2 January 2020

keras LSTM get hidden-state (converting sentece-sequence to document context vectors)

Im trying to create document context vectors from sentence-vectors via LSTM using keras (so each document consist of a sequence of sentence vectors).

My goal is to replicate the following blog post using keras: https://andriymulyar.com/blog/bert-document-classification

I have a (toy-)tensor, that looks like this: X = np.array(features).reshape(5, 200, 768) So 5 documents with each having a 200 sequence of sentence vectors - each sentence vector having 768 features.

So to get an embedding from my sentence vectors, I encoded my documents as one-hot-vectors to learn an LSTM:

y = [1,2,3,4,5] # 5 documents in toy-tensor
y = np.array(y)
yy = to_categorical(y)
yy = yy[0:5,1:6]

Until now, my code looks like this

inputs1=Input(shape=(200,768))
lstm1, states_h, states_c =LSTM(5,dropout=0.3,recurrent_dropout=0.2, return_state=True)(inputs1)
model1=Model(inputs1,lstm1)
model1.compile(loss='categorical_crossentropy',optimizer='rmsprop',metrics=['acc']) 
model1.summary()
model1.fit(x=X,y=yy,batch_size=100,epochs=10,verbose=1,shuffle=True,validation_split=0.2)

When I print states_h I get a tensor of shape=(?,5) and I dont really know how to access the vectors inside the tensor, which should represent my documents.

print(states_h)
Tensor("lstm_51/while/Exit_3:0", shape=(?, 5), dtype=float32)

Or am I doing something wrong? To my understanding there should be 5 document vectors e.g. doc1=[...] ; ...; doc5=[...] so that I can reuse the document vectors for a classification task.



from keras LSTM get hidden-state (converting sentece-sequence to document context vectors)

No comments:

Post a Comment