Hemant Vishwakarma: How to fill in the blank using bidirectional RNN and pytorch?

Tuesday, 12 February 2019

How to fill in the blank using bidirectional RNN and pytorch?

I am trying to fill in the blank using a bidirectional RNN and pytorch.

The input will be like: The dog is _____, but we are happy he is okay.

The output will be like:

1. hyper (Perplexity score here) 
2. sad (Perplexity score here) 
3. scared (Perplexity score here)

I discovered this idea here: https://medium.com/@plusepsilon/the-bidirectional-language-model-1f3961d1fb27

import torch, torch.nn as nn
from torch.autograd import Variable

text = ['BOS', 'How', 'are', 'you', 'EOS']
seq_len = len(text)
batch_size = 1
embedding_size = 1
hidden_size = 1
output_size = 1

random_input = Variable(
    torch.FloatTensor(seq_len, batch_size, embedding_size).normal_(), requires_grad=False)

bi_rnn = torch.nn.RNN(
    input_size=embedding_size, hidden_size=hidden_size, num_layers=1, batch_first=False, bidirectional=True)

bi_output, bi_hidden = bi_rnn(random_input)

# stagger
forward_output, backward_output = bi_output[:-2, :, :hidden_size], bi_output[2:, :, hidden_size:]
staggered_output = torch.cat((forward_output, backward_output), dim=-1)

linear = nn.Linear(hidden_size * 2, output_size)

# only predict on words
labels = random_input[1:-1]

# for language models, use cross-entropy :)
loss = nn.MSELoss()
output = loss(linear(staggered_output), labels)

I am trying to reimplement the code above found at the bottom of the blog post. I am new to pytorch and nlp, and can't understand what the input and output to the code is.

Question about the input: I am guessing the input are the few words that are given. Why does one need beginning of sentence and end of sentence tags in this case? Why don't I see the input being a corpus on which the model is trained like other classic NLP problems? I would like to use the Enron email corpus to train the RNN.

Question about the output: I see the output is a tensor. My understanding is the tensor is a vector, so maybe a word vector in this case. How can you use the tensor to output the words themselves?

See my github ipython notebook here showing the steps and output at each of the steps above: https://github.com/InsightDataCommunity/vectorizer/blob/4387a2b277f089a8f05ce6adb1cd71cecc98c00c/notebooks/exploratory/0.3-rnn-model.ipynb

from How to fill in the blank using bidirectional RNN and pytorch?

Hemant Vishwakarma

Tuesday, 12 February 2019

How to fill in the blank using bidirectional RNN and pytorch?

No comments:

Post a Comment