Sunday, 13 December 2020

LSTM Autoencoder problems

TLDR:

Autoencoder underfits timeseries reconstruction and just predicts average value.

Question Set-up:

Here is a summary of my attempt at a sequence-to-sequence autoencoder. This image was taken from this paper: https://arxiv.org/pdf/1607.00148.pdf enter image description here

Encoder: Standard LSTM layer. Input sequence is encoded in the final hidden state.

Decoder: LSTM Cell (I think!). Reconstruct the sequence one element at a time, starting with the last element x[N].

Decoder algorithm is as follows for a sequence of length N:

  1. Get Decoder initial hidden state hs[N]: Just use encoder final hidden state.
  2. Reconstruct last element in the sequence: x[N]= w.dot(hs[N]) + b.
  3. Same pattern for other elements: x[i]= w.dot(hs[i]) + b
  4. use x[i] and hs[i] as inputs to LSTMCell to get x[i-1] and hs[i-1]

Minimum Working Example:

Here is my implementation, starting with the encoder:

class SeqEncoderLSTM(nn.Module):
    def __init__(self, n_features, latent_size):
        super(SeqEncoderLSTM, self).__init__()
        
        self.lstm = nn.LSTM(
            n_features, 
            latent_size, 
            batch_first=True)
        
    def forward(self, x):
        _, hs = self.lstm(x)
        return hs

Decoder class:

class SeqDecoderLSTM(nn.Module):
    def __init__(self, emb_size, n_features):
        super(SeqDecoderLSTM, self).__init__()
        
        self.cell = nn.LSTMCell(n_features, emb_size)
        self.dense = nn.Linear(emb_size, n_features)
        
    def forward(self, hs_0, seq_len):
        
        x = torch.tensor([])
        
        # Final hidden and cell state from encoder
        hs_i, cs_i = hs_0
        
        # reconstruct first element with encoder output
        x_i = self.dense(hs_i)
        x = torch.cat([x, x_i])
        
        # reconstruct remaining elements
        for i in range(1, seq_len):
            hs_i, cs_i = self.cell(x_i, (hs_i, cs_i))
            x_i = self.dense(hs_i)
            x = torch.cat([x, x_i])
        return x

Bringing the two together:

class LSTMEncoderDecoder(nn.Module):
    def __init__(self, n_features, emb_size):
        super(LSTMEncoderDecoder, self).__init__()
        self.n_features = n_features
        self.hidden_size = emb_size

        self.encoder = SeqEncoderLSTM(n_features, emb_size)
        self.decoder = SeqDecoderLSTM(emb_size, n_features)
    
    def forward(self, x):
        seq_len = x.shape[1]
        hs = self.encoder(x)
        hs = tuple([h.squeeze(0) for h in hs])
        out = self.decoder(hs, seq_len)
        return out.unsqueeze(0)        

Data:

Large dataset of events scraped from the news (ICEWS). Various categories are assigned. I have encoded (one-hot) and aggregated it to daily rates for these categories. This reduces my dataset down to a multi-variate timeseries of 2,310 samples (~6 years worth of days). This number is reduced further when I break up my sequences into 7-day subsequences. So at the end of it all, I have 330 subsequences, each one containing 7 days worth of data, where each day includes rates of multiple variables.

I know this isn't very much, but overfitting would be welcome in the face my current problem.

Problem:

This model trains well enough. I've seldomly had a loss curve so nice with so little tuning... but the model is being trained to predict the average value of the time series rather than reconstruct it.

My research:

This problem is identical to the one discussed in this question: LSTM autoencoder always returns the average of the input sequence

The problem in that case ended up being that the objective function was averaging the target timeseries before calculating loss. This was due to some broadcasting errors because the author didn't have the right sized inputs to the objective function.

In my case, I do not see this being the issue. I have checked and double checked that all of my dimensions/sizes line up. I am at a loss.

Question:

What is causing my model to predict the average and how do I fix it?



from LSTM Autoencoder problems

No comments:

Post a Comment