Thursday, 24 June 2021

RunTimeError during one hot encoding

I have a dataset where class values go from -2 to 2 by 1 step (i.e., -2,-1,0,1,2) and where 9 identifies the unlabelled data. Using one hot encode

self._one_hot_encode(labels)

I get the following error: RuntimeError: index 1 is out of bounds for dimension 1 with size 1

due to

self.one_hot_labels = self.one_hot_labels.scatter(1, labels.unsqueeze(1), 1)

The error should raise from [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 9, 1, 1, 1, 1, 1, 1], where I have 9 in the mapping setting equal index 9 to 1. It is unclear to me how to fix it, even after going through past questions and answers to similar problems (e.g., index 1 is out of bounds for dimension 0 with size 1). The part of code involved in the error is the following:

def _one_hot_encode(self, labels):
    # Get the number of classes
    classes = torch.unique(labels)
    classes = classes[classes != 9] # unlabelled 
    self.n_classes = classes.size(0)

    # One-hot encode labeled data instances and zero rows corresponding to unlabeled instances
    unlabeled_mask = (labels == 9)
    labels = labels.clone()  # defensive copying
    labels[unlabeled_mask] = 0
    self.one_hot_labels = torch.zeros((self.n_nodes, self.n_classes), dtype=torch.float)
    self.one_hot_labels = self.one_hot_labels.scatter(1, labels.unsqueeze(1), 1)
    self.one_hot_labels[unlabeled_mask, 0] = 0

    self.labeled_mask = ~unlabeled_mask

def fit(self, labels, max_iter, tol):
    
    self._one_hot_encode(labels)

    self.predictions = self.one_hot_labels.clone()
    prev_predictions = torch.zeros((self.n_nodes, self.n_classes), dtype=torch.float)

    for i in range(max_iter):
        # Stop iterations if the system is considered at a steady state
        variation = torch.abs(self.predictions - prev_predictions).sum().item()
        

        prev_predictions = self.predictions
        self._propagate()

Example of dataset:

ID  Target  Weight  Label   Score   Scale_Cat   Scale_num
0   A   D   65.1    1   87  Up  1
1   A   X   35.8    1   87  Up  1
2   B   C   34.7    1   37.5    Down    -2
3   B   P   33.4    1   37.5    Down    -2
4   C   B   33.1    1   37.5    Down    -2
5   S   X   21.4    0   12.5    NA  9

The source code I am using as reference is here: https://mybinder.org/v2/gh/thibaudmartinez/label-propagation/master?filepath=notebook.ipynb

Full track of the error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-126-792a234f63dd> in <module>
      4 label_propagation = LabelPropagation(adj_matrix_t)
----> 6 label_propagation.fit(labels_t) # causing error
      7 label_propagation_output_labels = label_propagation.predict_classes()
      8 

<ipython-input-115-54a7dbc30bd1> in fit(self, labels, max_iter, tol)
    100 
    101     def fit(self, labels, max_iter=1000, tol=1e-3):
--> 102         super().fit(labels, max_iter, tol)
    103 
    104 ## Label spreading

<ipython-input-115-54a7dbc30bd1> in fit(self, labels, max_iter, tol)
     58             Convergence tolerance: threshold to consider the system at steady state.
     59         """
---> 60         self._one_hot_encode(labels)
     61 
     62         self.predictions = self.one_hot_labels.clone()

<ipython-input-115-54a7dbc30bd1> in _one_hot_encode(self, labels)
     42         labels[unlabeled_mask] = 0
     43         self.one_hot_labels = torch.zeros((self.n_nodes, self.n_classes), dtype=torch.float)
---> 44         self.one_hot_labels = self.one_hot_labels.scatter(1, labels.unsqueeze(1), 1)
     45         self.one_hot_labels[unlabeled_mask, 0] = 0
     46 

RuntimeError: index 1 is out of bounds for dimension 1 with size 1


from RunTimeError during one hot encoding

No comments:

Post a Comment