Hemant Vishwakarma: How to set multiple sequences as features in KERAS

Saturday, 15 May 2021

How to set multiple sequences as features in KERAS

I want to make Named Entity Recognition model with Keras. These are the links that I have followed:

https://valueml.com/named-entity-recognition-using-lstm-in-keras/ https://djajafer.medium.com/named-entity-recognition-and-classification-with-keras-4db04e22503d

Data looks like this:

                word label
0          Thousands     O
1                 of     O
2      demonstrators     O
3               have     O
4            marched     O
...              ...   ...
44187          there     O
44188   accidentally     O
44189             or     O
44190   deliberately     O
44191              .     O

They are using word to vectors, so they are indexing the words and labels, so that X are my features (index sequences of words) and y are my results (index sequences of labels):

max_len = 30
X = [[word2idx[w[0]] for w in s] for s in list_of_sentances]
X = pad_sequences(maxlen=max_len, sequences=X, padding="post", value=num_words-1)

y = [[label2idx[w[1]] for w in s] for s in list_of_sentances]
y = pad_sequences(maxlen=max_len, sequences=y, padding="post", value=label2idx["O"])
y = [to_categorical(i, num_classes=num_labels) for i in y]

But what if I have dataset like this:

here I have another column and that is POS. How can I add values of POS column to my features? So basically, I do not want only word values in my X, i also want POS values in my X. *(or any other values) What If I have multiple columns, such as:

word
POS
is_capital_letter
word_length

...

How can I add all of these columns to my features

from How to set multiple sequences as features in KERAS

Hemant Vishwakarma

Saturday, 15 May 2021

How to set multiple sequences as features in KERAS

No comments:

Post a Comment