I want to make Named Entity Recognition model with Keras. These are the links that I have followed:
https://valueml.com/named-entity-recognition-using-lstm-in-keras/ https://djajafer.medium.com/named-entity-recognition-and-classification-with-keras-4db04e22503d
Data looks like this:
word label
0 Thousands O
1 of O
2 demonstrators O
3 have O
4 marched O
... ... ...
44187 there O
44188 accidentally O
44189 or O
44190 deliberately O
44191 . O
They are using word to vectors, so they are indexing the words and labels, so that X are my features (index sequences of words) and y are my results (index sequences of labels):
max_len = 30
X = [[word2idx[w[0]] for w in s] for s in list_of_sentances]
X = pad_sequences(maxlen=max_len, sequences=X, padding="post", value=num_words-1)
y = [[label2idx[w[1]] for w in s] for s in list_of_sentances]
y = pad_sequences(maxlen=max_len, sequences=y, padding="post", value=label2idx["O"])
y = [to_categorical(i, num_classes=num_labels) for i in y]
But what if I have dataset like this: 
here I have another column and that is POS. How can I add values of POS column to my features? So basically, I do not want only word values in my X, i also want POS values in my X. *(or any other values) What If I have multiple columns, such as:
word
POS
is_capital_letter
word_length
...
How can I add all of these columns to my features?
This is my model: X = np.array(X) y = np.array(y)
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=1)
print("x_train shape", x_train.shape)
print("x_test shape", x_test.shape)
#x_train shape (750, 75)
#x_test shape (250, 75)
input_word = Input(shape=(max_len,))
model = Embedding(input_dim = vocab_len+1, output_dim = 75, input_length = max_len)(input_word)
model = SpatialDropout1D(0.25)(model)
model = Bidirectional(LSTM(units = 25, return_sequences=True, recurrent_dropout = 0.2))(model)
out = TimeDistributed(Dense(num_labels, activation = "softmax"))(model)
model = Model(input_word, out)
from How to set multiple sequences as features in KERAS
No comments:
Post a Comment