I have made Keras model that detects if string value is Address, Company or Date. I have used only different company names, different date formants, and different street addresses for training. So each row in my dataset have between 1 and 5 words (some words can be numbers).
For preprocessing I have used vectorizers:
transformerVectoriser = ColumnTransformer(transformers=[('vector char', CountVectorizer(analyzer='char', ngram_range=(3, 6), max_features = 2000), 'text'),
('vector word', CountVectorizer(analyzer='word', ngram_range=(1, 1), max_features = 4000), 'text')],
remainder='passthrough') # Default is to drop untransformed columns
features = transformerVectoriser.fit_transform(features)
This is my model:
model = Sequential()
model.add(Dense(100, input_dim = features.shape[1], activation = 'relu')) # input layer requires input_dim param
model.add(Dense(200, activation = 'relu'))
model.add(Dense(100, activation = 'relu'))
model.add(Dense(50, activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(3, activation='softmax'))
I have achieved accuracy of 93%. Is it possible to use that model for detecting where is that string (Address, Company or Date) in bigger text? I think that that kind of models is called NER models (named entity recognition).
My model takes string input and decides if its a company, person or address. String input is 1-5 words long.
For example if I have text:
text = "What do you think about Amazon and their customer policy?"
How can I detect start index and end index of "Amazon", or how can I extract only Amazon from this text? So required result should be:
start_index = 21
end_index = 27
Or:
result = 'Amazon'
I want to use my model to extract specific entities from text, in this case company, address or date. Basically I need some kind of location search
thats based on Keras model, or searching for the pattern in large text.
from Creating NER model with Keras and Python
No comments:
Post a Comment