Say I have three example strings
text1 = "Patient has checked in for abdominal pain which started 3 days ago. Patient was prescribed idx 20 mg every 4 hours."
text2 = "The time of discomfort was 3 days ago."
text3 = "John was given a prescription of idx, 20mg to be given every four hours"
If I got all the matching substrings of text2 and text3 with text1, I would get
text1_text2_common = [
'3 days ago.',
]
text2_text3_common = [
'of',
]
text1_text3_common = [
'was',
'idx'
'every'
'hours'
]
What I am looking for is a fuzzy matching, using something like the Levenshtein distance [ https://ift.tt/5kYXErf ]. So even if the substrings are not exact, if they are similar enough for a criteria, it would get selected as a substring.
So ideally I am looking for something like
text1_text3_common_fuzzy = [
'prescription of idx, 20mg to be given every four hours'
]
from How to get all fuzzy matching substrings between two strings in python?
No comments:
Post a Comment