My inputs are given like key: "a word" or like anotherkey: "a word (1234)". My issue is that I have used below syntax:
word = pp.Word(pp.printables, excludeChars=":")
word = ("[" + pp.Word(pp.printables + " ", excludeChars=":[]") + "]") | word
non_tag = word + ~pp.FollowedBy(":")
# tagged value is two words with a ":"
tag = pp.Group(word + ":" + word)
# one or more non-tag words - use originalTextFor to get back
# a single string, including intervening white space
phrase = pp.originalTextFor(non_tag[1, ...])
parser = (phrase | tag)[...]
When my inputs are like key: "value1" and hey you how are you? it translates the query to expected output which is ([(['key', ':', '"value1"'], {}), 'and hey you how are you?'], {}), but problem occures when I try to have space between my value after key:
parser.parseString('key: "Microsoft windows (12932)" and hey you how are you?')
([(['key', ':', '"Microsoft'], {}), 'windows (12932)" and hey you how are you?'], {})
It breaks on Microsoft and windows. I know `pyparsing ignores spaces, but how can I solve this issue and get results until the end of the phrase which is double quotes?
EDIT-1 I tried to work around this problem by adding another word like below:
word = ('"' + pp.Word(pp.printables + " ", excludeChars=':"') + '"') | word
It works on queries like key: "windows server (23232)" but not on more complex queries like key1: value and key2: "windows server (1212)". Anyone has any clue about this issue and how should I circumvent this buggy behavior?
EDIT-2 What do I expect? What I need is to extend my grammar so something like below query:
'key: "Microsoft windows (12932)" and hey you how are you?
It should NOT be:
([(['key', ':', '"Microsoft'], {}), 'windows (12932)" and hey you how are you?'], {})
IT should be like:
([(['key', ':', '"Microsoft windows (12932)"'], {}), 'and hey you how are you?'], {})
This query can get combined with more keys with a free text search like below:
A free text search and key1: "Microsoft windows (12312) and key2: "Sample2" or key3: "Another sample (121212)"
This should also get parsed like below:
part1-> A free text search and
part2: ['key1', ':', '"Microsoft windows (12932)"']
part3: ['key2', ':', '"Sample2"']
part3: ['key3', ':', '"Another sample (121212)"']
NOTE: if and, or is attached to tokens it is OK for me. I just need to separate free text search from key:value queries.
from Why search term with space does not parse correctly in pyparsing?
No comments:
Post a Comment