Thursday, 18 April 2019

Splitting a boolean search string into its constituent parts / counting what causes SQL SELECT matches

As per my previous question, a software recruiter can enter a boolean text string, such as C++ AND ((UML OR Python) OR (not Perl)), which I will translate to SELECT * FROM candidates WHERE skill=C++ AND ((skill=UML OR skill=Python) OR (not skill=Perl)).

I could like to COUNT(*) the number of hits, but I would also be very interested to know how much each "sub-clause" (if that's the correct term) of the query contributed to the result.

E.g. there might have been 200 candidates with C++, but 50 were not suitable because they have neither UML nor Python experience.

So, using either PHP (and rexex?) or MySql, how can I break that down to see which parts of the search term contribute to the result?

I.e, break down skill=C++ AND ((skill=UML OR skill=Python) OR (not skill=Perl)) into COUNT(*) WHERE skill=C++ and `COUNT (*) WHERE (skill=UML OR skill=Python), etc

I don't know if MySql has some sort of EXPLAIN for this, but suspect not, so that I will have to break out the SELECT as described and COUNT each sub-clause separately.

I hope that I have explained this clearly; if not, please ask for clarification. I simply don't know where to begin



from Splitting a boolean search string into its constituent parts / counting what causes SQL SELECT matches

No comments:

Post a Comment