Monday 31 May 2021

How to automate parameters passed into pandas.read_sql?

I am trying to create a methodology for passing parameters automatically through something like locals(), similarly to how f-strings work.

How it currently works

import pandas as pd

def my_func(conn, string_id, date, integer_ids):
    sql = f"""    
    select * from RISK a
    where STRING_ID = '{string_id}'
    and DATE = {date}
    and INTEGER_ID in ({','.join(map(str, integer_ids))})"""
    df = pd.read_sql(sql, conn)
    return df

However, this approach means I cannot copy-paste the SQL into SQL developer or similar, and run it from there. So I would like an approach that makes use of parameters instead.

There seems to be two problems with that

  1. Parameters must be literals, so its not possible to pass along lists
  2. I need to create a dictionary manually, and cannot simply pass something like locals()

How I would like it to work would be something like the example below (which obviously doesn't work)

import pandas as pd

def my_func(conn, string_id, date, integer_ids):
    sql = """    
    select * from RISK
    where STRING_ID = :string_id
    and DATE = :date
    and INTEGER_ID in :integer_ids"""
    df = pd.read_sql(sql, conn, params=locals())
    return df

EDIT: After testing a bit, maybe I could use a regex to find all instances of :param and replace them with the parameter value, e.g.

import re
pattern = '[\n\r].*:\s*([^\n\r]*)'
matches = re.findall(pattern,sql)
for match in matches:
    sql = sql.replace(':'+match, eval(match))

It's just not very pretty, and it introduces issues related to object types. E.g. the string_id should be encapsulated by ', and the date needs to be converted to a string object as well



from How to automate parameters passed into pandas.read_sql?

No comments:

Post a Comment