Wednesday, 6 July 2022

Use cursor.executemany instead of cursor.execute with tying panda header to sql variables

I have a python script that updates rows in an oracle sql table correctly, however I am using cursor.execute and try/except so if one update fails, it kills the whole run.

I want to be able to have it run through the whole update and just log the error and move onto the next one, which is where cursor.executemany comes in.

https://cx-oracle.readthedocs.io/en/latest/user_guide/batch_statement.html

Here is the script, it works great, except for the all or nothing error approach.

#oracle sql update statement for SHRTCKN
banner_shrtckn_update = """
UPDATE SATURN.SHRTCKN A
SET A.SHRTCKN_COURSE_COMMENT = :course_comment,
    A.SHRTCKN_REPEAT_COURSE_IND = :repeat_ind,
    A.SHRTCKN_ACTIVITY_DATE = SYSDATE,
    A.SHRTCKN_USER_ID = 'STU00940',
    A.SHRTCKN_DATA_ORIGIN = 'APPWORX'
WHERE A.SHRTCKN_PIDM = gb_common.f_get_pidm(:id) AND
      A.SHRTCKN_TERM_CODE = :term_code AND
      A.SHRTCKN_SEQ_NO = :seqno AND
      A.SHRTCKN_CRN = :crn AND
      A.SHRTCKN_SUBJ_CODE = :subj_code AND
      A.SHRTCKN_CRSE_NUMB = :crse_numb
"""

def main():
    # get time of run and current year
    now = datetime.datetime.now()
    year = str(now.year)+"40"
    # configure connection strings for banner PROD
    db_pass = os.environ['DB_PASSWORD']
    dsn = cx_Oracle.makedsn(host='FAKE', port='1521', service_name='TEST.FAKE.BLAH')
    try: # initiate banner db connection -- PROD
        banner_cnxn = cx_Oracle.connect(user=config.db_test['user'], password = db_pass, dsn=dsn)
        writeLog("---- Oracle Connection Made ----")
        insertCount = 0
        for index, row in df.iterrows():
            shrtcknupdate(row,banner_cnxn)
            insertCount = insertCount + 1
        banner_cnxn.commit()
        banner_cnxn.close()
        writeLog(str(insertCount)+" rows updated")
    except Exception as e:
        print("Error: "+str(e))
        writeLog("Error: "+str(e))

def writeLog(content):
    print(content)
    log.write(str(datetime.date.today())+" "+content+"\n")

#define the variable connection between panda/csv and SHRTCKN table
def shrtcknupdate(row, connection):
    sql = banner_shrtckn_update
    variables = {
        'id'             : row.Bear_Nbr,
        'term_code'      : row.Past_Term,
        'seqno'          : row.Seq_No,
        'crn'            : row.Past_CRN,
        'subj_code'      : row.Past_Prefix,
        'crse_numb'      : row.Past_Number,
        'course_comment' : row.Past_Course_Comment,
        'repeat_ind'     : row.Past_Repeat_Ind
    }
    cursor = connection.cursor()
    cursor.execute(sql, variables)

if __name__ == "__main__":
    writeLog("-------- Process Start --------")
    main()
    writeLog("-------- Process End --------")

The executemany option, I can turn on batcherrors=True

and it will do exactly what I need.

The problem I am running into, is if I get rid of the for loop that runs through the excel/panda dataframe to update the oracle sql rows, which is not needed when doing the update in batch, then how do I attach the column headers to the sql update statement.

If I leave in the for loop, I get this error when using executemany:

Error: parameters should be a list of sequences/dictionaries or an integer specifying the number of times to execute the statement



from Use cursor.executemany instead of cursor.execute with tying panda header to sql variables

No comments:

Post a Comment