Tuesday, 15 December 2020

How to speed up returning a 20MB Json file from a Python-Flask application?

I am trying to call an API which in turn triggers a store procedure from our sqlserver database. This is how I coded it.

class Api_Name(Resource):

    def __init__(self):
        pass

    @classmethod
    def get(self):
        try:
            engine = database_engine
            connection = engine.connect()
            sql = "DECLARE @return_value int EXEC @return_value = [dbname].[dbo].[proc_name])
            return call_proc(sql, apiname, starttime, connection)
        except Exception as e:
            return {'message': 'Proc execution failed with error => {error}'.format(error=e)}, 400
        pass

call_proc is the method where I return the JSON from database.

def call_proc(sql: str, connection):
    try:
        json_data = []
        rv = connection.execute(sql)
        for result in rv:
            json_data.append(dict(zip(result.keys(), result)))
        return Response(json.dumps(json_data), status=200)
    except Exception as e:
        return {'message': '{error}'.format(error=e)}, 400
    finally:
        connection.close()

The problem with the output is the way JSON is returned and the size of it. At first the API used to take 1minute 30seconds: when the return statement was like this:

case1: return Response(json.dumps(json_data), status=200, mimetype='application/json')

After looking online, I found that the above statement is trying to prettify JSON. So I removed mimetype from the response & made it as

case2: return Response(json.dumps(json_data), status=200)

The API runs for 30seconds, although the JSON output is not aligned properly but its still JSON. I see the output size of the JSON returned from the API is close 20MB. I observed this on postman response:

Status: 200 OK    Time: 29s    Size: 19MB

The difference in Json output:

case1:

[   {
        "col1":"val1",
        "col2":"val2"
    },
    {
        "col1":"val1",
        "col2":"val2"
    }
]

case2:

[{"col1":"val1","col2":"val2"},{"col1":"val1","col2":"val2"}]

Will the difference in output from the two aforementioned cases are different ? If so, how can I fix the problem ? If there is no difference, is there any way I speed up this further and reduce the run time further more, like compressing the JSON which I am returning ?



from How to speed up returning a 20MB Json file from a Python-Flask application?

No comments:

Post a Comment