Tuesday 3 November 2020

AWS Lambda - read csv and convert to pandas dataframe

I have got a simple Lambda code to read the csv file from S3 Bucket. All is working fine however I tried to get the csv data to pandas data frame and the error comes up string indices must be integers

My code is bog-standard but I just need to use the csv as a data frame for further manipulation. The hashed line is the source of the error. I can print data with no problems so the bucket and file details are configured properly.

updated code

import json
import pandas as pd
import numpy as np
import requests
import glob
import time
import os
from datetime import datetime
from csv import reader
import boto3
import traceback
import io

s3_client = boto3.client('s3')

def lambda_handler(event, context):
    try:
            
        bucket_name = event["Records"][0]["s3"]["bucket"]["name"]
        s3_file_name = event["Records"][0]["s3"]["object"]["key"]
        resp = s3_client.get_object(Bucket=bucket_name, Key=s3_file_name)
        
        data = resp['Body'].read().decode('utf-8')
        df=pd.DataFrame( list(reader(data)))
        print (df.head())

    except Exception as err:
        print(err)
        

        
        
    # TODO implement
    return {
        'statusCode': 200,
        'body': json.dumps('Hello fr2om Lambda!')
    }
    
    traceback.print_exc()


from AWS Lambda - read csv and convert to pandas dataframe

No comments:

Post a Comment