Snippets

Levelset Engineering Download S3 CSV to Pandas Dataframe

Created by Patrick Marlow last modified
import pandas as pd
import io
import boto3

def s3_download_csv_to_df(bucket,key):
    """Download specified CSV from S3 and push into Pandas Dataframe
    
    --bucket, Name of the S3 bucket to connect to
    --key, additional path information connecting the S3 bucket to the S3 file
    
    Ex: for key
    
    s3://advanced-data-algorithms/data/somefolder/anotherfolder/pairs_data_output_05112020.csv
         ^-------bucket---------^ ^--------------------------key-----------------------------^
         
    In the above example, the value for key would be:
        data/somefolder/anotherfolder/pairs_data_output_05112020.csv
        
    More detailed information about S3 Object Keys can be found here:
    https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html
    """
    
    s3 = boto3.client('s3')
    obj = s3.get_object(Bucket=bucket, Key=key)
    df = pd.read_csv(io.BytesIO(obj['Body'].read()))
    
    return df

Comments (0)

HTTPS SSH

You can clone a snippet to your computer for local editing. Learn more.