Upload files to S3 with Python (keeping the original folder structure )

This is a sample script for uploading multiple files to S3 keeping the original folder structure. Doing this manually can be a bit tedious, specially if there are many files to upload located in different folders. This code will do the hard work for you, just call the function upload_files('/path/to/my/folder'). The param of the function must be the path of the folder containing the files in your local machine.

Install Boto3

You will need to install Boto3 first:

pip install boto3

Script

import boto3
import os

def upload_files(path):
    session = boto3.Session(
        aws_access_key_id='YOUR_AWS_ACCESS_KEY_ID',
        aws_secret_access_key='YOUR_AWS_SECRET_ACCESS_KEY_ID',
        region_name='YOUR_AWS_ACCOUNT_REGION'
    )
    s3 = session.resource('s3')
    bucket = s3.Bucket('YOUR_BUCKET_NAME')

    for subdir, dirs, files in os.walk(path):
        for file in files:
            full_path = os.path.join(subdir, file)
            with open(full_path, 'rb') as data:
                bucket.put_object(Key=full_path[len(path)+1:], Body=data)

if __name__ == "__main__":
    upload_files('/path/to/my/folder')

The script will ignore the local path when creating the resources on S3, for example if we execute upload_files('/my_data') having the following structure:

/my_data/photos00/image1.jpg
/my_data/photos01/image1.jpg

The resulting structure on S3 will be:

/photos00/image1.jpg
/photos01/image1.jpg

Comments 9

  1. This code greatly helped me to upload file to S3. Thanks you! However I want to upload the files to a specific subfolder on S3. Not quite sure how to do it.

    ex: “datawarehouse” is my main bucket where I can upload easily with the above code. But I want to upload it in this path: datawarehouse/Import/networkreport.

    Can you please help me do it within this code?

    1. You should be able to just change the assignment of full_path above and prepend the path to the subfolder that you want to start in.

      In the example code, change:
      full_path = os.path.join(subdir, file)
      to:
      full_path = ‘Import/networkreport/’ + os.path.join(subdir, file)

    2. You should perform this method to upload files to a subfolder on S3:

      bucket.put_object(Key=’Subfolder/’+full_path[len(path)+0:], Body=data)

      Hope it helps!

  2. I dont know why I am getting an error
    EndpointConnectionError: Could not connect to the endpoint URL:

    Please help me to solve this.

    1. this means you dont have permission to that bucket or you have not set you IAM policy correctly for S3 operations.

  3. Hi,

    This is very helpful, but I need to upload the files to another bucket and would like to create a bucket if it does not exist and then upload the file.
    Source S3 bucket name :ABC/folder1/file1
    New S3 Bucket name(create if does not exist) : folder1/file1

  4. I am very new to Python and I wanted to use the code above as a template to upload files from a directory to an s3 bucket.
    In the code above where do I put in the path to my source file (the directory)

Leave a Reply to Naman Sethi Cancel reply

Your email address will not be published. Required fields are marked *