This is a sample script for uploading multiple files to S3 keeping the original folder structure. Doing this manually can be a bit tedious, specially if there are many files to upload located in different folders. This code will do the hard work for you, just call the function upload_files('/path/to/my/folder')
. The param of the function must be the path of the folder containing the files in your local machine.
Install Boto3
You will need to install Boto3 first:
pip install boto3
Script
import boto3 import os def upload_files(path): session = boto3.Session( aws_access_key_id='YOUR_AWS_ACCESS_KEY_ID', aws_secret_access_key='YOUR_AWS_SECRET_ACCESS_KEY_ID', region_name='YOUR_AWS_ACCOUNT_REGION' ) s3 = session.resource('s3') bucket = s3.Bucket('YOUR_BUCKET_NAME') for subdir, dirs, files in os.walk(path): for file in files: full_path = os.path.join(subdir, file) with open(full_path, 'rb') as data: bucket.put_object(Key=full_path[len(path)+1:], Body=data) if __name__ == "__main__": upload_files('/path/to/my/folder')
The script will ignore the local path when creating the resources on S3, for example if we execute upload_files('/my_data')
having the following structure:
/my_data/photos00/image1.jpg /my_data/photos01/image1.jpg
The resulting structure on S3 will be:
/photos00/image1.jpg /photos01/image1.jpg
Comments 9
This code greatly helped me to upload file to S3. Thanks you! However I want to upload the files to a specific subfolder on S3. Not quite sure how to do it.
ex: “datawarehouse” is my main bucket where I can upload easily with the above code. But I want to upload it in this path: datawarehouse/Import/networkreport.
Can you please help me do it within this code?
You should be able to just change the assignment of full_path above and prepend the path to the subfolder that you want to start in.
In the example code, change:
full_path = os.path.join(subdir, file)
to:
full_path = ‘Import/networkreport/’ + os.path.join(subdir, file)
You should perform this method to upload files to a subfolder on S3:
bucket.put_object(Key=’Subfolder/’+full_path[len(path)+0:], Body=data)
Hope it helps!
I dont know why I am getting an error
EndpointConnectionError: Could not connect to the endpoint URL:
Please help me to solve this.
this means you dont have permission to that bucket or you have not set you IAM policy correctly for S3 operations.
Hi,
This is very helpful, but I need to upload the files to another bucket and would like to create a bucket if it does not exist and then upload the file.
Source S3 bucket name :ABC/folder1/file1
New S3 Bucket name(create if does not exist) : folder1/file1
I am very new to Python and I wanted to use the code above as a template to upload files from a directory to an s3 bucket.
In the code above where do I put in the path to my source file (the directory)
How to perform multipart upload with above code for those files bigger than 5GB
Works well but this is quite slow though.
maybe we can upload multiple files concurrently ?