Amazon Simple Storage Service (S3) is an object storage service that stores data as objects within buckets called S3 buckets. An S3 bucket is a storage location to hold files referred to as data objects. Each object is identified by a unique key within a bucket, and buckets themselves are globally unique across all AWS accounts.

S3 is commonly used for backup and archival, data lakes, static website hosting, content distribution, and as storage for application data. It provides high durability (99.999999999%), availability, and scalability at a low cost.

In this post, we will explore common S3 operations using boto3, the Python SDK for AWS. We will cover bucket management, policies, encryption, file uploads, versioning, lifecycle policies, and static website hosting.

Boto3 Installation

We can install boto3 via pip install boto3

Before using boto3, we need to configure AWS credentials. This can be done via the AWS CLI (aws configure) or by setting environment variables (AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY).

S3 Client vs Resource

Boto3 provides two interfaces for interacting with S3:

  • Client (boto3.client('s3')) — A low-level interface that maps directly to the AWS API. It returns raw dictionary responses and gives fine-grained control over API calls.
  • Resource (boto3.resource('s3')) — A higher-level, object-oriented interface. It provides abstractions like Bucket and Object that are easier to work with.
import boto3

# Low-level client interface
s3_client = boto3.client('s3')

# High-level resource interface
s3_resource = boto3.resource('s3')

In general, we use the client interface when we need precise control over API calls, and the resource interface when we want cleaner, more Pythonic code. Most examples in this post use the client interface.

Bucket Operations

Create a Bucket

When creating a bucket outside of us-east-1, we must specify a LocationConstraint in the configuration.

import logging
import boto3
from botocore.exceptions import ClientError


def create_bucket(bucket_name, region=None):
    """Create an S3 bucket in a specified region

    If a region is not specified, the bucket is created in the S3 default
    region (us-east-1).

    :param bucket_name: Bucket to create
    :param region: String region to create bucket in, e.g., 'us-west-2'
    :return: True if bucket created, else False
    """

    try:
        if region is None:
            s3_client = boto3.client('s3')
            s3_client.create_bucket(Bucket=bucket_name)
        else:
            s3_client = boto3.client('s3', region_name=region)
            location = {'LocationConstraint': region}
            s3_client.create_bucket(Bucket=bucket_name,
                                    CreateBucketConfiguration=location)
    except ClientError as e:
        logging.error(e)
        return False
    return True

List Buckets

We can retrieve a list of all buckets in our account using list_buckets().

import boto3

# Retrieve the list of existing buckets
s3 = boto3.client('s3')
response = s3.list_buckets()

# Output the bucket names
print('Existing buckets:')
for bucket in response['Buckets']:
    print(f'  {bucket["Name"]}')

Delete a Bucket

A bucket must be empty before it can be deleted. If the bucket contains objects, we must delete them first.

def delete_bucket(bucket_name):
    s3_client = boto3.client('s3')
    return s3_client.delete_bucket(Bucket=bucket_name)

Bucket Policies

Bucket policies are JSON-based access control configurations that define what actions are allowed or denied on a bucket and its objects. They are useful for granting cross-account access or making objects publicly readable.

Create a Bucket Policy

The following example creates a policy that allows all S3 actions on objects within the bucket.

import json
import boto3

def create_bucket_policy(bucket_name):
    bucket_policy = {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "AddPerm",
                "Effect": "Allow",
                "Principal": "*",
                "Action": ["s3:*"],
                "Resource": [f"arn:aws:s3:::{bucket_name}/*"]
            }
        ]
    }

    policy_string = json.dumps(bucket_policy)

    s3_client = boto3.client('s3')
    return s3_client.put_bucket_policy(
        Bucket=bucket_name,
        Policy=policy_string
    )

Update a Bucket Policy

We can update a policy to restrict actions to specific operations like GetObject, PutObject, and DeleteObject.

def update_bucket_policy(bucket_name):
    bucket_policy = {
        'Version': '2012-10-17',
        'Statement': [
            {
                'Sid': 'AddPerm',
                'Effect': 'Allow',
                'Principal': '*',
                'Action': [
                    's3:DeleteObject',
                    's3:GetObject',
                    's3:PutObject'
                ],
                'Resource': 'arn:aws:s3:::' + bucket_name + '/*'
            }
        ]
    }

    policy_string = json.dumps(bucket_policy)

    s3_client = boto3.client('s3')
    return s3_client.put_bucket_policy(
        Bucket=bucket_name,
        Policy=policy_string
    )

Read a Bucket Policy

def get_bucket_policy(bucket_name):
    s3_client = boto3.client('s3')
    return s3_client.get_bucket_policy(Bucket=bucket_name)

Server-Side Encryption

S3 supports server-side encryption to protect data at rest. We can enable default encryption on a bucket so that all objects are automatically encrypted when stored.

Enable Encryption

The following example enables AES-256 (SSE-S3) encryption on a bucket.

def server_side_encrypt_bucket(bucket_name):
    s3_client = boto3.client('s3')
    return s3_client.put_bucket_encryption(
        Bucket=bucket_name,
        ServerSideEncryptionConfiguration={
            'Rules': [
                {
                    'ApplyServerSideEncryptionByDefault': {
                        'SSEAlgorithm': 'AES256'
                    }
                }
            ]
        }
    )

Check Encryption Status

def get_bucket_encryption(bucket_name):
    s3_client = boto3.client('s3')
    return s3_client.get_bucket_encryption(Bucket=bucket_name)

Uploading Objects

S3 supports two upload approaches depending on file size: a simple upload for small files and multipart upload for large files.

Small File Upload

For files under 5 GB, we can use upload_file() directly.

import os
import boto3

def upload_small_file(bucket_name, file_path, object_key):
    s3_client = boto3.client('s3')
    return s3_client.upload_file(file_path, bucket_name, object_key)

Multipart Upload with Progress Tracking

For large files, we can configure multipart uploads with concurrent transfers. The TransferConfig allows us to set the threshold for switching to multipart, chunk sizes, and concurrency. We can also attach a callback to track upload progress.

import os
import sys
import threading
import boto3
from boto3.s3.transfer import TransferConfig


class ProgressPercentage(object):
    def __init__(self, filename):
        self._filename = filename
        self._size = float(os.path.getsize(filename))
        self._seen_so_far = 0
        self._lock = threading.Lock()

    def __call__(self, bytes_amount):
        with self._lock:
            self._seen_so_far += bytes_amount
            percentage = (self._seen_so_far / self._size) * 100
            sys.stdout.write(
                "\r%s  %s / %s  (%.2f%%)" % (
                    self._filename, self._seen_so_far, self._size, percentage
                )
            )
            sys.stdout.flush()


def upload_large_file(bucket_name, file_path, object_key):
    config = TransferConfig(
        multipart_threshold=1024 * 25,
        max_concurrency=10,
        multipart_chunksize=1024 * 25,
        use_threads=True
    )

    s3_resource = boto3.resource('s3')
    s3_resource.meta.client.upload_file(
        file_path, bucket_name, object_key,
        ExtraArgs={'ContentType': 'application/octet-stream'},
        Config=config,
        Callback=ProgressPercentage(file_path)
    )

Reading Objects

We can read objects from S3 using get_object(). This returns a response that includes the object body as a streaming object, which we can read using .read().

def read_object_from_bucket(bucket_name, object_key):
    s3_client = boto3.client('s3')
    response = s3_client.get_object(Bucket=bucket_name, Key=object_key)

    # Read the object content
    content = response['Body'].read().decode('utf-8')
    return content

Versioning

Versioning allows us to keep multiple variants of an object in the same bucket. Once enabled, S3 preserves every version of every object stored in the bucket. This is useful for recovering from unintended user actions and application failures.

Enable Versioning

def enable_versioning(bucket_name):
    s3_client = boto3.client('s3')
    s3_client.put_bucket_versioning(
        Bucket=bucket_name,
        VersioningConfiguration={
            'Status': 'Enabled'
        }
    )

Upload a New Version

Once versioning is enabled, uploading a file with the same key automatically creates a new version. Each version gets a unique version ID.

def upload_new_version(bucket_name, file_path, object_key):
    s3_client = boto3.client('s3')
    return s3_client.upload_file(file_path, bucket_name, object_key)

Lifecycle Policies

Lifecycle policies allow us to automate transitioning objects between storage classes or expiring objects after a certain period. This is commonly used to move infrequently accessed data to cheaper storage classes like S3 Glacier.

The following example creates two rules:

  1. Move objects with the prefix readme to Glacier after a specific date
  2. Move non-current (old) versions of all objects to Glacier after 2 days
def put_lifecycle_policy(bucket_name):
    lifecycle_policy = {
        "Rules": [
            {
                "ID": "Move readme file to Glacier",
                "Prefix": "readme",
                "Status": "Enabled",
                "Transitions": [
                    {
                        "Date": "2019-01-01T00:00:00.000Z",
                        "StorageClass": "GLACIER"
                    }
                ]
            },
            {
                "Status": "Enabled",
                "Prefix": "",
                "NoncurrentVersionTransitions": [
                    {
                        "NoncurrentDays": 2,
                        "StorageClass": "GLACIER"
                    }
                ],
                "ID": "Move old versions to Glacier"
            }
        ]
    }

    s3_client = boto3.client('s3')
    s3_client.put_bucket_lifecycle_configuration(
        Bucket=bucket_name,
        LifecycleConfiguration=lifecycle_policy
    )

Static Website Hosting

S3 can host static websites directly from a bucket. To set this up, we need to create the bucket, apply a public-read policy, configure the website settings, and upload our HTML files.

import os
import boto3

def host_static_website(bucket_name, region='eu-central-1'):
    s3_client = boto3.client('s3', region_name=region)

    # Create the bucket
    s3_client.create_bucket(
        Bucket=bucket_name,
        CreateBucketConfiguration={
            'LocationConstraint': region
        }
    )

    # Apply a public-read policy
    update_bucket_policy(bucket_name)

    # Configure website hosting
    website_configuration = {
        'ErrorDocument': {'Key': 'error.html'},
        'IndexDocument': {'Suffix': 'index.html'}
    }

    s3_client.put_bucket_website(
        Bucket=bucket_name,
        WebsiteConfiguration=website_configuration
    )

    # Upload index and error pages
    index_file = os.path.dirname(__file__) + '/index.html'
    error_file = os.path.dirname(__file__) + '/error.html'

    s3_client.put_object(
        Bucket=bucket_name, ACL='public-read',
        Key='index.html',
        Body=open(index_file).read(),
        ContentType='text/html'
    )
    s3_client.put_object(
        Bucket=bucket_name, ACL='public-read',
        Key='error.html',
        Body=open(error_file).read(),
        ContentType='text/html'
    )

Once hosted, the website is accessible at http://<bucket-name>.s3-website.<region>.amazonaws.com.

Useful Resources

What is Amazon S3?

Amazon S3 buckets

Boto3 S3 Documentation

S3 Storage Classes

Common S3 Operations