May 27, 2016

Sync S3 bucket to multiple S3 buckets in different region

Cross-Region Replication for Amazon S3 was introduced last year which enables replicating objects from a S3 bucket to a different S3 bucket located in different region (it can be same/different AWS account). It’s one of the most sought after feature for long time but it doesn’t solve some of the scenarios as defined below.

  • Synchronizing to a bucket in same region.
  • Synchronizing to multiple destination buckets.
  • Object ACLs needs to be set at the source bucket hence multiple different ACLs needs to be set when copying across different accounts.
  • Replication doesn’t change the ownership of the objects hence bucket policy applied at the destination bucket won’t be honored.

Refer the feedback from AWS support team regarding the last point mentioned above.

When copying files from one S3 bucket to another, it is imperative that the IAM user specified in the “–profile” parameter is a user in the Destination AWS account. From what I’ve gathered, your previous attempts have been copying the objects from the Source S3 bucket to the Destination S3 bucket using an IAM user under the Source account. In this scenario, the created object in the Destination bucket is still marked as “Owned” by the source account, rather than the destination. As a result, any bucket policy on the Destination bucket will not take effect on the object, as it is protected from any permission changes made at the bucket level by the Destination account, as described in the documentation

However, the resolution is simple to get the behavior you are looking for. Ensure the Source S3 bucket allows your Destination IAM User access to retrieve and copy files, and use it to copy the object(s) into the Destination S3 bucket. (Again, this only works if using an IAM User from the Destination account, not the Source account.) This will cause the newly-copied objects to be marked “Owned” by the Destination account, and thereby allowing the Bucket Policy to take effect, and allow only the specified IP addresses to access the object, while denying all others.

As you have identified, the use of the “–acl” parameter is not necessary here, as I was able to get the bucket policy to work without it, but as a general rule it is a good idea to include “–acl bucket-owner-full-control” to ensure the Destination account owner can make any changes to the object and to its permissions as necessary in the future. Either way, the bucket policy will still apply as the newly-copied object is now owned by the Destination account.

I have a use case where I want to synchronize a bucket with multiple buckets in different regions (including a bucket in the same region as source bucket) in a different AWS account.

Lambda is again my friend here and find below the design for this solution.

S3 Sync

There are multiple connecting points to make this work and they are listed below.

  • Bucket policy in source bucket which allows GetObject to the AWS account of the destination buckets. Sample Bucket Policy

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "ReadAccess",
          "Effect": "Allow",
          "Principal": {
            "AWS": [ "arn:aws:iam::123456789012:root" ]
          },
          "Action": "s3:GetObject",
          "Resource": [ "arn:aws:s3:::me-pprakash-source-bucket/*" ]
        },
        {
          "Sid": "ListAccess",
          "Effect": "Allow",
          "Principal": {
            "AWS": [ "arn:aws:iam::123456789012:root" ]
          },
          "Action": "s3:ListBucket",
          "Resource": [ "arn:aws:s3:::me-pprakash-source-bucket" ]
        }
      ]
    }
    
  • Lambda execution IAM role in source AWS account which allows Assume Role permission. Sample Lambda IAM role Role Policy:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
            "Sid": "AllowAssumeRole",
            "Effect": "Allow",
            "Action": [
                "sts:AssumeRole"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "arn:aws:logs:*:*:*"
        }
      ]
    }
    

    Trust Relationships:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Service": "lambda.amazonaws.com"
          },
          "Action": "sts:AssumeRole"
        }
      ]
    }
    
  • Cross Region IAM role in destination AWS account with GetObject permission to source S3 bucket, Get/Put/Delete Object permissions to destination S3 buckets.

    Sample Cross Region IAM role

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Action": [
                  "s3:GetObject*",
                  "s3:GetObject",
                  "s3:PutObject*",
                  "s3:PutObject",
                  "s3:DeleteObject",
                  "s3:DeleteObjectVersion"
          ],
          "Effect": "Allow",
          "Resource": ["arn:aws:s3:::me-pprakash-destination-bucket/*"]
        }, 
        {
          "Action": ["s3:ListBucket"],
          "Effect": "Allow",
          "Resource": ["arn:aws:s3:::me-pprakash-destination-bucket"]
        },
        {
          "Sid": "ReadAccess",
          "Effect": "Allow",
          "Action": "s3:GetObject",
          "Resource": ["arn:aws:s3:::me-pprakash-source-bucket/*"]
        }, 
        {
          "Sid": "ListAccess",
          "Effect": "Allow",
          "Action": "s3:ListBucket",
          "Resource": ["arn:aws:s3:::me-pprakash-source-bucket"]
        }
      ]
    }
    
  • Trust Relationship of Cross Region IAM role should include the Lambda Execution IAM role as trusted entity.

    Sample Trust Relationship for Cross Region IAM Role

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "",
          "Effect": "Allow",
          "Principal": {
            "AWS": "arn:aws:iam::987654321012:role/s3-sync-role"
          },
          "Action": "sts:AssumeRole"
        }
      ]
    }
    
  • Lambda function with Lambda Execution IAM role attached which can assume the role of Cross Region role in destination AWS account and perform Copy Object or delete object based on the event. S3 ObjectCreated & ObjectRemoved of source bucket should be configured as the event source of the lambda function.

    Lambda Function

    import boto3
    from botocore.exceptions import ClientError
    import sys
    import traceback
    import re
    
    
    def lambda_handler(event, context):
        print 'Recieved following event'
        print '--------------------------------------------------'
        print event
        print '--------------------------------------------------'
        try:
            # Obtain destination account Credentials
            sts = boto3.client('sts')
            sts_role_cred = sts.assume_role(RoleArn='arn:aws:iam::123456789012:role/CrossRegionS3ReplicationRole',
                                            RoleSessionName='SyncRole', DurationSeconds=900)
            aid = sts_role_cred['Credentials']['AccessKeyId']
            sak = sts_role_cred['Credentials']['SecretAccessKey']
            stok = sts_role_cred['Credentials']['SessionToken']
            print aid
            print sak
            print stok
    
            bucks = ['me-pprakash-destination-bucket']
    
            s3 = boto3.client('s3', aws_access_key_id=aid, aws_secret_access_key=sak, aws_session_token=stok)
            records = event['Records']
            for record in records:
                src_buck = record['s3']['bucket']['name']
                src_key = record['s3']['object']['key']
                if re.match('^ObjectCreated:*', record['eventName']):
                    for des_buck in bucks:
                        s3.copy_object(ACL='bucket-owner-full-control', Bucket=des_buck, Key=src_key,
                                      CopySource={'Bucket': src_buck, 'Key': src_key}, StorageClass='STANDARD')
                        print 'Successfully copied %s to %s' % (src_key, des_buck)
                elif re.match('^ObjectRemoved:*', record['eventName']):
                    for des_buck in bucks:
                        s3.delete_object(Bucket=des_buck, Key=src_key)
                        print 'Successfully deleted %s from %s' % (src_key, des_buck)
        except ClientError as e:
            print 'Received client error'
            print e
        except:
            print 'Recieved following error'
            traceback.print_exc(file=sys.stdout)
    
    
    if __name__ == '__main__':
        lambda_handler('event', 'handler')
    

Further Improvements:

  • Spaces in the object names are not getting handled well which shall be fixed.
  • Support Amazon S3 managed SSE when copying the object if it’s enabled at source bucket.

© Prakash P 2015 - 2023

Powered by Hugo & Kiss.