AWS Redshift - Cross-Account Unload & Copy with S3

Abdul R. Wahab
3 min readMay 26

--

Use Case Requirements

You have an occasional need to copy and / or unload data from an S3 bucket from one account to a Redshift cluster in another account.

This can be achieved through Redshift’s COPY & UNLOAD commands in combination with IAM Roles and Role chaining.

Background

IAM Role chaining also works in Redshift for cross-account COPY and UNLOAD to S3 commands.

As shared in my previous blog, IAM Role chaining allows you to set up a chain of trusted entities between AWS accounts, enabling you to delegate access from one account to another.

This feature is also useful when you have a scenario where a Redshift cluster in one account needs to perform COPY or UNLOAD operations on S3 buckets owned by a different account.

High-Level Solution Visual

Source: Abdul R. Wahab

Implementation 👨‍💻

To set up permissions and IAM Role chaining for cross-account COPY and UNLOAD commands in Redshift, you can take the following steps:

  1. Set up an IAM role in the source AWS account (the account where the Redshift cluster resides) that will be allowed to assume a role in the destination AWS account (the account that owns the S3 buckets).
  • In the source account, create an IAM role with the necessary permissions to perform COPY or UNLOAD operations in Redshift.
  • Add a trust policy to the IAM role, specifying the destination account as the trusted entity. Below is an example of a trust policy for assuming a role in the destination account:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<destination-account-id>:root"
},
"Action": "sts:AssumeRole"
}
]
}

2. In the destination account, create an IAM role that grants the necessary permissions to access the S3 buckets. This role will be assumed by the IAM role in the source account during COPY or UNLOAD operations.

  • Create an IAM role with the required S3 permissions for the S3 buckets involved in the COPY or UNLOAD commands. For example, the role should have permissions to read from the source bucket during COPY or write to the target bucket during UNLOAD.

3. Attach an IAM policy to the role in the destination account, allowing the role from the source account to assume it. Below is an example of an IAM policy that grants permission for the source role to assume the destination role:

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<source-account-id>:role/<source-role-name>"
},
"Action": "sts:AssumeRole"
}
]
}

4. When executing the COPY or UNLOAD command in Redshift, use the IAM role in the source account that has the trust relationship configured with the destination account. Specify the ARN of the role in the source account using the IAM_ROLE option.

  • Example COPY command:
COPY table_name FROM 's3://<source-bucket>/<source-prefix>' 
IAM_ROLE 'arn:aws:iam::<source-account-id>:role/<source-role-name>';
  • Example UNLOAD command:
UNLOAD ('SELECT * FROM table_name') TO 's3://<destination-bucket>/<destination-prefix>' 
IAM_ROLE 'arn:aws:iam::<source-account-id>:role/<source-role-name>';

Summary 👏

In these examples, <source-account-id> refers to the ID of the source AWS account, <source-role-name> refers to the name of the IAM role in the source account, <destination-account-id> refers to the ID of the destination AWS account, and <source-bucket> and <destination-bucket> are the names of the S3 buckets involved.

By setting up IAM Role chaining, you establish a trust relationship between the two accounts, allowing Redshift in the source account to access the S3 buckets owned by the destination account using the specified IAM roles.

Remember to adjust the permissions and trust policies according to your specific requirements and naming conventions for the AWS accounts, IAM roles, and S3 buckets involved.

--

--

AWS Redshift - INSERT & COPY Commands

2 min read

Jun 4

A Deep Dive into AWS Data Services

7 min read

May 29

AWS IAM Role Chaining: Walkthrough

4 min read

May 26

Implementing a Data Mesh Architecture with AWS Redshift Spectrum and Lake Formation

4 min read

May 7

Using Vault Agent Caching to authenticate to Amazon RDS

3 min read

Apr 21

Amazon Redshift Data Sharing - Underlying Technology

2 min read

May 8

AWS Data Integration across Multiple Data Stores

4 min read

May 7

Different ways to migrate Terraform State

4 min read

May 4

Authenticating to AWS Redshift using Ephemeral Credentials

3 min read

Apr 29

Lesser Known aspects of Amazon Redshift

4 min read

Apr 28

Abdul R. Wahab

Tech guy. I like building cool software, & also leading others in building cool things. All views shared are my own.

Recommended from Medium

Lists

See more recommendations