How to Duplicate a Table in Redshift

Duplicating a table in Amazon Redshift can be useful for various purposes, such as creating backups, testing queries without altering the original data, or restructuring your database schema. This tutorial will guide you through the process of duplicating a table using SQL commands in Redshift.

Step 1: Basic Syntax to Duplicate a Table

To create an exact copy of an existing table, you can use the following SQL command:


CREATE TABLE new_table_name AS 
SELECT * FROM original_table_name;
                                

This command creates a new table named new_table_name and copies all data from the original_table_name.

Step 2: Duplicating a Table with Specific Columns

If you only need specific columns from the original table, you can modify the SELECT statement to include the desired columns:


CREATE TABLE new_table_name AS 
SELECT column1, column2, column3 FROM original_table_name;
                                

Step 3: Duplicating with Constraints and Indexes

While the above method duplicates the data, it does not copy any constraints, indexes, or distribution styles from the original table. If you need to duplicate a table with its constraints, indexes, and other properties, you will need to manually define these after creating the new table. Here's an example of how to add constraints after duplication:


ALTER TABLE new_table_name ADD CONSTRAINT new_constraint_name PRIMARY KEY (column1);
                                

Step 4: Using Redshift COPY Command for Large Tables

For very large tables, the CREATE TABLE AS statement might be inefficient. In such cases, using the Redshift COPY command can be a more performant option. First, create an empty table with the same schema as the original:


CREATE TABLE new_table_name (column1 datatype, column2 datatype, ...);
                                

Then, use the COPY command to load the data into the new table:


COPY new_table_name FROM 's3://bucket_name/path_to_data_file' 
CREDENTIALS 'aws_access_key_id=xxx;aws_secret_access_key=xxx' 
DELIMITER ',' REGION 'us-west-2';
                                

Best Practices for Duplicating Tables

  • Always check the storage and performance implications when duplicating large tables.
  • Ensure proper indexing and constraints after the duplication process to optimize query performance.
  • If you're duplicating tables for backup purposes, consider using Redshift snapshots instead, which can save time and storage space.