How to Drop a Column in BigQuery

Google BigQuery does not allow you to directly drop a column from a table using the ALTER TABLE statement. However, you can achieve this by creating a new table without the unwanted column. In this tutorial, we will walk you through the steps using SQL and the BigQuery web UI.

Why You Can’t Directly Drop a Column

BigQuery’s schema update capabilities are somewhat limited compared to traditional relational databases. While you can ADD COLUMN or RENAME COLUMN, you cannot directly DROP COLUMN. The typical workaround is to recreate the table without the column you want to remove.

Step 1: Identify the Columns You Want to Keep

List the columns you want to keep. You can find the current table schema by running:

SELECT * 
FROM `project.dataset.INFORMATION_SCHEMA.COLUMNS` 
WHERE table_name = 'your_table';

Step 2: Create a New Table Without the Unwanted Column

Use the CREATE OR REPLACE TABLE statement or the bq command-line tool to create a new table that excludes the unwanted column:

CREATE OR REPLACE TABLE `project.dataset.new_table` AS
SELECT col1, col2, col3  -- exclude the column you want to drop
FROM `project.dataset.old_table`;

Step 3: Verify the New Table

After creating the new table, check that the schema is correct and the data is intact:

SELECT * 
FROM `project.dataset.new_table`
LIMIT 10;

Step 4 (Optional): Replace the Old Table

If you want the new table to replace the old one, you can:

  1. Delete the old table:
    bq rm -t project:dataset.old_table
  2. Rename the new table:
    bq cp project:dataset.new_table project:dataset.old_table

Using the BigQuery Web UI

In the BigQuery web console:

  • Write a query that selects only the columns you want to keep.
  • Click Save ResultsSave as Table.
  • Choose your dataset and table name, then click Save.

Best Practices

  • Back up your data before modifying tables.
  • Test your queries on a small dataset before applying to production.
  • Use descriptive table names when creating interim tables.

By following these steps, you can safely remove unnecessary columns from your BigQuery tables and keep your data warehouse clean and efficient.