Manage BigLake tables for Apache Iceberg in BigQuery metadata snapshots

The following sections describe how to manage metadata snapshots for managed tables using BigLake tables for Apache Iceberg in BigQuery (hereafter BigLake Iceberg tables in BigQuery).

Before you begin

  • Understand the different types of BigLake tables and the implications of using them, in the BigLake table overview.

  • Before managing BigLake Iceberg tables in BigQuery metadata snapshots, ensure that you have set up a Cloud resource connection to a storage bucket. Your connection needs write permissions on the storage bucket, as specified in the following Required roles section. For more information about required roles and permissions for connections, see Manage connections.

Required roles

To get the permissions that you need to let BigQuery manage tables in your project, ask your administrator to grant you the following IAM roles:

For more information about granting roles, see Manage access to projects, folders, and organizations.

These predefined roles contain the permissions required to let BigQuery manage tables in your project. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to let BigQuery manage tables in your project:

  • bigquery.connections.delegate on your project
  • bigquery.jobs.create on your project
  • bigquery.readsessions.create on your project
  • bigquery.tables.get on your project
  • bigquery.tables.getData on your project
  • storage.buckets.get on your bucket
  • storage.objects.create on your bucket
  • storage.objects.delete on your bucket
  • storage.objects.get on your bucket
  • storage.objects.list on your bucket

You might also be able to get these permissions with custom roles or other predefined roles.

Create BigLake Iceberg table in BigQuery metadata snapshots

To create a BigLake Iceberg table in BigQuery metadata snapshot, follow these steps:

  1. Export the metadata into the Iceberg V2 format with the EXPORT TABLE METADATA SQL statement.

  2. Optional: Schedule Iceberg metadata snapshot refresh. To refresh an Iceberg metadata snapshot based on a set time interval, use a scheduled query.

  3. Optional: Enable metadata auto-refresh for your project to automatically update your Iceberg table metadata snapshot on each table mutation. To enable metadata auto-refresh, contact bigquery-tables-for-apache-iceberg-help@google.com. EXPORT METADATA costs are applied on each refresh operation.

The following example creates a scheduled query named My Scheduled Snapshot Refresh Query using the DDL statement EXPORT TABLE METADATA FROM mydataset.test. The DDL statement runs every 24 hours.

bq query \
    --use_legacy_sql=false \
    --display_name='My Scheduled Snapshot Refresh Query' \
    --schedule='every 24 hours' \
    'EXPORT TABLE METADATA FROM mydataset.test'

View BigLake Iceberg table in BigQuery metadata snapshot

After you refresh the BigLake Iceberg table in BigQuery metadata snapshot you can find the snapshot in the Cloud Storage URI that the BigLake Iceberg table in BigQuery was originally created in. The /data folder contains the Parquet file data shards, and the /metadata folder contains the BigLake Iceberg table in BigQuery metadata snapshot.

SELECT
  table_name,
  REGEXP_EXTRACT(ddl, r"storage_uri\s*=\s*\"([^\"]+)\"") AS storage_uri
FROM
  `mydataset`.INFORMATION_SCHEMA.TABLES;

Note that mydataset and table_name are placeholders for your actual dataset and table.

What's next