As of April 20th, 2026, BigLake is now called Lakehouse for Apache Iceberg. BigLake metastore is now called the Lakehouse runtime catalog. Lakehouse APIs, client libraries, CLI commands, and IAM names remain unchanged and still reference BigLake.

Supported storage formats and data types

This document details how data types and storage formats behave when integrating Spark and Hive with BigQuery through the Lakehouse runtime catalog.

Specifically, this document provides:

Supported storage formats: A compatibility breakdown of formats like Parquet, ORC, Avro, CSV, and JSON across Hive SerDe and Spark data sources.
Data type mappings: The precise conversion rules between Spark and BigQuery data types.

Use this page to verify that your table schemas and storage formats align with the metastore before you run workloads or query tables across engines.

Supported storage formats between Hive and Spark

The following sections describe the storage format and data source compatibility between Hive, Spark, and BigQuery.

Detailed storage format mapping

BigQuery determines the storage format of a table based on the input_format, output_format, and SerDe library in the metadata. The following table maps these properties to the BigQuery storage format.

Input format, output format, and SerDe library	BigQuery storage format
`org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat` `org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat` `org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe`	Parquet
`org.apache.hadoop.hive.ql.io.orc.OrcInputFormat` `org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat` `org.apache.hadoop.hive.ql.io.orc.OrcSerde`	ORC
`org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat` `org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat` `org.apache.hadoop.hive.serde2.avro.AvroSerDe`	Avro
`org.apache.hadoop.mapred.TextInputFormat` `org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat` `org.apache.hadoop.hive.serde2.OpenCSVSerde`	CSV
`org.apache.hadoop.mapred.TextInputFormat` `org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat` `org.openx.data.jsonserde.JsonSerDe`	JSON
`org.apache.hadoop.mapred.TextInputFormat` `org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat` `org.apache.hive.hcatalog.data.JsonSerDe`	JSON

Hive SerDe compatibility

The following table lists the compatibility of Hive SerDe table formats with BigQuery.

Format	Spark SQL DDL syntax	Queryable from BigQuery
Parquet	`CREATE TABLE ... STORED AS PARQUET`	Yes
ORC	`CREATE TABLE ... STORED AS ORC`	Yes
Avro	`CREATE TABLE ... STORED AS AVRO`	Yes
CSV	`CREATE TABLE ... ROW FORMAT 'org.apache.hadoop.hive.serde2.OpenCSVSerde'`	Yes
JSON	`CREATE TABLE ... ROW FORMAT 'org.openx.data.jsonserde.JsonSerDe'`	Yes

Spark data source compatibility

The following table lists the compatibility of Spark data source table formats with BigQuery.

CSV and JSON SerDe tables are queryable from BigQuery. However, CSV and JSON Spark data source tables are not.

Format	Spark SQL DDL syntax	Queryable from BigQuery
Parquet	`CREATE TABLE ... USING PARQUET`	Yes
ORC	`CREATE TABLE ... USING ORC`	Yes
Avro	`CREATE TABLE ... USING AVRO`	Yes
CSV	`CREATE TABLE ... USING CSV`	No
JSON	`CREATE TABLE ... USING JSON`	No

Supported data types from Spark to BigQuery

The following table maps Spark data types to BigQuery data types.

Spark data type	BigQuery data type
`BYTE` or `TINYINT`	`INT64`
`SMALLINT` or `SHORT`	`INT64`
`INT` or `INTEGER`	`INT64`
`BIGINT` or `LONG`	`INT64`
`DECIMAL` or `NUMERIC`	`BIGNUMERIC`
`FLOAT`	`FLOAT64`
`DOUBLE`	`FLOAT64`
`REAL`	`FLOAT64`
`BOOLEAN`	`BOOL`
`STRING`	`STRING`
`VARCHAR`	`STRING`
`CHAR` or `CHARACTER`	`STRING`
`BINARY`	`BYTES`
`DATE`	`DATE`
`TIMESTAMP` or `TIMESTAMP_LTZ`	`TIMESTAMP`
`ARRAY`	`ARRAY`
`STRUCT<col_name: type1, ...>`	`STRUCT<col_name: type1, ...>`
`MAP<key_type, value_type>`	`ARRAY<STRUCT<key: key_type, value: value_type>>` To enable this feature, send an email to biglake-help@google.com. This is only necessary if your workloads use `MAP`.

What's next

Use Spark and Hive with the Lakehouse runtime catalog.