Specify nested and repeated columns in table schemas
This page describes how to define a table schema with nested and repeated columns in BigQuery. For an overview of table schemas, see Specifying a schema.
Define nested and repeated columns
To create a column with nested data, set the data type of the column to
RECORD in the schema. A RECORD can be accessed as a
STRUCT type
in GoogleSQL. A STRUCT is a container of ordered fields.
To create a column with repeated data, set the
mode of the column to REPEATED in the schema.
A repeated field can be accessed as an
ARRAY type in
GoogleSQL.
A RECORD column can have REPEATED mode, which is represented as an array of
STRUCT types. Also, a field within a record can be repeated, which is
represented as a STRUCT that contains an ARRAY. An array cannot contain
another array directly. For more information, see
Declaring an ARRAY type.
Limitations
Nested and repeated schemas are subject to the following limitations:
- A schema cannot contain more than 15 levels of nested
RECORDtypes. - Columns of type
RECORDcan contain nestedRECORDtypes, also called child records. The maximum nested depth limit is 15 levels. This limit is independent of whether theRECORDs are scalar or array-based (repeated).
RECORD type is incompatible with UNION, INTERSECT, EXCEPT DISTINCT, and SELECT DISTINCT.
Example schema
The following example shows sample nested and repeated data. This table contains information about people. It consists of the following fields:
idfirst_namelast_namedob(date of birth)addresses(a nested and repeated field)addresses.status(current or previous)addresses.addressaddresses.cityaddresses.stateaddresses.zipaddresses.numberOfYears(years at the address)
The JSON data file would look like the following. Notice that the addresses
column contains an array of values (indicated by [ ]). The multiple addresses
in the array are the repeated data. The multiple fields within each address are
the nested data.
{"id":"1","first_name":"John","last_name":"Doe","dob":"1968-01-22","addresses":[{"status":"current","address":"123 First Avenue","city":"Seattle","state":"WA","zip":"11111","numberOfYears":"1"},{"status":"previous","address":"456 Main Street","city":"Portland","state":"OR","zip":"22222","numberOfYears":"5"}]}
{"id":"2","first_name":"Jane","last_name":"Doe","dob":"1980-10-16","addresses":[{"status":"current","address":"789 Any Avenue","city":"New York","state":"NY","zip":"33333","numberOfYears":"2"},{"status":"previous","address":"321 Main Street","city":"Hoboken","state":"NJ","zip":"44444","numberOfYears":"3"}]}
The schema for this table looks like the following:
[ { "name": "id", "type": "STRING", "mode": "NULLABLE" }, { "name": "first_name", "type": "STRING", "mode": "NULLABLE" }, { "name": "last_name", "type": "STRING", "mode": "NULLABLE" }, { "name": "dob", "type": "DATE", "mode": "NULLABLE" }, { "name": "addresses", "type": "RECORD", "mode": "REPEATED", "fields": [ { "name": "status", "type": "STRING", "mode": "NULLABLE" }, { "name": "address", "type": "STRING", "mode": "NULLABLE" }, { "name": "city", "type": "STRING", "mode": "NULLABLE" }, { "name": "state", "type": "STRING", "mode": "NULLABLE" }, { "name": "zip", "type": "STRING", "mode": "NULLABLE" }, { "name": "numberOfYears", "type": "STRING", "mode": "NULLABLE" } ] } ]
Specifying the nested and repeated columns in the example
To create a new table with the previous nested and repeated columns, select one of the following options:
Console
Specify the nested and repeated addresses column:
In the Google Cloud console, open the BigQuery page.
In the left pane, click Explorer:
If you don't see the left pane, click Expand left pane to open the pane.
In the Explorer pane, expand your project, click Datasets, and then select a dataset.
In the details pane, click Create table.
On the Create table page, specify the following details:
- For Source, in the Create table from field, select Empty table.
In the Destination section, specify the following fields:
- For Dataset, select the dataset in which you want to create the table.
- For Table, enter the name of the table that you want to create.
For Schema, click Add field and enter the following table schema:
- For Field name, enter
addresses. - For Type, select RECORD.
- For Mode, choose REPEATED.
Specify the following fields for a nested field:
- In the Field name field, enter
status. - For Type, choose STRING.
- For Mode, leave the value set to NULLABLE.
Click Add field to add the following fields:
Field name Type Mode addressSTRINGNULLABLEcitySTRINGNULLABLEstateSTRINGNULLABLEzipSTRINGNULLABLEnumberOfYearsSTRINGNULLABLE
Alternatively, click Edit as text and specify the schema as a JSON array.
- In the Field name field, enter
- For Field name, enter
SQL
Use the
CREATE TABLE statement.
Specify the schema using the
column
option:
In the Google Cloud console, go to the BigQuery page.
In the query editor, enter the following statement:
CREATE TABLE IF NOT EXISTS mydataset.mytable ( id STRING, first_name STRING, last_name STRING, dob DATE, addresses ARRAY< STRUCT< status STRING, address STRING, city STRING, state STRING, zip STRING, numberOfYears STRING>> ) OPTIONS ( description = 'Example name and addresses table');
Click Run.
For more information about how to run queries, see Run an interactive query.
bq
To specify the nested and repeated addresses column in a JSON schema file,
use a text editor to create a new file. Paste in the example schema
definition shown above.
After you create your JSON schema file, you can provide it through the bq command-line tool. For more information, see Using a JSON schema file.
Go
Before trying this sample, follow the Go setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Go API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Java
Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.