How to use unnest in BigQuery

UNNEST in BigQuery is a very useful function that allows you to flatten nested and repeated data structures for easier analysis. When dealing with complex hierarchical data like arrays, UNNEST comes to the rescue.

Unlike traditional tabular data, these arrays contain nested and repeated data, which can make aggregation and joins more difficult. When this happens, UNNEST allows you to expand nested or repeated data into separate rows so you can analyze the unnested elements much more easily.

In this article, we’ll explore the UNNEST function in BigQuery to see how it can simplify nested data analysis. 

Understanding nested and repeated data structures in BigQuery

When analyzing large and complex datasets in BigQuery, it’s common to encounter nested and repeated fields. These nested data structures allow you to store multiple values in a single field in BigQuery, making it easier to manipulate multiple values at once.

When you look at the table below, you will notice that the column with the “event parameters” contains subcolumns with multiple values, which are all part of the same row. 

Unnest BigQuery 1

If you examine the table schema, you’ll see this column’s data type isn’t an integer, string, float, or boolean. Rather, it’s defined as a record since it contains subcolumns.

Unnest BigQuery 2

As you can see, the “event_params” field has subcolumns (“key” and “value”) with multiple values inside. Notice that BigQuery supports nested data with subcolumns within subcolumns. In this table, “value” contains multiple subcolumns.

Unnest BigQuery 3

How does UNNEST work in BigQuery?

The UNNEST function in BigQuery is used to flatten nested or repeated data structures into separate rows. What it does is take as input a column with a nested data type like an ARRAY, and expand the nested or repeated elements into multiple rows. As a result, it will flatten the data into a tabular structure.

After UNNEST transforms the nested or repeated data into a flat table, you can query columns directly and analyze its individual elements.

How to use UNNEST in BigQuery

In order to flatten nested data structures in BigQuery, we’ll use the UNNEST function. Here’s the UNNEST syntax:

SELECT column1, column2, …

FROM table_name, UNNEST(column) AS alias 

We’ll use the UNNEST function to break the “event_params” column down into separated rows.

SELECT ga.event_name, params.key, params.value

FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_20210131` AS ga,

UNNEST(event_params) AS params

LIMIT 1000;

Let’s take a closer look at the query.

SELECT ga.event_name, params.key, params.value

First, I’ve specified the columns I want BigQuery to return: event_name, params.key, and params.value. “Params” is the alias I’ll assign when I write the UNNEST function.

FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_20210131` AS ga,

This is the table I want to query. I’ve used an AS statement to assign an alias to the table: “ga”.

UNNEST(event_params) AS params

The UNNEST function will expand the nested data into separate rows.

LIMIT 1000;

This clause limits the number of rows returned to 1000.

Unnest BigQuery 4

BigQuery has expanded the data into separate rows, flattening the nested data.

This is how you can unnest data in BigQuery. If you want to learn how to summarize and analyze data in BigQuery, check out this article on how to use aggregate functions.

Ready to streamline your spreadsheet data?

You may also like…

Best ETL tools in 2024

Best ETL tools in 2024

In the ever-evolving landscape of data analytics and business intelligence, ETL tools play a...