Warm tip: This article is reproduced from serverfault.com, please click

Select multiple keys and values from a NESTED column

发布于 2020-11-28 16:47:23

I'm linking Google Analytics 4 events into a BigQuery table. I can retrieve data just based on one key, but how do I get the value stored in another key in the same record as that key?

In specific, I want to rank the number of views by the name of the article, and then provide the author's name in a separate column as a supplementary data (the author's name is stored in the same record with a different key in a nested column).

Environment

Google Analytics 4 Set up events in Google Tag Manager

The table schema looks like this, with keys such as article_name author_name in event_params.key and the value you want to get in event_params.value.string_value.

enter image description here

And the table preview looks like this:

+-----+------------+-----------------+--------------+------------------+---------------------------------+
| Row | event_date | event_timestamp | event_name   | event_params.key | event_params.value.string_value |
+-----+------------+-----------------+--------------+------------------+---------------------------------+
| 1   | 20201127   | 160394324324231 | view_article | article_name     | My Article A                    |
|     |            |                 |              | author_name      | Author A                        |
|     |            |                 |              | random key1      | random value1                   |
|     |            |                 |              | random key2      | random value2                   |
| 2   | 20201127   | 160394324324112 | view_article | article_name     | My Article B                    |
|     |            |                 |              | author_name      | Author B                        |
|     |            |                 |              | random key1      | random value3                   |
|     |            |                 |              | random key2      | random value4                   |
 ...
+-----+------------+-----------------+--------------+------------------+---------------------------------+

What I tried

Was able to pull off the article ranking itself, without the author name.

#standardSQL

WITH _data AS (
    SELECT 
        value.string_value AS article_name 
    FROM 
        `my-new-project.analytics_000000000.events_*`, 
        UNNEST(event_params) 
    WHERE 
        event_name = 'article_view'
)

SELECT
    article_name,
    COUNT(*) AS cnt
FROM
    _data
GROUP BY
    1
ORDER BY
    2 DESC

Result:

+-----+--------------+-----+
| Row | article_name | cnt |
+-----+--------------+-----+
| 1   | My Article A | 20  |
| 2   | My Article D | 18  |
| 3   | My Article C | 11  |
| 4   | My Article B | 9   |
  ...
+-----+--------------+-----+

I wanted to add a column for author_name next to article_name here, so I thought it would be a good idea to use CASE WHEN. But as it turns out, the author_name will all be null, which probably means it is treated as a separate record.

#standardSQL

WITH _data AS (
    SELECT 
        CASE WHEN key = 'article_name' THEN value.string_value
        END AS article_name,
        CASE WHEN key = 'author_name' THEN value.string_value
        END AS author_name
    FROM 
        `my-new-project.analytics_000000000.events_*`, 
        UNNEST(event_params) 
    WHERE 
        key = 'article_name'
)

SELECT
    article_name,
    MAX(author_name),
    COUNT(*) AS cnt
FROM
    _data
GROUP BY
    1
ORDER BY
    3 DESC

Result:

+-----+--------------+-------------+-----+
| Row | article_name | author_name | cnt |
+-----+--------------+-------------+-----+
| 1   | My Article A | null        | 20  |
| 2   | My Article D | null        | 18  |
| 3   | My Article C | null        | 11  |
| 4   | My Article B | null        | 9   |
  ...
+-----+--------------+-------------+-----+

When I GROUP BY with author_name in a descending order, the author's name appears correctly, but this time the article_name is all null. Is it possible to have both article_name and author_name in the same record, and have the author name next to article name in the same ranking result?

Questioner
oki_oki
Viewed
0
Martin Weitzmann 2020-11-30 17:00:15

I think you're looking for a sub-select solution:

SELECT
   (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'article_name') AS article_name,
   (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'author_name') AS author_name,
count(1) as cnt
FROM 
  `my-new-project.analytics_000000000.events_*`
GROUP BY 1,2
ORDER BY 3 DESC