Warm tip: This article is reproduced from stackoverflow.com, please click
google-bigquery sql

How to refer to another field computed in the WHERE clause?

发布于 2020-04-05 00:21:41

Consider a simplified example:

SELECT
  lengthy_expression AS a,
  g(a) AS b,
  h(a) AS c,
  ...
FROM
  my_table

Here, lengthy_expression represents a complex expression that uses several fields from my_table and spans several lines. Its result is used to calculate two other fields b and c. However, the above is not allowed in standard SQL, because an expression in the WHERE clause is not allowed to refer to the result of another expression in the WHERE clause (why not is beyond me).

The naive option is to repeat the lengthy_expression, but that's what I explicitly want to avoid.

One option is to use a subquery:

SELECT
  a,
  g(a) AS b,
  h(a) AS c,
  ...
FROM (
  SELECT
    lengthy_expression AS a,
    ...
  FROM
    my_table
)

But as you can see, I now need to repeat the other fields ... that the outer query needs.

Maybe a subquery with a join then?

SELECT
  a,
  g(a) AS b,
  h(a) AS c,
  ...
FROM
  my_table
INNER JOIN (
  SELECT
    lengthy_expression AS a
  FROM
    my_table
) USING id

It works, but now there's a (possibly expensive) join that serves no purpose except to keep the query readable. And it doesn't even serve that purpose very well, because the lengthy_expression is tucked away below the point where it's used, and the human reader has to jump all over the place to find out what's going on.

An alternative is using a CTE:

WITH
my_table_with_a AS (
  SELECT
    *,
    lengthy_expression AS a
  FROM
    my_table
)
SELECT
  *,
  g(a) AS b,
  h(a) AS c,
  ...
FROM
  my_table_with_a

At least now the reading order more or less matches the logical order in which operations happen, but it's pretty verbose, and finding a good name for my_table_with_a is difficult. Especially because in practice, I'm repeating this pattern two or three times.

Is there a better way to do this?

Questioner
Thomas
Viewed
58
Mikhail Berlyant 2020-02-01 03:06

Below is for BigQuery Standard SQL

#standardSQL
SELECT 
  a,
  g(a) AS b,
  h(a) AS c,
FROM `project.dataset.my_table`,
UNNEST([lengthy_expression]) a   

Below is extremely simplified example of above approach

#standardSQL
WITH `project.dataset.my_table` AS (
  SELECT 1 x, 2 y, 3 z UNION ALL
  SELECT 4, 5, 6
)
SELECT 
  a,
  a / 2 AS b,
  2 * a AS c
FROM `project.dataset.my_table`,
UNNEST([x + y + z]) a   
WHERE a > 10    

with result

Row a   b   c    
1   15  7.5 30   

Note: in case if result of lengthy_expression is an ARRAY by itself - you need to enclose it into struct as array of array is not supported in BigQuery