温馨提示:本文翻译自stackoverflow.com,查看原文请点击:sql - How to de-duplicate and modify a Big Query table?
google-bigquery sql

sql - 如何删除和修改Big Query表?

发布于 2020-04-06 00:18:49

我有一个表A,看起来像这样:

| origin |   food   |  category  |
----------------------------------
|  tree  |  apple   |    fruit   |
|  plant |  tomato  |    fruit   |
|  plant |  tomato  |  vegetable |
|........|..........|............|
|  plant |  tomato  |  vegetable |

我想浏览一下表格,对于所有combinations of origin and food出现多次的表格,将其类别连接起来fruit + vegetable并删除standalone versions因此上表为:

| origin |   food   |     category      |
-----------------------------------------
|  tree  |  apple   |    fruit          |
|  plant |  tomato  | fruit + vegetable |
|........|..........|...................|

我只想使用standardSQL有什么想法吗?

谢谢

查看更多

提问者
Lev
被浏览
114
rtenha 2020-02-01 01:02

使用STRING_AGG(DISTINCT)

with data as (
  select 'tree' as origin, 'apple' as food, 'fruit' as category union all
  select 'plant', 'tomato', 'fruit' union all
  select 'plant', 'tomato', 'vegetable' union all
  select 'plant', 'tomato', 'vegetable'
)
select origin, food, string_agg(distinct category,' + ') as category
from data
group by 1,2