问题描述
伙计!
我需要一些解决以下问题的想法:
I need some ideas, with the follow problem:
我有两个表:
表1:
+-------+------------+---------+
| ID | field_name | value |
+-------+------------+---------+
| 1 | usd | 10.08 |
| 1 | gross_amt | 52.0 |
| 1 | jpy | 30.05 |
| 2 | usd | 50.0 |
| 2 | eur | 50.0 |
| 3 | real_amt | 210.43 |
| 3 | total | 320 |
| 4 | jpy | 23.45 |
| 4 | name | john |
| 4 | city | utah |
+-------+------------+---------+
表2:
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
| ID | name | last_name | date1 | counrty | city | usd | eur | jpy | gross_amt | real_amt | total | ... | field200 |
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
| 1 | jane | doe | 19900108 | usa | LA | 9.08 | 0.00 | 29.05 | 50.0 | 52.0 | 900.0 | ... | value200 |
| 2 | lane | smith | 19900108 | usa | LA | 40.8 | 40.0 | 0.00 | 100.0 | 70.0 | 290.0 | ... | value200 |
| 3 | mike | hoffa | 19900108 | usa | SF | 5.05 | 0.00 | 0.00 | 10.0 | 25.0 | 100.0 | ... | value200 |
| 4 | paul | doe | 19900108 | usa | NY | 1.00 | 0.00 | 29.05 | 45.0 | 55.0 | 110.0 | ... | value200 |
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
我需要用表1列value
的值更新表2中位于列field_name
的表1中的字段的值,这两个表中的两个ID相同,除此之外, ,表1中列value
的数据类型是字符串,但是表2中要更新的列的数据类型是不同的,尤其是数字(数字,int64,float64)
I need to update the values of the fields in the table 2, which are in table 1 in column field_name
, with the values of table 1 column value
, both IDs are the same in both tables, beside that, the datatype of column value
in table 1 are string, but the data type of the columns to update in the table 2 are diferent, especially the numbers(numeric, int64, float64)
上面的表是一个示例,实际问题的表2有200个字段,表1中的ID最多可以对每天要修改的数千条记录进行40个值修改
The tables above are an example, table 2 of the real problem has 200 fields and in table 1 for an ID there can be up to 40 value modifications for thousands of records to be modified daily
谢谢
我尝试了以下两种解决方案:
I have tried the following two solutions:
解决方案1(有效,但速度很慢,记录很多):
Solution 1 (it works, but very slow, it is a lot of records):
DECLARE SQLSCRIPT STRING DEFAULT '';
DECLARE col, val, id STRING;
DECLARE n INT64;
DECLARE i INT64 DEFAULT 1;
SET n= (SELECT COUNT(*) FROM `project.dataset.table1`);
WHILE i <= n DO
SET col = (SELECT col FROM `project.dataset.table1` LIMIT 1);
SET val = (SELECT val FROM `project.dataset.table1` LIMIT 1);
SET id = (SELECT id FROM `project.dataset.table1` LIMIT 1);
SET SQLScript = (SELECT CONCAT('UPDATE `project.dataset.table2`` SET ',col,' = ',val,' WHERE id = ','"',id,'"'));
SET i = i + 1;
END WHILE;
EXECUTE IMMEDIATE SQLSCRIPT;
解决方案2(我无法使用它,它给我以下错误):
Solution 2 (I can't get it to work, it gives me the following error):
[错误执行大查询][1]: https://i.stack.imgur.com/Pv44T.png
[error execution Big Query][1]: https://i.stack.imgur.com/Pv44T.png
EXECUTE IMMEDIATE (SELECT STRING_AGG('UPDATE `project.dataset.table2` SET '||x.col||'="'||x.val||'" WHERE id = "'||x.id||'"', ';')
FROM UNNEST((SELECT ARRAY_AGG(STRUCT(id, col, val))
FROM `project.dataset.table1`)) AS x);
推荐答案
下面是BigQuery标准SQL
Below is for BigQuery Standard SQL
EXECUTE IMMEDIATE '''
CREATE TEMP TABLE pivot1 AS
SELECT id, ''' || (
SELECT STRING_AGG(DISTINCT "MAX(IF(field_name = '" || field_name || "', CAST(value AS " || data_type || "), NULL)) AS " || field_name)
FROM `project.dataset.table1`
JOIN (
SELECT column_name, data_type
FROM `project.dataset.INFORMATION_SCHEMA.COLUMNS`
WHERE tablename = 'table2'
) ON field_name = column_name
) || '''
FROM `project.dataset.table1`
GROUP BY id
''';
EXECUTE IMMEDIATE '''
MERGE `project.dataset.table2` AS t2
USING pivot1 AS t1
ON t2.id = t1.id
WHEN MATCHED THEN
UPDATE SET
''' || (
SELECT STRING_AGG(DISTINCT field_name || ' = IFNULL(t1.' || field_name || ', t2.' || field_name || ')')
FROM `project.dataset.table1`
);
SELECT * FROM `project.dataset.table2` ORDER BY id;
如果要应用于您的问题的样本数据(表1和表2)-输出为(突出显示更新)
If to apply to sample data (table1 and table2) from your question - output is (updates are highlighted)
这篇关于如何在Big Query中更新表,其中要更新的字段名称是另一个表中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!