问题描述
我正在两个表之间进行数据迁移(拆分出一个相关的表)。现有表是提醒
,并且它具有 start
列和新添加的 dateset_id
列指向一个新的 dateset
表,该表也具有一个 start
列。对于提醒
中的每一行,我想 INSERT
在 dateset $中新建一行c $ c>,其中复制了
具有新插入的 start
值,并且 UPDATE
提示中的相应行 dateset
ID。
I am doing a data migration between two tables (splitting out a related table). The existing table is reminders
, and it has a start
column and a newly-added dateset_id
column pointing to a new dateset
table, which also has a start
column. For every row in reminders
, I want to INSERT
a new row in dateset
with the start
value copied over, and UPDATE
the corresponding row in reminders
with the newly-inserted dateset
ID.
这是我尝试使用的SQL:
Here's the SQL I tried:
WITH inserted_datesets AS (
INSERT INTO dateset (start)
SELECT start FROM reminder
RETURNING reminder.id AS reminder_id, id AS dateset_id
)
UPDATE reminder
SET dateset_id = ids.dateset_id
FROM inserted_datesets AS ids
WHERE reminder.id = ids.reminder_id
我收到错误缺少表提醒的FROM子句条目的错误
,因为我在 RETURNING
子句中包含了 reminder.id
列,但没有实际上选择它作为插入。这是有道理的,但我无法弄清楚如何修改查询以执行所需的操作。
I get an error missing FROM-clause entry for table "reminder"
, because I'm including the reminder.id
column in the RETURNING
clause, but not actually selecting it for the insert. This makes sense, but I can't figure out how to modify the query to do what I need. Is there a totally different approach I'm missing?
推荐答案
有几种解决方法。
1。临时添加列
如其他人所述,直接方法是临时添加列 reminder_id
到 dateset
。使用提醒
表中的原始 ID
填充它。使用它将提醒
与 dateset
表联接。删除临时列。
As others mentioned, the straight-forward way is to temporarily add a column reminder_id
to the dateset
. Populate it with original IDs
from reminder
table. Use it to join reminder
with the dateset
table. Drop the temporary column.
2。当start是唯一的
如果 start
列的值是唯一的,则可以做到通过将提醒
表与开始$ c上的
dateset
表结合起来,无需额外的列
If values of the start
column is unique it is possible to do it without extra column by joining reminder
table with the dateset
table on the start
column.
INSERT INTO dateset (start)
SELECT start FROM reminder;
WITH
CTE_Joined
AS
(
SELECT
reminder.id AS reminder_id
,reminder.dateset_id AS old_dateset_id
,dateset.id AS new_dateset_id
FROM
reminder
INNER JOIN dateset ON dateset.start = reminder.start
)
UPDATE CTE_Joined
SET old_dateset_id = new_dateset_id
;
3。当start不是唯一的
即使在这种情况下,也可以不使用临时列来完成。主要思想如下。让我们看一下这个示例:
It is possible to do it without temporary column even in this case. The main idea is the following. Let's have a look at this example:
在提醒
中,我们有两行与相同的开始
值以及ID 3和7:
We have two rows in reminder
with the same start
value and IDs 3 and 7:
reminder
id start dateset_id
3 2015-01-01 NULL
7 2015-01-01 NULL
之后我们将它们插入 dateset
中,将会生成新的ID,例如1和2:
After we insert them into the dateset
, there will be new IDs generated, for example, 1 and 2:
dateset
id start
1 2015-01-01
2 2015-01-01
我们如何链接这两行并不重要。最终结果可能是
It doesn't really matter how we link these two rows. The end result could be
reminder
id start dateset_id
3 2015-01-01 1
7 2015-01-01 2
或
reminder
id start dateset_id
3 2015-01-01 2
7 2015-01-01 1
这两个变体都是正确的。这将带给我们以下解决方案。
Both of these variants are correct. Which brings us to the following solution.
只需先插入所有行。
INSERT INTO dateset (start)
SELECT start FROM reminder;
在 start
列上匹配/联接两个表知道这不是唯一的。通过添加 ROW_NUMBER
并通过两列联接来独特。可以使查询更短,但是我明确地阐明了每个步骤:
Match/join two tables on start
column knowing that it is not unique. "Make it" unique by adding ROW_NUMBER
and joining by two columns. It is possible to make the query shorter, but I spelled out each step explicitly:
WITH
CTE_reminder_rn
AS
(
SELECT
id
,start
,dateset_id
,ROW_NUMBER() OVER (PARTITION BY start ORDER BY id) AS rn
FROM reminder
)
,CTE_dateset_rn
AS
(
SELECT
id
,start
,ROW_NUMBER() OVER (PARTITION BY start ORDER BY id) AS rn
FROM dateset
)
,CTE_Joined
AS
(
SELECT
CTE_reminder_rn.id AS reminder_id
,CTE_reminder_rn.dateset_id AS old_dateset_id
,CTE_dateset_rn.id AS new_dateset_id
FROM
CTE_reminder_rn
INNER JOIN CTE_dateset_rn ON
CTE_dateset_rn.start = CTE_reminder_rn.start AND
CTE_dateset_rn.rn = CTE_reminder_rn.rn
)
UPDATE CTE_Joined
SET old_dateset_id = new_dateset_id
;
我希望可以从代码中清楚地知道它的作用,尤其是将其与简单版本进行比较时没有 ROW_NUMBER
。显然,即使 start
是唯一的,复杂的解决方案仍然可以使用,但是效率不如简单的解决方案。
I hope it is clear from the code what it does, especially when you compare it to the simpler version without ROW_NUMBER
. Obviously, the complex solution will work even if start
is unique, but it is not as efficient, as a simple solution.
此解决方案假定在此过程之前 dateset
为空。
This solution assumes that dateset
is empty before this process.
这篇关于PostgreSQL-根据从另一个表中的选择插入行,并使用新插入的行在该表中更新FK的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!