PostgreSQL-根据从另一个表中的选择插入行，并使用新插入的行在该表中更新FK

本文介绍了PostgreSQL-根据从另一个表中的选择插入行，并使用新插入的行在该表中更新FK的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在两个表之间进行数据迁移（拆分出一个相关的表）。现有表是提醒，并且它具有 start 列和新添加的 dateset_id 列指向一个新的 dateset 表，该表也具有一个 start 列。对于提醒中的每一行，我想 INSERT 在 dateset ，其中复制了 start 值，并且 UPDATE 提示中的相应行具有新插入的 dateset ID。

I am doing a data migration between two tables (splitting out a related table). The existing table is reminders, and it has a start column and a newly-added dateset_id column pointing to a new dateset table, which also has a start column. For every row in reminders, I want to INSERT a new row in dateset with the start value copied over, and UPDATE the corresponding row in reminders with the newly-inserted dateset ID.

这是我尝试使用的SQL：

Here's the SQL I tried:

WITH inserted_datesets AS (
  INSERT INTO dateset (start)
  SELECT start FROM reminder
  RETURNING reminder.id AS reminder_id, id AS dateset_id
)
UPDATE reminder
SET dateset_id = ids.dateset_id
FROM inserted_datesets AS ids
WHERE reminder.id = ids.reminder_id

我收到错误缺少表提醒的FROM子句条目的错误，因为我在 RETURNING 子句中包含了 reminder.id 列，但没有实际上选择它作为插入。这是有道理的，但我无法弄清楚如何修改查询以执行所需的操作。

I get an error missing FROM-clause entry for table "reminder", because I'm including the reminder.id column in the RETURNING clause, but not actually selecting it for the insert. This makes sense, but I can't figure out how to modify the query to do what I need. Is there a totally different approach I'm missing?

推荐答案

有几种解决方法。

1。临时添加列

如其他人所述，直接方法是临时添加列 reminder_id 到 dateset 。使用提醒表中的原始 ID 填充它。使用它将提醒与 dateset 表联接。删除临时列。

As others mentioned, the straight-forward way is to temporarily add a column reminder_id to the dateset. Populate it with original IDs from reminder table. Use it to join reminder with the dateset table. Drop the temporary column.

2。当start是唯一的

如果 start 列的值是唯一的，则可以做到通过将提醒表与开始 dateset 表结合起来，无需额外的列

If values of the start column is unique it is possible to do it without extra column by joining reminder table with the dateset table on the start column.

INSERT INTO dateset (start)
SELECT start FROM reminder;

WITH
CTE_Joined
AS
(
    SELECT
        reminder.id AS reminder_id
        ,reminder.dateset_id AS old_dateset_id
        ,dateset.id AS new_dateset_id
    FROM
        reminder
        INNER JOIN dateset ON dateset.start = reminder.start
)
UPDATE CTE_Joined
SET old_dateset_id = new_dateset_id
;

3。当start不是唯一的

即使在这种情况下，也可以不使用临时列来完成。主要思想如下。让我们看一下这个示例：

It is possible to do it without temporary column even in this case. The main idea is the following. Let's have a look at this example:

在提醒中，我们有两行与相同的开始值以及ID 3和7：

We have two rows in reminder with the same start value and IDs 3 and 7:

reminder
id    start         dateset_id
3     2015-01-01    NULL
7     2015-01-01    NULL

之后我们将它们插入 dateset 中，将会生成新的ID，例如1和2：

After we insert them into the dateset, there will be new IDs generated, for example, 1 and 2:

dateset
id    start
1     2015-01-01
2     2015-01-01

我们如何链接这两行并不重要。最终结果可能是

It doesn't really matter how we link these two rows. The end result could be

reminder
id    start         dateset_id
3     2015-01-01    1
7     2015-01-01    2

或

reminder
id    start         dateset_id
3     2015-01-01    2
7     2015-01-01    1

这两个变体都是正确的。这将带给我们以下解决方案。

Both of these variants are correct. Which brings us to the following solution.

只需先插入所有行。

INSERT INTO dateset (start)
SELECT start FROM reminder;

在 start 列上匹配/联接两个表知道这不是唯一的。通过添加 ROW_NUMBER 并通过两列联接来独特。可以使查询更短，但是我明确地阐明了每个步骤：

Match/join two tables on start column knowing that it is not unique. "Make it" unique by adding ROW_NUMBER and joining by two columns. It is possible to make the query shorter, but I spelled out each step explicitly:

WITH
CTE_reminder_rn
AS
(
    SELECT
        id
        ,start
        ,dateset_id
        ,ROW_NUMBER() OVER (PARTITION BY start ORDER BY id) AS rn
    FROM reminder
)
,CTE_dateset_rn
AS
(
    SELECT
        id
        ,start
        ,ROW_NUMBER() OVER (PARTITION BY start ORDER BY id) AS rn
    FROM dateset
)
,CTE_Joined
AS
(
    SELECT
        CTE_reminder_rn.id AS reminder_id
        ,CTE_reminder_rn.dateset_id AS old_dateset_id
        ,CTE_dateset_rn.id AS new_dateset_id
    FROM
        CTE_reminder_rn
        INNER JOIN CTE_dateset_rn ON 
            CTE_dateset_rn.start = CTE_reminder_rn.start AND
            CTE_dateset_rn.rn = CTE_reminder_rn.rn
)
UPDATE CTE_Joined
SET old_dateset_id = new_dateset_id
;

我希望可以从代码中清楚地知道它的作用，尤其是将其与简单版本进行比较时没有 ROW_NUMBER 。显然，即使 start 是唯一的，复杂的解决方案仍然可以使用，但是效率不如简单的解决方案。

I hope it is clear from the code what it does, especially when you compare it to the simpler version without ROW_NUMBER. Obviously, the complex solution will work even if start is unique, but it is not as efficient, as a simple solution.

此解决方案假定在此过程之前 dateset 为空。

This solution assumes that dateset is empty before this process.

这篇关于PostgreSQL-根据从另一个表中的选择插入行，并使用新插入的行在该表中更新FK的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！