本文介绍了获取mysqldump以转储适合psql输入的数据(转义的单引号)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将数据库从MySQL移植到PostgreSQL.我已经在Postgres中重建了架构,所以我要做的就是在不重新创建表的情况下获取数据.

I'm trying to port a database from MySQL to PostgreSQL. I've rebuilt the schema in Postgres, so all I need to do is get the data across, without recreating the tables.

我可以使用遍历所有记录并一次插入一条记录的代码来执行此操作,但是我尝试过这样做,但是对于我们的数据库大小来说,这是waaayyyy减慢的,所以我尝试使用mysqldump和通向psql的管道(每张表一次,一旦工作就可以并行化).

I could do this with code that iterates all the records and inserts them one at a time, but I tried that and it's waaayyyy to slow for our database size, so I'm trying to use mysqldump and a pipe into psql instead (once per table, which I may parallelize once I get it working).

为了达到这一目标,我不得不跳过各种不同的篮球,打开和关闭各种标志才能获得转储,这有些理智.再次,这仅转储INSERT INTO,因为我已经准备好空模式以将数据放入:

I've had to jump through various hoops to get this far, turning on and off various flags to get a dump that is vaguely sane. Again, this only dumps the INSERT INTO, since I've already prepared the empty schema to get the data into:

      /usr/bin/env \
      PGPASSWORD=mypassword \
      mysqldump \
      -h mysql-server \
      -u mysql-username \
      --password=mysql-password \
      mysql-database-name \
      table-name \
      --compatible=postgresql \
      --compact \
      -e -c -t \
      --default-character-set=utf8 \
      | sed "s/\\\\\\'/\\'\\'/g" \
      | psql \
      -h postgresql-server \
      --username=postgresql-username \
      postgresql-database-name

除难看的sed命令是可管理的之外的所有.我正在执行sed尝试将MySQL的方法转换为引用字符串内的单引号('O\'Connor')o PostgreSQL的引用要求('O''Connor').它可以正常工作,直到转储中有这样的字符串:'String ending with a backslash \\' ...是的,似乎我们的数据库中有一些用户输入具有这种格式,这是完全有效的,但没有通过我的命令.我可以在sed命令中添加一个后视符号,但是我感觉好像正在爬进一个兔子洞.有没有办法:

Everything except that ugly sed command is manageable. I'm doing that sed to try and convert MySQL's approach to quoting single-quotes inside of strings ('O\'Connor') o PostgreSQL's quoting requirements ('O''Connor'). It works, until there are strings like this in the dump: 'String ending with a backslash \\'... and yes, it seems there is some user input in our database that has this format, which is perfectly valid, but doesn't pass my sed command. I could add a lookbehind to the sed command, but I feel like I'm crawling into a rabbit hole. Is there a way to either:

a)告诉mysqldump通过将单引号加倍来引用单引号 b)告诉psql期望反斜杠被解释为引用转义?

a) Tell mysqldump to quote single quotes by doubling them up b) Tell psql to expect backslashes to be interpreted as quoting escapes?

我还有另一个问题,分别是BINARYbytea,但是我已经通过base64编码/解码阶段解决了这个问题.

I have another issue with BINARY and bytea differences, but I've worked around that with a base64 encoding/decoding phase.

编辑|看来我可以使用set backslash_quote = on; set standard_conforming_strings = off;(b),尽管我不确定如何将其注入到管道输出的开头.

EDIT | Looks like I can do (b) with set backslash_quote = on; set standard_conforming_strings = off;, though I'm not sure how to inject that into the start of the piped output.

推荐答案

使用mysqldump的 --tab 选项,然后使用psql的 COPY 方法.

Dump the tables to TSV using mysqldump's --tab option and then import using psql's COPY method.

这篇关于获取mysqldump以转储适合psql输入的数据(转义的单引号)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-04 11:22