问题描述
我正在尝试将数据库从MySQL移植到PostgreSQL.我已经在Postgres中重建了架构,所以我要做的就是在不重新创建表的情况下获取数据.
I'm trying to port a database from MySQL to PostgreSQL. I've rebuilt the schema in Postgres, so all I need to do is get the data across, without recreating the tables.
我可以使用遍历所有记录并一次插入一条记录的代码来执行此操作,但是我尝试过这样做,但是对于我们的数据库大小来说,这是waaayyyy减慢的,所以我尝试使用mysqldump和通向psql的管道(每张表一次,一旦工作就可以并行化).
I could do this with code that iterates all the records and inserts them one at a time, but I tried that and it's waaayyyy to slow for our database size, so I'm trying to use mysqldump and a pipe into psql instead (once per table, which I may parallelize once I get it working).
为了达到这一目标,我不得不跳过各种不同的篮球,打开和关闭各种标志才能获得转储,这有些理智.再次,这仅转储INSERT INTO,因为我已经准备好空模式以将数据放入:
I've had to jump through various hoops to get this far, turning on and off various flags to get a dump that is vaguely sane. Again, this only dumps the INSERT INTO, since I've already prepared the empty schema to get the data into:
/usr/bin/env \
PGPASSWORD=mypassword \
mysqldump \
-h mysql-server \
-u mysql-username \
--password=mysql-password \
mysql-database-name \
table-name \
--compatible=postgresql \
--compact \
-e -c -t \
--default-character-set=utf8 \
| sed "s/\\\\\\'/\\'\\'/g" \
| psql \
-h postgresql-server \
--username=postgresql-username \
postgresql-database-name
除难看的sed
命令是可管理的之外的所有.我正在执行sed
尝试将MySQL的方法转换为引用字符串内的单引号('O\'Connor'
)o PostgreSQL的引用要求('O''Connor'
).它可以正常工作,直到转储中有这样的字符串:'String ending with a backslash \\'
...是的,似乎我们的数据库中有一些用户输入具有这种格式,这是完全有效的,但没有通过我的命令.我可以在sed
命令中添加一个后视符号,但是我感觉好像正在爬进一个兔子洞.有没有办法:
Everything except that ugly sed
command is manageable. I'm doing that sed
to try and convert MySQL's approach to quoting single-quotes inside of strings ('O\'Connor'
) o PostgreSQL's quoting requirements ('O''Connor'
). It works, until there are strings like this in the dump: 'String ending with a backslash \\'
... and yes, it seems there is some user input in our database that has this format, which is perfectly valid, but doesn't pass my sed
command. I could add a lookbehind to the sed
command, but I feel like I'm crawling into a rabbit hole. Is there a way to either:
a)告诉mysqldump通过将单引号加倍来引用单引号 b)告诉psql期望反斜杠被解释为引用转义?
a) Tell mysqldump to quote single quotes by doubling them up b) Tell psql to expect backslashes to be interpreted as quoting escapes?
我还有另一个问题,分别是BINARY
和bytea
,但是我已经通过base64编码/解码阶段解决了这个问题.
I have another issue with BINARY
and bytea
differences, but I've worked around that with a base64 encoding/decoding phase.
编辑|看来我可以使用set backslash_quote = on; set standard_conforming_strings = off;
(b),尽管我不确定如何将其注入到管道输出的开头.
EDIT | Looks like I can do (b) with set backslash_quote = on; set standard_conforming_strings = off;
, though I'm not sure how to inject that into the start of the piped output.
推荐答案
使用mysqldump的 --tab
选项,然后使用psql的 COPY
方法.
Dump the tables to TSV using mysqldump's --tab
option and then import using psql's COPY
method.
这篇关于获取mysqldump以转储适合psql输入的数据(转义的单引号)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!