我正在使用MySQL 5.6,并且想修改一个表的默认编码(从latin1到utf8),而无需修改现有的列和行。
基于documentation我尝试了以下命令:
ALTER TABLE mytable DEFAULT CHARACTER SET utf8;
它修改了我的表的默认字符集编码,并且没有修改列的排序规则,但是正如我所料,我真的很惊讶地看到:
Query OK, 32141 rows affected (6.31 sec)
Records: 32141 Duplicates: 0 Warnings: 0
除了“受影响的32141行”之外,结果与预期的一样,如下所示:
MySQL> select count(*) from mytable;
+----------+
| count(*) |
+----------+
| 32141 |
+----------+
1 row in set (0.01 sec)
MySQL> show table status like 'mytable';
+-----------------------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+-------------+-------------+------------+-----------------+----------+----------------+---------+
| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment |
+-----------------------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+-------------+-------------+------------+-----------------+----------+----------------+---------+
| mytable | InnoDB | 10 | Compact | 16723 | 20798 | 347815936 | 0 | 21561344 | 15728640 | NULL | NULL | NULL | NULL | utf8_general_ci | NULL | partitioned | |
+-----------------------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+-------------+-------------+------------+-----------------+----------+----------------+---------+
MySQL> show create table mytable;
CREATE TABLE `mytable` (
`ID` varchar(255) NOT NULL,
`COL1` double DEFAULT NULL,
`COL2` longtext CHARACTER SET latin1,
`COL3` datetime DEFAULT NULL,
`COL4` varchar(255) CHARACTER SET latin1 DEFAULT NULL,
`COL5` int(11) DEFAULT NULL,
`COL6` datetime DEFAULT NULL,
`COL7` varchar(255) CHARACTER SET latin1 DEFAULT NULL,
`COL8` datetime(3) NOT NULL,
`COL9` int(11) NOT NULL DEFAULT '-1',
`COL10` int(11) DEFAULT '0',
`COL11` double DEFAULT '0',
PRIMARY KEY (`ID`,`COL9`),
KEY `idx1` (`COL7`,`COL3`,`COL6`),
KEY `idx2` (`COL1`,`COL4`,`COL3`,`COL6`),
KEY `idx3` (`ID`,`COL3`,`COL6`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (`COL9`)
(PARTITION p0 VALUES LESS THAN (1) ENGINE = InnoDB,
PARTITION p1 VALUES LESS THAN (2) ENGINE = InnoDB,
PARTITION p2 VALUES LESS THAN (3) ENGINE = InnoDB,
PARTITION p3 VALUES LESS THAN (4) ENGINE = InnoDB,
PARTITION p4 VALUES LESS THAN (5) ENGINE = InnoDB,
PARTITION p5 VALUES LESS THAN (6) ENGINE = InnoDB,
PARTITION p6 VALUES LESS THAN (7) ENGINE = InnoDB,
PARTITION p7 VALUES LESS THAN (8) ENGINE = InnoDB,
PARTITION p8 VALUES LESS THAN (9) ENGINE = InnoDB,
PARTITION p9 VALUES LESS THAN (10) ENGINE = InnoDB,
PARTITION p10 VALUES LESS THAN (11) ENGINE = InnoDB,
PARTITION p11 VALUES LESS THAN (100) ENGINE = InnoDB,
PARTITION p12 VALUES LESS THAN (101) ENGINE = InnoDB,
PARTITION p13 VALUES LESS THAN (102) ENGINE = InnoDB,
PARTITION p14 VALUES LESS THAN (103) ENGINE = InnoDB,
PARTITION p15 VALUES LESS THAN (104) ENGINE = InnoDB,
PARTITION p16 VALUES LESS THAN (105) ENGINE = InnoDB,
PARTITION p17 VALUES LESS THAN (106) ENGINE = InnoDB,
PARTITION p18 VALUES LESS THAN (107) ENGINE = InnoDB,
PARTITION p19 VALUES LESS THAN (108) ENGINE = InnoDB,
PARTITION p20 VALUES LESS THAN (109) ENGINE = InnoDB,
PARTITION p21 VALUES LESS THAN (110) ENGINE = InnoDB,
PARTITION p22 VALUES LESS THAN (111) ENGINE = InnoDB,
PARTITION p23 VALUES LESS THAN (200) ENGINE = InnoDB,
PARTITION p24 VALUES LESS THAN (201) ENGINE = InnoDB,
PARTITION p25 VALUES LESS THAN (202) ENGINE = InnoDB,
PARTITION p26 VALUES LESS THAN (203) ENGINE = InnoDB,
PARTITION p27 VALUES LESS THAN (204) ENGINE = InnoDB,
PARTITION p28 VALUES LESS THAN (205) ENGINE = InnoDB,
PARTITION p29 VALUES LESS THAN (206) ENGINE = InnoDB,
PARTITION p30 VALUES LESS THAN (207) ENGINE = InnoDB,
PARTITION p31 VALUES LESS THAN (208) ENGINE = InnoDB,
PARTITION p32 VALUES LESS THAN (209) ENGINE = InnoDB,
PARTITION p33 VALUES LESS THAN (210) ENGINE = InnoDB,
PARTITION p34 VALUES LESS THAN (211) ENGINE = InnoDB,
PARTITION p35 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
MySQL> show full columns from mytable;
+--------------------------+--------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
| Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment |
+--------------------------+--------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
| ID | varchar(255) | latin1_swedish_ci | NO | PRI | NULL | | select,insert,update,references | |
| COL1 | double | NULL | YES | MUL | NULL | | select,insert,update,references | |
| COL2 | longtext | latin1_swedish_ci | YES | | NULL | | select,insert,update,references | |
| COL3 | datetime | NULL | YES | | NULL | | select,insert,update,references | |
| COL4 | varchar(255) | latin1_swedish_ci | YES | | NULL | | select,insert,update,references | |
| COL5 | int(11) | NULL | YES | | NULL | | select,insert,update,references | |
| COL6 | datetime | NULL | YES | | NULL | | select,insert,update,references | |
| COL7 | varchar(255) | latin1_swedish_ci | YES | MUL | NULL | | select,insert,update,references | |
| COL8 | datetime(3) | NULL | NO | | NULL | | select,insert,update,references | |
| COL9 | int(11) | NULL | NO | PRI | -1 | | select,insert,update,references | |
| COL10 | int(11) | NULL | YES | | 0 | | select,insert,update,references | |
| COL11 | double | NULL | YES | | 0 | | select,insert,update,references | |
+--------------------------+--------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
我的连接参数如下:
MySQL> show variables where variable_name like '%char%' or variable_name like '%collation%';
+--------------------------+--------------------------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_general_ci |
| collation_database | utf8mb4_general_ci |
| collation_server | utf8mb4_general_ci |
+--------------------------+--------------------------------------------------+
注意:
数据是从Java应用程序创建的
在创建数据时,将连接参数设置为utf8
没有与该表关联的FK
当我尝试使用一些新创建的表进行复制时,似乎未修改行。参见下面的“受影响的0行”:
MySQL> select count(*) from mytesttable;
+----------+
| count(*) |
+----------+
| 3 |
+----------+
3 row in set (0.10 sec)
MySQL> alter table mytesttable character set utf8;
Query OK, 0 rows affected (0.03 sec)
Records: 0 Duplicates: 0 Warnings: 0
我试图在数据创建过程中将连接参数改回latin1,但它没有改变结果:仍然“影响0行”。
所以我的问题是:
我对命令的理解正确吗? (它不应该修改行)
什么能解释在第一种情况下行受到影响?
编辑
我刚刚发现,如果删除分区,则不会发生此问题。
通过分区,我得到“ XXX个受影响的行”
没有分区,我得到“ 0个受影响的行”
是预期的吗?
编辑2和摘要
原来:
该表使用
latin1
作为默认编码(与列相同)该连接被声明为
utf8
什么有效:
在任何
ALTER TABLE
命令之前,像“é”之类的字符似乎已被latin1
编码(E9
)运行命令
ALTER TABLE mytable CHARACTER SET utf8mb4;
不会修改数据(十六进制命令仍显示E9
)该列仍声明为
latin1
。运行命令
ALTER TABLE mytable MODIFY COL2 LONGTEXT CHARACTER SET utf8mb4
将列更改为utf8mb4
(C3A9
)到目前为止,一切都很好。
剩余的问题:
如何确保表中存在的所有数据都是
latin1
?我尝试了SELECT COL2 FROM mytable WHERE LENGTH(COL2) != CHAR_LENGTH(COL2) LIMIT 1
,但结果为0。够了吗为什么显示
ALTER TABLE mytable CHARACTER SET utf8mb4;
命令似乎未修改数据时出现“ 32141行受影响”?
(当表在同一列上具有分区和索引时发生)
根据上一点,更改表的默认编码是否安全(需要?)?还是只坚持修改列?
非常感谢你的帮助
最佳答案
您一团糟,ALTER
使情况更糟。
首先,将表列声明为latin1
,并且连接声明客户端在使用latin1
(通过SET NAMES latin1
)。如果é
在客户端中实际上是十六进制E9
,那会很好。但是客户端中的数据是UTF-8。因此,é
是将两个字节C3A9
作为2个latin1字符发送到数据库。损坏不明显,因为当您SELECTed
时,损坏被恢复了。
后面的步骤将这些字节中的每一个视为latin1并将其转换为utf8,从而“加倍”编码,从而搞砸了事情。
请参见Trouble with UTF-8 characters; what I see is not what I stored中的“ Mojibake”和“双重编码”。如果要尝试恢复数据,请参见http://mysql.rjweb.org/doc.php/charcoll#fixes_for_various_cases中的相应情况
好吧,显然ALTER TABLE mytable DEFAULT CHARACTER SET utf8;
不仅在更改默认值,而且还在复制表,并在此过程中引入了双重编码。
十多年来,我一直在追求MySQL字符集问题。这是我尚未观察到的新皱纹。
我很确定character_set_system
与您的问题无关。 (但我可能错了!)
设定名称错误
测试用例:
CREATE TABLE mytest ( MYDATA longtext ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
SET NAMES latin1;
INSERT INTO mytest VALUES ( "é" );
SELECT MYDATA, HEX(MYDATA) FROM mytest;
运行该测试用例:
mysql> SET NAMES latin1;
mysql> SHOW CREATE TABLE mytest\G
*************************** 1. row ***************************
Table: mytest
Create Table: CREATE TABLE `mytest` (
`MYDATA` longtext
) ENGINE=InnoDB DEFAULT CHARSET=latin1
mysql> INSERT INTO mytest VALUES ( "é" );
mysql> SELECT MYDATA, HEX(MYDATA), LENGTH(MYDATA),
CHAR_LENGTH(MYDATA) FROM mytest;
+--------+-------------+----------------+---------------------+
| MYDATA | HEX(MYDATA) | LENGTH(MYDATA) | CHAR_LENGTH(MYDATA) |
+--------+-------------+----------------+---------------------+
| é | C3A9 | 2 | 2 |
+--------+-------------+----------------+---------------------+
角色看起来不错。但是十六进制看起来像UTF-8,而不是latin1。并且
CHAR_LENGTH
是“错误的”。情况是:
CHARACTER SET latin1
,但是其中有utf8字节。固定字符集时不保留字节:
然后在不更改字节的情况下转换列:
ALTER TABLE tbl MODIFY COLUMN MYDATA LONGBLOB;
ALTER TABLE tbl MODIFY COLUMN MYDATA LONGTEXT CHARACTER SET utf8mb4;
(确保具有您最初拥有的所有属性,例如
NOT NULL
。)如http://mysql.rjweb.org/doc.php/charcoll中所述,这是“两步ALTER”。)(请确保其他规范保持不变-VARCHAR,NOT NULL等)。
分区测试用例:
DROP TABLE IF EXISTS ptest;
CREATE TABLE ptest (
nn INT NOT NULL,
ee LONGTEXT
) ENGINE=InnoDB DEFAULT CHARSET=latin1
PARTITION BY RANGE (nn)
(PARTITION p0 VALUES LESS THAN (1),
PARTITION p1 VALUES LESS THAN MAXVALUE);
SET NAMES latin1;
INSERT INTO ptest (nn, ee) VALUES ( 0, "é" ), ( 1, "ü" );
SELECT nn, ee, HEX(ee), LENGTH(ee), CHAR_LENGTH(ee) FROM ptest;
ALTER TABLE ptest
DEFAULT CHARSET utf8;
SELECT nn, ee, HEX(ee), LENGTH(ee), CHAR_LENGTH(ee) FROM ptest;
SELECT @@version;
SHOW CREATE TABLE ptest\G
分区结果:
mysql> DROP TABLE IF EXISTS ptest;
Query OK, 0 rows affected (0.02 sec)
mysql> CREATE TABLE ptest (
-> nn INT NOT NULL,
-> ee LONGTEXT
-> ) ENGINE=InnoDB DEFAULT CHARSET=latin1
-> PARTITION BY RANGE (nn)
-> (PARTITION p0 VALUES LESS THAN (1),
-> PARTITION p1 VALUES LESS THAN MAXVALUE);
Query OK, 0 rows affected (0.03 sec)
mysql> SET NAMES latin1;
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO ptest (nn, ee) VALUES ( 0, "é" ), ( 1, "ü" );
Query OK, 2 rows affected (0.00 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> SELECT nn, ee, HEX(ee), LENGTH(ee), CHAR_LENGTH(ee) FROM ptest;
+----+------+---------+------------+-----------------+
| nn | ee | HEX(ee) | LENGTH(ee) | CHAR_LENGTH(ee) |
+----+------+---------+------------+-----------------+
| 0 | é | C3A9 | 2 | 2 |
| 1 | ü | C3BC | 2 | 2 |
+----+------+---------+------------+-----------------+
2 rows in set (0.00 sec)
mysql> ALTER TABLE ptest
-> DEFAULT CHARSET utf8;
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> SELECT nn, ee, HEX(ee), LENGTH(ee), CHAR_LENGTH(ee) FROM ptest;
+----+------+---------+------------+-----------------+
| nn | ee | HEX(ee) | LENGTH(ee) | CHAR_LENGTH(ee) |
+----+------+---------+------------+-----------------+
| 0 | é | C3A9 | 2 | 2 |
| 1 | ü | C3BC | 2 | 2 |
+----+------+---------+------------+-----------------+
2 rows in set (0.00 sec)
mysql> SELECT @@version;
+-----------------+
| @@version |
+-----------------+
| 5.6.22-71.0-log |
+-----------------+
1 row in set (0.00 sec)
mysql> SHOW CREATE TABLE ptest\G
*************************** 1. row ***************************
Table: ptest
Create Table: CREATE TABLE `ptest` (
`nn` int(11) NOT NULL,
`ee` longtext CHARACTER SET latin1
) ENGINE=InnoDB DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (nn)
(PARTITION p0 VALUES LESS THAN (1) ENGINE = InnoDB,
PARTITION p1 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.00 sec)
嗯...我看不到ALTER问题。您使用的是哪个版本?您是否看到此测试用例存在问题?
关于mysql - Alter Table默认字符集修改MySQL 5.6中的行,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/56898325/