问题描述
自从我们迁移到SQL Server 2016之后,我们现在尝试使用非XML格式文件和UTF-8编码的数据文件(带有换行符Unix(LF))通过BULK IMPORT将Unicode字符导入表中。格式文件指定主机文件数据长度,但不指定终止符。主机文件数据类型为SQLCHAR。我的BULK INSERT语句如下:
since we have migrated to the SQL Server 2016 we now trying to import Unicode characters into a table via BULK IMPORT using non-XML format files and UTF-8 encoded data files (with newline Unix (LF)) . The format files specify the host file data length but not the terminator. The host file data type is SQLCHAR. My BULK INSERT statement looks like:
SET @cmd = N'Bulk Insert myTable from ''D:\DATA\datafile'' with (DATAFILETYPE =''widechar'', KEEPNULLS, FORMATFILE = ''D:\DATA\fmt\formatfile.ftm'' ,MAXERRORS = 0, codepage=65001, FIRSTROW = 1)'
EXECUTE (@cmd)
格式化文件:
9.0
7
1 SQLCHAR 0 8 "" 1 column_1 ""
2 SQLCHAR 0 8 "" 2 column_2 ""
3 SQLCHAR 0 12 "" 3 column_3 ""
4 SQLCHAR 0 8 "" 4 column_4 ""
5 SQLCHAR 0 12 "" 5 column_5 ""
6 SQLCHAR 0 40 "" 6 column_6 ""
7 SQLCHAR 0 8 ";\n" 7 column_7 ""
表定义:
CREATE TABLE #myTable
( [column_1] [nvarchar](8) NOT NULL,
[column_2] [nvarchar](8) NULL,
[column_3] [nvarchar](12) NULL,
[column_4] [nvarchar](8) NOT NULL,
[column_5] [nvarchar](12) NULL,
[column_6] [nvarchar](40) NULL,
[column_7] [datetime] NULL)
BULK INSERT可以轻松导入普通字符(例如A,B,C ...)。但是当涉及到诸如Ä或Ü的字符时,BULK INSERT似乎将这些字符分为两部分,因此文件数据长度增加了一个(如果字符串仅包含一个类似于Ä的字符)。字符串中的最后一个字符将作为第一个字符插入下一个表格列。任何想法如何解决这个问题?由于数据文件来自第三方,因此我无法影响这些文件的创建/操作。
The BULK INSERT can import the ordinary characters (like A,B,C...) without any troubles. But when it comes to characters like Ä or Ü the BULK INSERT seems to split these characters in two, so that the file data length increases by one (if the string contains just one character like Ä). The last character from string is going to be inserted as first character to the next table column. Any idea how to solve this issue? As the data files comes from a third party I cannot influence the creation/manipulation of those.
PS:BTW,BCP工具也可以这样做。
P.S.: BTW, the BCP tool does the same.
感谢您的帮助!
谢谢!
推荐答案
已解决!我的步骤:
-将数据文件转换为Unicode
-在大容量插入语句集CodePage = 65001和Datafiletype ='widechar'中
-在非XML格式文件集宿主文件数据类型中SQLNCHAR(感谢Panagiotis-对问题的注释)并复制了主机文件数据长度。
Solved! My steps: - converted the data files to Unicode - in Bulk Insert statement set CodePage = 65001 and Datafiletype = 'widechar' - in the non-XML format files set host file data type to SQLNCHAR (thanks to Panagiotis - s. comments to the question) and duplicated the host file data length.
这篇关于使用SQL Server 2016批量导入Unicode的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!