我正在Windows 10计算机上使用data.table::fread从.csv文件读取数据。数据通过read.csv正确读取;但是,当我使用fread读取数据时,结果data.table每行中的所有最后列都以\r结尾,大概表示回车。这将使数字字段被赋予字符数据类型。 (代替数字文字4.53,行尾单元格将包含字 rune 字4.53\r。)

为什么会发生此错误?有没有一种方法可以通过fread的函数调用直接解决此问题?

更新

当使用verbose = TRUE参数时,我得到以下信息

Input contains no \n. Taking this to be a filename to open
File opened, filesize is 0.000001 GB.
Memory mapping ... ok
Detected eol as \n only (no \r afterwards), the UNIX and Mac standard.
Positioned on line 1 after skip or autostart
This line is the autostart and not blank so searching up for the last non-blank ... line 1
Detecting sep ... ','
Detected 7 columns. Longest stretch was from line 1 to line 13
Starting data input on line 1 (either column names or first row of data). First 10 characters: subjectNum
All the fields on line 1 are character fields. Treating as the column names.
Count of eol: 13 (including 1 at the end)
Count of sep: 72
nrow = MIN( nsep [72] / ncol [7] -1, neol [13] - nblank [1] ) = 12
Type codes (   first 5 rows): 1131414
Type codes: 1131414 (after applying colClasses and integer64)
Type codes: 1131414 (after applying drop or select (if supplied)
Allocating 7 column slots (7 - 0 dropped)
Read 12 rows. Exactly what was estimated and allocated up front
   0.000s (  0%) Memory map (rerun may be quicker)
   0.001s ( 33%) sep and header detection
   0.000s (  0%) Count rows (wc -l)
   0.002s ( 67%) Column type detection (first, middle and last 5 rows)
   0.000s (  0%) Allocation of 12x7 result (xMB) in RAM
   0.000s (  0%) Reading data
   0.000s (  0%) Allocation for type bumps (if any), including gc time if triggered
   0.000s (  0%) Coercing data already read in type bumps (if any)
   0.000s (  0%) Changing na.strings to NA
   0.003s        Total

最佳答案

如果您的文件看起来像x="a\n1\r\n2\r\n",那么fread(x)将给出描述的结果:

     a
1: 1\r
2: 2\r

发生这种情况是因为行尾指示符在各行之间不一致。

我已经听说过这种情况在其他人那里发生,但是我不确定它是从哪里来的,或者不确定是否有比“修复”文件更好的方法来解决它(可能是使用命令行工具)。

关于r - 为什么fread将回车符(\r)插入到data.table中?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/37870424/

10-14 18:31