问题描述
我希望将csv文件导入R,第一个非空行提供数据框列的名称。我知道您可以提供 skip = 0
参数来指定首先读取哪一行。但是,第一个非空行的行号可以在文件之间更改。
I am wishing to import csv files into R, with the first non empty line supplying the name of data frame columns. I know that you can supply the skip = 0
argument to specify which line to read first. However, the row number of the first non empty line can change between files.
我如何计算出空行数,并为每个文件动态跳过它们?
How do I work out how many lines are empty, and dynamically skip them for each file?
As在评论中指出,我需要澄清什么是空白的意思。我的csv文件如下所示:
As pointed out in the comments, I need to clarify what "blank" means. My csv files look like:
,,,
w,x,y,z
a,b,5,c
a,b,5,c
a,b,5,c
a,b,4,c
a,b,4,c
a,b,4,c
这意味着开头有一些逗号行。
which means there are rows of commas at the start.
推荐答案
read.csv
会自动跳过空白行(除非您将设置为空白。 lines.skip = FALSE
)。参见?read.csv
read.csv
automatically skips blank lines (unless you set blank.lines.skip=FALSE
). See ?read.csv
在写完上述内容后,海报解释说空白行实际上并不是空白但是逗号在其中,但逗号之间没有任何内容。在这种情况下,从data.table包中使用 fread
来处理它。 skip =
参数可以设置为标题中找到的任何字符串:
After writing the above, the poster explained that blank lines are not actually blank but have commas in them but nothing between the commas. In that case use fread
from the data.table package which will handle that. The skip=
argument can be set to any character string found in the header:
library(data.table)
DT <- fread("myfile.csv", skip = "w") # assuming w is in the header
DF <- as.data.frame(DT)
如果data.table可以作为返回值,则可以省略最后一行。
The last line can be omitted if a data.table is ok as the returned value.
这篇关于跳过read.csv中所有前导空行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!