文件并将它们导出到单个大文件

文件并将它们导出到单个大文件

本文介绍了如何读取 R 中的每个 .csv 文件并将它们导出到单个大文件中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下格式的数据

101,20130826T155649
------------------------------------------------------------------------
3,1,round-0,10552,180,yellow
12002,1,round-1,19502,150,yellow
22452,1,round-2,28957,130,yellow,30457,160,brake,31457,170,red
38657,1,round-3,46662,160,yellow,47912,185,red

我一直在阅读它们并通过此代码清理/格式化它们

and I have been reading them and cleaning/formating them by this code

b <- read.table("sid-101-20130826T155649.csv", sep = ',', fill=TRUE, col.names=paste("V", 1:18,sep="") )
b$id<- b[1,1]
b<-b[-1,]
b<-b[-1,]
b$yellow<-B$V6

等等大约有 300 个这样的文件,理想情况下,它们都将在没有前两行的情况下编译,因为第一行只是 id,我创建了一个单独的列来标识这些数据.有谁知道如何快速读取这些表格,并按照我想要的方式清理和格式化,然后将它们编译成一个大文件并导出它们?

and so onThere are about 300 files like this, and ideally they will all compiled without the first two lines, since the first line is just id and I made a separate column to identity these data. Does anyone know how to read these table quickly and clean and format the way I want then compile them into a large file and export them?

推荐答案

您可以使用 lapply 读取所有文件、清理和格式化它们,并将结果数据帧存储在列表中.然后使用 do.call 将所有数据帧组合成单个大数据帧.

You can use lapply to read all the files, clean and format them, and store the resulting data frames in a list. Then use do.call to combine all of the data frames into single large data frame.

# Get vector of files names to read
files.to.load = list.files(pattern="csv$")

# Read the files
df.list = lapply(files.to.load, function(file) {
   df = read.table(file, sep = ',', fill=TRUE, col.names=paste("V", 1:18,sep=""))
   ... # Cleaning and formatting code goes here
   df$file.name = file  # In case you need to know which file each row came from
   return(df)
})

# Combine into a single data frame
df.combined = do.call(rbind, df.list)

这篇关于如何读取 R 中的每个 .csv 文件并将它们导出到单个大文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-30 10:48