问题描述
在R脚本中,我想在for循环中使用数据帧中某一行的路径名.例如,我想通过在代码中替换."
来获取以下数据框中第一列中每个路径的目录大小: sum(file.info(list.files(.",all.files = TRUE,递归= TRUE))$ size)
,并带有路径名"列中的值.
In an R script, I'd like to use use a pathname from a row in a dataframe in a for loop. For example, I'd like to get the directory size for each of the paths in column one in the following dataframe, by substituting "."
in the code:sum(file.info(list.files(".", all.files = TRUE, recursive = TRUE))$size)
with a value from the 'pathname' column.
pathname size
1 F:/Asus/C_drive/WEPP/bitmap 0
2 F:/Asus/C_drive/WEPP/Data 0
3 F:/Asus/C_drive/WEPP/misc 0
4 F:/Asus/C_drive/WEPP/output 0
5 F:/Asus/C_drive/WEPP/runs 0
6 F:/Asus/C_drive/WEPP/tools 0
7 F:/Asus/C_drive/WEPP/watersheds 0
8 F:/Asus/C_drive/WEPP/wepp 0
9 F:/Asus/C_drive/WEPP/weppwin 0
推荐答案
以下是使用Base R的解决方案,它使用了我的R Working目录中的目录结构.我们将使用目录,文件名和文件大小构建一个数据框,然后聚合到目录级别.
Here is a solution using Base R, using the directory structure from my R Working directory. We'll build a data frame with directory, file name and file size, then aggregate to the directory level.
theDirectories <- list.dirs(getwd()) # start with current wd
# for example answer, only use first 2 directories in vector
fileList <- lapply(theDirectories[1:2],function(x){
files <- list.files(path=x,full.names = TRUE)
fileName <- list.files(path=x,full.names = FALSE)
size <- unlist(lapply(files,file.size))
pathName <- rep(x,length(size))
data.frame(pathName,fileName,size,stringsAsFactors = FALSE)
})
fileSizes <- do.call(rbind,fileList)
aggregate(size ~ pathName,data = fileSizes,sum)
...以及输出:
> aggregate(sizes ~ pathName,data = fileSizes,sum)
pathName size
1 /Users/lgreski/gitrepos/datascience 3461684
2 /Users/lgreski/gitrepos/datascience/.git 137811
>
这篇关于在for循环中使用数据帧中某行的路径名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!