R-抓取文件名的第一部分

本文介绍了R-抓取文件名的第一部分的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在目录 dir 中有很多文件，格式如下:

I have lots of files in a directory dir, with the following format

[xyz][sequence of numbers or letters]_[single number]_[more stuff that does not matter].csv

例如 xyz39289_3_8932jda.csv

我想编写一个函数，该函数返回该目录中所有文件名的所有第一部分.第一部分，我指的是 [xyz] [数字序列] 部分.因此，在上面的示例中，这将包括 xyz39289 .这样，该函数最终将返回一个列表，例如

I would like to write a function that returns all the first portions of all the file names in that directory. By first portion, I mean the [xyz][sequence of numbers] portion. So, in the example above, this would include the xyz39289. As such, the function would ultimately return a list such as

[xyz39289, xyz9382, xyz03319927, etc]

如何在R中做到这一点?在Java中，我将执行以下操作:

How can I do this in R? In Java, I would do the following:

File[] files = new File(dir).listFiles();
ArrayList<String> output = new ArrayList<String>();
for(int i = 0; i < files.length; i++) {
   output.add(files[i].getName().substring(0,files[i].getName().indexOf("_"));
}

推荐答案

使用 list.files 获取文件列表之后(并可能仅提取所需的以 xyz ，我会使用 sub .

After you get your list of files with list.files (and possibly extract just the files that you want that begin with xyz, I'd use sub.

files <- list.files(dir)
files <- files[grep("^xyz",files, perl = TRUE)]
filepart <- sub("^(xyz[^_]*)_.*$","\\1",files, perl = TRUE)

还有一个我不太确定的 regexpr 方法.像

There's also a regexpr method that I'm not too certain with. Something like

files <- list.files(dir)
matchdat <- regexpr("^xyz.*?(?=_)",files, perl = TRUE)
filepart <- regmatches(test,matchdat)

这篇关于R-抓取文件名的第一部分的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！