使用上一列名称和正则表达式模式重命名R中的数据框列名称

本文介绍了使用上一列名称和正则表达式模式重命名R中的数据框列名称的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我第一次在R中工作，而且在重命名数据帧(Grade.Data)中的列名时遇到了困难.我有一个从csv文件导入的数据集，该数据集具有如下列名: 学生编号

I am working in R for the first time and I have been having difficulty renaming column names in a dataframe (Grade.Data). I have a dataset imported from an csv file that has column names like this: Student.ID

Grade    

Interactive.Exercises.1..Health

Interactive.Exercises.2..Fitness

Quizzes.1..Week.1.Quiz

Quizzes.2..Week.2.Quiz

Case.Studies.1..Case.Study1

Case.Studies.2..Case.Study2

我希望能够更改变量名称，使它们更简单，即从Interactive.Exercises.1.Health更改为Interactive.Exercises.1或Quizzes.1.Week.1.Quiz更改为Quizzes.1.

I would like to be able to change the variable names so that they are more simple, i.e. from Interactive.Exercises.1.Health to Interactive.Exercises.1 or Quizzes.1.Week.1.Quiz to Quizzes.1

到目前为止，我已经尝试过:

So far, I have tried this:

grep(".*[0-9]", names(Grade.Data))

但是我得到了这个返回:

But I get this returned:

[1]  3  4  5  6  7  8  9 11 12 13 14 15 16 17 19 20 21 22 23 24 25

有人可以帮助我弄清楚发生了什么，并写出更好的正则表达式吗?非常感谢.

Can anyone help me figure out what is going on, and write a better regex expression? Thank you so much.

推荐答案

似乎您在第一批数字后截断了列名.

It seems you truncate column names after the first chunk of digits.

您可以使用以下sub解决方案:

You may use the following sub solution:

names(Grade.Data) <- sub("^(.*?\\d+).*$", "\\1", names(Grade.Data))

请参见 regex演示

详细信息

^-字符串开头
(.*?\\d+)-第1组(后继替换模式中用\1表示)匹配的0+个字符越少越好(.*?)，然后匹配1个或多个数字(\d+)
.*-尽可能多的0个字符
$-字符串结尾

^ - start of string
(.*?\\d+) - Group 1 (later referred with \1 from the replacement pattern) matching any 0+ chars as few as possible (.*?) and then 1 or more digits (\d+)
.* - any 0+ chars as many as possible
$ - end of string

这篇关于使用上一列名称和正则表达式模式重命名R中的数据框列名称的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！