问题描述
我正在尝试使用查找表一次更改多个列的值.它们都使用相同的查找表.我知道如何仅对一列执行此操作——我只使用 merge
,但在处理多列时遇到问题.
I am trying to change the value of a number of columns at once using a lookup table. They all use the same lookup table. I know how to do this for just one column -- I'd just use a merge
, but am having trouble with multiple columns.
下面是一个示例数据框和一个示例查找表.我的实际数据要大得多(约 10K 列,8 行).
Below is an example dataframe and an example lookup table. My actual data is much larger (~10K columns with 8 rows).
示例 <- data.frame(a = seq(1,5), b = seq(5,1), c=c(1,4,3,2,5))
lookup <- data.frame(number = seq(1,5), letter = LETTERS[seq(1,5)])
理想情况下,我最终会得到一个如下所示的数据框:
Ideally, I would end up with a dataframe which looks like this:
example_of_ideal_output
当然,在我的实际数据中dataframe是数字,但是查找表要复杂很多,所以我不能只用
LETTERS
这样的函数来解决问题.
Of course, in my actual data the dataframe is numbers, but the lookup table is a lot more complicated, so I can't just use a function like
LETTERS
to solve things.
先谢谢你!
推荐答案
这是一个使用
lapply()
依次作用于每一列的解决方案:
Here's a solution that works on each column successively using
lapply()
:
as.data.frame(lapply(example,function(col) lookup$letter[match(col,lookup$number)]));
## a b c
## 1 A E A
## 2 B D D
## 3 C C C
## 4 D B B
## 5 E A E
或者,如果您不介意切换到矩阵,您可以实现更加矢量化"的解决方案,因为矩阵将允许您调用
match()
和索引 lookup$letter
对于整个输入只需要一次:
Alternatively, if you don't mind switching over to a matrix, you can achieve a "more vectorized" solution, as a matrix will allow you to call
match()
and index lookup$letter
just once for the entire input:
matrix(lookup$letter[match(as.matrix(example),lookup$number)],nrow(example));
## [,1] [,2] [,3]
## [1,] "A" "E" "A"
## [2,] "B" "D" "D"
## [3,] "C" "C" "C"
## [4,] "D" "B" "B"
## [5,] "E" "A" "E"
(当然,您可以在之后通过
as.data.frame()
强制返回到 data.frame,但如果需要,您还必须恢复列名,这可以用 setNames(...,names(example))
来完成.但如果你真的想坚持使用 data.frame,我的第一个解决方案可能更可取.)
(And of course you can coerce back to data.frame via
as.data.frame()
afterward, although you'll have to restore the column names as well if you want them, which can be done with setNames(...,names(example))
. But if you really want to stick with a data.frame, my first solution is probably preferable.)
这篇关于使用查找表更改数据框多列中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!