本文介绍了如何在dplyr管线中使用sample和seq?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个包含两列的数据框,低和高。我想使用dplyr创建一个新变量,该变量是一个介于低值和高值(包含和相等的概率)之间的随机选择的值。我已经尝试过
I have a dataframe with two columns, low and high. I would like to create a new variable that is a randomly selected value between low and high (inclusive and equal probability) using dplyr. I have tried
library(tidyverse)
data_frame(low = 1:10, high = 11) %>%
mutate(rand_btwn = base::sample(seq(low, high, by = 1), size = 1))
这会给我一个错误,因为 seq
需要标量参数。
which gives me an error since seq
expects scalar arguments.
然后我再次尝试使用矢量化版本的 seq
I then tried again using a vectorized version of seq
seq2 <- Vectorize(seq.default, vectorize.args = c("from", "to"))
data_frame(low = 1:10, high = 11) %>%
mutate(rand_btwn = base::sample(seq2(low, high, by = 1), size = 1))
但这也不能给我想要的结果。
but this does not give me the desired result either.
推荐答案
要避免 rowwise()
模式,我通常更喜欢 mutate()
中的 map()
,例如:
To avoid the rowwise()
pattern, I usually prefer to map()
in mutate()
, like:
set.seed(123)
data_frame(low = 1:10, high = 11) %>%
mutate(rand_btwn = map_int(map2(low, high, seq), sample, size = 1))
# # A tibble: 10 x 3
# low high rand_btwn
# <int> <dbl> <int>
# 1 1 11 4
# 2 2 11 9
# 3 3 11 6
# 4 4 11 11
# 5 5 11 11
# 6 6 11 6
# 7 7 11 9
# 8 8 11 11
# 9 9 11 10
# 10 10 11 10
或:
set.seed(123)
data_frame(low = 1:10, high = 11) %>%
mutate(rand_btwn = map2_int(low, high, ~ sample(seq(.x, .y), 1)))
您的 Vectorize()
方法也适用:
sample_v <- Vectorize(function(x, y) sample(seq(x, y), 1))
set.seed(123)
data_frame(low = 1:10, high = 11) %>%
mutate(rand_btwn = sample_v(low, high))
这篇关于如何在dplyr管线中使用sample和seq?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!