问题描述
我正在R中编程。我有一个向量,包含1000个值。现在,假设我要将这1000个值随机分成两个新集合,一个包含400个值,另一个包含600个值。我该怎么做?我已经考虑过要做这样的事情...
I'm programming in R. I've got a vector containing, let's say, 1000 values. Now let's say I want to partition these 1000 values randomly into two new sets, one containing 400 values and the other containing 600. How could I do this? I've thought about doing something like this...
firstset <- sample(mydata, size=400)
...但这不会对数据进行分区(换句话说,我仍然不知道哪个600个值放入另一组)。我还考虑过从1循环到400,一次随机删除1个值并将其放在 firstset
中。这样可以正确地对数据进行分区,但是我不清楚如何实现此目的。另外我被告知要尽可能避免在R中使用 for
循环。
...but this doesn't partition the data (in other words, I still don't know which 600 values to put in the other set). I also thought about looping from 1 to 400, randomly removing 1 value at a time and placing it in firstset
. This would partition the data correctly, but how to implement this is not clear to me. Plus I've been told to avoid for
loops in R whenever possible.
有什么想法吗?
推荐答案
可以对值的位置进行抽样,而不是对值进行抽样。
Instead of sampling the values, you could sample their positions.
positions <- sample(length(mydata), size=400) # ucfagls' suggestion
firstset <- mydata[positions]
secondset <- mydata[-positions]
编辑:ucfagls的建议将更有效(尤其是对于较大的向量),因为它避免了分配R中的位置向量。
ucfagls' suggestion will be more efficient (especially for larger vectors), since it avoids allocating a vector of positions in R.
这篇关于如何在R中对一组值(向量)进行分区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!