如何在R中对一组值（向量）进行分区

本文介绍了如何在R中对一组值（向量）进行分区的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在R中编程。我有一个向量，包含1000个值。现在，假设我要将这1000个值随机分成两个新集合，一个包含400个值，另一个包含600个值。我该怎么做？我已经考虑过要做这样的事情...

I'm programming in R. I've got a vector containing, let's say, 1000 values. Now let's say I want to partition these 1000 values randomly into two new sets, one containing 400 values and the other containing 600. How could I do this? I've thought about doing something like this...

firstset <- sample(mydata, size=400)

...但这不会对数据进行分区（换句话说，我仍然不知道哪个600个值放入另一组）。我还考虑过从1循环到400，一次随机删除1个值并将其放在 firstset 中。这样可以正确地对数据进行分区，但是我不清楚如何实现此目的。另外我被告知要尽可能避免在R中使用 for 循环。

...but this doesn't partition the data (in other words, I still don't know which 600 values to put in the other set). I also thought about looping from 1 to 400, randomly removing 1 value at a time and placing it in firstset. This would partition the data correctly, but how to implement this is not clear to me. Plus I've been told to avoid for loops in R whenever possible.

有什么想法吗？

推荐答案

可以对值的位置进行抽样，而不是对值进行抽样。

Instead of sampling the values, you could sample their positions.

positions <- sample(length(mydata), size=400)  # ucfagls' suggestion
firstset <- mydata[positions]
secondset <- mydata[-positions]

编辑：ucfagls的建议将更有效（尤其是对于较大的向量），因为它避免了分配R中的位置向量。

ucfagls' suggestion will be more efficient (especially for larger vectors), since it avoids allocating a vector of positions in R.

这篇关于如何在R中对一组值（向量）进行分区的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！