R:选择子集而不复制

本文介绍了R:选择子集而不复制的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

是否可以从对象(数据框，矩阵，向量)中选择子集，而无需复制所选数据?

Is there a way to select a subset from objects (data frames, matrices, vectors) without making a copy of selected data?

我使用相当大的数据集，但是从不更改它们.但是，通常为了方便起见，我选择要操作的数据子集.每次创建一个大子集的副本都是非常低效的内存，但是普通索引和subset(因此是xapply()函数族)都创建所选数据的副本.因此，我正在寻找可以克服此问题的功能或数据结构.

I work with quite large data sets, but never change them. However often for convenience I select subsets of the data to operate on. Making a copy of a large subset each time is very memory inefficient, but both normal indexing and subset (and thus xapply() family of functions) create copies of selected data. So I'm looking for functions or data structures that can overcome this issue.

一些可能满足我的需求的方法可能希望在某些R包中实现:

Some possible approaches that may fit my needs and hopefully are implemented in some R packages:

写时复制机制，即仅当您添加或重写现有元素时才复制的数据结构；
不可变的数据结构，它只需要重新创建该数据结构的索引信息，而无需重新创建其索引内容(例如，通过仅创建一个保留长度和指向相同指针的小对象来从字符串中创建子字符串) char数组)；
xapply() 类似物，不会创建子集.

copy-on-write mechanism, i.e. data structures that are copied only when you add or rewrite existing elements;
immutable data structures, that only require recreating indexing information for the data structure, but not its content (like making substring from the string by only creating small object that holds length and a pointer to the same char array);
xapply() analogues that do not create subsets.

选择子集而不复制

问题描述

推荐答案