本文介绍了用条件子集多行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将一个 .txt 文件读入一个名为 powertable 中,其中包含超过 200 万个对 9 个变量的观察.我试图通过包含01/02/2007"或02/02/2007"的两行对 power 进行子集化.创建子集后,RStudio 环境说我最终得到了零个观察值,但变量相同.

I have a .txt file read into a table called power with over 2 million observations of 9 variables. I am trying to subset power by two rows containing either "01/02/2007" or "02/02/2007". After creating the subset, the RStudio environment said I ended up with zero observations, but the same variables.

如何获取仅包含01/02/2007"和02/02/2007"行的数据子集?

我看到了一个类似的帖子,但我的数据集仍然有错误.请参阅链接:选择多行以 R 中的 ID 为条件

I saw a similar post, but still got an error on my dataset. See link: Select multiple rows conditioning on ID in R

我的数据:

#load data
> power <- read.table("textfile.txt", stringsAsFactors = FALSE, head = TRUE)
#subsetted first column called Date
> head(power$Date)
#[1] 16/12/2006 16/12/2006 16/12/2006 16/12/2006 16/12/2006 16/12/2006

> str(power$Date)
 chr [1:2075259] "16/12/2006" "16/12/2006" "16/12/2006" "16/12/2006" ...

我的代码:

> subpower <- subset(power, Date %in% c("01/02/2007", "02/02/2007"))

子集数据:

> str(powersub$Date)
 chr(0) 

推荐答案

尝试:

> subpower = power[power$Date %in% c("01/02/2007", "02/02/2007") ,]
> subpower
        Date Val
1 01/02/2007  14
8 02/02/2007  28

(使用@akrun 回答中的功率数据)

(Using power data from @akrun's answer)

此外,如果您使用子集的专有名称:subpower"而不是powersub",您自己的代码将起作用!

Moreover, your own code will work if you use proper name of subset: "subpower" instead of "powersub"!

> subpower <- subset(power, Date %in% c("01/02/2007", "02/02/2007"))
> subpower
        Date Val
1 01/02/2007  14
8 02/02/2007  28
>
> str(subpower)
'data.frame':   2 obs. of  2 variables:
 $ Date: chr  "01/02/2007" "02/02/2007"
 $ Val : int  14 28

这篇关于用条件子集多行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-20 09:58