问题描述
我正在尝试通过一些声明简化这一过程。不知道如何去做(或者如果可以的话),但是任何可以使我接近或尽可能少地采取步骤的人。我正在使用dplyr和lubridate。我有一个名为OutofRange(示例)的数据库;
I'm trying to simplify this with a few statements. Not sure how to go about it (or if i can), but anyone that can get me close or as few steps as possible. I am using dplyr and lubridate. I have a database called OutofRange (sample);
OutOfRange %>% select(OutRange, TouchVPOC)
Source: local data frame [341 x 2]
OutRange TouchVPOC
(lgl) (lgl)
1 FALSE TRUE
2 FALSE FALSE
3 FALSE TRUE
4 FALSE FALSE
5 FALSE TRUE
OutOfRange %>% select(OutRange, TouchVPOC) %>% filter(OutRange == T) %>% tally
Source: local data frame [1 x 1]
n (int)
1 37
OutOfRange %>% select(OutRange, TouchVPOC) %>% filter(OutRange == T, TouchVPOC == T) %>% tally
Source: local data frame [1 x 1]
n (int)
1 15
15/37
[1] 0.4054054
因此,如果可能的话,我正在寻找类似这样的最终结果,其中CountofDataFrame是所有行的计数;其中OutRange& TouchVPOC是TRUE值的计数;和Pct = TouchVPOC / OutRange。
So, if possible I'm looking for a final outcome of something like this, where CountofDataFrame is the count of all rows; where OutRange & TouchVPOC are the count of TRUE Values; and Pct = TouchVPOC/OutRange.
CountOfDataFrame OutRange TouchVPOC Pct
341 37 15 .40
我确实意识到,我可能会问很多..而我对此并不陌生,欢迎提出任何建议。
I do realize, I may be asking alot.. and I'm new to this, any suggestions are welcome. Just looking for a basis or a start in the right direction.
推荐答案
我建议您先将数据整理成整齐的格式,然后使用group_by / summarize / mutate进行汇总和百分比计算,如下所示。
I would suggest you first get the data into tidy format first, then use group_by/summarize/mutate to do aggregation and percentage calculation like below.
a <- data.frame(OutRange = c(TRUE, FALSE, FALSE, FALSE, FALSE),
TouchVPOC = c(TRUE, TRUE, TRUE, FALSE, FALSE))
> a
OutRange TouchVPOC
1 TRUE TRUE
2 FALSE TRUE
3 FALSE TRUE
4 FALSE FALSE
5 FALSE FALSE
library(tidyr)
a %>%
gather(type, value, OutRange:TouchVPOC) %>%
group_by(type) %>%
summarize(true_count = sum(value)) %>%
mutate(total = sum(true_count), Pct = true_count / total)
Source: local data frame [2 x 4]
type true_count total Pct
(chr) (int) (int) (dbl)
1 OutRange 1 4 0.25
2 TouchVPOC 3 4 0.75
这篇关于dplyr group_by逻辑值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!