本文介绍了R 查找数据帧中落在给定阈值内的第一个值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!



我有以下数据集 (lbnp_br),它是随时间(以秒为单位)测量的光密度 (OD):

 时间 OD1891 -244.61891.5 -244.41892 -2421892.5 -2421893 -241.11893.5 -242.41894 -245.21894.5 -249.6**1895 -253.9**1895.5 -254.51896 -251.91896.5 -246.71897 -242.41897.5 -234.61898 -225.5


为此,我计算了 OD 的变异系数 (CV),并使用平均 OD (-252.9098) +/- 2*CV 来定义响应阈值.对于上述数据,阈值设置为(平均 OD + 2*CV = -252.9917)和(平均 OD - 2*CV = -252.8278).

我现在需要计算从开始(1891 秒)到超过 +/- 阈值的第一个 OD 值的时间(以秒为单位).例如,对于上述数据帧,该阈值在 1895 秒时超过,对应于 -253.9 的 OD.

我现在必须对每个研究科目和总共 17 个科目重复 3 次,因此,我正在寻找一个函数,我可以在其中定义数据框和阈值,并且它将返回第一个 OD 值超过定义的阈值 (all_threshold$sup_2_minus) 和 (all_threshold$sup_2_plus) 及其相应的时间.

我在别处尝试了 subset 建议:

subset(lbnp_br,lbnp_br$OD < all_threshold$sup_2_minus & lbnp_br$OD > all_threshold$sup_2_plus)



ifelse(lbnp_br$OD > all_threshold$sup_2_plus & lbnp_br$OD < all_threshold$sup_2_minus, lbnp_br$OD, NA)

返回 NA 并且没有指定 OD 的确切值和时间.




find_time %安排(时间)%>%过滤器(OD>阈值_1)%>%切片_(1)colnames(return_value_1)[1] <- "time_hdt_upper"colnames(return_value_1)[2] <- "OD_hdt_upper"如果(nrow(return_value_1)== 0){return_value_1[1,1] %安排(时间)%>%过滤器(OD %切片_(1)colnames(return_value_2)[1] <- "time_hdt_lower"colnames(return_value_2)[2] <- "OD_hdt_lower"如果(nrow(return_value_2)== 0){return_value_2[1,1] %安排(时间)%>%过滤器(OD > threshold_3) % > %切片_(1)colnames(return_value_3)[1] <- "time_lbnp_upper"colnames(return_value_3)[2] <- "OD_lbnp_upper"如果(nrow(return_value_3)== 0){return_value_3[1,1] %安排(时间)%>%过滤器(OD %切片_(1)colnames(return_value_4)[1] <- "time_lbnp_lower"colnames(return_value_4)[2] <- "OD_lbnp_lower"如果(nrow(return_value_4)== 0){return_value_4[1,1] %安排(时间)%>%过滤器(OD>阈值_5)%>%切片_(1)colnames(return_value_5)[1] <- "time_hut_upper"colnames(return_value_5)[2] <- "OD_hut_upper"如果(nrow(return_value_5)== 0){return_value_5[1,1] %安排(时间)%>%过滤器(OD %切片_(1)colnames(return_value_6)[1] <- "time_hut_lower"colnames(return_value_6)[2] <- "OD_hut_lower"如果(nrow(return_value_6)== 0){return_value_6[1,1] 


find_time_threshold <- find_time(hdt_br, lbnp_br, hut_br, all_threshold$base_plus, all_threshold$base_minus, all_threshold$sup_2_plus, all_threshold$sup_2_minus, all_threshold$sup_sup_sup_3_>find_time_thresholdtime_hdt_upper OD_hdt_upper time_hdt_lower OD_hdt_lower time_lbnp_upper OD_lbnp_upper time_lbnp_lower1 596.5 123.3 506 91.3 不适用 不适用 1706OD_lbnp_lower time_hut_upper OD_hut_upper time_hut_lower OD_hut_lower1 -27.89 3186.5 -82.98 2909 -211.7

I am a fairly new user and I need your help with a task that I am stuck on. If my question has been asked/answered before I would be grateful if you could kindly guide me to the relevant page.

I have the following data set (lbnp_br) which is optical density (OD) measured over time (in seconds):

 time   OD
1891    -244.6
1891.5  -244.4
1892    -242
1892.5  -242
1893    -241.1
1893.5  -242.4
1894    -245.2
1894.5  -249.6
**1895  -253.9**
1895.5  -254.5
1896    -251.9
1896.5  -246.7
1897    -242.4
1897.5  -234.6
1898    -225.5

I need to find out how responsive the study device is by measuring how long it takes to reach the threshold for optical density.

For this I have calculated the coefficient of variation (CV) of OD and I am using mean OD (-252.9098) +/- 2*CV to define a response threshold. For the above data the threshold is set as (mean OD + 2*CV = -252.9917), and (mean OD - 2*CV = -252.8278).

I now need to calculate the time in seconds from the start (1891 seconds) to the first OD value that exceed the +/- threshold values. For example for the above data frame this threshold is exceeded at 1895 seconds corresponding to an OD of -253.9.

I now have to repeat this 3 times for each study subject and 17 subjects overall, thus, I am looking for a function where I can define the data frame and the threshold values, and it will return the first OD value where it exceeds the defined thresholds (all_threshold$sup_2_minus) and (all_threshold$sup_2_plus) and its corresponding time.

I have tried subset a advised elsewhere:

subset(lbnp_br, lbnp_br$OD < all_threshold$sup_2_minus & lbnp_br$OD > all_threshold$sup_2_plus)

However, this doesn't return what I am looking for.

and also

ifelse(lbnp_br$OD > all_threshold$sup_2_plus & lbnp_br$OD < all_threshold$sup_2_minus, lbnp_br$OD, NA)

which returns NA and doesn't specify the exact value of OD and the time.


Using the above code, I added a few extra conditions to get exactly what I was looking for and here it is for anyone who may need something similar:

find_time <- function(df, df2, df3, threshold_1, threshold_2, threshold_3, threshold_4, threshold_5, threshold_6){
  return_value_1 = df %>%
    arrange(time) %>%
    filter(OD > threshold_1) %>%
  colnames(return_value_1)[1] <- "time_hdt_upper"
  colnames(return_value_1)[2] <- "OD_hdt_upper"

  if (nrow(return_value_1) == 0) {
    return_value_1[1,1] <- NA
    return_value_1[1,2] <- NA

  return_value_2 = df %>%
    arrange(time) %>%
    filter(OD < threshold_2) %>%
  colnames(return_value_2)[1] <- "time_hdt_lower"
  colnames(return_value_2)[2] <- "OD_hdt_lower"

  if (nrow(return_value_2) == 0) {
    return_value_2[1,1] <- NA
    return_value_2[1,2] <- NA

  return_value_3 = df2 %>%
    arrange(time) %>%
    filter(OD > threshold_3) %>%
  colnames(return_value_3)[1] <- "time_lbnp_upper"
  colnames(return_value_3)[2] <- "OD_lbnp_upper"

  if (nrow(return_value_3) == 0) {
    return_value_3[1,1] <- NA
    return_value_3[1,2] <- NA

  return_value_4 = df2 %>%
    arrange(time) %>%
    filter(OD < threshold_4) %>%
  colnames(return_value_4)[1] <- "time_lbnp_lower"
  colnames(return_value_4)[2] <- "OD_lbnp_lower"

  if (nrow(return_value_4) == 0) {
    return_value_4[1,1] <- NA
    return_value_4[1,2] <- NA

  return_value_5 = df3 %>%
    arrange(time) %>%
    filter(OD > threshold_5) %>%
  colnames(return_value_5)[1] <- "time_hut_upper"
  colnames(return_value_5)[2] <- "OD_hut_upper"

  if (nrow(return_value_5) == 0) {
    return_value_5[1,1] <- NA
    return_value_5[1,2] <- NA

  return_value_6 = df3 %>%
    arrange(time) %>%
    filter(OD < threshold_6) %>%
  colnames(return_value_6)[1] <- "time_hut_lower"
  colnames(return_value_6)[2] <- "OD_hut_lower"

  if (nrow(return_value_6) == 0) {
    return_value_6[1,1] <- NA
    return_value_6[1,2] <- NA

  return(data.frame(return_value_1, return_value_2, return_value_3, return_value_4, return_value_5, return_value_6))


which gives

find_time_threshold <- find_time(hdt_br, lbnp_br, hut_br, all_threshold$base_plus, all_threshold$base_minus, all_threshold$sup_2_plus, all_threshold$sup_2_minus, all_threshold$sup_3_plus, all_threshold$sup_3_minus)
> find_time_threshold

  time_hdt_upper OD_hdt_upper time_hdt_lower OD_hdt_lower time_lbnp_upper OD_lbnp_upper time_lbnp_lower
1          596.5        123.3            506         91.3              NA            NA            1706
  OD_lbnp_lower time_hut_upper OD_hut_upper time_hut_lower OD_hut_lower
1        -27.89         3186.5       -82.98           2909       -211.7

这篇关于R 查找数据帧中落在给定阈值内的第一个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-06 05:55