我有一个名为bias_correc 的数据框,观察值作为x 列,预测值作为y 列。下面显示的数据框包含对不同区域的观察和预测,因此列确实具有相似的名称,我想计算每个位置(即海岸)在每个时间步的预测偏差。

我知道如何通过创建一个新列来手动执行此操作,并通过每组位置列进行简单的减法

bias_correc$Coast <- bias_correc$Coast.y- bias_correct$Coast.x

但如果可能的话,我想通过应用函数或循环来执行此操作,以便计算每组位置列并将其转储到此数据框或新的数据框中。

我熟悉 seq 函数并且过去使用过它,但我不确定如何将它包装到应用函数或循环中,以便计算每两列按位置的差异。

任何帮助深表感谢。
bias_correc <-
structure(list(Forecast_day = c(8, 8, 8, 8, 8, 8), Forecast_date = structure(c(17555,
17556, 17557, 17558, 17559, 17560), class = "Date"), DeliveryDate = structure(c(17563,
17564, 17565, 17566, 17567, 17568), class = "Date"), HourEnding = c(1L,
1L, 1L, 1L, 1L, 1L), Coast.x = c(60.8, 62.6, 50.5, 56.8, 58.9,
59.4), Coast.y = c(58.5, 51, 46.7, 49.7, 49.3, 48.2), East.x = c(56,
52, 43, 47, 43.5, 52.5), East.y = c(56.5, 43.5, 41.5, 43.5, 43,
43), FarWest.x = c(50, 41, 45.5, 49.5, 35.5, 49.5), FarWest.y = c(46.5,
34.5, 36.5, 38, 41.5, 39), North.x = c(49, 34.5, 34.5, 39.5,
24.5, 34.5), North.y = c(49.5, 32, 33, 38, 38.5, 34.5), NorthCentral.x = c(57.5,
44.75, 45.5, 52.75, 35.75, 38.5), NorthCentral.y = c(54, 37.5,
39.75, 42, 42.5, 40), SouthCentral.x = c(56.5, 53.5, 51.5, 48.5,
53.5, 56), SouthCentral.y = c(56, 43.5, 43, 45, 45, 45), Southern.x = c(60.4,
63.6, 55, 61.8, 64, 65.6), Southern.y = c(58.4, 52.8, 50.4, 54,
54.4, 53.6), West.x = c(57.6, 42, 43.4, 51.8, 32.6, 45.2), West.y = c(49.6,
34.6, 36.8, 38.6, 40.4, 36.2)), class = "data.frame", row.names = c(NA,
-6L), .Names = c("Forecast_day", "Forecast_date", "DeliveryDate",
"HourEnding", "Coast.x", "Coast.y", "East.x", "East.y", "FarWest.x",
"FarWest.y", "North.x", "North.y", "NorthCentral.x", "NorthCentral.y",
"SouthCentral.x", "SouthCentral.y", "Southern.x", "Southern.y",
"West.x", "West.y"))

最佳答案

如果我们对列名进行一些字符串操作,它应该相当简单。

# find column names ending in ".x"
var_names <- names(bias_correc)[grepl(pattern = ".x",
                                      x = names(bias_correc),
                                      fixed = TRUE)]
# replace ".x" with "" (blank)
var_names <- gsub(pattern = ".x", replacement = "", x = var_names, fixed = TRUE)
# subtract y and x
(diff_table <- bias_correc[paste0(var_names, ".y")] - bias_correc[paste0(var_names, ".x")])

  Coast.y East.y FarWest.y North.y NorthCentral.y SouthCentral.y Southern.y West.y
1    -2.3    0.5      -3.5     0.5          -3.50           -0.5       -2.0   -8.0
2   -11.6   -8.5      -6.5    -2.5          -7.25          -10.0      -10.8   -7.4
3    -3.8   -1.5      -9.0    -1.5          -5.75           -8.5       -4.6   -6.6
4    -7.1   -3.5     -11.5    -1.5         -10.75           -3.5       -7.8  -13.2
5    -9.6   -0.5       6.0    14.0           6.75           -8.5       -9.6    7.8
6   -11.2   -9.5     -10.5     0.0           1.50          -11.0      -12.0   -9.0

cbind(bias_correc, setNames(diff_table, var_names)) # bind back to original table

关于r - 计算每两列之间的差异,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/48694263/

10-11 11:22