我有一个名为bias_correc 的数据框,观察值作为x 列,预测值作为y 列。下面显示的数据框包含对不同区域的观察和预测,因此列确实具有相似的名称,我想计算每个位置(即海岸)在每个时间步的预测偏差。
我知道如何通过创建一个新列来手动执行此操作,并通过每组位置列进行简单的减法
bias_correc$Coast <- bias_correc$Coast.y- bias_correct$Coast.x
但如果可能的话,我想通过应用函数或循环来执行此操作,以便计算每组位置列并将其转储到此数据框或新的数据框中。
我熟悉 seq 函数并且过去使用过它,但我不确定如何将它包装到应用函数或循环中,以便计算每两列按位置的差异。
任何帮助深表感谢。
bias_correc <-
structure(list(Forecast_day = c(8, 8, 8, 8, 8, 8), Forecast_date = structure(c(17555,
17556, 17557, 17558, 17559, 17560), class = "Date"), DeliveryDate = structure(c(17563,
17564, 17565, 17566, 17567, 17568), class = "Date"), HourEnding = c(1L,
1L, 1L, 1L, 1L, 1L), Coast.x = c(60.8, 62.6, 50.5, 56.8, 58.9,
59.4), Coast.y = c(58.5, 51, 46.7, 49.7, 49.3, 48.2), East.x = c(56,
52, 43, 47, 43.5, 52.5), East.y = c(56.5, 43.5, 41.5, 43.5, 43,
43), FarWest.x = c(50, 41, 45.5, 49.5, 35.5, 49.5), FarWest.y = c(46.5,
34.5, 36.5, 38, 41.5, 39), North.x = c(49, 34.5, 34.5, 39.5,
24.5, 34.5), North.y = c(49.5, 32, 33, 38, 38.5, 34.5), NorthCentral.x = c(57.5,
44.75, 45.5, 52.75, 35.75, 38.5), NorthCentral.y = c(54, 37.5,
39.75, 42, 42.5, 40), SouthCentral.x = c(56.5, 53.5, 51.5, 48.5,
53.5, 56), SouthCentral.y = c(56, 43.5, 43, 45, 45, 45), Southern.x = c(60.4,
63.6, 55, 61.8, 64, 65.6), Southern.y = c(58.4, 52.8, 50.4, 54,
54.4, 53.6), West.x = c(57.6, 42, 43.4, 51.8, 32.6, 45.2), West.y = c(49.6,
34.6, 36.8, 38.6, 40.4, 36.2)), class = "data.frame", row.names = c(NA,
-6L), .Names = c("Forecast_day", "Forecast_date", "DeliveryDate",
"HourEnding", "Coast.x", "Coast.y", "East.x", "East.y", "FarWest.x",
"FarWest.y", "North.x", "North.y", "NorthCentral.x", "NorthCentral.y",
"SouthCentral.x", "SouthCentral.y", "Southern.x", "Southern.y",
"West.x", "West.y"))
最佳答案
如果我们对列名进行一些字符串操作,它应该相当简单。
# find column names ending in ".x"
var_names <- names(bias_correc)[grepl(pattern = ".x",
x = names(bias_correc),
fixed = TRUE)]
# replace ".x" with "" (blank)
var_names <- gsub(pattern = ".x", replacement = "", x = var_names, fixed = TRUE)
# subtract y and x
(diff_table <- bias_correc[paste0(var_names, ".y")] - bias_correc[paste0(var_names, ".x")])
Coast.y East.y FarWest.y North.y NorthCentral.y SouthCentral.y Southern.y West.y
1 -2.3 0.5 -3.5 0.5 -3.50 -0.5 -2.0 -8.0
2 -11.6 -8.5 -6.5 -2.5 -7.25 -10.0 -10.8 -7.4
3 -3.8 -1.5 -9.0 -1.5 -5.75 -8.5 -4.6 -6.6
4 -7.1 -3.5 -11.5 -1.5 -10.75 -3.5 -7.8 -13.2
5 -9.6 -0.5 6.0 14.0 6.75 -8.5 -9.6 7.8
6 -11.2 -9.5 -10.5 0.0 1.50 -11.0 -12.0 -9.0
cbind(bias_correc, setNames(diff_table, var_names)) # bind back to original table
关于r - 计算每两列之间的差异,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/48694263/