r - R数据表:对当前行除外的组使用功能

假设我有：

x = data.table( id=c(1,1,1,2,2,2), price=c(100,110,120,200,200,220) )
> x
   id price
1:  1   100
2:  1   110
3:  1   120
4:  2   200
5:  2   200
6:  2   220

并希望在省略当前行后为组中的每行（by = id）找到最便宜的价格。
因此结果应如下所示：

> x
   id price   cheapest_in_this_id_omitting_current_row
1:  1   100   110       # if I take this row out the cheapest is the next row
2:  1   110   100       # row 1
3:  1   120   100       # row 1
4:  2   200   200       # row 5
5:  2   200   200       # row 4 (or 5)
6:  2   220   200       # row 4 (or 5)

所以就像使用：

x[, cheapest_by_id := min(price), id]

但删除每次计算的当前行。

如果我可以使用像.row_nb这样的变量来引用组中的当前行，则可以使用：

x[, min(price[-.row_nb]), id]

但是这个.row_nb似乎不存在...？

最佳答案

我们按'id'分组，在行序列上使用combn，指定要选择的元素数，即'm'比行数（.N-1）小1，将combn的输出用作数字索引，将“价格”作为子集，获取min并将输出分配（:=）作为新列。

 x[,  cheapest_in_this_id_omitting_current_row:=
             combn(.N:1, .N-1, FUN=function(i) min(price[i])), by = id]
x
#   id price cheapest_in_this_id_omitting_current_row
#1:  1   100                                      110
#2:  1   110                                      100
#3:  1   120                                      100
#4:  2   200                                      200
#5:  2   200                                      200
#6:  2   220                                      200

或代替使用combn，我们可以遍历序列，使用该序列为“价格”建立索引，获取mean。我想这会很快。

 x[,cheapest_in_this_id_omitting_current_row:=
          unlist(lapply(1:.N, function(i) min(price[-i]))) , id]