问题描述
这里是一个示例,其中 foverlaps(...)
似乎在查找不重叠的匹配项。有人可以帮助我了解我在做什么吗?
Here is an example where foverlaps(...)
seems to be finding matches that do not overlap. Can anyone help me understand what I'm doing wrong?
似乎是使用 foverlaps(...)$在data.table包中。下面的数据集来自该帖子。
The problem in this post seems like an excellent opportunity to use foverlaps(...)
in the data.table package. The datasets below are from that post.
dinosaurs <- structure(list(GENUS = structure(1:3, .Label = c("Abydosaurus", "Achelousaurus", "Acheroraptor"), class = "factor"), ma_max = c(109, 84.9, 70.6), ma_min = c(94.3, 70.6, 66.043), ma_mid = c(101.65, 77.75, 68.3215)), .Names = c("GENUS", "ma_max", "ma_min", "ma_mid"), class = "data.frame", row.names = c(NA, -3L))
stages <- structure(list(Stage = structure(c(13L, 19L, 17L, 21L, 1L, 4L, 6L, 8L, 16L, 14L, 20L, 7L, 23L, 12L, 5L, 3L, 2L, 10L, 22L, 11L, 18L, 9L, 15L), .Label = c("Aalenian", "Albian", "Aptian", "Bajocian", "Barremian", "Bathonian", "Berriasian", "Callovian", "Campanian", "Cenomanian", "Coniacian", "Hauterivian", "Hettangian", "Kimmeridgian", "Maastrichtian", "Oxfordian", "Pliensbachian", "Santonian", "Sinemurian", "Tithonian", "Toarcian", "Turonian", "Valanginian"), class = "factor"),ma_max = c(201.6, 197, 190, 183, 176, 172, 168, 165, 161, 156, 151, 145.5, 140, 136, 130, 125, 112, 99.6, 93.5, 89.3, 85.8, 83.5, 70.6), ma_min = c(197, 190, 183, 176, 172, 168, 165, 161, 156, 151, 145.5, 140, 136, 130, 125, 112, 99.6, 93.5, 89.3, 85.8, 83.5, 70.6, 66.5), ma_mid = c(199.3, 193.5, 186.5, 179.5, 174, 170, 166.5, 163, 158.5, 153.5, 148.25, 142.75, 138, 133, 127.5, 118.5, 105.8, 96.55, 91.4, 87.55, 84.65, 77.05, 68.05)), .Names = c("Stage", "ma_max", "ma_min", "ma_mid"), class = "data.frame", row.names = c(NA, -23L))
dinosaurs
# GENUS ma_max ma_min ma_mid
# 1 Abydosaurus 109.0 94.300 101.6500
# 2 Achelousaurus 84.9 70.600 77.7500
# 3 Acheroraptor 70.6 66.043 68.3215
head(stages)
# Stage ma_max ma_min ma_mid
# 1 Hettangian 201.6 197 199.3
# 2 Sinemurian 197.0 190 193.5
# 3 Pliensbachian 190.0 183 186.5
# 4 Toarcian 183.0 176 179.5
# 5 Aalenian 176.0 172 174.0
# 6 Bajocian 172.0 168 170.0
我的目标s以查找每个地质阶段中存在的恐龙属的数量。
The goal is to find the number of dinosaur genera which were present in each geological stage.
library(data.table) # 1.9.4
setDT(dinosaurs)[,ma_mid:=NULL]
setDT(stages)[,ma_mid:=NULL]
setkey(dinosaurs,ma_min,ma_max)
foverlaps(stages,dinosaurs,type="any",nomatch=0)
# GENUS ma_max ma_min Stage i.ma_max i.ma_min
# 1: Abydosaurus 109.0 94.300 Albian 112.0 99.6
# 2: Abydosaurus 109.0 94.300 Cenomanian 99.6 93.5
# 3: Achelousaurus 84.9 70.600 Coniacian 89.3 85.8
# 4: Achelousaurus 84.9 70.600 Santonian 85.8 83.5
# 5: Acheroraptor 70.6 66.043 Campanian 83.5 70.6
# 6: Achelousaurus 84.9 70.600 Campanian 83.5 70.6
# 7: Acheroraptor 70.6 66.043 Maastrichtian 70.6 66.5
# 8: Achelousaurus 84.9 70.600 Maastrichtian 70.6 66.5
这基本上是正确的,但是在第3行。这似乎可以断言,从85.8到8930万年前的切诺曼期与与70.6到8490万年前的Achelousaurus重叠。我缺少什么?
This is mostly correct, but look at row 3. This seems to assert that the Cenomanian stage, from 85.8 to 89.3 million years ago, overlaps with Achelousaurus, which lived from 70.6 to 84.9 million years ago. What am I missing?
推荐答案
在1.9.5上,我得到了:
On 1.9.5, I get this:
# GENUS ma_max ma_min Stage i.ma_max i.ma_min
# 1: Abydosaurus 109.0 94.300 Albian 112.0 99.6
# 2: Abydosaurus 109.0 94.300 Cenomanian 99.6 93.5
# 3: Achelousaurus 84.9 70.600 Santonian 85.8 83.5
# 4: Acheroraptor 70.6 66.043 Campanian 83.5 70.6
# 5: Achelousaurus 84.9 70.600 Campanian 83.5 70.6
# 6: Acheroraptor 70.6 66.043 Maastrichtian 70.6 66.5
# 7: Achelousaurus 84.9 70.600 Maastrichtian 70.6 66.5
最有可能修复浮点错误在的1.9.5中。如果您也可以验证这一点,那就太好了。
Most likely floating point bug fixed in 1.9.5 in this commit. Would be great if you could verify this as well.
这篇关于浮点间隔上有翻转的意外行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!