问题描述
这里是一个例子,其中 foverlaps(...)
似乎找到不重叠的匹配。谁可以帮助我理解我在做什么?
Here is an example where foverlaps(...)
seems to be finding matches that do not overlap. Can anyone help me understand what I'm doing wrong?
似乎是使用 foverlaps(...)
在data.table包中。下面的数据集来自该文章。
The problem in this post seems like an excellent opportunity to use foverlaps(...)
in the data.table package. The datasets below are from that post.
dinosaurs <- structure(list(GENUS = structure(1:3, .Label = c("Abydosaurus", "Achelousaurus", "Acheroraptor"), class = "factor"), ma_max = c(109, 84.9, 70.6), ma_min = c(94.3, 70.6, 66.043), ma_mid = c(101.65, 77.75, 68.3215)), .Names = c("GENUS", "ma_max", "ma_min", "ma_mid"), class = "data.frame", row.names = c(NA, -3L))
stages <- structure(list(Stage = structure(c(13L, 19L, 17L, 21L, 1L, 4L, 6L, 8L, 16L, 14L, 20L, 7L, 23L, 12L, 5L, 3L, 2L, 10L, 22L, 11L, 18L, 9L, 15L), .Label = c("Aalenian", "Albian", "Aptian", "Bajocian", "Barremian", "Bathonian", "Berriasian", "Callovian", "Campanian", "Cenomanian", "Coniacian", "Hauterivian", "Hettangian", "Kimmeridgian", "Maastrichtian", "Oxfordian", "Pliensbachian", "Santonian", "Sinemurian", "Tithonian", "Toarcian", "Turonian", "Valanginian"), class = "factor"),ma_max = c(201.6, 197, 190, 183, 176, 172, 168, 165, 161, 156, 151, 145.5, 140, 136, 130, 125, 112, 99.6, 93.5, 89.3, 85.8, 83.5, 70.6), ma_min = c(197, 190, 183, 176, 172, 168, 165, 161, 156, 151, 145.5, 140, 136, 130, 125, 112, 99.6, 93.5, 89.3, 85.8, 83.5, 70.6, 66.5), ma_mid = c(199.3, 193.5, 186.5, 179.5, 174, 170, 166.5, 163, 158.5, 153.5, 148.25, 142.75, 138, 133, 127.5, 118.5, 105.8, 96.55, 91.4, 87.55, 84.65, 77.05, 68.05)), .Names = c("Stage", "ma_max", "ma_min", "ma_mid"), class = "data.frame", row.names = c(NA, -23L))
dinosaurs
# GENUS ma_max ma_min ma_mid
# 1 Abydosaurus 109.0 94.300 101.6500
# 2 Achelousaurus 84.9 70.600 77.7500
# 3 Acheroraptor 70.6 66.043 68.3215
head(stages)
# Stage ma_max ma_min ma_mid
# 1 Hettangian 201.6 197 199.3
# 2 Sinemurian 197.0 190 193.5
# 3 Pliensbachian 190.0 183 186.5
# 4 Toarcian 183.0 176 179.5
# 5 Aalenian 176.0 172 174.0
# 6 Bajocian 172.0 168 170.0
目标是找出在每个地质阶段存在的恐龙属的数量。
The goal is to find the number of dinosaur genera which were present in each geological stage.
library(data.table) # 1.9.4
setDT(dinosaurs)[,ma_mid:=NULL]
setDT(stages)[,ma_mid:=NULL]
setkey(dinosaurs,ma_min,ma_max)
foverlaps(stages,dinosaurs,type="any",nomatch=0)
# GENUS ma_max ma_min Stage i.ma_max i.ma_min
# 1: Abydosaurus 109.0 94.300 Albian 112.0 99.6
# 2: Abydosaurus 109.0 94.300 Cenomanian 99.6 93.5
# 3: Achelousaurus 84.9 70.600 Coniacian 89.3 85.8
# 4: Achelousaurus 84.9 70.600 Santonian 85.8 83.5
# 5: Acheroraptor 70.6 66.043 Campanian 83.5 70.6
# 6: Achelousaurus 84.9 70.600 Campanian 83.5 70.6
# 7: Acheroraptor 70.6 66.043 Maastrichtian 70.6 66.5
# 8: Achelousaurus 84.9 70.600 Maastrichtian 70.6 66.5
这是大多数正确的,但看看第3行。这似乎断言Cenomanian阶段,从85.8 8930万年前,与Achelousaurus重叠,它生活在70.6到8490万年前。
This is mostly correct, but look at row 3. This seems to assert that the Cenomanian stage, from 85.8 to 89.3 million years ago, overlaps with Achelousaurus, which lived from 70.6 to 84.9 million years ago. What am I missing?
推荐答案
在1.9.5版本中,我得到:
On 1.9.5, I get this:
# GENUS ma_max ma_min Stage i.ma_max i.ma_min
# 1: Abydosaurus 109.0 94.300 Albian 112.0 99.6
# 2: Abydosaurus 109.0 94.300 Cenomanian 99.6 93.5
# 3: Achelousaurus 84.9 70.600 Santonian 85.8 83.5
# 4: Acheroraptor 70.6 66.043 Campanian 83.5 70.6
# 5: Achelousaurus 84.9 70.600 Campanian 83.5 70.6
# 6: Acheroraptor 70.6 66.043 Maastrichtian 70.6 66.5
# 7: Achelousaurus 84.9 70.600 Maastrichtian 70.6 66.5
在中的1.9.5中。如果你也可以验证这一点,会很棒。
Most likely floating point bug fixed in 1.9.5 in this commit. Would be great if you could verify this as well.
这篇关于在浮点间隔上出现意外行为[已解决]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!