问题描述
Fisher's Exact Test 与超几何分布有关,我希望这两个命令会返回相同的 pvalues.谁能解释我做错了什么,他们不匹配?
The Fisher's Exact Test is related to the hypergeometric distribution, and I would expect that these two commands would return identical pvalues. Can anyone explain what I'm doing wrong that they do not match?
#data (variable names chosen to match dhyper() argument names)
x = 14
m = 20
n = 41047
k = 40
#Fisher test, alternative = 'greater'
(fisher.test(matrix(c(x, m-x, k-x, n-(k-x)),2,2), alternative='greater'))$p.value
#returns 2.01804e-39
#geometric distribution, lower.tail = F, i.e. P[X > x]
phyper(x, m, n, k, lower.tail = F, log.p = F)
#returns 5.115862e-43
推荐答案
在这种情况下,对 phyper
相关的实际调用是 phyper(x - 1, m, n,k,lower.tail = FALSE)
.查看与您调用 fisher.test(matrix(c(x, mx, kx, n-(kx)),2,2) 相关的
.在第 138 行,fisher.test
的源代码,替代='更大')PVAL
设置为:
In this case, the actual call to phyper
that is relevant is phyper(x - 1, m, n, k, lower.tail = FALSE)
. Look at the source code for fisher.test
relevant to your call of fisher.test(matrix(c(x, m-x, k-x, n-(k-x)),2,2), alternative='greater')
. At line 138, PVAL
is set to:
switch(alternative, less = pnhyper(x, or),
greater = pnhyper(x, or, upper.tail = TRUE),
two.sided = {
if (or == 0) as.numeric(x == lo) else if (or ==
Inf) as.numeric(x == hi) else {
relErr <- 1 + 10^(-7)
d <- dnhyper(or)
sum(d[d <= d[x - lo + 1] * relErr])
}
})
由于 alternative = 'greater'
,PVAL
被设置为 pnhyper(x, or, upper.tail = TRUE)
.可以看到第122行定义了pnhyper
.这里,or = 1
,传递给ncp
,所以调用的是phyper(x - 1, m, n, k,lower.tail = FALSE)
Since alternative = 'greater'
, PVAL
is set to pnhyper(x, or, upper.tail = TRUE)
. You can see pnhyper
defined on line 122. Here, or = 1
, which is passed to ncp
, so the call is phyper(x - 1, m, n, k, lower.tail = FALSE)
带着你的价值观:
x = 14
m = 20
n = 41047
k = 40
phyper(x - 1, m, n, k, lower.tail = FALSE)
# [1] 2.01804e-39
这篇关于来自 fisher.test() 的 p 值与 phyper() 不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!