问题描述
我有以下代码来比较两个分布:
I have the following code to compare two distributions:
sns.kdeplot(df['term'][df['outcome'] == 0], shade=1, color='red')
sns.kdeplot(df['term'][df['outcome'] == 1], shade=1, color='green');
看起来像这样:
如何仅绘制两种分布的差异(disA-disB)?当然,它可以包含负值.
How do to plot just the difference of both distributions (disA - disB)? Of course, it could contain negative values.
推荐答案
由于两个kde曲线之间的差异本身并不是kde曲线,因此不能使用 kdeplot
绘制该差异.
Since the difference between two kde curves is not a kde curve itself, you cannot use kdeplot
to plot that difference.
使用 scipy.stats.gaussian_kde
可以轻松计算 kde.结果很容易用 pyplot 绘制.
A kde is easily calculated using scipy.stats.gaussian_kde
. The result is easily plotted with pyplot.
import numpy as np; np.random.seed(0)
import matplotlib.pyplot as plt
import scipy.stats
a = np.random.gumbel(80, 25, 1000)
b = np.random.gumbel(90, 46, 4000)
kdea = scipy.stats.gaussian_kde(a)
kdeb = scipy.stats.gaussian_kde(b)
grid = np.linspace(0,500, 501)
plt.plot(grid, kdea(grid), label="kde A")
plt.plot(grid, kdeb(grid), label="kde B")
plt.plot(grid, kdea(grid)-kdeb(grid), label="difference")
plt.legend()
plt.show()
请记住,结果实际上只是曲线之间的差异(根据要求);它根本没有统计相关性.
Mind that the result is really just the difference between the curves (as being asked for); it has no statistical relevance at all.
这篇关于如何绘制一个seaborn中两个分布的差异?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!