我是python的新手。
我有2个数组和一个漂亮的条形图:
# Buyers in %
h =[1,1,3,5,9,13,16,16,14,10,5,4,2,1,0]
# Clothes size
x = [34,35,36,37,38,39,40,41,42,43,44,45,46,47,48]
# P(X=40) = 16 % // The probability that some buyers gets a 40 sized clothe is 16 %
# P(37 <= X <= 40) = 5+9+13+16 = 43 % // The probability that somes buyers gets between 37 and 40 sized clothes is 43 %
plt.ylabel('Buyers % ')
plt.xlabel('Clothes Size')
plt.bar(x, height = h)
plt.grid(True)
plt.show()
如何使用seaborn或scipy.stats.norm将其转换为密度线和正态分布,并在条形图上绘制?
之后,如何使用正态分布计算P(X
谢谢。
最佳答案
使用seaborn:
# Buyers in %
h =[1,1,3,5,9,13,16,16,14,10,5,4,2,1,0]
# Clothes size
x = [34,35,36,37,38,39,40,41,42,43,44,45,46,47,48]
import seaborn as sns
from scipy.stats import norm
data = []
for i in range(len(x)): data += [x[i]]*h[i]
sns.set()
plt.figure(figsize=(10,5),dpi=300)
sns.distplot(data, fit=norm, kde=False)
获取概率:
from scipy.stats import norm
import numpy as np
sample = data
sample_mean = np.array(data).mean()
sample_std = np.array(data).std()
min_value = int(sample_mean-4*sample_std)
max_value = int(sample_mean+4*sample_std)
dist = norm(sample_mean, sample_std)
values = [value for value in range(min_value, max_value)]
probabilities = [dist.pdf(value) for value in values]
#plt.plot(values,probabilities)
def prob(min_lim,max_lim):
p = (np.array(values)>min_lim).astype(int)* (np.array(values)<max_lim).astype(int)
prob = (np.array(probabilities)[p.astype(bool)]).sum()
return prob
prob(0,40)
Out[2]: 0.3230891372830226
注意:它与计算值不同,因为它使用的是根据数据均值和标准差的连续估计正态分布。
如果您不想使用连续估计,则代码如下:
len(np.array(data)[np.array(data)<40])/len(data)
Out[2]: 0.32
关于python - 将条变为正态分布,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/59582648/