问题描述
如何在特定频率下从python直方图中删除数据?
说我有10个垃圾箱,第一个垃圾箱有4个,第二个垃圾箱有2个,第三个垃圾箱有1个,第四个垃圾箱有5个,依此类推...现在,我要删除计数为2或更少的数据.因此,第二个仓位将变为零,第三个仓位也将变为零.
示例:
import numpy as np
import matplotlib.pyplot as plt
gaussian_numbers = np.random.randn(1000)
plt.hist(gaussian_numbers, bins=12)
plt.title("Gaussian Histogram")
plt.xlabel("Value")
plt.ylabel("Frequency")
fig = plt.gcf()
赠予:
我想摆脱频率小于"X"的垃圾箱(例如,频率可以是100).
想要:谢谢.
Une np.histogram
创建直方图.
然后使用 np.where
.在有条件的情况下,它会产生一个布尔数组,可用于索引直方图.
import numpy as np
import matplotlib.pyplot as plt
gaussian_numbers = np.random.randn(1000)
# Get histogram
hist, bins = np.histogram(gaussian_numbers, bins=12)
# Threshold frequency
freq = 100
# Zero out low values
hist[np.where(hist <= freq)] = 0
# Plot
width = 0.7 * (bins[1] - bins[0])
center = (bins[:-1] + bins[1:]) / 2
plt.bar(center, hist, align='center', width=width)
plt.title("Gaussian Histogram")
plt.xlabel("Value")
plt.ylabel("Frequency")
(图解部分的灵感来自此处.)
How do I remove data from a histogram in python under a certain frequency count?
Say I have 10 bins, the first bin has a count of 4, the second has 2, the third has 1, fourth has 5, etc...Now I want to get rid of the data that has a count of 2 or less. So the second bin would go to zero, as would the third.
Example:
import numpy as np
import matplotlib.pyplot as plt
gaussian_numbers = np.random.randn(1000)
plt.hist(gaussian_numbers, bins=12)
plt.title("Gaussian Histogram")
plt.xlabel("Value")
plt.ylabel("Frequency")
fig = plt.gcf()
Gives:
and I want to get rid of the bins with fewer than a frequency of say 'X' (could be frequency = 100 for example).
want:
thank you.
Une np.histogram
to create the histogram.
Then use np.where
. Given a condition, it yields an array of booleans you can use to index your histogram.
import numpy as np
import matplotlib.pyplot as plt
gaussian_numbers = np.random.randn(1000)
# Get histogram
hist, bins = np.histogram(gaussian_numbers, bins=12)
# Threshold frequency
freq = 100
# Zero out low values
hist[np.where(hist <= freq)] = 0
# Plot
width = 0.7 * (bins[1] - bins[0])
center = (bins[:-1] + bins[1:]) / 2
plt.bar(center, hist, align='center', width=width)
plt.title("Gaussian Histogram")
plt.xlabel("Value")
plt.ylabel("Frequency")
(Plot part inspired from here.)
这篇关于直方图处理以删除不需要的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!