问题描述
运行Tukey测试时出现一个奇怪的错误.我希望在我尝试了很多之后,有人能够为我提供帮助.这是我的数据框:
I get a strange error when running the Tukey test. I hope somebody is able to help me with this as I tried a lot. This is my dataframe:
Name Score
1 A 2.29
2 B 2.19
这是我的Tukey测试代码:
This is my Tukey Test code:
#TUKEY HSD TEST
tukey = pairwise_tukeyhsd(endog=df['Score'].astype('float'),
groups=df['Name'],
alpha=0.05)
tukey.plot_simultaneous()
plt.vlines(x=49.57,ymin=-0.5,ymax=4.5, color="red")
tukey.summary()
这是错误:
<ipython-input-12-3e12e78a002f> in <module>()
2 tukey = pairwise_tukeyhsd(endog=df['Score'].astype('float'),
3 groups=df['Name'],
----> 4 alpha=0.05)
5
6 tukey.plot_simultaneous()
/usr/local/lib/python3.6/dist-packages/statsmodels/stats/multicomp.py in pairwise_tukeyhsd(endog, groups, alpha)
36 '''
37
---> 38 return MultiComparison(endog, groups).tukeyhsd(alpha=alpha)
/usr/local/lib/python3.6/dist-packages/statsmodels/sandbox/stats/multicomp.py in __init__(self, data, groups, group_order)
794 if group_order is None:
795 self.groupsunique, self.groupintlab = np.unique(groups,
--> 796 return_inverse=True)
797 else:
798 #check if group_order has any names not in groups
/usr/local/lib/python3.6/dist-packages/numpy/lib/arraysetops.py in unique(ar, return_index, return_inverse, return_counts, axis)
221 ar = np.asanyarray(ar)
222 if axis is None:
--> 223 return _unique1d(ar, return_index, return_inverse, return_counts)
224 if not (-ar.ndim <= axis < ar.ndim):
225 raise ValueError('Invalid axis kwarg specified for unique')
/usr/local/lib/python3.6/dist-packages/numpy/lib/arraysetops.py in _unique1d(ar, return_index, return_inverse, return_counts)
278
279 if optional_indices:
--> 280 perm = ar.argsort(kind='mergesort' if return_index else 'quicksort')
281 aux = ar[perm]
282 else:
**TypeError: '<' not supported between instances of 'float' and 'str'**
如何解决此错误?预先感谢!
How can this error be resolved? Thanks in advance!
推荐答案
您遇到了问题,因为 df ['Name']
包含浮点数和字符串AND df ['Name']
的类型为 pandas.core.series.Series
.从追溯中可以看出,这种组合会导致 numpy.unique()
出现错误.您可以通过两种方法解决该问题.
You have the problem because df['Name']
contains both floats and strings AND df['Name']
is of type pandas.core.series.Series
. This combination leads to an error with numpy.unique()
as seen from traceback. You can fix the problem with 2 ways.
tukey = pairwise_tukeyhsd(endog=df['Score'].astype('float'),
groups=list(df['Name']), # list instead of a Series
alpha=0.05)
OR
确保 df ['Name']
仅包含数字或字符串.
Make sure df['Name']
contains only numbers or only strings.
这篇关于'<'Tukey HSD测试的'float'和'str'错误之间不支持的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!