本文介绍了调整seaborn.boxplot的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想比较一组分数分布(分数),按某些类别(中心度)分组),并用其他颜色(模型)着色。我已经尝试过使用seaborn进行以下操作:

I would like to compare a set of distributions of scores (score), grouped by some categories (centrality) and colored by some other (model). I've tried the following with seaborn:

plt.figure(figsize=(14,6))
seaborn.boxplot(x="centrality", y="score", hue="model", data=data, palette=seaborn.color_palette("husl", len(models) +1))
seaborn.despine(offset=10, trim=True)
plt.savefig("/home/i11/staudt/Eval/properties-replication-test.pdf", bbox_inches="tight")

此图存在一些问题:


  • 有大量异常值,我不喜欢它们在此处的绘制方式。我可以删除它们吗?我可以更改外观以减少混乱吗?我可以至少给它们上色以使其颜色与方框颜色匹配吗?

  • 模型原始是特殊的,因为所有其他分布都应与原始的分布进行比较。这应该在图中直观地反映出来。我可以将原始作为每个组的第一盒吗?我可以以某种方式抵消或标记它吗?是否可以通过每个原始分布的中位数并通过一组框画一条水平线?

  • 其中一些分数的值很小,如何对y轴进行适当缩放以显示它们?

  • There is a large amount of outliers and I don't like how they are drawn here. Can I remove them? Can I change the appearance to show less clutter? Can I color them at least so that their color matches the box color?
  • The model value original is special because all other distributions should be compared to the distribution of original. This should be visually reflected in the plot. Can I make original the first box of every group? Can I offset or mark it differently somehow? Would it be possible to draw a horizontal line through the median of each original distribution and through the group of boxes?
  • some of the values of score are very small, how to do proper scaling of the y-axis to show them?

编辑:

这是一个具有对数比例的y轴的示例-也不理想。为什么某些盒子似乎在低端被切断?

Here is an example with a log-scaled y-axis - also not yet ideal. Why do the some boxes seem cut off at the low end?

推荐答案

异常值显示

您应该可以将任何参数传递给 seaborn.boxplot ,您可以将其传递给 plt.boxplot (请参阅),因此您可以通过设置 flierprops 来调整异常值的显示。 是一些可以处理异常值的示例。

You should be able to pass any arguments to seaborn.boxplot that you can pass to plt.boxplot (see documentation), so you could adjust the display of the outliers by setting flierprops. Here are some examples of what you can do with your outliers.

如果不想显示它们,则可以

If you don't want to display them, you could do

seaborn.boxplot(x="centrality", y="score", hue="model", data=data,
                showfliers=False)

,也可以像这样使它们变成浅灰色:

or you could make them light gray like so:

flierprops = dict(markerfacecolor='0.75', markersize=5,
              linestyle='none')
seaborn.boxplot(x="centrality", y="score", hue="model", data=data,
                flierprops=flierprops)

组的顺序

您可以使用 hue_order 手动设置组的顺序,例如

You can set the order of the groups manually with hue_order, e.g.

seaborn.boxplot(x="centrality", y="score", hue="model", data=data,
                hue_order=["original", "Havel..","etc"])

y轴缩放

您可以获取最小值和最大值所有y值的值并相应地设置 y_lim ?像这样的东西:

You could just get the minimum and maximum values of all y-values and set y_lim accordingly? Something like this:

y_values = data["scores"].values
seaborn.boxplot(x="centrality", y="score", hue="model", data=data,
                y_lim=(np.min(y_values),np.max(y_values)))

编辑:这最后一点没有任何意义,因为自动 y_lim 范围已经存在包括所有值,但我仅作为调整这些设置的示例。如评论中所述,日志扩展可能更有意义。

This last point doesn't really make sense since the automatic y_lim range will already include all the values, but I'm leaving it just as an example of how to adjust these settings. As mentioned in the comments, log-scaling probably makes more sense.

这篇关于调整seaborn.boxplot的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 04:13