我有一个pandas数据框,其中包含按“帖子类型”分类的Facebook帖子上的数据。该数据框称为“Posts_by_type”。它包含点赞次数,分享次数和帖子类型。帖子分为3种:赛车,娱乐和促销。
我想在matplotlib中创建一个箱形图,显示每种类型的帖子的点赞数。
我的代码有效:
Posts_by_type.boxplot(column='Likes', by='Type', grid=True)
这将产生以下箱线图:
但是,我也想用相应的数值标记箱线图上的中位数和晶须。
在matplotlib中有可能吗?如果是这样,谁能给我一些如何做的指示?
最佳答案
一种解决方案,还添加了框的值。
import random
import string
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
def get_x_tick_labels(df, grouped_by):
tmp = df.groupby([grouped_by]).size()
return ["{0}: {1}".format(k,v) for k, v in tmp.to_dict().items()]
def series_values_as_dict(series_object):
tmp = series_object.to_dict().values()
return [y for y in tmp][0]
def generate_dataframe():
# Create a pandas dataframe...
_likes = [random.randint(0,300) for _ in range(100)]
_type = [random.choice(string.ascii_uppercase[:5]) for _ in range(100)]
_shares = [random.randint(0,100) for _ in range(100)]
return pd.DataFrame(
{'Likes': _likes,
'Type': _type,
'shares': _shares
})
def add_values(bp, ax):
""" This actually adds the numbers to the various points of the boxplots"""
for element in ['whiskers', 'medians', 'caps']:
for line in bp[element]:
# Get the position of the element. y is the label you want
(x_l, y),(x_r, _) = line.get_xydata()
# Make sure datapoints exist
# (I've been working with intervals, should not be problem for this case)
if not np.isnan(y):
x_line_center = x_l + (x_r - x_l)/2
y_line_center = y # Since it's a line and it's horisontal
# overlay the value: on the line, from center to right
ax.text(x_line_center, y_line_center, # Position
'%.3f' % y, # Value (3f = 3 decimal float)
verticalalignment='center', # Centered vertically with line
fontsize=16, backgroundcolor="white")
posts_by_type = generate_dataframe()
fig, axes = plt.subplots(1, figsize=(20, 10))
bp_series = posts_by_type.boxplot(column='Likes', by='Type',
grid=True, figsize=(25, 10),
ax=axes, return_type='dict', labels=labels)
# This should return a dict, but gives me a Series object, soo...
bp_dict = series_values_as_dict(bp_series)
#Now add the values
add_values(bp_dict, axes)
# Set a label on X-axis for each boxplot
labels = get_x_tick_labels(posts_by_type, 'Type')
plt.xticks(range(1, len(labels) + 1), labels)
# Change some other texts on the graphs?
plt.title('Likes per type of post', fontsize=22)
plt.xlabel('Type', fontsize=18)
plt.ylabel('Likes', fontsize=18)
plt.suptitle('This is a pretty graph')
plt.show()
关于python - 有箱线图,想用值标记中位数和晶须,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/40813813/