问题描述
我正在尝试从这样的数据框中制作折线图数组
I'm trying to make an array of line charts from a data frame like this
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame({ 'CITY' : np.random.choice(['PHOENIX','ATLANTA','CHICAGO', 'MIAMI', 'DENVER'], 10000),
'DAY': np.random.choice(['Monday','Tuesday','Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'], 10000),
'TIME_BIN': np.random.randint(1, 86400, size=10000),
'COUNT': np.random.randint(1, 700, size=10000)})
df['TIME_BIN'] = pd.to_datetime(df['TIME_BIN'], unit='s').dt.round('10min').dt.strftime('%H:%M:%S')
print(df)
CITY COUNT DAY TIME_BIN
0 ATLANTA 270 Wednesday 10:50:00
1 CHICAGO 375 Wednesday 12:20:00
2 MIAMI 490 Thursday 11:30:00
3 MIAMI 571 Sunday 23:30:00
4 DENVER 379 Saturday 07:30:00
... ... ... ... ...
9995 ATLANTA 107 Saturday 21:10:00
9996 DENVER 127 Tuesday 15:00:00
9997 DENVER 330 Friday 06:20:00
9998 PHOENIX 379 Saturday 19:50:00
9999 CHICAGO 628 Saturday 01:30:00
这就是我现在所拥有的:
This is what I have right now:
piv = df.pivot(columns="DAY").plot(x='TIME_BIN', kind="Line", subplots=True)
plt.show()
但是 x 轴格式混乱,我需要每个城市都成为自己的行.我该如何解决?我在想我需要遍历一周中的每一天,而不是尝试在一行中创建一个数组.我试过seaborn,没有运气.总而言之,这就是我想要实现的目标:
But the x-axis formatting is messed up and I need each city to be its own line. How do I fix that? I'm thinking that I need to loop through each day of the week instead of trying to make an array in a single line. I've tried seaborn with no luck. To summarize, this is what I'm trying to achieve:
- x轴上的TIME_BIN
- Y 轴计数
- 每个城市的色线不同
- 每天一张图表
推荐答案
我不明白旋转在这里有什么帮助,因为最后你需要将数据划分两次,一次是一周中的几天,这应该是放入几个子图,并再次用于城市,这些城市应该有自己的彩色线.在这一点上,我们已经达到了 pandas 可以用它的绘图包装器做的极限.
I don't see how pivoting helps here, since at the end you need to divide your data twice, once for the days of the week, which shall be put into several subplots, and again for the cities, which shall have their own colored line. At this point we're at the limit of what pandas can do with its plotting wrapper.
使用 matplotlib 可以循环遍历天和城市这两个类别,然后绘制数据.
Using matplotlib one can loop through the two categories, days and cities and just plot the data.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates
df = pd.DataFrame({
'CITY' : np.random.choice(['PHOENIX','ATLANTA','CHICAGO', 'MIAMI', 'DENVER'], 10000),
'DAY': np.random.choice(['Monday','Tuesday','Wednesday', 'Thursday',
'Friday', 'Saturday', 'Sunday'], 10000),
'TIME_BIN': np.random.randint(1, 86400, size=10000),
'COUNT': np.random.randint(1, 700, size=10000)})
df['TIME_BIN'] = pd.to_datetime(df['TIME_BIN'], unit='s').dt.round('10min')
days = ['Monday','Tuesday','Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
cities = np.unique(df["CITY"])
fig, axes = plt.subplots(nrows=len(days), figsize=(13,8), sharex=True)
# loop over days (one could use groupby here, but that would lead to days unsorted)
for i, day in enumerate(days):
ddf = df[df["DAY"] == day].sort_values("TIME_BIN")
# loop over cities
for city in cities:
dddf = ddf[ddf["CITY"] == city]
axes[i].plot(dddf["TIME_BIN"], dddf["COUNT"], label=city)
axes[i].margins(x=0)
axes[i].set_title(day)
fmt = matplotlib.dates.DateFormatter("%H:%M")
axes[-1].xaxis.set_major_formatter(fmt)
axes[0].legend(bbox_to_anchor=(1.02,1))
fig.subplots_adjust(left=0.05,bottom=0.05, top=0.95,right=0.85, hspace=0.8)
plt.show()
使用 seaborn FacetGrid 可以实现大致相同的效果.
Roughly the same can be achived with a seaborn FacetGrid.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates
import seaborn as sns
df = pd.DataFrame({
'CITY' : np.random.choice(['PHOENIX','ATLANTA','CHICAGO', 'MIAMI', 'DENVER'], 10000),
'DAY': np.random.choice(['Monday','Tuesday','Wednesday', 'Thursday',
'Friday', 'Saturday', 'Sunday'], 10000),
'TIME_BIN': np.random.randint(1, 86400, size=10000),
'COUNT': np.random.randint(1, 700, size=10000)})
df['TIME_BIN'] = pd.to_datetime(df['TIME_BIN'], unit='s').dt.round('10min')
days = ['Monday','Tuesday','Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
cities = np.unique(df["CITY"])
g = sns.FacetGrid(data=df.sort_values('TIME_BIN'),
row="DAY", row_order=days,
hue="CITY", hue_order=cities, sharex=True, aspect=5)
g.map(plt.plot, "TIME_BIN", "COUNT")
g.add_legend()
g.fig.subplots_adjust(left=0.05,bottom=0.05, top=0.95,hspace=0.8)
fmt = matplotlib.dates.DateFormatter("%H:%M")
g.axes[-1,-1].xaxis.set_major_formatter(fmt)
plt.show()
这篇关于如何从 Pandas 数据框中绘制多个折线图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!