我想知道是否有人可以帮助我进行平行坐标绘制。
首先,这是数据的样子:
它是根据以下数据操纵的:https://data.cityofnewyork.us/Transportation/2016-Yellow-Taxi-Trip-Data/k67s-dv2t
因此,我正在尝试对某些功能进行归一化,并使用这些功能来计算一周中每一天的平均出行距离,乘客人数和付款金额。
from pandas.tools.plotting import parallel_coordinates
feature = ['trip_distance','passenger_count','payment_amount']
#normalizing data
for feature in features:
df[feature] = (df[feature]-df[feature].min())/(df[feature].max()-df[feature].min())
#change format to datetime
pickup_time = pd.to_datetime(df['pickup_datetime'], format ='%d/%m/%y %H:%M')
#fill dayofweek column with 0~6 0:Monday and 6:Sunday
df['dayofweek'] = pickup_time.dt.weekday
mean_trip = df.groupby('dayofweek').trip_distance.mean()
mean_passanger = df.groupby('dayofweek').passenger_count.mean()
mean_payment = df.groupby('dayofweek').payment_amount.mean()
#parallel_coordinates('notsurewattoput')
因此,如果我打印mean_trip:
它显示了一周中每一天的平均值,但是我不确定如何用它在同一图上绘制所有三个均值的平行坐标图。
有人知道如何实现吗?
最佳答案
我认为您可以将3次汇总均值更改为输出DataFrame
而不是3系列:
mean_trip = df.groupby('dayofweek').trip_distance.mean()
mean_passanger = df.groupby('dayofweek').passenger_count.mean()
mean_payment = df.groupby('dayofweek').payment_amount.mean()
至:
from pandas.tools.plotting import parallel_coordinates
cols = ['trip_distance','passenger_count','payment_amount']
df1 = df.groupby('dayofweek', as_index=False)[cols].mean()
#https://stackoverflow.com/a/45082022
parallel_coordinates(df1, class_column='dayofweek', cols=cols)