本文介绍了用 Matplotlib 绘制 SVM?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些有趣的用户数据.它提供了一些有关要求用户执行的某些任务的及时性的信息.我想知道,如果 late - 它告诉我用户是否准时 (0),有点晚 (1),或很晚 (2) - 是可预测/可解释的.我从提供交通灯信息的列中生成 late(绿色 = 不迟到,红色 = 超级迟到).

这是我所做的:

 #imports将熊猫导入为 pd将 numpy 导入为 np导入 matplotlib.pyplot 作为 plt从 sklearn 导入预处理从 sklearn 导入 svm导入 sklearn.metrics 作为 sm#加载用户数据df = pd.read_csv('April.csv', error_bad_lines=False, encoding='iso8859_15', delimiter=';')#将对象转换为日期时间数据类型cols = ['计划开始','实际开始','计划结束','实际结束']df = df[cols].apply(pd.to_datetime, dayfirst=True, errors='忽略').join(df.drop(cols, 1))#将日期时间转换为数字数据类型cols = ['计划开始','实际开始','计划结束','实际结束']df = df[cols].apply(pd.to_numeric, errors='忽略').join(df.drop(cols, 1))#为绿色、黄色和红色交通灯添加李克特量表df['迟到'] = 0df.ix[df['End Time Traffic Light'].isin(['Yellow']), 'late'] = 1df.ix[df['End Time Traffic Light'].isin(['Red']), 'late'] = 2#监督学习#X 和 y 数组# X = np.array(df.drop(['late'],axis=1))X = df[['Planned Start', 'Actual Start', 'Planned End', 'Actual End', 'Measure Package', 'Measure', 'Responsible User']].as_matrix()y = np.array(df['晚'])#预处理数据X = 预处理.scale(X)#Supper 向量机clf = svm.SVC(decision_function_shape='ovo')clf.fit(X, y)打印(clf.score(X,y))

我现在试图了解如何绘制决策边界.我的目标是绘制带有 Actual EndPlanned End 的 2 向散点图.当然,我检查了文档(参见例如

I have some interesting user data. It gives some information on the timeliness of certain tasks the users were asked to perform. I am trying to find out, if late - which tells me if users are on time (0), a little late (1), or quite late (2) - is predictable/explainable. I generate late from a column giving traffic light information (green = not late, red = super late).

Here is what I do:

  #imports
  import pandas as pd
  import numpy as np
  import matplotlib.pyplot as plt
  from sklearn import preprocessing
  from sklearn import svm
  import sklearn.metrics as sm




  #load user data
  df = pd.read_csv('April.csv', error_bad_lines=False, encoding='iso8859_15', delimiter=';')


  #convert objects to datetime data types
  cols = ['Planned Start', 'Actual Start', 'Planned End', 'Actual End']
  df = df[cols].apply(
  pd.to_datetime, dayfirst=True, errors='ignore'
  ).join(df.drop(cols, 1))

  #convert datetime to numeric data types
  cols = ['Planned Start', 'Actual Start', 'Planned End', 'Actual End']
  df = df[cols].apply(
  pd.to_numeric, errors='ignore'
  ).join(df.drop(cols, 1))


  #add likert scale for green, yellow and red traffic lights
  df['late'] = 0
  df.ix[df['End Time Traffic Light'].isin(['Yellow']), 'late'] = 1
  df.ix[df['End Time Traffic Light'].isin(['Red']), 'late'] = 2

  #Supervised Learning

    #X and y arrays
  # X = np.array(df.drop(['late'], axis=1))
  X = df[['Planned Start', 'Actual Start', 'Planned End', 'Actual End', 'Measure Package', 'Measure' , 'Responsible User']].as_matrix()

  y = np.array(df['late'])

    #preprocessing the data
  X = preprocessing.scale(X)


  #Supper Vector Machine
  clf = svm.SVC(decision_function_shape='ovo')
  clf.fit(X, y)
  print(clf.score(X, y))

I am now trying to understand how to plot the decision boundaries.My goal is to plot a 2-way scatter with Actual End and Planned End. Naturally, I checked the documentation (see e.g. here). But I can't wrap my head around it. How does this work?

解决方案

As a heads up for the future, you'll generally get faster (and better) responses if you provide a publicly available dataset with your attempted plotting code, since we don't have 'April.csv'. You can also leave out your data-wrangling code for 'April.csv'. With that said...

Sebastian Raschka created the mlxtend package, which has has a pretty awesome plotting function for doing this. It uses matplotlib under the hood.

import numpy as np
import pandas as pd
from sklearn import svm
from mlxtend.plotting import plot_decision_regions
import matplotlib.pyplot as plt


# Create arbitrary dataset for example
df = pd.DataFrame({'Planned_End': np.random.uniform(low=-5, high=5, size=50),
                   'Actual_End':  np.random.uniform(low=-1, high=1, size=50),
                   'Late':        np.random.random_integers(low=0,  high=2, size=50)}
)

# Fit Support Vector Machine Classifier
X = df[['Planned_End', 'Actual_End']]
y = df['Late']

clf = svm.SVC(decision_function_shape='ovo')
clf.fit(X.values, y.values)

# Plot Decision Region using mlxtend's awesome plotting function
plot_decision_regions(X=X.values,
                      y=y.values,
                      clf=clf,
                      legend=2)

# Update plot object with X/Y axis labels and Figure Title
plt.xlabel(X.columns[0], size=14)
plt.ylabel(X.columns[1], size=14)
plt.title('SVM Decision Region Boundary', size=16)

这篇关于用 Matplotlib 绘制 SVM?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-25 12:31