问题描述
很抱歉是否已被问到.代替原始预测(-r
),我想通过设置-loss_function hinge
以[0,1]间隔返回在vowpal wabbit中训练的SVM的预测.目前,我正在尝试此操作,但是并没有给我想要的东西.有什么想法吗?
Apologies if this has been asked already. Instead of the raw predictions (-r
) I would like to return predictions in the [0, 1] interval for an SVM trained in vowpal wabbit by setting -loss_function hinge
. Currently I'm trying this but it's not giving me what I want. Any thoughts?
vw -d vw_train_rand.vw -c -f svm_rand.vw --passes 10 --loss_function hinge -q cn;
vw -d vw_test_rand.vw -t -i svm_rand.vw -p preds_rand_svm.txt
欢呼
亚伦
1)样本数据:
-1 |c Loan.TypeConventional:1 Loan.TypeFHA:0 Loan.TypeUnknown:0 Loan.TypeVA:0 |n Loan.Size:124500 LenderRank0612.0614:1939 ZipSquareMiles:53.1 MailDateMonth:5 ZipPerForeignBorn:11.4 ZipPerHighSchoolPlusDegree:57.2 ZipPerCollegePlusDegree:15.2 ZipPerVeterans:13.4 ZipPopPerSquareMile:798.1 ZipPerUnemployement:8.5 ZipSexRatio:96.7 ZipHousingUnitsPerSquareMile:315.1 ZipMedianHouseholdIncome:36238 ZipPerCapitaIncome:19085 MonthsDeedDatetoMailDate:2
-1 |c Loan.TypeConventional:1 Loan.TypeFHA:0 Loan.TypeUnknown:0 Loan.TypeVA:0 |n Loan.Size:232000 LenderRank0612.0614:391 ZipSquareMiles:99.1 MailDateMonth:5 ZipPerForeignBorn:11.8 ZipPerHighSchoolPlusDegree:73.3 ZipPerCollegePlusDegree:39.3 ZipPerVeterans:9.1 ZipPopPerSquareMile:485.5 ZipPerUnemployement:5.9 ZipSexRatio:98.5 ZipHousingUnitsPerSquareMile:169.6 ZipMedianHouseholdIncome:78465 ZipPerCapitaIncome:31908 MonthsDeedDatetoMailDate:3
-1 |c Loan.TypeConventional:1 Loan.TypeFHA:0 Loan.TypeUnknown:0 Loan.TypeVA:0 |n Loan.Size:90000 LenderRank0612.0614:130 ZipSquareMiles:32.6 MailDateMonth:5 ZipPerForeignBorn:51.5 ZipPerHighSchoolPlusDegree:60.7 ZipPerCollegePlusDegree:17.3 ZipPerVeterans:9.3 ZipPopPerSquareMile:783.2 ZipPerUnemployement:4.8 ZipSexRatio:97.2 ZipHousingUnitsPerSquareMile:274.2 ZipMedianHouseholdIncome:64668 ZipPerCapitaIncome:25632 MonthsDeedDatetoMailDate:3
-1 |c Loan.TypeConventional:0 Loan.TypeFHA:0 Loan.TypeUnknown:0 Loan.TypeVA:1 |n Loan.Size:121301 LenderRank0612.0614:23 ZipSquareMiles:6.8 MailDateMonth:5 ZipPerForeignBorn:14.9 ZipPerHighSchoolPlusDegree:63.9 ZipPerCollegePlusDegree:24.2 ZipPerVeterans:10 ZipPopPerSquareMile:5245.1 ZipPerUnemployement:7.1 ZipSexRatio:93.3 ZipHousingUnitsPerSquareMile:2001.6 ZipMedianHouseholdIncome:56398 ZipPerCapitaIncome:25815 MonthsDeedDatetoMailDate:2
2)我当前得到的是:
2) What I get currently:
-1.001968
-1.000737
-1.000441
-1.001823
3)我想看的是:以连续的[0,1]间隔进行的预测,这样每个条目都可以解释为与事件相关的预测概率,例如:
3) What I'd like to see: Predictions in a continuous [0, 1] interval such that each entry can be interpreted as a forecasted probability associated with the event, e.g.:
0.012
0.009
0.010
0.0085
推荐答案
如果要预测概率,则应使用--loss_function=logistic
进行训练,并使用--link=logistic
进行测试. SVM中使用的铰链损失会导致最大余量分类器,这不适用于预测概率.
If you want to predict probabilities, you should train with --loss_function=logistic
and test with --link=logistic
. The hinge loss (used in SVM) results in max-margin classifier, which is not suitable for predicting probabilities.
请注意,仅使用--loss_function=hinge
并不能从大众制造SVM(没有内核).如果要使用在线方式对带有径向基础内核的Support Vector Machine进行培训,请使用--kvsm --kernel=rbf
(有关更多参数,请参见vw --ksvm -h | grep -A9 KSVM
).
Note that just using --loss_function=hinge
does not make SVM from VW (there is no kernel). If you want Support Vector Machine with radial-basis kernel trained in online fashion, use --kvsm --kernel=rbf
(see vw --ksvm -h | grep -A9 KSVM
for more parameters).
这篇关于如何在vowpal wabbit中以[0,1]间隔返回SVM的预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!