问题描述
我正在开发一款 Android 应用,该应用使用手机中的传感器数据对活动进行分类.与任何 Java 机器学习库相比,我也非常喜欢 scikit-learn.所以我使用 Django 创建了一个非常小的 REST api,scikit 学习使用支持向量机训练传感器数据并返回模型信息.
I am developing an Android app that uses sensor data from the phone to classify activities. I also really prefer scikit-learn to any of the Java machine learning libraries. So I created a very minimal REST api using Django and scikit learn to train sensor data using support vector machines and return model information.
我的问题是:如何使用 scikit-learn 在手机上生成的模型进行预测?到目前为止,我已经考虑过扩展 api,以便每当手机想要进行预测时,它都会将数据发送到 api 以获得一个.但我更愿意能够编写一些 Java 代码或使用 Java 库来进行预测.将用于训练的数据发送到 api 不是问题,因为这不是实时完成的 --- 只有在已经收集到数据时才会完成.然而,发送数据用于实时预测似乎并不可行.
My question is this: how can I use the model scikit-learn produces on my phone to make predictions? So far I've considered extending the api so that whenever the phone wants to make a prediction, it sends the data to the api to get one. But I'd much rather be able to write some Java code or use a Java library to do the predicting. Sending data for training to the api isn't a problem, for that's not done in real time --- it's only done when the data has already been collected. Sending data for real-time predictions doesn't seem workable, however.
使用逻辑回归做这件事要容易得多,因为预测公式和模型参数非常简单;我可以放弃 svms 并改用它,但我也希望有 svms 可用.
Doing this with logistic regression is a lot easier as the prediction formula and model parameters are pretty simple; I could abandon svms and use this instead, but I'd also like to have svms available.
有人知道有人这样做过吗?是否有一种在没有数值计算或机器学习博士学位的情况下在相对较短的时间内可行的方法来做到这一点?不需要详细的步骤,只需概述如何使用 scikit-learn 生成的 svm 的组件.
Anyone aware of someone doing this before? Is there a doable-in-a-relatively-short-time-by-someone-without-a-PhD-in-numerical-computing-or-machine-learning way to do this? Detailed steps aren't necessary, just an outline of how to use the components of the svm that scikit-learn produces.
推荐答案
大多数带有 SVM 的包(也包括 scikit-learn)依赖于 libsvm 实现.但是你不需要来自 libsvm 的 99% 的代码,你也不必是博士,因为在 scikit-learn 中学习后你已经拥有了所有参数.所有你需要的 - 任何简单的线性代数库(仅用于向量 * 向量运算)在 java 中实现一个决策函数.
Most of packages with SVM (scikit-learn too) rely on libsvm implementation. But you don't need 99% of code from libsvm and you don't have to be PhD, because you already have all parameters after learning inside scikit-learn. All what you need - any simple linear algebra library (only for vector*vector operation) in java to implement a decision function.
如果您在 SVC 中使用线性核 - 这相对容易,因为 scikit-learn 会自动将所有那些复杂的对偶系数和支持向量转换为简单的超平面系数,因此决策函数就等同于逻辑回归,您需要的就是这里 -点积 - 看这里 将 SVM 分类器从 sklearn 导出到 Java 代码库
If you are using linear kernel in SVC - it's relatively easy, because scikit-learn automatically converts all those complicated dual coefficients and support vectors into simple hyperplane coefficients, thus decision function becomes equivalent to logistic regression, all what you need here - dot product - look here Exporting SVM classifiers from sklearn to Java codebase
如果使用非线性内核 - 同样只需要决策函数,但是现在您必须了解什么是支持向量,什么是对偶系数,什么是内核,并且您必须在 Java 中实现您的非线性内核.我认为在不了解 SVC 优化过程如何工作的情况下为非线性 SVC 实现决策函数并不是一件容易的事,我会给你一些链接:
In case with non-linear kernel - again only decision function is needed, but now you have to understand what is support vectors, what is dual coefficients, what is kernel, and you have to implement your non-linear kernel in java. I think that it's not easy task to implement decision function for non-linear SVC without understanding how SVC optimization process works, i'll give you some links:
或者,您可以找到任何适用于 Java 的 SVM 库,并使用您在 SVC 中选择的相同参数(C、eps 等)学习模型.我认为这是非线性内核最简单的解决方案.SVM 是众所周知的方法,我认为使用相同的参数和数据集进行学习将在任何好的实现上给出相同的结果(除了大多数实现和绑定,正如我所说,依赖于 libsvm,在这种情况下保证相等).
Or you can just find any SVM library for java and learn model with same parameters you choose in SVC (C, eps, etc). I think it's easiest solution for non-linear kernels. SVM is well-known method, and i think that learning with same parameters and dataset will give same results on any good implementation (besides that most of implementations and bindings, as i said, rely on libsvm, in this case equality is guaranteed).
这篇关于在 Android 中使用经过训练的 Scikit-learn svm 分类器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!