本文介绍了黄土以新的x值预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图理解predict.loess函数如何能够在原始数据中不存在的点x处计算新的预测值(y_hat).例如(这是一个简单的例子,我知道这种例子显然不需要黄土,但它说明了这一点):

I am attempting to understand how the predict.loess function is able to compute new predicted values (y_hat) at points x that do not exist in the original data. For example (this is a simple example and I realize loess is obviously not needed for an example of this sort but it illustrates the point):

x <- 1:10
y <- x^2
mdl <- loess(y ~ x)
predict(mdl, 1.5)
[1] 2.25

loess回归是通过在每个x处使用多项式来进行的,因此在每个y处都创建了预测的y_hat.但是,由于没有存储系数,因此在这种情况下,模型"只是用于预测每个y_hat(例如,spandegree)的详细信息.当我执行predict(mdl, 1.5)时,predict如何在这个新的x处产生一个值?是否在两个最接近的现有x值及其关联的y_hat之间进行插值?如果是这样,它如何执行此操作的详细信息是什么?

loess regression works by using polynomials at each x and thus it creates a predicted y_hat at each y. However, because there are no coefficients being stored, the "model" in this case is simply the details of what was used to predict each y_hat, for example, the span or degree. When I do predict(mdl, 1.5), how is predict able to produce a value at this new x? Is it interpolating between two nearest existing x values and their associated y_hat? If so, what are the details behind how it is doing this?

我已经在线阅读了cloess文档,但是找不到在何处进行讨论.

I have read the cloess documentation online but am unable to find where it discusses this.

推荐答案

也许您已经使用过print(mdl)命令,或者只是使用了mdl来查看模型mdl包含的内容,但事实并非如此.该模型非常复杂,并且存储了大量参数.

Maybe you have used print(mdl) command or simply mdl to see what the model mdl contains, but this is not the case. The model is really complicated and stores a big number of parameters.

要了解其中的内容,可以使用unlist(mdl)并查看其中的大量参数.

To have an idea what's inside, you may use unlist(mdl) and see the big list of parameters in it.

这是命令手册的一部分,描述了命令的实际工作方式:

This is a part of the manual of the command describing how it really works:

对于默认族,拟合度是(加权)最小二乘.为了 family ="symmetric"的M估计过程的一些迭代 使用Tukey的biweight.请注意,因为初始值是 最小二乘拟合,这不必是非常可靠的拟合.

For the default family, fitting is by (weighted) least squares. For family="symmetric" a few iterations of an M-estimation procedure with Tukey's biweight are used. Be aware that as the initial value is the least-squares fit, this need not be a very resistant fit.

我相信它试图在每个点的附近拟合一个多项式模型(不仅仅是整个集合的一个多项式).但是,邻域并不仅仅意味着一个点之前和之后的一个点,如果我正在实现这样的功能,我会给与点x最接近的点赋予较大的权重,而给远端点赋予较轻的权重,并尝试拟合一个适合最高的总重量.

What I believe is that it tries to fit a polynomial model in the neighborhood of every point (not just a single polynomial for the whole set). But the neighborhood does not mean only one point before and one point after, if I was implementing such a function I put a big weight on the nearest points to the point x, and lower weights to distal points, and tried to fit a polynomial that fits the highest total weight.

然后,如果要为其预测高度的给定x'最接近点x,我尝试使用拟合在点x的邻域上的多项式-说P(x)-并将其应用于x'-说P(x')-那将是预测.

Then if the given x' for which height should be predicted is closest to point x, I tried to use the polynomial fitted on the neighborhoods of the point x - say P(x) - and applied it over x' - say P(x') - and that would be the prediction.

如果您正在寻找特殊的东西,请告诉我.

Let me know if you are looking for anything special.

这篇关于黄土以新的x值预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-22 21:14