




I'm plotting some log-scaled data with an overlain linear fit line, like so:

d <- data.frame(x=1:10, y=10^(1:10 + rnorm(10)))
ggplot(d, aes(x=x, y=y)) + geom_point() +
  geom_smooth(method="lm", se=FALSE) +

看起来 好像是在转换后的数据上计算出线性回归线,否则它将直接经过最后一点.真的吗?

It looks like the linear regression line is being calculated on the transformed data, or else it would go directly through the last point. Is that true?

我似乎记得在 ggplot2 文本中已解决此问题,但现在找不到.

I seem to remember that this is addressed in the ggplot2 text, but I can't find it now.


ggplot 绘制图时,它按以下顺序进行绘制:

When ggplot renders a plot, it does so in the following order:

  1. 将变量映射到美学上(即,针对每一层,找出哪个变量与哪种美学相关联,等等)
  2. 对数据集进行分类(制作面板)
  3. 转换刻度(通常通过任何 scale _ 函数)
  4. 计算美观度(即,在这种情况下,计算 lm 拟合度-这是 stat _ 函数出现的地方,通常通过 geom_函数)
  5. 火车比例尺(弄清楚整个地块的尺寸应该是什么)
  6. 地图比例尺(确定每个图层在整体图中的适合位置)
  7. 渲染几何.
  1. Map variables to aesthetics (ie, for each layer, figure out which variable is associated with which aesthetic, etc.)
  2. Facet the datasets (make panels)
  3. Transform the scales (through any scale_ functions, typically)
  4. Compute the aesthetics (ie, compute the lm fit, in this case -- this is where stat_ functions come in, which are typically called through geom_ functions)
  5. Train scales (figure out what the overall plot dimensions should be)
  6. Map scales (figure out where each layer should fit in the overall plot)
  7. Render geoms.


So, scaling happens before the model is fit, and hence yes, the fit is being calculated on the transformed data.


09-05 20:44