本文介绍了从r中的两高斯混合生成样本(MATLAB中给出的代码)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建(r)等效于以下MATLAB函数,该函数将从N(m1,(s1)^ 2)和N(m2,(s2)^ 2)的混合物生成n个样本从第一个高斯算起,分数为alpha.

I'm trying to create (in r) the equivalent to the following MATLAB function that will generate n samples from a mixture of N(m1,(s1)^2) and N(m2, (s2)^2) with a fraction, alpha, from the first Gaussian.

我有一个开始,但是MATLAB和R之间的结果明显不同(即,MATLAB结果偶尔会提供+ -8的值,但R版本从来没有提供+ -5的值). 请帮助我解决这里的问题.谢谢:-)

I have a start, but the results are notably different between MATLAB and R (i.e., the MATLAB results give occasional values of +-8 but the R version never even gives a value of +-5). Please help me sort out what is wrong here. Thanks :-)

例如:从N(0,1)和N(0,36)的混合物中绘制1000个样本,其中95%来自第一个高斯样本.将样本归一化为均值零和标准偏差一.

For Example:Plot 1000 samples from a mix of N(0,1) and N(0,36) with 95% of samples from the first Gaussian. Normalize the samples to mean zero and standard deviation one.

MATLAB

功能

function y = gaussmix(n,m1,m2,s1,s2,alpha)
y = zeros(n,1);
U = rand(n,1);
I = (U < alpha)
y = I.*(randn(n,1)*s1+m1) + (1-I).*(randn(n,1)*s2 + m2);

实现

P = gaussmix(1000,0,0,1,6,.95)
P = (P-mean(P))/std(P)
plot(P)
axis([0 1000 -15 15])
hist(P)
axis([-15 15 0 1000])

结果图

产生的历史记录

R

yn <- rbinom(1000, 1, .95)
s <- rnorm(1000, 0 + 0*yn, 1 + 36*yn)
sn <- (s-mean(s))/sd(s)
plot(sn, xlim=range(0,1000), ylim=range(-15,15))
hist(sn, xlim=range(-15,15), ylim=range(0,1000))

结果图

产生的历史记录

一如既往,谢谢!

解决方案

gaussmix <- function(nsim,mean_1,mean_2,std_1,std_2,alpha){
   U <- runif(nsim)
   I <- as.numeric(U<alpha)
   y <- I*rnorm(nsim,mean=mean_1,sd=std_1)+
       (1-I)*rnorm(nsim,mean=mean_2,sd=std_2)
   return(y)
}

z1 <- gaussmix(1000,0,0,1,6,0.95)
z1_standardized <- (z1-mean(z1))/sqrt(var(z1))
z2 <- gaussmix(1000,0,3,1,1,0.80)
z2_standardized <- (z2-mean(z2))/sqrt(var(z2))
z3 <- rlnorm(1000)
z3_standardized <- (z3-mean(z3))/sqrt(var(z3))

par(mfrow=c(2,3))
hist(z1_standardized,xlim=c(-10,10),ylim=c(0,500),
   main="Histogram of 95% of N(0,1) and 5% of N(0,36)",
   col="blue",xlab=" ")
hist(z2_standardized,xlim=c(-10,10),ylim=c(0,500),
   main="Histogram of 80% of N(0,1) and 10% of N(3,1)",
   col="blue",xlab=" ")
hist(z3_standardized,xlim=c(-10,10),ylim=c(0,500),
   main="Histogram of samples of LN(0,1)",col="blue",xlab=" ")
##
plot(z1_standardized,type='l',
   main="1000 samples from a mixture N(0,1) and N(0,36)",
   col="blue",xlab="Samples",ylab="Mean",ylim=c(-10,10))
plot(z2_standardized,type='l',
   main="1000 samples from a mixture N(0,1) and N(3,1)",
   col="blue",xlab="Samples",ylab="Mean",ylim=c(-10,10))
plot(z3_standardized,type='l',
  main="1000 samples from LN(0,1)",
   col="blue",xlab="Samples",ylab="Mean",ylim=c(-10,10))

推荐答案

我认为有两个问题...(1)您的R代码正在创建正态分布的混合,标准偏差为1和37 . (2)通过在您的rbinom()调用中将prob设置为等于alpha,您将在 second 模式而不是first模式获得分数alpha.因此,您得到的分布主要是高斯为sd 37的高斯与sd 1的5%混合物污染,而不是高斯为sd 1的高斯与sd 6的5%混合物污染的高斯分布. .通过混合物的标准偏差(大约36.6)进行缩放,基本上可以将其减小为标准高斯分布,并且在原点附近略有凸起...

There are two problems, I think ... (1) your R code is creating a mixture of normal distributions with standard deviations of 1 and 37. (2) By setting prob equal to alpha in your rbinom() call, you're getting a fraction alpha in the second mode rather than the first. So what you are getting is a distribution that is mostly a Gaussian with sd 37, contaminated by a 5% mixture of Gaussian with sd 1, rather than a Gaussian with sd 1 that is contaminated by a 5% mixture of a Gaussian with sd 6. Scaling by the standard deviation of the mixture (which is about 36.6) basically reduces it to a standard Gaussian with a slight bump near the origin ...

(此处发布的其他答案确实可以很好地解决您的问题,但我认为您可能会对诊断感兴趣...)

(The other answers posted here do solve your problem perfectly well, but I thought you might be interested in a diagnosis ...)

您的Matlab gaussmix函数的更紧凑(也许更惯用)的版本(我认为runif(n)<alpha的效率比rbinom(n,size=1,prob=alpha)略高)

A more compact (and perhaps more idiomatic) version of your Matlab gaussmix function (I think runif(n)<alpha is slightly more efficient than rbinom(n,size=1,prob=alpha) )

gaussmix <- function(n,m1,m2,s1,s2,alpha) {
    I <- runif(n)<alpha
    rnorm(n,mean=ifelse(I,m1,m2),sd=ifelse(I,s1,s2))
}
set.seed(1001)
s <- gaussmix(1000,0,0,1,6,0.95)

这篇关于从r中的两高斯混合生成样本(MATLAB中给出的代码)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 16:23
查看更多