执行EM算法的高斯混合模型 | EM算法的高斯混合模型

本文介绍了执行EM算法的高斯混合模型的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

EM算法，我要培养有四个组成部分高斯混合模型给定的数据集。该组是三维和包含300个样本。

现在的问题是，经过6轮的EM算法，协方差矩阵西格玛成为亲密根据MATLAB奇异（等级（西格玛）= 2 而不是3 ）。这反过来又导致了像复杂的评估值高斯分布不想要的结果克（K，I）。

此外，我用高斯的日志占溢麻烦 - 见E步。我不知道这是否是正确的，如果我有拿responsibilites p的EXP？（w_k | X ^（我），THETA）别的地方

你能告诉我，如果我在执行EM算法是正确的这么远吗？而如何考虑与接近奇异协方差西格玛的问题？

下面是我实现EM算法的：

首先我初始化的手段和使用k均值部件的协方差：

 负载（'data1.mat'）;

X =数据; ％300x3数据集
D =尺寸（X，2）; ％ 尺寸
N =尺寸（X，1）;样本数量％
K = 4;高斯混合部件编号％

％ 初始化
p值= [0.2，0.3，0.2，0.3]。 ％任意PI
[IDX，μ= k均值（X，K）;组分％初始装置

％计算的成分的协方差
差=零（D，D，K）;
对于k = 1：K
    西格玛（：，：，K）= COV（X（IDX == K，:)）;
结束

有关在 E-步我用下面的公式来计算的责任。

w_k是k个高斯成分。

的x ^（i）是单个数据点（样品）

THETA代表高斯混合模型的参数：亩，适马，PI

下面是相应的code：

 ％变量收敛
融合= 0;
prevLoglikelihood =天道酬勤;
prevMu =亩;
prevSigma =西格玛;
preVPI = P;
圆= 0;
而（会聚〜= 1）
    圆=轮+1
    克=零（K，N）;提名者在％高斯分量
    sumGM =零（N，1）;职责％分母
    ％E步：使用当前参数评估的责任
    ％计算的责任提名者和分母
    对于k = 1：K
        对于i = 1：N
             厦门大学= X-亩;
             ％我使用的日志高斯分布的prevent溢（EXP（小值））
             logPdf =日志（1 / SQRT（DET（西格马（：，：，k））的*（2 * PI）^ D））+（-0.5 * XMU *（西格马（：，：，k）的\ XMU'）） ;
             通用汽车公司（K，I）=日志（P（K））* logPdf;
             sumGM（ⅰ）= sumGM（ⅰ）+克（K，I）;
         结束
    结束

    ％计算责任
    RES =零（K，N）; ％责任
    NK =零（4,1）;
    对于k = 1：K
        对于i = 1：N
            ％我试图使用EXP（克（K，I）/ sumGM（I））来计算资源，但是这会导致总结（PI）＆GT; 1。
            水库（K，I）=克（K，I）/ sumGM（ⅰ）;
        结束
        NK（K）= SUM（RES（K，:)）;
    结束

NK（K）用在M步给出的是用在M步骤来计算新的概率公式计算 P（K）。

M-步

 ％M步：使用当前的职责重新估计的参数
    对于k = 1：K
        对于i = 1：N
            亩（K，:) =亩（K，:) +资源（K，I）* X（K，:)。
            西格玛（：，：，K）=六西格玛（：，：，K）+资源（K，I）*。（X（K，：） - 亩（K，：））*（X（K，：） - 亩（K，:)）;
        结束
        亩（K，:) =亩（K，：）./ NK（K）;
        西格马（：，：，k）的=六西格玛（：，：，k）的./ NK（k）的;
        P（K）= NK（K）/ N;
    结束

现在，为了检查是否收敛对数似然使用该公式计算：

 ％评估数似然和检查收敛任
    ％的参数或对数似然。如果不收敛，请E-一步。
    数似然= 0;
    对于i = 1：N
        数似然=数似然+日志（SUM（GM（：我）））;
    结束


    ％检查参数收敛
    errorLoglikelihood = ABS（loglikelihood- prevLoglikelihood）;
    如果（errorLoglikelihood＆LT; = EPS）
        融合= 1;
    结束

    errorMu = ABS（亩（：） -  prevMu（:)）;
    errorSigma = ABS（西格玛（：） -  prevSigma（:)）;
    errorPi = ABS（P（：） -  preVPI（:)）;

    如果（所有（errorMu＆LT; = EPS）及和放大器;所有（errorSigma＆LT; = EPS）及和放大器;所有（errorPi＆LT; = EPS））
        融合= 1;
    结束

    prevLoglikelihood =数似然;
    prevMu =亩;
    prevSigma =西格玛;
    preVPI = P;

结束％，而

是不是有什么毛病我matlab实现EM算法的高斯混合模型？

previous烦恼：

现在的问题是，我不能使用对数似然，因为它是 -Inf 检查收敛。这导致从圆形零值，而评价高斯的责任在式（见E-工序）。

你能告诉我，如果我在执行EM算法是正确的这么远吗？而如何解释这一问题与圆形零值？

下面是我实现EM算法的：

首先我初始化的手段和使用k均值部件的协方差：

 负载（'data1.mat'）;

X =数据; ％300x3数据集
D =尺寸（X，2）; ％ 尺寸
N =尺寸（X，1）;样本数量％
K = 4;高斯混合部件编号％

％ 初始化
p值= [0.2，0.3，0.2，0.3]。 ％任意PI
[IDX，μ= k均值（X，K）;组分％初始装置

％计算的成分的协方差
差=零（D，D，K）;
对于k = 1：K
    西格玛（：，：，K）= COV（X（IDX == K，:)）;
结束

有关在 E-步我用下面的公式来计算的责任

下面是相应的code：

 ％变量收敛
融合= 0;
prevLoglikelihood =天道酬勤;
prevMu =亩;
prevSigma =西格玛;
preVPI = P;
圆= 0;
而（会聚〜= 1）
    圆=轮+1
    克=零（K，N）; ％高斯分量的提名者中 -
                     ％某些值计算为零
    sumGM =零（N，1）;职责％分母
    ％E步：使用当前参数评估的责任
    ％计算的责任提名者和分母
    对于k = 1：K
        对于i = 1：N
             ％位置数值evalute零例如EXP（-746.6228）= -Inf
             通用汽车公司（K，I）= p(k)/sqrt(det(sigma(:,:,k))*(2*pi)^D)*exp(-0.5*(X(i,:)-mu(k,:))*inv(sigma(:,:,k))*(X(i,:)-mu(k,:))');
             sumGM（ⅰ）= sumGM（ⅰ）+克（K，I）;
         结束
    结束

    ％计算责任
    RES =零（K，N）; ％责任
    NK =零（4,1）;
    对于k = 1：K
        对于i = 1：N
            水库（K，I）=克（K，I）/ sumGM（ⅰ）;
        结束
        NK（K）= SUM（RES（K，:)）;
    结束

NK（K）用在M步给出的公式计算。

M-步

 ％M步：使用当前的职责重新估计的参数
    亩=零（K，3）;
    对于k = 1：K
        对于i = 1：N
            亩（K，:) =亩（K，:) +资源（K，I）* X（K，:)。
            西格玛（：，：，K）=六西格玛（：，：，K）+资源（K，I）*。（X（K，：） - 亩（K，：））*（X（K，：） - 亩（K，:)）;
        结束
        亩（K，:) =亩（K，：）./ NK（K）;
        西格马（：，：，k）的=六西格玛（：，：，k）的./ NK（k）的;
        P（K）= NK（K）/ N;
    结束

现在，为了检查是否收敛对数似然使用该公式计算的：

 ％评估数似然和检查收敛任
    ％的参数或对数似然。如果不收敛，请E-一步。
    数似然= 0;
    对于i = 1：N
        数似然=数似然+日志（SUM（GM（：我）））;
    结束


    ％检查参数收敛
    errorLoglikelihood = ABS（loglikelihood- prevLoglikelihood）;
    如果（errorLoglikelihood＆LT; = EPS）
        融合= 1;
    结束

    errorMu = ABS（亩（：） -  prevMu（:)）;
    errorSigma = ABS（西格玛（：） -  prevSigma（:)）;
    errorPi = ABS（P（：） -  preVPI（:)）;

    如果（所有（errorMu＆LT; = EPS）及和放大器;所有（errorSigma＆LT; = EPS）及和放大器;所有（errorPi＆LT; = EPS））
        融合= 1;
    结束

    prevLoglikelihood =数似然;
    prevMu =亩;
    prevSigma =西格玛;
    preVPI = P;

结束％，而

在第一轮后的数似然大约是700。在第二轮就 -Inf ，因为在E步某些克（K，I）的值为零。因此，日志显然是负无穷大。

在零值也导致 sumGM 等于零，因此导致在万亩所有NaN项和西格玛矩阵。

我该如何解决这个问题呢？你能告诉我，如果有什么问题我的执行？难道通过增加Matlab的precision某种方式来解决？

编辑：

我加了一个比例为EXP（）项克（K，I）。不幸的是这并没有太大的帮助。一些回合后，我仍然得到了溢问题。

 规模=零（N，D）;
对于i = 1：N
    最大= 0;
    对于k = 1：K
        厦门大学= X（I，：） - 亩（K，:);
        如果（标准（量表（我，:)  - 厦门大学）＆GT;最大）
            最大=范（规模（我，:)  - 厦门大学）;
            规模（我，:) =厦门大学;
        结束
    结束
结束


对于k = 1：K
    对于i = 1：N
        厦门大学= X（I，：） - 亩（K，:);
        ％规模gm至prevent溢
        厦门大学厦门大学=  - 规模（我，:);
        克（K，I）= P（k）的/ SQRT（DET（西格马（：，：，k））的*（2 * PI）^ D）* EXP（-0.5 * XMU * INV（西格马（：，：， k））的* XMU'）;
        sumGM（ⅰ）= sumGM（ⅰ）+克（K，I）;
    结束
结束

此外，我注意到，k均值初始化装置完全不同的相比有以下回合何处的装置计算在M步骤。

k均值：

 亩= 13.500000000000000 0.026602138870044 0.062415945993735
       88.500000000000000 -0.009869960132085 -0.075177888210981
       39.000000000000000 -0.042569305020309 0.043402772876513
       64.000000000000000 -0.024519281362918 -0.012586980924762

在M步骤：

 圆= 2

亩= 1.000000000000000 0.077230046948357 0.024498886414254
     2.000000000000000 0.074260118474053 0.026484346404660
     3.000000000000002 0.070944016105476 0.029043085983168
     4.000000000000000 0.067613431480832 0.031641849205021

在接下来的几轮亩不会改变的。它保持与在第2轮

我想这是因为通用汽车的底流引起的（K，I）？无论是我的实现缩放的不正确或整个实现的算法是错误的地方:(

编辑2

在四轮比赛中我得到了 NaN的值，看着通用汽车在更多的细节。看着只有一个样本（并没有0.5倍），通用成为所有组件为零。将在MATLAB GM（：，1）= [0 0 0 0] 。这又导致sumGM等于零 - >的NaN因为我除以零。我在

赋予了更多的详细信息

 圆= 1

亩= 62.0000 -0.0298 -0.0078
     37.0000 -0.0396 0.0481
     87.5000 -0.0083 -0.0728
     12.5000 0.0303 0.0614

通用汽车公司（：，1）= 11.7488，0.0000，0.0000，0.0000]


圆= 2

亩= 1.0000 0.0772 0.0245
     2.0000 0.0743 0.0265
     3.0000 0.0709 0.0290
     4.0000 0.0676 0.0316


通用汽车公司（：，1）= [0.0000，0.0000，0.0000，0.3128]

圆= 3

亩= 1.0000 0.0772 0.0245
     2.0000 0.0743 0.0265
     3.0000 0.0709 0.0290
     4.0000 0.0676 0.0316


克（：，1）= [0,0，0.0000，0.2867]


圆= 4


亩= 1.0000 0.0772 0.0245
        楠楠的NaN
     3.0000 0.0709 0.0290
     4.0000 0.0676 0.0316

通用汽车公司（：，1）= 1.0E-105 * [0大，NaN，0，0.5375]

首先手段的似乎不改变，是完全不同的相比k均值的initializaiton

和每个样本（不只是为第一个喜欢这里）根据转的输出只对应一个高斯分量（：，1）。如果没有样品被部分分布式每个高斯组件之间？

EDIT3：

所以我想用每亩不改变是在M-步骤中的第一行中的问题：亩=零（K，3）;

要解释下溢问题，我目前正在使用高斯的日志：

 函数logPdf = logmvnpdf（X，MU，SIGMA，D）
    厦门大学= X-亩;
    logPdf =日志（1 / SQRT（DET（西格玛）*（2 * PI）^ D））+（-0.5 * XMU * INV（西格玛）* XMU'）;
结束

新的问题是协方差矩阵西格玛。 Matlab的主张：警告：矩阵接近奇异或严重缩小。结果可能是不准确的。

在6轮，我得到虚值GM（高斯分布）。

更新后的E-步骤看起来像现在这样：

  GM =零（K，N）;提名者在％高斯分量
sumGM =零（N，1）;职责％分母


对于k = 1：K
    对于i = 1：N
        ％克（K，I）= P（k）的/ SQRT（DET（西格马（：，：，k））的*（2 * PI）^ D）* EXP（-0.5 * XMU * INV（西格马（：，：中，k））* XMU'）;
        ％克（K，I）= P（K）* mvnpdf（X（I，:)，亩（K，:)，西格玛（：，：，K））;
        通用汽车公司（K，I）=日志（P（K））+ logmvnpdf（X（I，:)，亩（K，:)，西格玛（：，：，K），D）;
        sumGM（ⅰ）= sumGM（ⅰ）+克（K，I）;
    结束
结束

解决方案

看起来你应该能够使用比例因子规模（一）把克（K，I）成重presentable范围，因为如果你乘克（K，I）由规模（i）本最终将乘以sumGM（我）的欢迎，并取消离开时，你的工作资源（K，I）=克（K，I）/ sumGM （I）。

我会做的规模（I）= 1 / max_k（EXP（-0.5 *（X（我，：） - 亩（K，:)））在理论上，实际上计算它没有做的幂，所以你最终处理它的日志，max_k（-0.5 *（X（我，：） - 亩（K，:)） - 这给你一个常用词，你可以添加到每个-0.5 *（X（我，：） - 亩（K，:)用exp（）之前，将保持至少再presentable范围内的最大 - 任何仍溢为零这种修正你不关心反正后，因为它相比微乎其微其他的贡献。

Using the EM algorithm, I want to train a Gaussian Mixture model with four components on a given dataset. The set is three dimensional and contains 300 samples.

The problem is that after about 6 rounds of the EM algorithm, the covariance matrices sigma become close to singular according to matlab (rank(sigma) = 2 instead of 3). This in turn leads to undesired results like complex values evaluating the gaussian distribution gm(k,i).

Furthermore I used the log of the gaussian to account for underflow troubles - see E-step. I am not sure if this is correct and if I have to take the exp of the responsibilites p(w_k | x^(i), theta) somewhere else?

Can you tell me if my implementation of the EM algorithm is correct so far?And how to account for the problem with the close to singular covariance sigma?

Here is my implementation of the EM algorithm:

First I initialized the means and the covariance of the components using kmeans:

load('data1.mat');

X = Data'; % 300x3 data set
D = size(X,2); % dimension
N = size(X,1); % number of samples
K = 4; % number of Gaussian Mixture components

% Initialization
p = [0.2, 0.3, 0.2, 0.3]; % arbitrary pi
[idx,mu] = kmeans(X,K); % initial means of the components

% compute the covariance of the components
sigma = zeros(D,D,K);
for k = 1:K
    sigma(:,:,k) = cov(X(idx==k,:));
end

For the E-step I am using the following formula to calculate the responsibilities.

w_k are the k gaussian components.

x^(i) is a single datapoint (sample)

theta stands for the parameters of the gaussian mixture model: mu, Sigma, pi.

Here is the corresponding code:

% variables for convergence
converged = 0;
prevLoglikelihood = Inf;
prevMu = mu;
prevSigma = sigma;
prevPi = p;
round = 0;
while (converged ~= 1)
    round = round +1
    gm = zeros(K,N); % gaussian component in the nominator
    sumGM = zeros(N,1); % denominator of responsibilities
    % E-step:  Evaluate the responsibilities using the current parameters
    % compute the nominator and denominator of the responsibilities
    for k = 1:K
        for i = 1:N
             Xmu = X-mu;
             % I am using log to prevent underflow of the gaussian distribution (exp("small value"))
             logPdf = log(1/sqrt(det(sigma(:,:,k))*(2*pi)^D)) + (-0.5*Xmu*(sigma(:,:,k)\Xmu'));
             gm(k,i) = log(p(k)) * logPdf;
             sumGM(i) = sumGM(i) + gm(k,i);
         end
    end

    % calculate responsibilities
    res = zeros(K,N); % responsibilities
    Nk = zeros(4,1);
    for k = 1:K
        for i = 1:N
            % I tried to use the exp(gm(k,i)/sumGM(i)) to compute res but this leads to sum(pi) > 1.
            res(k,i) = gm(k,i)/sumGM(i);
        end
        Nk(k) = sum(res(k,:));
    end

Nk(k) is computed using the formula given in the M-step and is used in the M-step to calculate the new probabilities p(k).

M-step

    % M-step: Re-estimate the parameters using the current responsibilities
    for k = 1:K
        for i = 1:N
            mu(k,:) = mu(k,:) + res(k,i).*X(k,:);
            sigma(:,:,k) = sigma(:,:,k) + res(k,i).*(X(k,:)-mu(k,:))*(X(k,:)-mu(k,:))';
        end
        mu(k,:) = mu(k,:)./Nk(k);
        sigma(:,:,k) = sigma(:,:,k)./Nk(k);
        p(k) = Nk(k)/N;
    end

Now in order to check for convergence the log-likelihood is computed using this formula:

    % Evaluate the log-likelihood and check for convergence of either
    % the parameters or the log-likelihood. If not converged, go to E-step.
    loglikelihood = 0;
    for i = 1:N
        loglikelihood = loglikelihood + log(sum(gm(:,i)));
    end


    % Check for convergence of parameters
    errorLoglikelihood = abs(loglikelihood-prevLoglikelihood);
    if (errorLoglikelihood <= eps)
        converged = 1;
    end

    errorMu = abs(mu(:)-prevMu(:));
    errorSigma = abs(sigma(:)-prevSigma(:));
    errorPi = abs(p(:)-prevPi(:));

    if (all(errorMu <= eps) && all(errorSigma <= eps) && all(errorPi <= eps))
        converged = 1;
    end

    prevLoglikelihood = loglikelihood;
    prevMu = mu;
    prevSigma = sigma;
    prevPi = p;

end % while

Is there something wrong with my Matlab implementation of EM algorithm for Gaussian Mixture Models?

Previous troubles:

The problem is that I cannot check for convergence using the log-likelihood because it is -Inf. This results from rounded zero values while evaluating the gaussian in the formula of the responsibilities (see E-step).

Can you tell me if my implementation of the EM algorithm is correct so far?And how to account for the problem with the rounded zero values?

Here is my implementation of the EM algorithm:

First I initialized the means and the covariance of the components using kmeans:

load('data1.mat');

X = Data'; % 300x3 data set
D = size(X,2); % dimension
N = size(X,1); % number of samples
K = 4; % number of Gaussian Mixture components

% Initialization
p = [0.2, 0.3, 0.2, 0.3]; % arbitrary pi
[idx,mu] = kmeans(X,K); % initial means of the components

% compute the covariance of the components
sigma = zeros(D,D,K);
for k = 1:K
    sigma(:,:,k) = cov(X(idx==k,:));
end

For the E-step I am using the following formula to calculate the responsibilities

Here is the corresponding code:

% variables for convergence
converged = 0;
prevLoglikelihood = Inf;
prevMu = mu;
prevSigma = sigma;
prevPi = p;
round = 0;
while (converged ~= 1)
    round = round +1
    gm = zeros(K,N); % gaussian component in the nominator -
                     % some values evaluate to zero
    sumGM = zeros(N,1); % denominator of responsibilities
    % E-step:  Evaluate the responsibilities using the current parameters
    % compute the nominator and denominator of the responsibilities
    for k = 1:K
        for i = 1:N
             % HERE values evalute to zero e.g. exp(-746.6228) = -Inf
             gm(k,i) = p(k)/sqrt(det(sigma(:,:,k))*(2*pi)^D)*exp(-0.5*(X(i,:)-mu(k,:))*inv(sigma(:,:,k))*(X(i,:)-mu(k,:))');
             sumGM(i) = sumGM(i) + gm(k,i);
         end
    end

    % calculate responsibilities
    res = zeros(K,N); % responsibilities
    Nk = zeros(4,1);
    for k = 1:K
        for i = 1:N
            res(k,i) = gm(k,i)/sumGM(i);
        end
        Nk(k) = sum(res(k,:));
    end

Nk(k) is computed using the formula given in the M-step.

M-step

    % M-step: Re-estimate the parameters using the current responsibilities
    mu = zeros(K,3);
    for k = 1:K
        for i = 1:N
            mu(k,:) = mu(k,:) + res(k,i).*X(k,:);
            sigma(:,:,k) = sigma(:,:,k) + res(k,i).*(X(k,:)-mu(k,:))*(X(k,:)-mu(k,:))';
        end
        mu(k,:) = mu(k,:)./Nk(k);
        sigma(:,:,k) = sigma(:,:,k)./Nk(k);
        p(k) = Nk(k)/N;
    end

Now in order to check for convergence the log-likelihood is computed using this formula:

    % Evaluate the log-likelihood and check for convergence of either
    % the parameters or the log-likelihood. If not converged, go to E-step.
    loglikelihood = 0;
    for i = 1:N
        loglikelihood = loglikelihood + log(sum(gm(:,i)));
    end


    % Check for convergence of parameters
    errorLoglikelihood = abs(loglikelihood-prevLoglikelihood);
    if (errorLoglikelihood <= eps)
        converged = 1;
    end

    errorMu = abs(mu(:)-prevMu(:));
    errorSigma = abs(sigma(:)-prevSigma(:));
    errorPi = abs(p(:)-prevPi(:));

    if (all(errorMu <= eps) && all(errorSigma <= eps) && all(errorPi <= eps))
        converged = 1;
    end

    prevLoglikelihood = loglikelihood;
    prevMu = mu;
    prevSigma = sigma;
    prevPi = p;

end % while

After the first round the loglikelihood is around 700.In the second round it is -Inf because some gm(k,i) values in the E-step are zero. Therefore the log is obviously negative infinity.

The zero values also lead to sumGM equals to zero and therefore leading to all NaN entries inside the mu and sigma matrices.

How can I solve this problem?Can you tell me if there is something wrong with my implementation?Could it be solved by increasing Matlab's precision somehow?

EDIT:

I added a scaling for the exp() term in gm(k,i).Unfortunately this doesn't help much. After some more rounds I still get the underflow problem.

scale = zeros(N,D);
for i = 1:N
    max = 0;
    for k = 1:K
        Xmu = X(i,:)-mu(k,:);
        if (norm(scale(i,:) - Xmu) > max)
            max = norm(scale(i,:) - Xmu);
            scale(i,:) = Xmu;
        end
    end
end


for k = 1:K
    for i = 1:N
        Xmu = X(i,:)-mu(k,:);
        % scale gm to prevent underflow
        Xmu = Xmu - scale(i,:);
        gm(k,i) = p(k)/sqrt(det(sigma(:,:,k))*(2*pi)^D)*exp(-0.5*Xmu*inv(sigma(:,:,k))*Xmu');
        sumGM(i) = sumGM(i) + gm(k,i);
    end
end

Further I noticed that kmeans initializes the means completely different compared to the following rounds where the means are computed in the M-step.

kmeans:

mu =   13.500000000000000   0.026602138870044   0.062415945993735
       88.500000000000000  -0.009869960132085  -0.075177888210981
       39.000000000000000  -0.042569305020309   0.043402772876513
       64.000000000000000  -0.024519281362918  -0.012586980924762

after M-step:

round = 2

mu = 1.000000000000000   0.077230046948357   0.024498886414254
     2.000000000000000   0.074260118474053   0.026484346404660
     3.000000000000002   0.070944016105476   0.029043085983168
     4.000000000000000   0.067613431480832   0.031641849205021

In the next rounds mu doesn't change at all. It stays the same as in round 2.

I guess this is caused because of the underflow in gm(k,i)?Either my implementation of the scaling is incorrect or the whole implementation of the algorithm is wrong somewhere :(

EDIT 2

After four rounds I got NaN values and looked into gm in more detail. Looking at only one sample (and without the 0.5 factor), gm becomes zero in all components. Put in matlab gm(:,1) = [0 0 0 0]. This in turn leads to sumGM equal to zero -> NaN because I divided by zero. I have given more details in

round = 1

mu = 62.0000   -0.0298   -0.0078
     37.0000   -0.0396    0.0481
     87.5000   -0.0083   -0.0728
     12.5000    0.0303    0.0614

gm(:,1) = [11.7488, 0.0000, 0.0000, 0.0000]


round = 2

mu = 1.0000    0.0772    0.0245
     2.0000    0.0743    0.0265
     3.0000    0.0709    0.0290
     4.0000    0.0676    0.0316


gm(:,1) = [0.0000, 0.0000, 0.0000, 0.3128]

round = 3

mu = 1.0000    0.0772    0.0245
     2.0000    0.0743    0.0265
     3.0000    0.0709    0.0290
     4.0000    0.0676    0.0316


gm(:,1) = [0, 0, 0.0000, 0.2867]


round = 4


mu = 1.0000    0.0772    0.0245
        NaN       NaN       NaN
     3.0000    0.0709    0.0290
     4.0000    0.0676    0.0316

gm(:,1) = 1.0e-105 * [0, NaN, 0, 0.5375]

First of all the means doesn't seem to change and are completely different compared to the initializaiton of kmeans.

And every sample (not just for the first one like here) corresponds only to one gaussian component according to the output of gm(:,1). Shouldn't the sample be "partially distributed" among every gaussian component?

EDIT3:

So I guess the problem with mu not changing was the first line in the M-step: mu = zeros(K,3);.

To account the underflow problem I am currently trying to use the log of the gaussian:

function logPdf = logmvnpdf(X, mu, sigma, D)
    Xmu = X-mu;
    logPdf = log(1/sqrt(det(sigma)*(2*pi)^D)) + (-0.5*Xmu*inv(sigma)*Xmu');
end

The new problem is the covariance matrix sigma. Matlab claims:Warning: Matrix is close to singular or badly scaled. Results may be inaccurate.

After 6 rounds I get imaginary values for gm (gaussian distribution).

The updated E-Step looks like this now:

gm = zeros(K,N); % gaussian component in the nominator
sumGM = zeros(N,1); % denominator of responsibilities


for k = 1:K
    for i = 1:N
        %gm(k,i) = p(k)/sqrt(det(sigma(:,:,k))*(2*pi)^D)*exp(-0.5*Xmu*inv(sigma(:,:,k))*Xmu');
        %gm(k,i) = p(k)*mvnpdf(X(i,:),mu(k,:),sigma(:,:,k));
        gm(k,i) = log(p(k)) + logmvnpdf(X(i,:), mu(k,:), sigma(:,:,k), D);
        sumGM(i) = sumGM(i) + gm(k,i);
    end
end

解决方案

It looks like you should be able to use a scale factor scale(i) to bring gm(k, i) into a representable range, because if you multiply gm(k, i) by scale(i) this will end up multiplying sumGM(i) as well, and be cancelled away when you work out res(k, i) = gm(k, i) / sumGM(i).

I would make scale(i) = 1 / max_k(exp(-0.5*(X(i,:)-mu(k,:))) in theory, and actually calculate it without doing the exponentiation, so you end up dealing with its log, max_k(-0.5*(X(i,:)-mu(k,:)) - this gives you a common term you can add to each -0.5*(X(i,:)-mu(k,:) before using exp() and will keep at least the maximum within a representable range - anything that still underflows to zero after this correction you don't care about anyway, because it is vanishingly small compared to the other contributions.

这篇关于执行EM算法的高斯混合模型的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！