问题描述
我正在尝试了解有关"kohonen"的信息,特别地,在R中有一个名为"supersom()"的函数.( https://www.rdocumentation.org/packages/kohonen/versions/3.0.10/topics/supersom ,它对应于我正在尝试将其应用于某些数据的SOM(自组织地图)算法.
I am trying to learn about the "kohonen" package in R. In particular, there is a function called "supersom()" (https://www.rdocumentation.org/packages/kohonen/versions/3.0.10/topics/supersom , corresponding to the SOM (Self Organizing Maps) algorithm used in unsupervised machine learning) that I am trying to apply on some data.
下面,(来自上一个问题: R错误:"check.data中的错误:参数应为数字".)我学会了如何将"supersom()"应用于这些函数对一些人工创建的数据都具有因子"功能.和数字"表示变量.
Below, (from a previous question: R error: "Error in check.data : Argument Should be Numeric") I learned how to apply the "supersom()" function on some artificially created data with both "factor" and "numeric" variables.
#the following code works
#load libraries
library(kohonen)
library(dplyr)
#create and format data
a =rnorm(1000,10,10)
b = rnorm(1000,10,5)
c = rnorm(1000,5,5)
d = rnorm(1000,5,10)
e <- sample( LETTERS[1:4], 100 , replace=TRUE, prob=c(0.25, 0.25, 0.25, 0.25) )
f <- sample( LETTERS[1:5], 100 , replace=TRUE, prob=c(0.2, 0.2, 0.2, 0.2, 0.2) )
g <- sample( LETTERS[1:2], 100 , replace=TRUE, prob=c(0.5, 0.5) )
data = data.frame(a,b,c,d,e,f,g)
data$e = as.factor(data$e)
data$f = as.factor(data$f)
data$g = as.factor(data$g)
cols <- 1:4
data[cols] <- scale(data[cols])
#som model
som <- supersom(data= as.list(data), grid = somgrid(10,10, "hexagonal"),
dist.fct = "euclidean", keep.data = TRUE)
一切正常-问题是,当我尝试应用"supersom()"时,功能在更真实,更大的数据",则出现以下错误:
Everything works well - the problem is, when I try to apply the "supersom()" function on " more realistic and bigger data", I get the following error:
"Error: Non-informative layers present : mean distances between objects zero"
当我查看此功能的源代码时( https://rdrr.io/cran/kohonen/src/R/supersom.R ),我注意到了相同错误的参考:
When I look at the source code for this function (https://rdrr.io/cran/kohonen/src/R/supersom.R), I notice a reference for the same error:
if (any(sapply(meanDistances, mean) < .Machine$double.eps))
stop("Non-informative layers present: mean distance between objects zero")
有人可以告诉我如何解决该错误,即使"supersom()"函数可以处理因子和数值数据吗?
Can someone please show me how I might be able to resolve this error, i.e. make the "supersom()" function work with factor and numeric data?
我认为删除重复的行和 NA 可能会解决这个问题:
I thought that perhaps removing duplicate rows and NA's might fix this problem:
data <- na.omit(data)
data <- unique(data)
然而,同样的错误(非信息层存在:对象之间的平均距离为零")仍然存在.
However the same error ("Non-informative layers present : mean distances between objects zero") is still there.
有人可以帮我找出可能导致此错误的原因吗?注意:当我删除因素"时,变量,一切正常.
Can someone please help me figure out what might be causing this error? Note: when I remove the "factor" variables, everything works fine.
来源:
https://cran.r-project.org/web/packages/kohonen/kohonen.pdf
https://www.rdocumentation.org/packages/kohonen/versions/2.0.5/topics/supersom
https://rdrr.io/cran/kohonen/src/R/supersom.R
推荐答案
如果您的某些数字列的平均值为0,则会发生错误.您可以通过将任意1列变为0来重现该错误.
The error happens if you have certain numeric columns whose mean is 0. You can reproduce the error by turning any 1 column to 0.
data$a <- 0
som <- supersom(data= as.list(data), grid = somgrid(10,10, "hexagonal"),
dist.fct = "euclidean", keep.data = TRUE)
也许您可以调查为什么这些列的平均值为0,或从数据中删除平均值为0的列.
Maybe you can investigate why those column have 0 mean or remove the columns with 0 means from the data.
library(kohonen)
library(dplyr)
data <- data %>% select(where(~(is.numeric(.) && mean(.) > 0) | !is.numeric(.)))
#som model
som <- supersom(data= as.list(data), grid = somgrid(10,10, "hexagonal"),
dist.fct = "euclidean", keep.data = TRUE)
这篇关于错误:对象之间的平均距离为零的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!