我有2个尺寸为6的向量,并且我想要一个0到1之间的数字。

a=c("HDa","2Pb","2","BxU","BuQ","Bve")

b=c("HCK","2Pb","2","09","F","G")


谁能解释我该怎么办?

最佳答案

使用lsa软件包和该软件包的手册

# create some files
library('lsa')
td = tempfile()
dir.create(td)
write( c("HDa","2Pb","2","BxU","BuQ","Bve"), file=paste(td, "D1", sep="/"))
write( c("HCK","2Pb","2","09","F","G"), file=paste(td, "D2", sep="/"))

# read files into a document-term matrix
myMatrix = textmatrix(td, minWordLength=1)


编辑:显示mymatrix对象如何

myMatrix
#myMatrix
#       docs
#  terms D1 D2
#    2    1  1
#    2pb  1  1
#    buq  1  0
#    bve  1  0
#    bxu  1  0
#    hda  1  0
#    09   0  1
#    f    0  1
#    g    0  1
#    hck  0  1

# Calculate cosine similarity
res <- lsa::cosine(myMatrix[,1], myMatrix[,2])
res
#0.3333

关于r - 如何计算两个字符串向量之间的余弦相似度,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/34045738/

10-10 02:19