我有2个尺寸为6的向量,并且我想要一个0到1之间的数字。
a=c("HDa","2Pb","2","BxU","BuQ","Bve")
b=c("HCK","2Pb","2","09","F","G")
谁能解释我该怎么办?
最佳答案
使用lsa
软件包和该软件包的手册
# create some files
library('lsa')
td = tempfile()
dir.create(td)
write( c("HDa","2Pb","2","BxU","BuQ","Bve"), file=paste(td, "D1", sep="/"))
write( c("HCK","2Pb","2","09","F","G"), file=paste(td, "D2", sep="/"))
# read files into a document-term matrix
myMatrix = textmatrix(td, minWordLength=1)
编辑:显示
mymatrix
对象如何myMatrix
#myMatrix
# docs
# terms D1 D2
# 2 1 1
# 2pb 1 1
# buq 1 0
# bve 1 0
# bxu 1 0
# hda 1 0
# 09 0 1
# f 0 1
# g 0 1
# hck 0 1
# Calculate cosine similarity
res <- lsa::cosine(myMatrix[,1], myMatrix[,2])
res
#0.3333
关于r - 如何计算两个字符串向量之间的余弦相似度,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/34045738/