本文介绍了你能建议一个好的minhash实现吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试寻找可用于我的工作的minhash开源实现.

I am trying to look for a minhash open source implementation which I can leverage for my work.

我需要的功能非常简单,给定一组输入,实现应返回其minhash.

The functionality I need is very simple, given a set as input, the implementation should return its minhash.

最好使用python或C实现,以防万一我需要对其进行破解以供我使用.

A python or C implementation would be preferred, just in case I need to hack it to work for me.

任何指针都会有很大帮助.

Any pointers would be of great help.

致谢.

推荐答案

您应该按顺序查看以下开放源代码库.所有这些都在Python中,并展示了如何使用LSH/MinHash计算文档相似度:

You should have a look at the following open source libraries, in order. All of them are in Python, and show how you can calculate document similarity using LSH/MinHash:

lsh
LSHHDC:基于位置敏感的哈希的高维聚类
MinHash

lsh
LSHHDC : Locality-Sensitive Hashing based High Dimensional Clustering
MinHash

这篇关于你能建议一个好的minhash实现吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-07 21:24