本文介绍了编辑距离相似度 sas?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在表 V_tablas.arreglo(columns--> domainsBad) 中有一个域列表:@[email protected]@[email protected]@otmail.com.....etc(超过10k)并且需要将此域更正为@hotmail.com"我的问题是关于 oracle 的 EDIT_DISTANCE_SIMILARITY(模糊逻辑)获取返回 0 到 100 之间的整数,其中 0 表示完全没有相似性,100 表示完全匹配"是否可行?

I have a list of domains in a table V_tablas.arreglo(columns--> domainsBad): @hotmai.es @ghotmail.es @hotmaol.com @hotmai.com @otmail.com.....etc(more than 10k)And need to correct this domains to "@hotmail.com"My questions is about EDIT_DISTANCE_SIMILARITY(fuzzy logic) of oracle for get 'Returns an integer Between 0 and 100, Where 0 Indicates no similarity at all and 100 Indicates a perfect match' Is it posible?

推荐答案

SAS 至少有几个函数来计算两个字符串之间的编辑距离:

SAS has at least a couple functions for calculating edit distance between two strings:

Compged,对于一般编辑距离:http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002206133.htm

Compged, for general edit distance:http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002206133.htm

Complev,对于 Levenshtein 距离:http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002206137.htm

Complev, for Levenshtein distance:http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002206137.htm

这篇关于编辑距离相似度 sas?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-21 12:05