我正在尝试计算两个Pandas列之间的Levenshtein距离,但我被卡住了这是我正在使用的library。这是一个最小的,可重现的示例:
import pandas as pd
from textdistance import levenshtein
attempts = [['passw0rd', 'pasw0rd'],
['passwrd', 'psword'],
['psw0rd', 'passwor']]
df=pd.DataFrame(attempts, columns=['password', 'attempt'])
password attempt
0 passw0rd pasw0rd
1 passwrd psword
2 psw0rd passwor
我的可怜尝试:
df.apply(lambda x: levenshtein.distance(*zip(x['password'] + x['attempt'])), axis=1)
该功能是这样工作的。它需要两个字符串作为参数:
levenshtein.distance('helloworld', 'heloworl')
Out[1]: 2
最佳答案
也许我缺少了一些东西,您是否有理由不喜欢lambda表达式?这对我有用:
import pandas as pd
from textdistance import levenshtein
attempts = [['passw0rd', 'pasw0rd'],
['passwrd', 'psword'],
['psw0rd', 'passwor'],
['helloworld', 'heloworl']]
df=pd.DataFrame(attempts, columns=['password', 'attempt'])
df.apply(lambda x: levenshtein.distance(x['password'], x['attempt']), axis=1)
出:
0 1
1 3
2 4
3 2
dtype: int64
关于python - 如何计算两个Pandas DataFrame列之间的Levenshtein距离?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/60007062/