我当前的实现是这样的:

def find_longest_matching_option(option, options):
    options = sorted(options, key=len)
    longest_matching_option = None
    for valid_option in options:
        # Don't want to treat "oreo" as matching "o",
        # match only if it's "o reo"
        if re.match(ur"^{}\s+".format(valid_option), option.strip()):
            longest_matching_option = valid_option
    return longest_matching_option


我正在尝试做的一些例子:

"foo bar baz something", ["foo", "foo bar", "foo bar baz"]
# -> "foo bar baz"
"foo bar bazsomething", (same as above)
# -> "foo bar"
"hello world", ["hello", "something_else"]
# -> "hello"
"a b", ["a", "a b"]
# -> "a b" # Doesn't work in current impl.


通常,我在这里寻找效率。当前的实现可行,但有人告诉我它是O(m^2 * n),这很糟糕。

提前致谢!

最佳答案

让我们从foo开始。

def foo(x, y):
    x, y = x.strip(), y.strip()
    return x == y or x.startswith(y + " ")


如果两个字符串相等,或者一个(加一个空格)是另一个的子字符串,则foo返回true。

接下来,给定案例字符串和选项列表,可以使用filter查找给定案例字符串的所有有效子字符串,然后应用max查找最长的子字符串(请参见下面的测试)。

这是foo的一些测试案例。为了演示的目的,我将使用partialfoo咖喱化为更高阶的函数。

from functools import partial

cases = ["foo bar baz something", "foo bar bazsomething", "hello world", "a b", "a b"]
options = [
      ["foo", "foo bar", "foo bar baz"],
      ["foo", "foo bar", "foo bar baz"],
      ["hello", "something_else"],
      ["a", "a b"],
      ["a", "a b\t"]
]
p_list = [partial(foo, c) for c in cases]

for p, o in zip(p_list, options):
    print(max(filter(p, o), key=len))




foo bar baz
foo bar
hello
a b
a b

10-04 11:00
查看更多