我正在尝试寻找不包含重复字符的字符串的最长子字符串的古老问题(周围有很多版本)。我无法弄清楚为什么我的尝试无法正常工作:

def findLongest(inputStr):
    resultSet = []
    substr = []

    for c in inputStr:
        print ("c: ", c)
        if substr == []:
            substr.append([c])
            continue

        print(substr)
        for str in substr:
            print ("c: ",c," - str: ",str,"\n")
            if c in str:
                resultSet.append(str)
                substr.remove(str)
            else:
                str.append(c)
        substr.append([c])



    print("Result set:")
    print(resultSet)
    return max(resultSet, key=len)

print (findLongest("pwwkewambb"))

当我的输出到达第二个'w'时,它不会遍历所有substr元素。我认为我做了一些愚蠢的事情,但是我看不到它是什么,所以请您多多指教!我觉得我要踢自己的答案...

我的输出的开始:
c:  p
c:  w
[['p']]
c:  w  - str:  ['p']

c:  w
[['p', 'w'], ['w']]
c:  w  - str:  ['p', 'w'] # I expect the next line to say c: w - str: ['w']

c:  k
[['w'], ['w']] # it is like the w was ignored as it is here
c:  k  - str:  ['w']

c:  k  - str:  ['w']
...

编辑:

我将for循环替换为
for idx, str in enumerate(substr):
    print ("c: ",c," - str: ",str,"\n")
    if c in str:
        resultSet.append(str)
        substr[idx] = []
    else:
        str.append(c)

它会产生正确的结果。唯一的事情是,将空元素数组设置为下一个字符。似乎没有意义。一定会有更好的办法。

我的预期输出是 kewamb

例如
c:  p
c:  w
[['p']]
c:  w  - str:  ['p']

c:  w
[['p', 'w'], ['w']]
c:  w  - str:  ['p', 'w']

c:  w  - str:  ['w']

c:  k
[[], [], ['w']]
c:  k  - str:  []

c:  k  - str:  []

c:  k  - str:  ['w']

c:  e
[['k'], ['k'], ['w', 'k'], ['k']]
c:  e  - str:  ['k']

c:  e  - str:  ['k']

c:  e  - str:  ['w', 'k']

c:  e  - str:  ['k']
...

最佳答案

根据@seymour对错误回复的评论进行编辑:

def find_longest(s):
    _longest = set()
    def longest(x):
         if x in _longest:
             _longest.clear()
             return False
         _longest.add(x)
         return True
    return ''.join(max((list(g) for _, g in groupby(s, key=longest)), key=len))

并测试:
In [101]: assert find_longest('pwwkewambb') == 'kewamb'

In [102]: assert find_longest('abcabcbb') == 'abc'

In [103]: assert find_longest('abczxyabczxya') == 'abczxy'

旧答案:
from itertools import groupby

s = set() ## for mutable access

''.join(max((list(g) for _, g in groupby('pwwkewambb', key=lambda x: not ((s and x == s.pop()) or s.add(x)))), key=len))
'kewamb'

groupby返回基于key参数提供的函数分组的迭代器,默认情况下为lambda x: x。代替默认值,我们通过使用可变结构来利用某些状态(如果使用常规函数,可以通过更直观的方式完成)
lambda x: not ((s and x == s.pop()) or s.add(x))

这里发生的事情是因为我无法在lambda中重新分配全局分配(再次,我可以使用适当的函数来做到这一点),所以我刚刚创建了可以添加/删除的全局可变结构。关键(没有双关语)是,我仅通过使用短路来根据需要添加/删除项来保留所需的元素。
maxlen很容易解释,以获得groupby产生的最长列表

没有可变的全局结构业务的另一个版本:
def longest(x):
     if hasattr(longest, 'last'):
         result = not (longest.last == x)
         longest.last = x
         return result
     longest.last = x
     return True


''.join(max((list(g) for _, g in groupby('pwwkewambb', key=longest)), key=len))
'kewamb'

关于python - 在字符串python中找到最长的唯一子字符串,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/46613491/

10-12 22:09
查看更多