如何在Python中为正则表达式解释重音字符？

本文介绍了如何在Python中为正则表达式解释重音字符？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

我目前使用re.findall查找并隔离字符串中哈希标签的'＃'字符后的单词：

I currently use re.findall to find and isolate words after the '#' character for hash tags in a string:

hashtags = re.findall(r'#([A-Za-z0-9_]+)', str1)

它搜索str1并找到所有的标签。此方法有效，但是它不考虑以下重音字符，例如：áéíóúñü¿。

It searches str1 and finds all the hashtags. This works however it doesn't account for accented characters like these for example: áéíóúñü¿.

如果其中之一这些字母在str1中，它将保存＃号直到其前的字母。因此，例如，＃yogenfrüz将是 #yogenfr 。

If one of these letters are in str1, it will save the hashtag up until the letter before it. So for example, #yogenfrüz would be #yogenfr.

我需要能够解释所有带重音符号的字母，包括德语，荷兰语，法语和西班牙语，以便保存诸如＃这样的标签。 yogenfrüz

I need to be able to account for all accented letters that range from German, Dutch, French and Spanish so that I can save hashtags like #yogenfrüz

我该怎么做

尝试以下操作：

hashtags = re.findall(r'#(\w+)', str1, re.UNICODE)

编辑
查看下面来自Martijn Pieters的有用评论

EDITCheck the useful comment below from Martijn Pieters.

这篇关于如何在Python中为正则表达式解释重音字符？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！