本文介绍了python UnicodeEncodeError>如何简单地删除令人烦恼的unicode字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
这是我的工作.
>>> soup = BeautifulSoup (html)
>>> soup
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xae' in position 96953: ordinal not in range(128)
>>>
>>> soup.find('div')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xae' in position 11035: ordinal not in range(128)
>>>
>>> soup.find('span')
<span id="navLogoPrimary" class="navSprite"><span>amazon.com</span></span>
>>>
如何简单地从html
中删除令人烦恼的unicode字符?
还是有更清洁的解决方案?
How can I simply remove troubling unicode characters from html
?
Or is there any cleaner solution ?
推荐答案
尝试这种方式:soup = BeautifulSoup (html.decode('utf-8', 'ignore'))
这篇关于python UnicodeEncodeError>如何简单地删除令人烦恼的unicode字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!