问题描述
我有一个鹦鹉螺脚本,将我喜欢的曲调复制到一个特殊的文件夹,我同步到我的手机和我的车。它的路径上有滑稽的字符,如á在它们失败。我正在按照以下方式逐步修正:
temp = temp.replace('%20','')
temp = temp.replace('%5B','[')
temp = temp.replace('%5D',']'
但是我厌倦了这些带宽的解决方案,我确定有一个更好的方式来使用 str.encode
或 str.decode
。有没有人认识到这个奇怪的编码,我如何处理它?问题是,例如,我有一个文件夹,如
/ media / music /kálmánbalogh和gipsy cimbalom band / aven shavale
在我的磁盘上,但是当我使用 os.getenv('NAUTILUS_SCRIPT_CURRENT_URI')
,即nautilus中当前选择的文件夹,它出现在python中作为
/ media / music / k%C3%A1lm%C3% / / code $
然后其他动作如重命名或复制文件不起作用,因为它没有找不到磁盘上的文件。
您正在查看url编码。使用将其解释为UTF-8编码的文本,然后解码为unicode:
>> ; import urllib
>>> urllib.unquote('/ media / music / k%C3%A1lm%C3%A1n balogh and the gipsy cimbalom band / aven shavale')。decode('utf8')
u'/ media / music / k\\ \\ xe1lm\xe1n balogh和吉普赛cimbalom乐队/ aven shavale'
>>>>打印urllib.unquote('/ media / music / k%C3%A1lm%C3%A1n balogh和gipsy cimbalom乐队/ aven shavale')。decode('utf8')
/ media / music /kálmánbalogh和gipsy cimbalom乐队/ aven shavale
在Python 3中,您需要使用;该功能被移动。
I have a nautilus script to copy tunes that I like into a special folder which I sync to my phone and my car. It fails on paths with funny characters like á in them. I'm fixing it incrementally with stuff like:
temp = temp.replace('%20', ' ')
temp = temp.replace('%5B', '[')
temp = temp.replace('%5D', ']')
but I'm getting tired of these bandaid solutions, and I'm sure there is a better way to do this with str.encode
or str.decode
.
Does anyone recognise this strange encoding and how I can handle it properly? The problem is, for example, I have a folder such as
/media/music/kálmán balogh and the gipsy cimbalom band/aven shavale
on my disk, but when I get it using os.getenv('NAUTILUS_SCRIPT_CURRENT_URI')
, i.e. the currently selected folder in nautilus, it appears in python as
/media/music/k%C3%A1lm%C3%A1n balogh and the gipsy cimbalom band/aven shavale
and then other actions such as renaming or copying the file don't work because it doesn't find the file on disk.
You are looking at url encoding. Use urllib.unquote()
to interpret these to UTF-8 encoded text, then decode to unicode:
>>> import urllib
>>> urllib.unquote('/media/music/k%C3%A1lm%C3%A1n balogh and the gipsy cimbalom band/aven shavale').decode('utf8')
u'/media/music/k\xe1lm\xe1n balogh and the gipsy cimbalom band/aven shavale'
>>> print urllib.unquote('/media/music/k%C3%A1lm%C3%A1n balogh and the gipsy cimbalom band/aven shavale').decode('utf8')
/media/music/kálmán balogh and the gipsy cimbalom band/aven shavale
In Python 3, you need to use urllib.parse.unquote()
; the function was moved.
这篇关于python / nautilus脚本组合的奇怪字符编码问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!