本文介绍了解码URL中的转义字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个列表,其中包含带有转义字符的URL。这些角色在恢复html页面时由 urllib2.urlopen 设置:

I have a list containing URLs with escaped characters in them. Those characters have been set by urllib2.urlopen when it recovers the html page:

http://www.sample1webpage.com/index.php?title=%E9%A6%96%E9%A1%B5&action=edit
http://www.sample1webpage.com/index.php?title=%E9%A6%96%E9%A1%B5&action=history
http://www.sample1webpage.com/index.php?title=%E9%A6%96%E9%A1%B5&variant=zh

有没有办法在python中将它们转换回未转义的形式?

Is there a way to transform them back to their unescaped form in python?

PS:URL被编码为utf-8

P.S.: The URLs are encoded in utf-8

推荐答案

用等效的单字符替换%xx 转义。

Replace %xx escapes by their single-character equivalent.

示例: unquote('/%7Econnolly /')产生'/〜connolly /' / p >

Example: unquote('/%7Econnolly/') yields '/~connolly/'.

然后解码。

这篇关于解码URL中的转义字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-14 00:21