问题描述
我无法将URL编码为URI:
I'm having trouble encoding a URL to a URI:
mUrl = "A string url that needs to be encoded for use in a new HttpGet()";
URL url = new URL(mUrl);
URI uri = new URI(url.getProtocol(), url.getAuthority(), url.getPath(),
url.getQuery(), null);
这不符合我对以下网址的期望:
This does not do what I expect for the following URL:
传入字符串:
结果为:
哪个坏了。例如,%3D
变成了%253D
这似乎对%s已经做了一些神秘的事情在字符串中。
Which is broken. For example, the %3D
is turned into %253D
It seems to be doing something mysterious to the %'s already in the string.
这是怎么回事?我在这里做错了什么?
What's going on and what am I doing wrong here?
推荐答案
您首先将(已经转义的)字符串放入 URL
类中。这并没有逃脱任何事情。然后,您将拉出 URL
的部分,这些部分将返回它们而不进行任何进一步处理(因此 - 它们仍会被转义,因为它们在您放入时会被转义)。最后,您将使用。此构造函数被指定为使用百分比对URI组件进行编码。
You are first putting the (already-escaped) string into the URL
class. That doesn't escape anything. Then you are pulling out sections of the URL
, which returns them without any further processing (so -- they are still escaped since they were escaped when you put them in). Finally, you are putting the sections into the URI
class, using the multi-argument constructor. This constructor is specified as encoding the URI components using percentages.
因此,在最后一步中,例如,:
变成%3A
(好)和%3A
变为%253A
(差)。由于您输入的URL已经编码*,因此您不希望再次对它们进行编码。
Therefore, it is in this final step that, for example, ":
" becomes "%3A
" (good) and "%3A
" becomes "%253A
" (bad). Since you are putting in URLs which are already-encoded*, you don't want to encode them again.
因此, URI 是你的朋友。它不会逃避任何事情,并要求您传递预先转义的字符串。因此,您根本不需要 URL
:
mUrl = "A string url is already percent-encoded for use in a new HttpGet()";
URI uri = new URI(mUrl);
*唯一的问题是,如果您的网址有时不是百分比编码,有时也不是。那你有一个更大的问题。您需要确定您的程序是从一个始终编码的URL开始,还是需要编码的URL。
*The only problem is if your URLs are sometimes not percent-encoded, and sometimes they are. Then you have a bigger problem. You need to decide whether your program is starting out with a URL which is always encoded, or one which needs to be encoded.
请注意,没有这样的事物作为完整的URL,不是百分比编码的。例如,您无法获取完整的网址 http://example.com/bob&co
并以某种方式将其转换为正确编码的网址 http://example.com/bob%26co
- 你怎么能区分语法(不应该被转义)和字符(应该)?这就是 URI
的单参数形式要求字符串已经转义的原因。如果你有未转义的字符串,你需要在将它们插入到完整的URL语法之前对它们进行百分比编码,这就是 URI的多参数构造函数
帮助你。
Note that there is no such thing as a full URL which is not percent-encoded. For example, you can't take the full URL "http://example.com/bob&co
" and somehow turn it into the properly-encoded URL "http://example.com/bob%26co
" -- how can you tell the difference between the syntax (which shouldn't be escaped) and the characters (which should)? This is why the single-argument form of URI
requires that strings are already-escaped. If you have unescaped strings, you need to percent-encode them before inserting them into the full URL syntax, and that is what the multi-argument constructor of URI
helps you do.
编辑:我错过了原始代码丢弃片段的事实。如果要删除URL的片段(或任何其他部分),可以如上所述构建 URI
,然后根据需要拉出所有部分(它们将是解码成常规字符串),然后将它们传递回 URI
多参数构造函数(它们将被重新编码 as URI components):
I missed the fact that the original code discards the fragment. If you want to remove the fragment (or any other part) of the URL, you can construct the URI
as above, then pull all the parts out as required (they will be decoded into regular strings), then pass them back into the URI
multi-argument constructor (where they will be re-encoded as URI components):
uri = new URI(uri.getScheme(), uri.getUserInfo(), uri.getHost(), uri.getPort(),
uri.getPath(), uri.getQuery(), null) // Remove fragment
这篇关于URI编码的URL会更改“%3D”。至“%253D”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!