问题描述
我试图读取REST API,该API是gzip编码的.确切地说,我尝试阅读StackExchange API.
I tried to read a REST API, which is gzip encoded. To be exact, I tried to read the StackExchange API.
我已经找到问题在TRESTResponse中自动解码GZIP吗?,但是答案由于某种原因不能解决我的问题.
I already found the question Automatically Decode GZIP In TRESTResponse?, but that answer doesn't solve my issue for some reason.
测试设置
在XE5中,我添加了具有以下相关属性的TRestClient,TRestRequest和TRestResponse.我设置了客户端的BaseURL,请求的资源和参数,并将请求的AcceptEncoding
设置为gzip, deflate
,这应使其自动解码压缩的响应.
In XE5, I added a TRestClient, a TRestRequest and a TRestResponse with the following relevant properties. I set the BaseURL of the client, the resource and parameters of the request, and I set AcceptEncoding
of the request to gzip, deflate
, which should make it automatically decode gzipped responses.
object RESTClient1: TRESTClient
BaseURL = 'https://api.stackexchange.com/2.2'
end
object RESTRequest1: TRESTRequest
AcceptEncoding = 'gzip, deflate'
Client = RESTClient1
Params = <
item
Kind = pkURLSEGMENT
name = 'id'
Options = [poAutoCreated]
Value = '511529'
end
item
name = 'site'
Value = 'stackoverflow'
end>
Resource = 'users/{id}'
Response = RESTResponse1
end
object RESTResponse1: TRESTResponse
end
这将产生网址:
我这样调用请求,带有两个消息框以显示URL和请求的结果:
I invoke the request like this, with two message boxes to show the url and the outcome of the request:
ShowMessage(RESTRequest1.GetFullRequestURL());
RESTRequest1.Execute; // Actual call
ShowMessage(RESTResponse1.Content);
如果我在浏览器中调用该url,则会得到正确的结果,这是一个包含一些用户信息的json对象.
If I call that url in a browser, I get a proper result, which is a json object with some of my user information in it.
问题
但是,在Delphi中,我没有得到JSON响应.事实上,我得到了一堆似乎的字节,它们是错误的gzip响应.我尝试使用TIdCompressorZlib.DecompressGZipStream()
解压缩,但使用ZLib Error (-3)
失败.当我自己检查响应的字节时,我看到它以#1F#3F#08开头.这特别奇怪,因为gzip标头应为#1F#8B#08,所以#8B转换为#3F,这是一个问号.
However, in Delp I don't get the JSON response. In fact, I get a bunch of bytes which seems to be a mangled gzip response. I tried to decompress it with TIdCompressorZlib.DecompressGZipStream()
, but it fails with a ZLib Error (-3)
. When I inspect the bytes of the response myself, I see it starts with #1F#3F#08. This is especially weird, since the gzip header should be #1F#8B#08, so #8B is transformed into #3F, which is a question mark.
因此,在我看来,RESTClient试图对gzip流进行解码,就好像它是UTF-8响应一样,并用一个问题替换了无效序列(#8B本身不是有效的UTF-8字符)标记.
So it seems to me like the RESTClient has attempted to decode the gzip stream as if it was a UTF-8 response, and has replaced invalid sequences (#8B is in itself not a valid UTF-8 character) with a question mark.
尝试(表面的)
我已经做了很多实验,例如
I've done quite some experimenting, like
- 使用RESTResponse.RawBytes并尝试对其进行解码.我注意到此字节数组中的字节已经无效. TRESTResponse来源中的注释告诉我,"RawBytes"已被解码,因此很有意义.
- 将RESTResponse.RawBytes保存在文件中,并尝试使用7zip和几个在线gzip解压缩器对其进行解压缩.当然,它们都失败了,因为即使gzip标头也不正确.
- 将值"gzip,deflate"分配给TRESTClient.AcceptEncoding,TRESTResponse.AcceptEncoding以及它们的组合.还尝试将其附加到每个组件的预先填充的Accept属性中.
- 从已认证请求转换为未认证请求.我已经完成了整个oAuth部分的工作,但是尽管这样会使问题变得过于复杂.不过,我在此问题中使用的匿名API也存在相同的问题.
不幸的是,它仍然无法正常工作,我仍然得到了错误的答复.
Unfortunately it still doesn't work and I still get a mangled response.
尝试次数(深入VCL)
最终,我更深入地研究了TRestRequest.Execute.我不会在此处粘贴所有代码,但最终它会通过调用来执行请求
Eventually, I dug a little deeper, and dove into TRestRequest.Execute. I won't paste all the code here, but eventually it performs the request by calling
FClient.HTTPClient.Get(LURL, LResponseStream);
FClient是链接到请求的TRESTClient,而LResponseStream是TMemoryStream.我在手表中添加了LResponseStream.SaveToFile('...')
,因此它将保存未处理的结果,等等,它为我提供了一个有效的gz文件,可以将其解压缩以获取JSON.
FClient is the TRESTClient that is linked to the request and LResponseStream is a TMemoryStream. I added LResponseStream.SaveToFile('...')
to the watches, so it would save this unprocessed result, et voilá, it gave me a valid gz file, which I could decompress to get my JSON.
解决方法中的错误?
但是,接下来几行,我看到了这段代码:
But then, a couple of lines down, I see this piece of code:
if FClient.HTTPClient.Response.CharSet > '' then
begin
LResponseStream.Position := 0;
S := FClient.HTTPClient.ReadStringAsCharset(LResponseStream, FClient.HTTPClient.Response.CharSet);
LResponseStream.Free;
LResponseStream := TStringStream.Create(S);
end;
根据此块上方的注释,之所以这样做,是因为内存流的内容未根据可能存在的Encoding或Content-Type Charset参数进行相应的编码",这被作者的Indy认为是一个错误.此VCL代码.
According to the comment above this block, this is done because the contents of the memory stream are "NOT encoded accordingly to a possibly present Encoding or Content-Type Charset parameter", which is considered a bug in Indy by the writer of this VCL code.
因此,基本上,这里发生了什么:原始响应被视为字符串,并转换为正确"的编码. FClient.HTTPClient.Response.CharSet是'UTF-8',这确实是JSON的编码,但是不幸的是,此转换只能在解压缩流之后完成,但尚未完成.因此,我认为这是一个错误. ;)
So basically, what happens here: the raw response is treated as a string and converted to the 'right' encoding. FClient.HTTPClient.Response.CharSet is 'UTF-8', which is indeed the encoding of the JSON, but unfortunately, this conversion should only be done after decompressing the stream, which isn't done yet. So this is considered a bug by me. ;)
我试图进行更深入的研究,但是我找不到应该进行减压的地方.实际请求由IIPHTTP实例执行,该实例是IPPeerAPI.dcu,我没有源.
I tried to dig deeper, but I couldn't find the place where this decompression should have taken place. The actual request is performed by an IIPHTTP instance, which is IPPeerAPI.dcu of which I don't have the source.
所以...
所以我的问题是双重的:
So my question is twofold:
- 为什么会这样?当您将AcceptEncoding设置为'gzip,deflate'时,TRestClient应该会自动解码gzip流.我错过了什么设置?还是XE5还不支持此功能?
- 如何防止gzip流的这种错误翻译?我不介意自己解码响应,只要它可以工作,尽管理想情况下REST组件应该自动执行.
我的设置:VCL Forms应用程序,Windows 8.1,Delphi XE5 Professional Update 2.
My setup: VCL Forms application, Windows 8.1, Delphi XE5 professional Update 2.
更新
- 已找到解决方法(请参阅我的答案)
- 错误报告在质量中心提交的RSP-9855
- 据说它已在Delphi 10.1(柏林)中修复,但我尚未对此进行测试.
推荐答案
雷米·勒博(Remy Lebeau)对这个问题的回答以及对问题让我走上正确的轨道.
Remy Lebeau's input in his answer to this question as well as his comment to the answer in the question Automatically Decode GZIP In TRESTResponse? put me on the right track.
就像他说的那样,设置AcceptEncoding是不够的,因为执行实际请求的TIdHTTP没有附加解压缩器,因此无法解压缩gzip响应.基于稀疏资源,我想到了设置AcceptEncoding也会自动解压缩响应的想法,但是这个想法是错误的.
Like he said, setting AcceptEncoding doesn't suffice, because the TIdHTTP that performs the actual request doesn't have a decompressor attached, so it can't decompress the gzip response. Based on the sparse resources, I got the idea that setting AcceptEncoding would automatically decompress the response too, but that idea was wrong.
不过,在这种情况下,将AcceptEncoding留为空白也不起作用,因为有关此的API(即StackExchange API)为,无论您是否指定接受gzip.
Still, leaving AcceptEncoding empty doesn't work either in this case, since the API this is all about, which is the StackExchange API, is always compressed, regardless whether you specify that you accept gzip or not.
因此,a)始终压缩的响应,b)无法解压缩的HTTP客户端和c)TRESTRequest对象(不正确地假定已将响应正确地压缩在一起)的组合会导致这种情况.
So the combination of a) an always compressed response, b) an HTTP client that cannot decompress and c) a TRESTRequest object that -incorrectly- assumed that the response is already properly decompressed together lead to this situation.
我仅看到两种解决方案,第一种是完全丢弃TRESTClient并仅使用简单的TIdHTTP执行请求.可惜,因为我的目标是探索新的REST组件的可能性,以了解它们如何使生活更轻松.
I see only two solutions, the first being to discard TRESTClient altogether and just perform the request with a plain TIdHTTP. A pity, since my goal was to explore the possibilities of the new REST components to see how they can make life easier.
因此,另一种解决方案是将压缩器分配给内部使用的TIdHTTP.
So the other solution is to assign a compressor to the TIdHTTP that is used internally.
我成功地取得了成功,尽管不幸的是,它消除了TREST组件试图引入的许多抽象概念.这是解决它的代码:
I managed to succeed, although unfortunately it undoes a lot of the abstraction that the TREST components are trying to introduce. This is the code that solves it:
var
Http: TIdCustomHTTP;
begin
// Get the TIdHTTP that performs the request.
Http := (RESTRequest1 // The TRESTRequest object
.Client // The TRESTClient
.HTTPClient // A TRESTHTTP object that wraps HTTP communication
.Peer // An IIPHTTP interface which is obtained through PeerFactory.CreatePeer
.GetObject // A method to get the object instance of the interface
as TIdCustomHTTP // The object instance, which is an TIdCustomHTTP.
);
// Attach a gzip decompressor to it.
Http.Compressor := TIdCompressorZLib.Create(Http);
此后,我可以使用RESTRequest1组件成功获取JSON响应(至少作为文本).
After this, I can use the RESTRequest1 component to successfully fetch the JSON response (at least as text).
这篇关于TRestClient/TRestRequest错误地解码了gzip响应的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!