使用以下代码

payload = '''
 工作报告
 总体情况:良好
'''
r = requests.post("http://httpbin.org/post", data=payload)

请求发布数据为字符串类型时的默认编码是什么? UTF8 还是 Unicode 转义?

如果我想指定编码类型,是否必须自己编码并将字节对象传递给参数“数据”?

最佳答案

如果你真的尝试你的例子,你会发现:

$ python
Python 3.7.2 (default, Jan 29 2019, 13:41:02)
[Clang 10.0.0 (clang-1000.10.44.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> payload = '''
...  工作报告
...  总体情况:良好
... '''
>>> r = requests.post("http://127.0.0.1:8888/post", data=payload)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/venv/lib/python3.7/site-packages/requests/api.py", line 116, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/tmp/venv/lib/python3.7/site-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/tmp/venv/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/tmp/venv/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/tmp/venv/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/tmp/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/tmp/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 354, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/tmp/venv/lib/python3.7/http/client.py", line 1229, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/tmp/venv/lib/python3.7/http/client.py", line 1274, in _send_request
    body = _encode(body, 'body')
  File "/tmp/venv/lib/python3.7/http/client.py", line 160, in _encode
    (name.title(), data[err.start:err.end], name)) from None
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 2-5: Body ('工作报告') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.

Detecting the character encoding of an HTTP POST request 中所述,HTTP POST 的默认编码是 ISO-8859-1 又名 Latin-1。正如回溯末尾的错误消息告诉您的那样,您可以通过编码为 UTF-8 bytes 字符串来强制它;但是当然,您的服务器也需要期待 UTF-8;否则你只会发送无用的Latin-1 mojibake。

POST 接口(interface)本身无法强制执行此操作,但您的服务器实际上可以要求客户端使用 charset 参数明确指定其内容编码;如果丢失,可能会返回带有明确错误消息的特定 5xx 错误代码。

稍微不那么严格,您可以让您的服务器尝试将传入的 POST 请求解码为 UTF-8,如果失败则拒绝 POST。

关于python - python Requests post数据为字符串类型时的默认编码是什么?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/55887958/

10-16 18:00