问题描述
在stackoverflow和python 2.7 doc上呆了几天后,我对此没有任何结论。
基本上我运行一个python脚本windows服务器必须有一个文本块作为输入。此文本块(不幸的是)已通过管道传递。例如:
PS> [something_that_outputs_text] | python .\my_script.py
因此问题是:
服务器使用cp1252编码,我真的不能改变它,由于行政法规和whatnot。当我把文本转换为我的python脚本,当我读它,它已经与?
,而字符像 \xe1
应该是
$ b 用UTF-8测试。 Yep,
chcp 65001
和 $ OutputEncoding = [Console] :: OutputEncoding
解决它,因为在python中获取文本完全,然后我可以解码它unicode等。但显然他们不让我做的服务器/ sadface。 一个小脚本测试什么地狱正在发生:
import codecs
import sys
def main(argv = None ):
如果argv为None:
argv = sys.argv
如果len(argv)> 1:
用于argv [1:]中的arg:
打印arg.decode('cp1252')
sys.stdin = codecs.getreader('cp1252')(sys.stdin)
text = sys.stdin.read )
print text
return 0
如果__name __ ==__ main__:
sys.exit(main())
尝试使用编解码器
包装,而不包含它。
我的输入&输出:
PS>回声Blá| python。\testinput.pyblé
blé
Bl?
- >因此,参数(blé)没有问题,但是管道文本没有好:(
我甚至将文本
字符串转换为十六进制,是的, c $ c> 3f (AKA mr ?
),因此 print
。
[另外:这里是我的第一个问题...随时可以询问我的任何信息]
EDIT
我不知道这是否相关,但是当我 sys.stdin.encoding
它会产生无
我没有问题的cmd。选中 sys.stdin.encoding
在cmd上运行程序,一切都很好,我想我的头刚刚爆炸。
在CMD会话上将数据保存到文件并将其传递到Python如何?在CMD上调用Powershell和Python。 >
c:\> powershell -commandc:\genrateDataForPython.ps1 -output c:\data.txt
c:\> type c:\data.txt | python .\myscript.py
编辑
$ b
另一个想法:在Powershell中将数据转换为base64格式,并在Python中解码。 Base64在Powershell中很简单,我猜在Python 这也不难。像这样,
#将一些重音字符转换为base64
$ s = [Text.Encoding] :: UTF8。 GetBytes(éêèë)
[System.Convert] :: ToBase64String($ s)
#输出:
w6nDqsOow6s =
#解码:
$ d = [System.Convert] :: FromBase64String(w6nDqsOow6s =)
[Text.Encoding] :: UTF8.GetString($ d)
#输出
éêèë
After a few days of dwelling over stackoverflow and python 2.7 doc, I have come to no conclusion about this.
Basically I'm running a python script on a windows server that must have as input a block of text. This block of text (unfortunately) has to be passed by a pipe. Something like:
PS > [something_that_outputs_text] | python .\my_script.py
So the problem is:
The server uses cp1252 encoding and I really cannot change it due to administrative regulations and whatnot. And when I pipe the text to my python script, when I read it, it comes already with ?
whereas characters like \xe1
should be.
What I have done so far:
Tested with UTF-8. Yep, chcp 65001
and $OutputEncoding = [Console]::OutputEncoding
"solve it", as in python gets the text perfectly and then I can decode it to unicode etc. But apparently they don't let me do it on the server /sadface.
A little script to test what the hell is happening:
import codecs
import sys
def main(argv=None):
if argv is None:
argv = sys.argv
if len(argv)>1:
for arg in argv[1:]:
print arg.decode('cp1252')
sys.stdin = codecs.getreader('cp1252')(sys.stdin)
text = sys.stdin.read().strip()
print text
return 0
if __name__=="__main__":
sys.exit(main())
Tried it with both the codecs
wrapping and without it.
My input & output:
PS > echo "Blá" | python .\testinput.py blé
blé
Bl?
--> So there's no problem with the argument (blé) but the piped text (Blá) is no good :(
I even converted the text
string to hex and, yes, it gets flooded with 3f
(AKA mr ?
), so it's not a problem with the print
.
[Also: it's my first question here... feel free to ask any more info about what I did]
EDIT
I don't know if this is relevant or not, but when I do sys.stdin.encoding
it yields None
Update: So... I have no problems with cmd. Checked sys.stdin.encoding
while running the program on cmd and everything went fine. I think my head just exploded.
How about saving the data into a file and piping it to Python on a CMD session? Invoke Powershell and Python on CMD. Like so,
c:\>powershell -command "c:\genrateDataForPython.ps1 -output c:\data.txt"
c:\>type c:\data.txt | python .\myscript.py
Edit
Another an idea: convert the data into base64 format in Powershell and decode it in Python. Base64 is simple in Powershell, I guess in Python it isn't hard either. Like so,
# Convert some accent chars to base64
$s = [Text.Encoding]::UTF8.GetBytes("éêèë")
[System.Convert]::ToBase64String($s)
# Output:
w6nDqsOow6s=
# Decode:
$d = [System.Convert]::FromBase64String("w6nDqsOow6s=")
[Text.Encoding]::UTF8.GetString($d)
# Output
éêèë
这篇关于Python管道cp1252字符串从PowerShell到python(2.7)脚本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!