ython管道cp1252字符串从PowerShell到pyth

ython管道cp1252字符串从PowerShell到pyth

本文介绍了Python管道cp1252字符串从PowerShell到python(2.7)脚本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在stackoverflow和python 2.7 doc上呆了几天后,我对此没有任何结论。



基本上我运行一个python脚本windows服务器必须有一个文本块作为输入。此文本块(不幸的是)通过管道传递。例如:



PS> [something_that_outputs_text] | python .\my_script.py



因此问题是:



服务器使用cp1252编码,我真的不能改变它,由于行政法规和whatnot。当我把文本转换为我的python脚本,当我读它,它已经与,而字符像 \xe1 应该是


$ b

用UTF-8测试。 Yep, chcp 65001 $ OutputEncoding = [Console] :: OutputEncoding 解决它,因为在python中获取文本完全,然后我可以解码它unicode等。但显然他们不让我做的服务器/ sadface。



一个小脚本测试什么地狱正在发生:

  import codecs 
import sys

def main(argv = None ):
如果argv为None:
argv = sys.argv
如果len(argv)> 1:
用于argv [1:]中的arg:
打印arg.decode('cp1252')

sys.stdin = codecs.getreader('cp1252')(sys.stdin)
text = sys.stdin.read )
print text
return 0

如果__name __ ==__ main__:
sys.exit(main())

尝试使用编解码器包装,而不包含它。



我的输入&输出:

  PS>回声Blá| python。\testinput.pyblé
blé
Bl?

- >因此,参数(blé)没有问题,但是管道文本没有好:(



我甚至将文本字符串转换为十六进制,是的, c $ c> 3f (AKA mr ),因此 print



[另外:这里是我的第一个问题...随时可以询问我的任何信息]



EDIT



我不知道这是否相关,但是当我 sys.stdin.encoding 它会产生



我没有问题的cmd。选中 sys.stdin.encoding 在cmd上运行程序,一切都很好,我想我的头刚刚爆炸。

解决方案

在CMD会话上将数据保存到文件并将其传递到Python如何?在CMD上调用Powershell和Python。 >

  c:\> powershell -commandc:\genrateDataForPython.ps1 -output c:\data.txt
c:\> type c:\data.txt | python .\myscript.py

编辑
$ b

另一个想法:在Powershell中将数据转换为base64格式,并在Python中解码。 Base64在Powershell中很简单,我猜在Python 这也不难。像这样,

 #将一些重音字符转换为base64 
$ s = [Text.Encoding] :: UTF8。 GetBytes(éêèë)
[System.Convert] :: ToBase64String($ s)
#输出:
w6nDqsOow6s =

#解码:
$ d = [System.Convert] :: FromBase64String(w6nDqsOow6s =)
[Text.Encoding] :: UTF8.GetString($ d)
#输出
éêèë


After a few days of dwelling over stackoverflow and python 2.7 doc, I have come to no conclusion about this.

Basically I'm running a python script on a windows server that must have as input a block of text. This block of text (unfortunately) has to be passed by a pipe. Something like:

PS > [something_that_outputs_text] | python .\my_script.py

So the problem is:

The server uses cp1252 encoding and I really cannot change it due to administrative regulations and whatnot. And when I pipe the text to my python script, when I read it, it comes already with ? whereas characters like \xe1 should be.

What I have done so far:

Tested with UTF-8. Yep, chcp 65001 and $OutputEncoding = [Console]::OutputEncoding "solve it", as in python gets the text perfectly and then I can decode it to unicode etc. But apparently they don't let me do it on the server /sadface.

A little script to test what the hell is happening:

import codecs
import sys

def main(argv=None):
    if argv is None:
        argv = sys.argv
        if len(argv)>1:
            for arg in argv[1:]:
                print arg.decode('cp1252')

    sys.stdin = codecs.getreader('cp1252')(sys.stdin)
    text = sys.stdin.read().strip()
    print text
    return 0

if __name__=="__main__":
    sys.exit(main())

Tried it with both the codecs wrapping and without it.

My input & output:

PS > echo "Blá" | python .\testinput.py blé
blé
Bl?

--> So there's no problem with the argument (blé) but the piped text (Blá) is no good :(

I even converted the text string to hex and, yes, it gets flooded with 3f (AKA mr ?), so it's not a problem with the print.

[Also: it's my first question here... feel free to ask any more info about what I did]

EDIT

I don't know if this is relevant or not, but when I do sys.stdin.encoding it yields None

Update: So... I have no problems with cmd. Checked sys.stdin.encoding while running the program on cmd and everything went fine. I think my head just exploded.

How about saving the data into a file and piping it to Python on a CMD session? Invoke Powershell and Python on CMD. Like so,

c:\>powershell -command "c:\genrateDataForPython.ps1 -output c:\data.txt"
c:\>type c:\data.txt | python .\myscript.py

Edit

Another an idea: convert the data into base64 format in Powershell and decode it in Python. Base64 is simple in Powershell, I guess in Python it isn't hard either. Like so,

# Convert some accent chars to base64
$s  = [Text.Encoding]::UTF8.GetBytes("éêèë")
[System.Convert]::ToBase64String($s)
# Output:
w6nDqsOow6s=

# Decode:
$d  = [System.Convert]::FromBase64String("w6nDqsOow6s=")
[Text.Encoding]::UTF8.GetString($d)
# Output
éêèë

这篇关于Python管道cp1252字符串从PowerShell到python(2.7)脚本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-30 10:10