为什么PowerShell重定向

为什么PowerShell重定向

本文介绍了为什么PowerShell重定向>>更改文本内容的格式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用重定向附加>>或写>来写入txt文件,但是当我这样做时,我会收到奇怪的格式"\x00a\x00p...".

I want to use the redirect append >> or write > to write to a txt file, but when I do, I receive a weird format "\x00a\x00p...".

我成功使用了Set-ContentAdd-Content,为什么它们能按预期运行,但>>>重定向运算符却没有运行?

I successfully use Set-Content and Add-Content, why do they function as expected, but not the >> and > redirect operators?

使用PowerShell cat和简单的Python打印显示输出.

Showing the output using PowerShell cat as well as simple Python print.

rocket_brain> new-item test.txt
rocket_brain> "appended using add-content" | add-content test.txt
rocket_brain> cat test.txt

 appended using add-content

但是如果我使用重定向附加>>

but then if I use redirect append >>

rocket_brain> "appended using redirect" >> test.txt
rocket_brain> cat test.txt

 appended using add-content
 a p p e n d e d   u s i n g   r e d i r e c t

简单的Python脚本:read_test.py

Simple Python script: read_test.py

with open("test.txt", "r") as file:   # open test.txt in readmode
    data = file.readlines()           # append each line to the list data
    print(data)                       # output list with each input line as an item

使用read_test.py,我发现格式有所不同

Using read_test.py I see a difference in formatting

rocket_brain> python read_test.txt
 ['appended using add-content\n', 'a\x00p\x00p\x00e\x00n\x00d\x00e\x00d\x00 \x00u\x00s\x00i\x00n\x00g\x00 \x00r\x00e\x00d\x00i\x00r\x00e\x00c\x00t\x00\r\x00\n', '\x00']

注意:如果我仅使用重定向附加>>(或写>)而不先使用Add-Content,则cat输出看起来很正常(而不是间隔开),但是我会得到每一行的c17>格式(包括从>运算符开始的任何Add-Content命令).在记事本(或VS等)中打开文件,文本始终看起来像预期的那样.在cmd(而不是PS)中使用>>>也会以预期的ascii格式存储文本.

NOTE: If I use only the redirect append >> (or write >) without first using Add-Content, the cat output looks normal (instead of spaced out), but I will then get the /x00p format for every line when using the Python script (including any Add-Content command after starting with > operators). Opening the file in Notepad (or VS etc), the text always looks as expected. Using >> or > in cmd (instead of PS) also stores text in expected ascii format.

相关链接: cmd重定向运算符 PS重定向操作符

推荐答案

注意:问题最终在于,在 Windows PowerShell 中,不同的cmdlet/运算符使用不同的默认编码.此问题已在PowerShell Core (v6 +)中得到了解决,在该问题中,始终使用无BOM的UTF-8.

Note: The problem is ultimately that in Windows PowerShell different cmdlets / operators use different default encodings. This problem has been resolved in PowerShell Core(v6+), where BOM-less UTF-8 is consistently used.

    在附加到现有文件时,
  • >>盲目地应用Out-File的默认编码(实际上,>的行为类似于Out-File,而>>的行为类似于),在 Windows PowerShell 中是名为 Unicode的编码,即UTF-16LE ,其中大多数字符都编码为2字节序列,即使是ASCII范围;后者的高字节为0x0(NUL).

  • >> blindly applies Out-File's default encoding when appending to an existing file (in effect, > behaves like Out-File and >> like Out-File -Append), which in Windows PowerShell is the encoding named Unicode, i.e., UTF-16LE, where most characters are encoded as 2-byte sequences, even those in the ASCII range; the latter have a 0x0 (NUL) as the high byte.

  • 因此,除非目标文件的现有内容使用相同的编码,否则您最终将得到不同编码的 mix ,这就是您所遇到的情况.

Add-Content确实尝试检测文件的现有编码,则在文件上使用了该文件,在这种情况下,将应用 Set-Content的默认编码,在 Windows中PowerShell 是名为 Default 的编码,指的是系统的活动ANSI代码页.

While Add-Content, by contrast, does try to detect a file's existing encoding, you used it on an empty file, in which case Set-Content's default encoding is applied, which in Windows PowerShell is the encoding named Default, which refers to your system's active ANSI code page.

因此,为了在添加更多内容时匹配Add-Content调用最初创建的单字节ANSI编码,使用Out-File -Append -Encoding Default代替>>,或者直接使用Add-Content .

Therefore, to match the single-byte ANSI encoding initially created by your Add-Content call when appending further content, use Out-File -Append -Encoding Default instead of >>, or simply keep using Add-Content.

    或者,用Add-Content -Encoding ...选择一种不同的编码,然后在Out-File -Append调用中进行匹配;通常,UTF-8是最佳选择,但是请注意,当您在Windows PowerShell中创建UTF-8文件时,它将以BOM表(将文件标识为UTF-8的伪字节顺序标记)开头,类似于Unix平台通常不期望).
  • Alternatively, pick a different encoding with Add-Content -Encoding ... and match it in the Out-File -Append call; UTF-8 is generally the best choice, though note that when you create a UTF-8 file in Windows PowerShell, it will start with a BOM (a pseudo byte-order mark identifying the file as UTF-8, which Unix-like platforms typically do not expect).

在PowerShell v5.1 +中,您还可以全局更改默认编码,包括>>>的默认编码(在早期版本中是不可能的).例如,要更改为UTF-8,请使用:
$PSDefaultParameterValues['*:Encoding']='UTF8'

In PowerShell v5.1+ you may also change the default encoding globally, including for > and >> (which isn't possible in earlier versions). To change to UTF-8, for instance, use:
$PSDefaultParameterValues['*:Encoding']='UTF8'

除了使用不同的默认编码(在Windows PowerShell中)外,重要的是要注意一方面 Set-Content/Add-Content以及另一方面>/>>/Out-File [-Append]非字符串输入完全不同:

Aside from different default encodings (in Windows PowerShell), it is important to note that Set-Content / Add-Content on the one hand and > / >> / Out-File [-Append] on the other behave fundamentally differently with non-string input:

简而言之:前者对输入对象应用简单的.ToString()格式,而后者执行与控制台相同的输出格式-请参见以获取详细信息.

In short: the former apply simple .ToString()-formatting to the input objects, whereas the latter perform the same output formatting you would see in the console - see this answer for details.

这篇关于为什么PowerShell重定向>>更改文本内容的格式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 18:20