为什么PowerShell重定向>>更改文本内容的格式?

本文介绍了为什么PowerShell重定向>>更改文本内容的格式?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想使用重定向附加>>或写>来写入txt文件，但是当我这样做时，我会收到奇怪的格式"\x00a\x00p...".

I want to use the redirect append >> or write > to write to a txt file, but when I do, I receive a weird format "\x00a\x00p...".

我成功使用了Set-Content和Add-Content，为什么它们能按预期运行，但>>和>重定向运算符却没有运行?

I successfully use Set-Content and Add-Content, why do they function as expected, but not the >> and > redirect operators?

使用PowerShell cat和简单的Python打印显示输出.

Showing the output using PowerShell cat as well as simple Python print.

rocket_brain> new-item test.txt
rocket_brain> "appended using add-content" | add-content test.txt
rocket_brain> cat test.txt

 appended using add-content

但是如果我使用重定向附加>>

but then if I use redirect append >>

rocket_brain> "appended using redirect" >> test.txt
rocket_brain> cat test.txt

 appended using add-content
 a p p e n d e d   u s i n g   r e d i r e c t

简单的Python脚本:read_test.py

Simple Python script: read_test.py

with open("test.txt", "r") as file:   # open test.txt in readmode
    data = file.readlines()           # append each line to the list data
    print(data)                       # output list with each input line as an item

使用read_test.py，我发现格式有所不同

Using read_test.py I see a difference in formatting

rocket_brain> python read_test.txt
 ['appended using add-content\n', 'a\x00p\x00p\x00e\x00n\x00d\x00e\x00d\x00 \x00u\x00s\x00i\x00n\x00g\x00 \x00r\x00e\x00d\x00i\x00r\x00e\x00c\x00t\x00\r\x00\n', '\x00']

注意:如果我仅使用重定向附加>>(或写>)而不先使用Add-Content，则cat输出看起来很正常(而不是间隔开)，但是我会得到每一行的c17>格式(包括从>运算符开始的任何Add-Content命令).在记事本(或VS等)中打开文件，文本始终看起来像预期的那样.在cmd(而不是PS)中使用>>或>也会以预期的ascii格式存储文本.

NOTE: If I use only the redirect append >> (or write >) without first using Add-Content, the cat output looks normal (instead of spaced out), but I will then get the /x00p format for every line when using the Python script (including any Add-Content command after starting with > operators). Opening the file in Notepad (or VS etc), the text always looks as expected. Using >> or > in cmd (instead of PS) also stores text in expected ascii format.

相关链接: cmd重定向运算符， PS重定向操作符

推荐答案

注意:问题最终在于，在 Windows PowerShell 中，不同的cmdlet/运算符使用不同的默认编码.此问题已在PowerShell Core (v6 +)中得到了解决，在该问题中，始终使用无BOM的UTF-8.

Note: The problem is ultimately that in Windows PowerShell different cmdlets / operators use different default encodings. This problem has been resolved in PowerShell Core(v6+), where BOM-less UTF-8 is consistently used.

>>盲目地应用Out-File的默认编码(实际上，>的行为类似于Out-File，而>>的行为类似于)，在 Windows PowerShell 中是名为 Unicode的编码，即UTF-16LE ，其中大多数字符都编码为2字节序列，即使是ASCII范围；后者的高字节为0x0(NUL).

>> blindly applies Out-File's default encoding when appending to an existing file (in effect, > behaves like Out-File and >> like Out-File -Append), which in Windows PowerShell is the encoding named Unicode, i.e., UTF-16LE, where most characters are encoded as 2-byte sequences, even those in the ASCII range; the latter have a 0x0 (NUL) as the high byte.

因此，除非目标文件的现有内容使用相同的编码，否则您最终将得到不同编码的 mix ，这就是您所遇到的情况.

Add-Content确实尝试检测文件的现有编码，则在空文件上使用了该文件，在这种情况下，将应用 Set-Content的默认编码，在 Windows中PowerShell 是名为 Default 的编码，指的是系统的活动ANSI代码页.

While Add-Content, by contrast, does try to detect a file's existing encoding, you used it on an empty file, in which case Set-Content's default encoding is applied, which in Windows PowerShell is the encoding named Default, which refers to your system's active ANSI code page.

因此，为了在添加更多内容时匹配Add-Content调用最初创建的单字节ANSI编码，使用Out-File -Append -Encoding Default代替>>，或者直接使用Add-Content .

Therefore, to match the single-byte ANSI encoding initially created by your Add-Content call when appending further content, use Out-File -Append -Encoding Default instead of >>, or simply keep using Add-Content.

Add-Content -Encoding ...

Out-File -Append

Alternatively, pick a different encoding with Add-Content -Encoding ... and match it in the Out-File -Append call; UTF-8 is generally the best choice, though note that when you create a UTF-8 file in Windows PowerShell, it will start with a BOM (a pseudo byte-order mark identifying the file as UTF-8, which Unix-like platforms typically do not expect).

在PowerShell v5.1 +中，您还可以全局更改默认编码，包括>和>>的默认编码(在早期版本中是不可能的).例如，要更改为UTF-8，请使用:
$PSDefaultParameterValues['*:Encoding']='UTF8'

In PowerShell v5.1+ you may also change the default encoding globally, including for > and >> (which isn't possible in earlier versions). To change to UTF-8, for instance, use:
$PSDefaultParameterValues['*:Encoding']='UTF8'

除了使用不同的默认编码(在Windows PowerShell中)外，重要的是要注意一方面 Set-Content/Add-Content以及另一方面>/>>/Out-File [-Append]与非字符串输入完全不同:

Aside from different default encodings (in Windows PowerShell), it is important to note that Set-Content / Add-Content on the one hand and > / >> / Out-File [-Append] on the other behave fundamentally differently with non-string input:

简而言之:前者对输入对象应用简单的.ToString()格式，而后者执行与控制台相同的输出格式-请参见以获取详细信息.

In short: the former apply simple .ToString()-formatting to the input objects, whereas the latter perform the same output formatting you would see in the console - see this answer for details.

这篇关于为什么PowerShell重定向>>更改文本内容的格式?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！