本文介绍了什么可能导致“目标多字节代码页中不存在 Unicode 字符的映射"?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个错误报告显示 EEncodingError.日志指向 TFile.AppendAllText.我调用 TFile.AppendAllText 是我的这个程序:

I have a bug report showing an EEncodingError. The log points to TFile.AppendAllText. I call TFile.AppendAllText is this procedure of mine:

procedure WriteToFile(CONST FileName: string; CONST uString: string; CONST WriteOp: WriteOpperation; ForceFolder: Boolean= FALSE);     // Works with UNC paths
begin
 if NOT ForceFolder
 OR (ForceFolder AND ForceDirectoriesMsg(ExtractFilePath(FileName))) then
   if WriteOp= (woOverwrite)
   then IOUtils.TFile.WriteAllText (FileName, uString)
   else IOUtils.TFile.AppendAllText(FileName, uString);
end;

这是来自 EurekaLog 的信息.

This is the information from EurekaLog.

什么会导致这种情况发生?

What can cause this to happen?

推荐答案

此程序重现了您报告的错误:

This program reproduces the error that you report:

{$APPTYPE CONSOLE}

uses
  System.SysUtils, System.IOUtils;

var
  FileName: string;

begin
  try
    FileName := TPath.GetTempFileName;
    TFile.WriteAllText(FileName, 'é', TEncoding.ANSI);
    TFile.AppendAllText(FileName, 'é');
  except
    on E: Exception do
      Writeln(E.ClassName, ': ', E.Message);
  end;
end.

这里我把原始文件写成 ANSI.然后调用 AppendAllText 它将尝试编写为 UTF-8.发生的事情是我们最终在这个函数中:

Here I have written the original file as ANSI. And then called AppendAllText which will try to write as UTF-8. What happens is that we end up in this function:

class procedure TFile.AppendAllText(const Path, Contents: string);
var
  LFileStream: TFileStream;
  LFileEncoding: TEncoding; // encoding of the file
  Buff: TBytes;
  Preamble: TBytes;
  UTFStr: TBytes;
  UTF8Str: TBytes;
begin
  CheckAppendAllTextParameters(Path, nil, False);

  LFileStream := nil;
  try
    try
      LFileStream := DoCreateOpenFile(Path);
      // detect the file encoding
      LFileEncoding := GetEncoding(LFileStream);

      // file is written is ASCII (default ANSI code page)
      if LFileEncoding = TEncoding.ANSI then
      begin
        // Contents can be represented as ASCII;
        // append the contents in ASCII

        UTFStr := TEncoding.ANSI.GetBytes(Contents);
        UTF8Str := TEncoding.UTF8.GetBytes(Contents);

        if TEncoding.UTF8.GetString(UTFStr) = TEncoding.UTF8.GetString(UTF8Str) then
        begin
          LFileStream.Seek(0, TSeekOrigin.soEnd);
          Buff := TEncoding.ANSI.GetBytes(Contents);
        end
        // Contents can be represented only in UTF-8;
        // convert file and Contents encodings to UTF-8
        else
        begin
          // convert file contents to UTF-8
          LFileStream.Seek(0, TSeekOrigin.soBeginning);
          SetLength(Buff, LFileStream.Size);
          LFileStream.ReadBuffer(Buff, Length(Buff));
          Buff := TEncoding.Convert(LFileEncoding, TEncoding.UTF8, Buff);

          // prepare the stream to rewrite the converted file contents
          LFileStream.Size := Length(Buff);
          LFileStream.Seek(0, TSeekOrigin.soBeginning);
          Preamble := TEncoding.UTF8.GetPreamble;
          LFileStream.WriteBuffer(Preamble, Length(Preamble));
          LFileStream.WriteBuffer(Buff, Length(Buff));

          // convert Contents in UTF-8
          Buff := TEncoding.UTF8.GetBytes(Contents);
        end;
      end
      // file is written either in UTF-8 or Unicode (BE or LE);
      // append Contents encoded in UTF-8 to the file
      else
      begin
        LFileStream.Seek(0, TSeekOrigin.soEnd);
        Buff := TEncoding.UTF8.GetBytes(Contents);
      end;

      // write Contents to the stream
      LFileStream.WriteBuffer(Buff, Length(Buff));
    except
      on E: EFileStreamError do
        raise EInOutError.Create(E.Message);
    end;
  finally
    LFileStream.Free;
  end;
end;

错误源于这一行:

if TEncoding.UTF8.GetString(UTFStr) = TEncoding.UTF8.GetString(UTF8Str) then

问题在于 UTFStr 实际上不是有效的 UTF-8.因此 TEncoding.UTF8.GetString(UTFStr) 抛出异常.

The problem is that UTFStr is not in fact valid UTF-8. And hence TEncoding.UTF8.GetString(UTFStr) throws an exception.

这是TFile.AppendAllBytes 中的一个缺陷.鉴于它非常清楚UTFStrANSI 编码的,因此调用TEncoding.UTF8.GetString 毫无意义.

This is a defect in TFile.AppendAllBytes. Given that it knows perfectly well that UTFStr is ANSI encoded, it makes no sense at all for it to call TEncoding.UTF8.GetString.

您应该向 Embarcadero 提交一份关于此缺陷的错误报告,该缺陷仍然存在于 Delphi 10 Seattle 中.同时,您不应使用 TFile.AppendAllBytes.

You should submit a bug report to Embarcadero for this defect which still exists in Delphi 10 Seattle. In the meantime you should not use TFile.AppendAllBytes.

这篇关于什么可能导致“目标多字节代码页中不存在 Unicode 字符的映射"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-17 22:39
查看更多