打印C字符串(UTF-8)时NSLog()vs printf()

本文介绍了打印C字符串(UTF-8)时NSLog()vs printf()的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我注意到，如果我尝试使用格式说明符％s"来打印包含UTF-8中字符串表示形式的字节数组，则printf()会正确显示，而NSLog()会显示乱码(即，则按原样打印每个字节，因此例如将¥"打印为2个字符:¬•").这很好奇，因为我一直以为NSLog()就是printf()，加上:

I have noticed that if I try to print the byte array containing the representation of a string in UTF-8, using the format specifier "%s", printf() gets it right but NSLog() gets it garbled (i.e., each byte printed as-is, so for example "¥" gets printed as the 2 characters: "¬•").This is curious, because I always thought that NSLog() is just printf(), plus:

第一个参数(格式")是Objective-C字符串，而不是C字符串(因此为"@").
在时间戳记和应用程序名称之前.
换行符自动添加到末尾.
能够打印Objective-C对象(使用格式％@").

The first parameter (the 'format') is an Objective-C string, not a Cstring (hence the "@").
The timestamp and app name prepended.
The newline automatically added at the end.
The ability to print Objective-C objects (using the format "%@").

我的代码:

NSString* string;

// (...fill string with unicode string...)

const char* stringBytes = [string cStringUsingEncoding:NSUTF8Encoding];

NSUInteger stringByteLength = [string lengthOfBytesUsingEncoding:NSUTF8Encoding];
stringByteLength += 1; // add room for '\0' terminator

char* buffer = calloc(sizeof(char), stringByteLength);

memcpy(buffer, stringBytes, stringByteLength);

NSLog(@"Buffer after copy: %s", buffer);
// (renders ascii, no matter what)

printf("Buffer after copy: %s\n", buffer);
// (renders correctly, e.g. japanese text)

以某种方式，似乎printf()比NSLog()更聪明".有谁知道根本原因，以及是否在任何地方都记录了此功能? (找不到)

Somehow, it looks as if printf() is "smarter" than NSLog(). Does anyone know the underlying cause, and if this feature is documented anywhere? (Couldn't find)

推荐答案

NSLog()和stringWithFormat:似乎期望%s的字符串在系统编码"中(例如，在我的计算机上为"Mac Roman"):

NSLog() and stringWithFormat: seem to expect the string for %sin the "system encoding" (for example "Mac Roman" on my computer):

NSString *string = @"¥";
NSStringEncoding enc = CFStringConvertEncodingToNSStringEncoding(CFStringGetSystemEncoding());
const char* stringBytes = [string cStringUsingEncoding:enc];
NSString *log = [NSString stringWithFormat:@"%s", stringBytes];
NSLog(@"%@", log);

// Output: ¥

当然，如果某些字符无法在系统编码中表示，则此操作将失败.我找不到有关此行为的官方文档，但可以看到在stringWithFormat:或NSLog()中使用%s不能可靠地与任意UTF-8字符串一起使用.

Of course this will fail if some characters are not representable in the system encoding. I could not find an official documentation for this behavior, but one can see that using %s in stringWithFormat: or NSLog() does not reliably work with arbitrary UTF-8 strings.

如果要检查包含UTF-8字符串的char缓冲区的内容，则这将适用于任意字符(使用带框的表达式语法从UTF-8字符串创建NSString):

If you want to check the contents of a char buffer containing an UTF-8 string, thenthis would work with arbitrary characters (using the boxed expression syntax to create an NSString from a UTF-8 string):

NSLog(@"%@", @(utf8Buffer));

这篇关于打印C字符串(UTF-8)时NSLog()vs printf()的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！