Addendum 2The manpage for the Perl Curses shim says: 如果wget_wch()不可用(即Curses库不可用) 了解宽字符),这会调用wgetch() [获取1个字节的字符 从curses窗口],但返回 尽管如此,上述值.这可能是一个问题,因为使用 多字节字符编码(如UTF-8),您将收到两个 两字节字符的一个字符字符串(例如,Ã"和¤"表示 一种"). If wget_wch() is not available (i.e. The Curses library does not understand wide characters), this calls wgetch() [get a 1-byte char from a curses window], but returns the values described above nonetheless. This can be a problem because with a multibyte character encoding like UTF-8, you will receive two one-character strings for a two-byte-character (e.g. "Ã" and "¤" for "ä").在这里可能是这种情况,但是wget_wch()在此系统上确实存在.This may be the case here, but wget_wch() does exist on this system.试图查看C代码的作用,并直接将fprintf添加到curses/Curses-1.36/CursesFunWide.c的多字节处理代码中,重新编译后,没有设法通过LD_LIBRARY_PATH用我自己的系统覆盖Curses.so(为什么为什么呢?为什么一切都只能在一半时间内工作?),因此直接替换了系统库(使用THAT!).Tried to see what the C code does and added an fprintf directly into the multibyte handling code of curses/Curses-1.36/CursesFunWide.c, recompiled, didn't manage to override the system Curses.so with my own via LD_LIBRARY_PATH (why not? why is everything only working half of the time?), so replaced the system library directly in place (take THAT!).#ifdef C_GET_WCH wint_t wch; int ret = wget_wch(win, &wch); if (ret == OK) { ST(0) = sv_newmortal(); fprintf(stderr,"Obtained win_t 0x%04lx\n", wch); c_wchar2sv(ST(0), wch); XSRETURN(1); } else if (ret == KEY_CODE_YES) { XST_mUNDEF(0); ST(1) = sv_newmortal(); sv_setiv(ST(1), (IV)wch); XSRETURN(2); } else { XSRETURN_UNDEF; }#else那只是一个胖的NOPE,当按下ü时,会看到:That's just a fat NOPE, when pressing ü one sees:Obtained win_t 0x00fc因此,可以运行正确的代码,但是数据是 ISO-8859-1 ,而不是UTF-8.因此,wget_wch的行为不佳.因此,这是一个curses配置问题.嗯.So the correct code is run, but the data is ISO-8859-1, not UTF-8. So it's wget_wch which behaves badly. So it's a curses config problem. Huh.令我惊讶的是,也许ncurses假定使用默认语言环境,即C.为了使ncurses使用宽字符,必须初始化语言环境",这可能意味着将状态从未设置"(从而使ncurses退回到C)移动到设置为系统上的内容指示"(应该是LANG环境变量中的内容). ncurses的手册页说:It struck me that maybe ncurses was assuming default locale, i.e. C. To make it ncurses work with wide characters, one has to "initialize the locale", which probably means moving state from "unset" (and thus making ncurses fall back to C) to "set to what the system indicates" (which should be what is in the LANG environment variable). The man page for ncurses says: 该库使用调用程序已初始化的语言环境. 通常是通过setlocale完成的: The library uses the locale which the calling program has initialized. That is normally done with setlocale: setlocale(LC_ALL,");setlocale(LC_ALL, "");如果未初始化语言环境,则库将假定字符 可以按照ISO-8859-1进行打印,以与某些旧版程序配合使用. 您应该初始化语言环境,而不要依赖于 尚未设置语言环境的库.If the locale is not initialized, the library assumes that characters are printable as in ISO-8859-1, to work with certain legacy programs. You should initialize the locale and not rely on specific details of the library when the locale has not been setup.这也不起作用,但我认为解决方案就在这条路上.This didn't work either, but I feel that the solution is down that road.来自win_t(显然与wchar_t相同)转换代码> CursesWide.c ,将从wget_wch()接收到的wint_t(在此处显示为wchar_t)转换为Perl字符串. SV是标量值"类型.The win_t (apparently the same as wchar_t) conversion code from CursesWide.c, converts the wint_t (here seen as wchar_t) received from wget_wch() into a Perl string. SV is the "scalar value" type.另请参见: https://perldoc.perl.org/perlguts.html在这里插入了两个fprintf,以查看发生了什么情况:Here with two fprintf inserted to see what is going on:static voidc_wchar2sv(SV * const sv, wchar_t const wc) {/*---------------------------------------------------------------------------- Set SV to a one-character (not -byte!) Perl string holding a given wide character-----------------------------------------------------------------------------*/ if (wc <= 0xff) { char s[] = { wc, 0 }; fprintf(stderr,"Not UTF-8 string: %02x %02x\n", ((int)s[0])&0xFF, ((int)s[1])&0xFF); sv_setpv(sv, s); SvPOK_on(sv); SvUTF8_off(sv); } else { char s[UTF8_MAXBYTES + 1] = { 0 }; char *s_end = (char *)UVCHR_TO_UTF8((U8 *)s, wc); *s_end = 0; fprintf(stderr,"UTF-8 string: %02x %02x %02x\n", ((int)s[0])&0xFF, ((int)s[1])&0xFF, ((int)s[2])&0xFF); sv_setpv(sv, s); SvPOK_on(sv); SvUTF8_on(sv); }}使用perl-Curses测试代码尝试过perl-Curses-1.36-9.fc30.x86_64尝试过perl-Curses-1.36-11.fc31.x86_64Test code using perl-CursesTried with perl-Curses-1.36-9.fc30.x86_64Tried with perl-Curses-1.36-11.fc31.x86_64 如果尝试这样做,请按BACKSPACE退出循环,因为不再解释CTRL-C.下面有很多代码,但关键区域标有----- Testing:A lot of code below, but the critical area is marked with ----- Testing:#!/usr/bin/perl# pmap -p PID# shows the per process using# /usr/lib64/libncursesw.so.6.1# /usr/lib64/perl5/vendor_perl/auto/Curses/Curses.so# Trying https://metacpan.org/release/Cursesuse warnings;use strict;use utf8; # Meaning "This lexical scope (i.e. file) contains utf8"use Curses; # On Fedora: dnf install perl-Curses# This didn't fix it# https://perldoc.perl.org/open.htmluse open qw(:std :encoding(UTF-8));# https://perldoc.perl.org/perllocale.html#The-setlocale-functionuse POSIX ();my $loc = POSIX::setlocale(&POSIX::LC_ALL, "");# ---# Surrounds the actual program# ---sub setup() { initscr(); raw(); keypad(1); noecho();}sub teardown { endwin();}# ---# Mainly for prettyprinting# ---my $special_keys = setup_special_keys();# ---# Error printing# ---sub mt { return sprintf("%i: ",time());}sub ae { my ($x,$fname) = @_; if ($x == ERR) { printw mt(); printw "Got error code from '$fname': $x\n" }}# ---# Where the action is# ---sub announce { my $res = printw "Type any character to see it in bold! (or backspace to exit)\n"; ae($res, "printw"); return { refresh => 1 }}sub read1ch { # Read a next character, waiting until it is there. # Use the wide-character aware functions unless you want to deal with # collating individual bytes yourself! # Readings: # https://metacpan.org/pod/Curses#Wide-Character-Aware-Functions # https://perldoc.perl.org/perlunicode.html#Unicode-Character-Properties # https://www.ahinea.com/en/tech/perl-unicode-struggle.html # https://hexdump.wordpress.com/2009/06/19/character-encoding-issues-part-ii-perl/ my ($ch, $key) = getchar(); if (defined $key) { # it's a function key printw "Function key pressed: $key"; printw " with known alias '" . $$special_keys{$key} . "'" if (exists $$special_keys{$key}); printw "\n"; # done if backspace was hit return { done => ($key == KEY_BACKSPACE()) } } elsif (defined $ch) { # "$ch" should be a String of 1 character # ----- Testing printw "Locale: $loc\n"; printw "Multibyte output test: öüäéèà периоду\n"; printw sprintf("Received string '%s' of length %i with ordinal 0x%x\n", $ch, length($ch), ord($ch)); { # https://perldoc.perl.org/bytes.html use bytes; printw sprintf("... length is %i\n" , length($ch)); printw sprintf("... contents are %vd\n" , $ch); } # ----- Testing return { ch => $ch } } else { # it's an error printw "getchar() failed\n"; return {} }}sub feedback { my ($ch) = @_; printw "The pressed key is: "; attron(A_BOLD); printw("%s\n","$ch"); # do not print $txt directly to make sure escape sequences are not interpreted! attroff(A_BOLD); return { refresh => 1 } # should refresh}sub do_curses_run { setup; my $done = 0; while (!$done) { my $bubl; $bubl = announce(); refresh() if $$bubl{refresh}; $bubl = read1ch(); $done = $$bubl{done}; if (defined $$bubl{ch}) { $bubl = feedback($$bubl{ch}); refresh() if $$bubl{refresh}; } } teardown;}# ---# main# ---do_curses_run();sub setup_special_keys { # the key codes on the left must be called once to resolve to a numeric constant! my $res = { KEY_BREAK() => "Break key", KEY_DOWN() => "Arrow down", KEY_UP() => "Arrow up", KEY_LEFT() => "Arrow left", KEY_RIGHT() => "Arrow right", KEY_HOME() => "Home key", KEY_BACKSPACE() => "Backspace", KEY_DL() => "Delete line", KEY_IL() => "Insert line", KEY_DC() => "Delete character", KEY_IC() => "Insert char or enter insert mode", KEY_EIC() => "Exit insert char mode", KEY_CLEAR() => "Clear screen", KEY_EOS() => "Clear to end of screen", KEY_EOL() => "Clear to end of line", KEY_SF() => "Scroll 1 line forward", KEY_SR() => "Scroll 1 line backward (reverse)", KEY_NPAGE() => "Next page", KEY_PPAGE() => "Previous page", KEY_STAB() => "Set tab", KEY_CTAB() => "Clear tab", KEY_CATAB() => "Clear all tabs", KEY_ENTER() => "Enter or send", KEY_SRESET() => "Soft (partial) reset", KEY_RESET() => "Reset or hard reset", KEY_PRINT() => "Print or copy", KEY_LL() => "Home down or bottom (lower left)", KEY_A1() => "Upper left of keypad", KEY_A3() => "Upper right of keypad", KEY_B2() => "Center of keypad", KEY_C1() => "Lower left of keypad", KEY_C3 () => "Lower right of keypad", KEY_BTAB() => "Back tab key", KEY_BEG() => "Beg(inning) key", KEY_CANCEL() => "Cancel key", KEY_CLOSE() => "Close key", KEY_COMMAND() => "Cmd (command) key", KEY_COPY() => "Copy key", KEY_CREATE() => "Create key", KEY_END() => "End key", KEY_EXIT() => "Exit key", KEY_FIND() => "Find key", KEY_HELP() => "Help key", KEY_MARK() => "Mark key", KEY_MESSAGE() => "Message key", KEY_MOUSE() => "Mouse event read", KEY_MOVE() => "Move key", KEY_NEXT() => "Next object key", KEY_OPEN() => "Open key", KEY_OPTIONS() => "Options key", KEY_PREVIOUS() => "Previous object key", KEY_REDO() => "Redo key", KEY_REFERENCE() => "Ref(erence) key", KEY_REFRESH() => "Refresh key", KEY_REPLACE() => "Replace key", KEY_RESIZE() => "Screen resized", KEY_RESTART() => "Restart key", KEY_RESUME() => "Resume key", KEY_SAVE() => "Save key", KEY_SBEG() => "Shifted beginning key", KEY_SCANCEL() => "Shifted cancel key", KEY_SCOMMAND() => "Shifted command key", KEY_SCOPY() => "Shifted copy key", KEY_SCREATE() => "Shifted create key", KEY_SDC() => "Shifted delete char key", KEY_SDL() => "Shifted delete line key", KEY_SELECT() => "Select key", KEY_SEND() => "Shifted end key", KEY_SEOL() => "Shifted clear line key", KEY_SEXIT() => "Shifted exit key", KEY_SFIND() => "Shifted find key", KEY_SHELP() => "Shifted help key", KEY_SHOME() => "Shifted home key", KEY_SIC() => "Shifted input key", KEY_SLEFT() => "Shifted left arrow key", KEY_SMESSAGE() => "Shifted message key", KEY_SMOVE() => "Shifted move key", KEY_SNEXT() => "Shifted next key", KEY_SOPTIONS() => "Shifted options key", KEY_SPREVIOUS() => "Shifted prev key", KEY_SPRINT() => "Shifted print key", KEY_SREDO() => "Shifted redo key", KEY_SREPLACE() => "Shifted replace key", KEY_SRIGHT() => "Shifted right arrow", KEY_SRSUME() => "Shifted resume key", KEY_SSAVE() => "Shifted save key", KEY_SSUSPEND() => "Shifted suspend key", KEY_SUNDO() => "Shifted undo key", KEY_SUSPEND() => "Suspend key", KEY_UNDO() => "Undo key" }; for (my $f = 1; $f <= 64; $f++) { $$res{KEY_F($f)} = "KEY_F($f)" } return $res}推荐答案实际上看起来是正确的.Actually it looks correct.使用 strace 运行脚本可以帮助...我这样做是为了查看系统调用:Running your script with strace can help... I did this to see the system calls:strace -fo strace.out -s 1024 ./foo,并且可以看到读取内容,消息等.使用打包库可以对ncurses进行类似的跟踪,尽管打包程序在提供跟踪功能方面并不一致.and could see the reads, messages, etc. Getting a similar trace for ncurses could be done using a debug-library, though packagers haven't been consistent about providing one with tracing enabled. ü 是 \303\274 (八进制),其Unicode值是 252 (十进制)或 0xfc (十六进制).问题的这一部分似乎错过了这一点:ü in UTF-8 is \303\274 (octal), and its Unicode value is 252 (decimal), or 0xfc (hexadecimal). This part of the question seems to have missed that point: 那只是一个肥腻的NOPE,按ü会看到: That's just a fat NOPE, when pressing ü one sees: Obtained win_t 0x00fc因此将运行正确的代码,但数据为ISO-8859-1,而不是UTF-8.所以是wget_wch表现不佳.因此,这是一个curses配置问题.嗯.So the correct code is run, but the data is ISO-8859-1, not UTF-8. So it's wget_wch which behaves badly. So it's a curses config problem. Huh. wget_wch 返回(出于实际目的)Unicode值(不是UTF-8字节序列). ISO-8859-1代码160-255碰巧(并非偶然)匹配Unicode代码点,尽管后者在UTF-8中肯定会以不同的方式编码. wgetch 将返回UTF-8字节,但Perl脚本仅将其用作后备(因为这将导致Perl脚本将UTF-8字符串转换为Unicode值).wgetch would return the UTF-8 bytes, but the Perl script would only use that as a fallback (since that would lead to having the Perl script convert UTF-8 strings to Unicode values). 这篇关于Perl + Curses:期望从getchar()获得UTF-8编码的多字节字符,但没有得到任何的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
09-13 15:00