问题描述
在 perl
中,有 Unicode 7 的 perluniprops
索引,http://perldoc.perl.org/perluniprops.html 在这里我可以执行以下操作来填充开始和结束标点:
s/(\p{Open_Punctuation})/$1/g;s/(\p{Close_Punctuation})/$1/g;
使用 perl 时填充的开始/结束标点符号的完整列表是什么?python
中的等价物是什么?
相关问题:用空格填充多个字符 - python 用空格填充多个字符 - python;这个问题是由回答者投票分开提出的,它应该分开.
您是否在询问如何确定给定的开放标点对应的结束标点是什么?Unicode 没有定义这个.事实上,甚至没有一对一的关系.
$ unichars '\p{Open_Punctuation}' |wc -l75$ unichars '\p{Close_Punctuation}' |wc -l73
但是,您构建自己的映射应该相对容易.
$ unichars '\p{Open_Punctuation}' |猫( U+0028 左括号[ U+005B 左方括号{ U+007B 左花括号༺ U+0F3A TIBETAN MARK GUG RTAGS GYON༼ U+0F3C 藏语 MARK ANG KHANG GYON᚛ U+169B 欧格姆羽毛印记‚ U+201A 单个低 9 引号„ U+201E 双低 9 引号⁅ U+2045 左方支架带套筒⁽ U+207D 上标左括号₍ U+208D 订阅左括号⌈ U+2308 左天花板⌊ U+230A 左楼〈 U+2329 左尖角支架❨ U+2768 中左括号装饰❪ U+276A 中号扁平左圆括号装饰❬ U+276C 中号左尖角支架饰品❮ U+276E 重左指角引号装饰品❰ U+2770 重型左尖角支架装饰❲ U+2772 轻左玳瑁托饰❴ U+2774 中左花括号装饰⟅ U+27C5 左 S 形袋分隔符⟦ U+27E6 数学左白方括号⟨ U+27E8 数学左角支架⟪ U+27EA 数学左双角支架⟬ U+27EC 数学左白龟壳支架⟮ U+27EE 数学左平括号⦃ U+2983 左白色卷曲支架⦅ U+2985 左白括号⦇ U+2987 Z 符号左图像括号⦉ U+2989 Z NOTATION 左绑定括号⦋ U+298B 左方支架带底杆⦍ U+298D 左方括号,顶角有勾⦏ U+298F 左方支架,底角有勾⦑ U+2991 带圆点的左角支架⦓ U+2993 左圆弧小于支架⦕ U+2995 双左圆弧大于支架⦗ U+2997 左黑龟甲支架⧘ U+29D8 左摆动栅栏⧚ U+29DA 左双摆动栅栏⧼ U+29FC 左指弯角支架⸢ U+2E22 左上半支架⸤ U+2E24 左下半支架⸦ U+2E26 左侧 U 型支架⸨ U+2E28 左双括号⹂ U+2E42 双低反转 9 引号〈 U+3008 左角支架《 U+300A左双角支架「 U+300C 左角支架『 U+300E 左白角支架【 U+3010 左黑色透镜支架〔 U+3014 左龟甲支架〖 U+3016 左白透镜支架〘 U+3018 左白龟甲支架〚 U+301A 左白方括号〝 U+301D 反双引号﴿ U+FD3F 华丽的右括号︗ U+FE17 垂直左白透镜支架演示表格︵ U+FE35 垂直左括号的演示表格︷ U+FE37 立式左花括号展示表︹ U+FE39 立式左龟壳支架展示表︻ U+FE3B 立式左侧黑色透镜支架演示表格︽ U+FE3D 立式左双角支架展示表格︿ U+FE3F 立式左角支架展示表格﹁ U+FE41 立式左角支架展示表格﹃ U+FE43 立式左白角支架演示表格﹇ U+FE47 立式左方括号演示表格﹙ U+FE59 左小括号﹛ U+FE5B 左小花括号﹝ U+FE5D 左小龟壳支架( U+FF08 全宽左括号[ U+FF3B 全宽左方括号{ U+FF5B 全宽左花括号⦅ U+FF5F 全宽左白括号★ U+FF62 左半角支架
$ unichars '\p{Close_Punctuation}' |猫) U+0029 右括号] U+005D 右方支架} U+007D 右花括号༻ U+0F3B TIBETAN MARK GUG RTAGS GYAS༽ U+0F3D 藏语 MARK ANG KHANG GYAS᚜ U+169C OGHAM 反面羽毛印记⁆ U+2046 右方支架带套筒⁾ U+207E 上标右括号₎ U+208E 订阅右括号⌉ U+2309 右天花板⌋ U+230B 右楼〉 U+232A 直角支架❩ U+2769 中号右括号饰品❫ U+276B 中号扁平右括号装饰品❭ U+276D 中号直角支架饰品❯ U+276F 重直角引号装饰品❱ U+2771 重型直角支架饰品❳ U+2773 灯右玳瑁支架饰品❵ U+2775 中号右花括号装饰品⟆ U+27C6 右 S 形袋分隔符⟧ U+27E7 数学右白方括号⟩ U+27E9 数学直角支架⟫ U+27EB 数学右双角支架⟭ U+27ED 数学右白龟甲支架⟯ U+27EF 数学右平括号⦄ U+2984 右白色卷曲支架⦆ U+2986 右白括号⦈ U+2988 Z 符号右图括号⦊ U+298A Z 符号右绑定支架⦌ U+298C 右方支架带底杆⦎ U+298E 右方括号,底角有勾⦐ U+2990 右方支架,顶角有勾号⦒ U+2992 带圆点的直角支架⦔ U+2994 右圆弧大于支架⦖ U+2996双右圆弧小于支架⦘ U+2998 右黑龟甲支架⧙ U+29D9 右摇摆栅栏⧛ U+29DB 右双摆动栅栏⧽ U+29FD 直角弯角支架⸣ U+2E23 右上半支架⸥ U+2E25 右下半支架⸧ U+2E27 右侧 U 支架⸩ U+2E29 右双括号〉 U+3009 直角支架》 U+300B 右双角支架」 U+300D 右角支架』 U+300F 右白角支架】 U+3011 右黑色透镜支架〕 U+3015 右龟壳支架】 U+3017 右白透镜支架〙 U+3019 右白龟甲支架〛 U+301B 右白方括号─ U+301E 双引号〟 U+301F 低双引号﴾ U+FD3E 华丽的左括号︘ U+FE18 立式右白透镜支架演示表格︶ U+FE36 垂直右括号演示表格︸ U+FE38 立式右花括号展示表︺ U+FE3A 立式右龟壳支架展示表︼ U+FE3C 立式右黑透镜支架展示表︾ U+FE3E 立式右双角支架展示表格﹀ U+FE40 立式直角支架演示表格﹂ U+FE42 立式右角支架展示表格﹄ U+FE44 立式右白角支架展示表﹈ U+FE48 立式右方括号展示表﹚ U+FE5A 右小括号﹜ U+FE5C 小右卷曲支架﹞ U+FE5E 小右龟壳支架) U+FF09 全宽右括号] U+FF3D 全宽右方括号} U+FF5D 全宽右卷曲支架⦆ U+FF60 全宽右白括号’ U+FF63 半宽右角支架
在python中安装unichars
和cpan Unicode::Tussle
后:
In perl
, there's the perluniprops
index of Unicode 7, http://perldoc.perl.org/perluniprops.html where I can do the following to pad opening and closing punctuations:
s/(\p{Open_Punctuation})/ $1 /g;
s/(\p{Close_Punctuation})/ $1 /g;
What is the full list of opening/closing punctuations that gets padded when using the perl? And what is the equivalence in python
?
Related question: Padding multiple character with space - python Padding multiple character with space - python; this question was asked separatedly by answerer's vote that it should be separate.
Are you asking how to determine what's the corresponding closing punctuation for a given open punctuation? Unicode does not define this. In fact, there's not even a 1:1 relationship.
$ unichars '\p{Open_Punctuation}' | wc -l
75
$ unichars '\p{Close_Punctuation}' | wc -l
73
However, It should be relatively easy for you to build your own mapping.
$ unichars '\p{Open_Punctuation}' | cat
( U+0028 LEFT PARENTHESIS
[ U+005B LEFT SQUARE BRACKET
{ U+007B LEFT CURLY BRACKET
༺ U+0F3A TIBETAN MARK GUG RTAGS GYON
༼ U+0F3C TIBETAN MARK ANG KHANG GYON
᚛ U+169B OGHAM FEATHER MARK
‚ U+201A SINGLE LOW-9 QUOTATION MARK
„ U+201E DOUBLE LOW-9 QUOTATION MARK
⁅ U+2045 LEFT SQUARE BRACKET WITH QUILL
⁽ U+207D SUPERSCRIPT LEFT PARENTHESIS
₍ U+208D SUBSCRIPT LEFT PARENTHESIS
⌈ U+2308 LEFT CEILING
⌊ U+230A LEFT FLOOR
〈 U+2329 LEFT-POINTING ANGLE BRACKET
❨ U+2768 MEDIUM LEFT PARENTHESIS ORNAMENT
❪ U+276A MEDIUM FLATTENED LEFT PARENTHESIS ORNAMENT
❬ U+276C MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT
❮ U+276E HEAVY LEFT-POINTING ANGLE QUOTATION MARK ORNAMENT
❰ U+2770 HEAVY LEFT-POINTING ANGLE BRACKET ORNAMENT
❲ U+2772 LIGHT LEFT TORTOISE SHELL BRACKET ORNAMENT
❴ U+2774 MEDIUM LEFT CURLY BRACKET ORNAMENT
⟅ U+27C5 LEFT S-SHAPED BAG DELIMITER
⟦ U+27E6 MATHEMATICAL LEFT WHITE SQUARE BRACKET
⟨ U+27E8 MATHEMATICAL LEFT ANGLE BRACKET
⟪ U+27EA MATHEMATICAL LEFT DOUBLE ANGLE BRACKET
⟬ U+27EC MATHEMATICAL LEFT WHITE TORTOISE SHELL BRACKET
⟮ U+27EE MATHEMATICAL LEFT FLATTENED PARENTHESIS
⦃ U+2983 LEFT WHITE CURLY BRACKET
⦅ U+2985 LEFT WHITE PARENTHESIS
⦇ U+2987 Z NOTATION LEFT IMAGE BRACKET
⦉ U+2989 Z NOTATION LEFT BINDING BRACKET
⦋ U+298B LEFT SQUARE BRACKET WITH UNDERBAR
⦍ U+298D LEFT SQUARE BRACKET WITH TICK IN TOP CORNER
⦏ U+298F LEFT SQUARE BRACKET WITH TICK IN BOTTOM CORNER
⦑ U+2991 LEFT ANGLE BRACKET WITH DOT
⦓ U+2993 LEFT ARC LESS-THAN BRACKET
⦕ U+2995 DOUBLE LEFT ARC GREATER-THAN BRACKET
⦗ U+2997 LEFT BLACK TORTOISE SHELL BRACKET
⧘ U+29D8 LEFT WIGGLY FENCE
⧚ U+29DA LEFT DOUBLE WIGGLY FENCE
⧼ U+29FC LEFT-POINTING CURVED ANGLE BRACKET
⸢ U+2E22 TOP LEFT HALF BRACKET
⸤ U+2E24 BOTTOM LEFT HALF BRACKET
⸦ U+2E26 LEFT SIDEWAYS U BRACKET
⸨ U+2E28 LEFT DOUBLE PARENTHESIS
⹂ U+2E42 DOUBLE LOW-REVERSED-9 QUOTATION MARK
〈 U+3008 LEFT ANGLE BRACKET
《 U+300A LEFT DOUBLE ANGLE BRACKET
「 U+300C LEFT CORNER BRACKET
『 U+300E LEFT WHITE CORNER BRACKET
【 U+3010 LEFT BLACK LENTICULAR BRACKET
〔 U+3014 LEFT TORTOISE SHELL BRACKET
〖 U+3016 LEFT WHITE LENTICULAR BRACKET
〘 U+3018 LEFT WHITE TORTOISE SHELL BRACKET
〚 U+301A LEFT WHITE SQUARE BRACKET
〝 U+301D REVERSED DOUBLE PRIME QUOTATION MARK
﴿ U+FD3F ORNATE RIGHT PARENTHESIS
︗ U+FE17 PRESENTATION FORM FOR VERTICAL LEFT WHITE LENTICULAR BRACKET
︵ U+FE35 PRESENTATION FORM FOR VERTICAL LEFT PARENTHESIS
︷ U+FE37 PRESENTATION FORM FOR VERTICAL LEFT CURLY BRACKET
︹ U+FE39 PRESENTATION FORM FOR VERTICAL LEFT TORTOISE SHELL BRACKET
︻ U+FE3B PRESENTATION FORM FOR VERTICAL LEFT BLACK LENTICULAR BRACKET
︽ U+FE3D PRESENTATION FORM FOR VERTICAL LEFT DOUBLE ANGLE BRACKET
︿ U+FE3F PRESENTATION FORM FOR VERTICAL LEFT ANGLE BRACKET
﹁ U+FE41 PRESENTATION FORM FOR VERTICAL LEFT CORNER BRACKET
﹃ U+FE43 PRESENTATION FORM FOR VERTICAL LEFT WHITE CORNER BRACKET
﹇ U+FE47 PRESENTATION FORM FOR VERTICAL LEFT SQUARE BRACKET
﹙ U+FE59 SMALL LEFT PARENTHESIS
﹛ U+FE5B SMALL LEFT CURLY BRACKET
﹝ U+FE5D SMALL LEFT TORTOISE SHELL BRACKET
( U+FF08 FULLWIDTH LEFT PARENTHESIS
[ U+FF3B FULLWIDTH LEFT SQUARE BRACKET
{ U+FF5B FULLWIDTH LEFT CURLY BRACKET
⦅ U+FF5F FULLWIDTH LEFT WHITE PARENTHESIS
「 U+FF62 HALFWIDTH LEFT CORNER BRACKET
$ unichars '\p{Close_Punctuation}' | cat
) U+0029 RIGHT PARENTHESIS
] U+005D RIGHT SQUARE BRACKET
} U+007D RIGHT CURLY BRACKET
༻ U+0F3B TIBETAN MARK GUG RTAGS GYAS
༽ U+0F3D TIBETAN MARK ANG KHANG GYAS
᚜ U+169C OGHAM REVERSED FEATHER MARK
⁆ U+2046 RIGHT SQUARE BRACKET WITH QUILL
⁾ U+207E SUPERSCRIPT RIGHT PARENTHESIS
₎ U+208E SUBSCRIPT RIGHT PARENTHESIS
⌉ U+2309 RIGHT CEILING
⌋ U+230B RIGHT FLOOR
〉 U+232A RIGHT-POINTING ANGLE BRACKET
❩ U+2769 MEDIUM RIGHT PARENTHESIS ORNAMENT
❫ U+276B MEDIUM FLATTENED RIGHT PARENTHESIS ORNAMENT
❭ U+276D MEDIUM RIGHT-POINTING ANGLE BRACKET ORNAMENT
❯ U+276F HEAVY RIGHT-POINTING ANGLE QUOTATION MARK ORNAMENT
❱ U+2771 HEAVY RIGHT-POINTING ANGLE BRACKET ORNAMENT
❳ U+2773 LIGHT RIGHT TORTOISE SHELL BRACKET ORNAMENT
❵ U+2775 MEDIUM RIGHT CURLY BRACKET ORNAMENT
⟆ U+27C6 RIGHT S-SHAPED BAG DELIMITER
⟧ U+27E7 MATHEMATICAL RIGHT WHITE SQUARE BRACKET
⟩ U+27E9 MATHEMATICAL RIGHT ANGLE BRACKET
⟫ U+27EB MATHEMATICAL RIGHT DOUBLE ANGLE BRACKET
⟭ U+27ED MATHEMATICAL RIGHT WHITE TORTOISE SHELL BRACKET
⟯ U+27EF MATHEMATICAL RIGHT FLATTENED PARENTHESIS
⦄ U+2984 RIGHT WHITE CURLY BRACKET
⦆ U+2986 RIGHT WHITE PARENTHESIS
⦈ U+2988 Z NOTATION RIGHT IMAGE BRACKET
⦊ U+298A Z NOTATION RIGHT BINDING BRACKET
⦌ U+298C RIGHT SQUARE BRACKET WITH UNDERBAR
⦎ U+298E RIGHT SQUARE BRACKET WITH TICK IN BOTTOM CORNER
⦐ U+2990 RIGHT SQUARE BRACKET WITH TICK IN TOP CORNER
⦒ U+2992 RIGHT ANGLE BRACKET WITH DOT
⦔ U+2994 RIGHT ARC GREATER-THAN BRACKET
⦖ U+2996 DOUBLE RIGHT ARC LESS-THAN BRACKET
⦘ U+2998 RIGHT BLACK TORTOISE SHELL BRACKET
⧙ U+29D9 RIGHT WIGGLY FENCE
⧛ U+29DB RIGHT DOUBLE WIGGLY FENCE
⧽ U+29FD RIGHT-POINTING CURVED ANGLE BRACKET
⸣ U+2E23 TOP RIGHT HALF BRACKET
⸥ U+2E25 BOTTOM RIGHT HALF BRACKET
⸧ U+2E27 RIGHT SIDEWAYS U BRACKET
⸩ U+2E29 RIGHT DOUBLE PARENTHESIS
〉 U+3009 RIGHT ANGLE BRACKET
》 U+300B RIGHT DOUBLE ANGLE BRACKET
」 U+300D RIGHT CORNER BRACKET
』 U+300F RIGHT WHITE CORNER BRACKET
】 U+3011 RIGHT BLACK LENTICULAR BRACKET
〕 U+3015 RIGHT TORTOISE SHELL BRACKET
〗 U+3017 RIGHT WHITE LENTICULAR BRACKET
〙 U+3019 RIGHT WHITE TORTOISE SHELL BRACKET
〛 U+301B RIGHT WHITE SQUARE BRACKET
〞 U+301E DOUBLE PRIME QUOTATION MARK
〟 U+301F LOW DOUBLE PRIME QUOTATION MARK
﴾ U+FD3E ORNATE LEFT PARENTHESIS
︘ U+FE18 PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRACKET
︶ U+FE36 PRESENTATION FORM FOR VERTICAL RIGHT PARENTHESIS
︸ U+FE38 PRESENTATION FORM FOR VERTICAL RIGHT CURLY BRACKET
︺ U+FE3A PRESENTATION FORM FOR VERTICAL RIGHT TORTOISE SHELL BRACKET
︼ U+FE3C PRESENTATION FORM FOR VERTICAL RIGHT BLACK LENTICULAR BRACKET
︾ U+FE3E PRESENTATION FORM FOR VERTICAL RIGHT DOUBLE ANGLE BRACKET
﹀ U+FE40 PRESENTATION FORM FOR VERTICAL RIGHT ANGLE BRACKET
﹂ U+FE42 PRESENTATION FORM FOR VERTICAL RIGHT CORNER BRACKET
﹄ U+FE44 PRESENTATION FORM FOR VERTICAL RIGHT WHITE CORNER BRACKET
﹈ U+FE48 PRESENTATION FORM FOR VERTICAL RIGHT SQUARE BRACKET
﹚ U+FE5A SMALL RIGHT PARENTHESIS
﹜ U+FE5C SMALL RIGHT CURLY BRACKET
﹞ U+FE5E SMALL RIGHT TORTOISE SHELL BRACKET
) U+FF09 FULLWIDTH RIGHT PARENTHESIS
] U+FF3D FULLWIDTH RIGHT SQUARE BRACKET
} U+FF5D FULLWIDTH RIGHT CURLY BRACKET
⦆ U+FF60 FULLWIDTH RIGHT WHITE PARENTHESIS
」 U+FF63 HALFWIDTH RIGHT CORNER BRACKET
After installing unichars
with cpan Unicode::Tussle
, in python:
>>> import subprocess
>>> cmd = "unichars '\p{Open_Punctuation}' | cut -f2 -d' ' | tr -d '\n'"
>>> open_punct = subprocess.check_output(cmd, shell=True).decode('utf8')
Smartmatch is experimental at /usr/local/bin/unichars line 546.
>>> print (open_punct)
([{༺༼᚛‚„⁅⁽₍〈❨❪❬❮❰❲❴⟅⟦⟨⟪⟬⟮⦃⦅⦇⦉⦋⦍⦏⦑⦓⦕⦗⧘⧚⧼⸢⸤⸦⸨〈《「『【〔〖〘〚〝﴾︗︵︷︹︻︽︿﹁﹃﹇﹙﹛﹝([{⦅「
这篇关于python中perluniprops的等价物是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!