本文介绍了正则表达式;删除所有标点符号,除了的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下正则表达式可以在任何空格或标点符号上拆分.如何从 :punct: 中排除 1 个或多个标点符号?假设我想排除撇号和逗号.我知道我可以明确地使用 [此处的所有标点符号] 而不是 [[:punct:]] 但我希望有一种排除方法.

I have the following regex that splits on any space or punctuation. How can I exclude 1 or more punctuation characters from :punct:? Let's say I'd like to exclude apostrophes and commas. I know I could explicitly use [all punctuation marks in here] instead of [[:punct:]] but I'm hoping for an exclusion method.

X <- "I'm not that good at regex yet, but am getting better!"
strsplit(X, "[[:space:]]|(?=[[:punct:]])", perl=TRUE)

 [1] "I"       "'"       "m"       "not"     "that"    "good"    "at"      "regex"   "yet"
[10] ","       ""        "but"     "am"      "getting" "better"  "!"

推荐答案

我不清楚您希望结果是什么,但您可以使用否定类 喜欢这个答案.

It's not clear to me what you want the result to be, but you might be able to use negative classes like this answer.

R> strsplit(X, "[[:space:]]|(?=[^,'[:^punct:]])", perl=TRUE)[[1]]
 [1] "I'm"     "not"     "that"    "good"    "at"      "regex"   "yet,"
 [8] "but"     "am"      "getting" "better"  "!"

这篇关于正则表达式;删除所有标点符号,除了的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-26 22:20