我想将文本拆分成一个数组,并保持标点符号与其余单词分开,所以像这样的字符串:
Hello, I am Albert Einstein.
应该变成这样的数组:
["Hello", ",", "I", "am", "Albert", "Einstein", "."]
我尝试了
sting.components(separatedBy: CharacterSet.init(charactersIn: " ,;;:"))
,但是此方法删除了所有标点符号,并返回了一个像这样的数组:["Hello", "I", "am", "Albert", "Einstein"]
那么,如何获得像第一个示例一样的数组?
最佳答案
作为解决方案,它不是很漂亮,但是您可以尝试:
var str = "Hello, I am Albert Einstein."
var list = [String]()
var currentSubString = "";
//enumerate to get all characters including ".", ",", ";", " "
str.enumerateSubstrings(in: str.startIndex..<str.endIndex, options: String.EnumerationOptions.byComposedCharacterSequences) { (substring, substringRange, enclosingRange, value) in
if let _subString = substring {
if (!currentSubString.isEmpty &&
(_subString.compare(" ") == .orderedSame
|| _subString.compare(",") == .orderedSame
|| _subString.compare(".") == .orderedSame
|| _subString.compare(";") == .orderedSame
)
) {
//create word if see any of those character and currentSubString is not empty
list.append(currentSubString)
currentSubString = _subString.trimmingCharacters(in: CharacterSet.whitespaces )
} else {
//add to current sub string if current character is not space.
if (_subString.compare(" ") != .orderedSame) {
currentSubString += _subString
}
}
}
}
//last word
if (!currentSubString.isEmpty) {
list.append(currentSubString)
}
在Swift3中:
var str = "Hello, I am Albert Einstein."
var list = [String]()
var currentSubString = "";
//enumerate to get all characters including ".", ",", ";", " "
str.enumerateSubstrings(in: str.startIndex..<str.endIndex, options: String.EnumerationOptions.byComposedCharacterSequences) { (substring, substringRange, enclosingRange, value) in
if let _subString = substring {
if (!currentSubString.isEmpty &&
(_subString.compare(" ") == .orderedSame
|| _subString.compare(",") == .orderedSame
|| _subString.compare(".") == .orderedSame
|| _subString.compare(";") == .orderedSame
)
) {
//create word if see any of those character and currentSubString is not empty
list.append(currentSubString)
currentSubString = _subString.trimmingCharacters(in: CharacterSet.whitespaces )
} else {
//add to current sub string if current character is not space.
if (_subString.compare(" ") != .orderedSame) {
currentSubString += _subString
}
}
}
}
//last word
if (!currentSubString.isEmpty) {
list.append(currentSubString)
}
这个想法是循环所有字符并同时创建单词。单词是一组连续字符,不是
,,
,.
或;
。因此,在创建循环单词时,如果看到其中一个字符,则完成当前单词,并且构造中的当前单词不为空。要根据您的输入分割步骤:
H
(不是空格,也不是其他终端字符)-> currentSubString =“H”
e
(不是空格,也不是其他终端字符)-> currentSubString =“He”
l
(不是空格,也不是其他终端字符)-> currentSubString =“Hel”
l
(不是空格,也不是其他终端字符)-> currentSubString =“ hell ”
o
(不是空格,也不是其他终端字符)-> currentSubString =“Hello”
.
(是终端字符)list
并重新开始下一个单词的构建,然后列表= [“Hello”]
。但是对于其他终端字符,我们必须保留下一个单词。
(是空格字符)list
并重新开始构建-> list = [“Hello”,“。”] ... 等等。
关于ios - 将文本拆分为数组,同时在Swift中保持标点符号,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/39834953/