我想将文本拆分成一个数组,并保持标点符号与其余单词分开,所以像这样的字符串:

Hello, I am Albert Einstein.

应该变成这样的数组:
["Hello", ",", "I", "am", "Albert", "Einstein", "."]

我尝试了sting.components(separatedBy: CharacterSet.init(charactersIn: " ,;;:")),但是此方法删除了所有标点符号,并返回了一个像这样的数组:
["Hello", "I", "am", "Albert", "Einstein"]

那么,如何获得像第一个示例一样的数组?

最佳答案

作为解决方案,它不是很漂亮,但是您可以尝试:

var str = "Hello, I am Albert Einstein."
var list = [String]()
var currentSubString = "";
//enumerate to get all characters including ".", ",", ";", " "
str.enumerateSubstrings(in: str.startIndex..<str.endIndex, options: String.EnumerationOptions.byComposedCharacterSequences) { (substring, substringRange, enclosingRange, value) in
    if let _subString = substring {
        if (!currentSubString.isEmpty &&
            (_subString.compare(" ") == .orderedSame
                || _subString.compare(",") == .orderedSame
                || _subString.compare(".") == .orderedSame
                || _subString.compare(";") == .orderedSame
            )
            ) {
            //create word if see any of those character and currentSubString is not empty
            list.append(currentSubString)
            currentSubString = _subString.trimmingCharacters(in: CharacterSet.whitespaces )
        } else {
            //add to current sub string if current character is not space.
            if (_subString.compare(" ") != .orderedSame) {
                currentSubString += _subString
            }
        }
    }
}


//last word
if (!currentSubString.isEmpty) {
    list.append(currentSubString)
}

在Swift3中:
var str = "Hello, I am Albert Einstein."
var list = [String]()
var currentSubString = "";
//enumerate to get all characters including ".", ",", ";", " "
str.enumerateSubstrings(in: str.startIndex..<str.endIndex, options: String.EnumerationOptions.byComposedCharacterSequences) { (substring, substringRange, enclosingRange, value) in
    if let _subString = substring {
        if (!currentSubString.isEmpty &&
            (_subString.compare(" ") == .orderedSame
                || _subString.compare(",") == .orderedSame
                || _subString.compare(".") == .orderedSame
                || _subString.compare(";") == .orderedSame
            )
            ) {
            //create word if see any of those character and currentSubString is not empty
            list.append(currentSubString)
            currentSubString = _subString.trimmingCharacters(in: CharacterSet.whitespaces )
        } else {
            //add to current sub string if current character is not space.
            if (_subString.compare(" ") != .orderedSame) {
                currentSubString += _subString
            }
        }
    }
}


//last word
if (!currentSubString.isEmpty) {
    list.append(currentSubString)
}

这个想法是循环所有字符并同时创建单词。单词是一组连续字符,不是,.;。因此,在创建循环单词时,如果看到其中一个字符,则完成当前单词,并且构造中的当前单词不为空。
要根据您的输入分割步骤:
  • 获取H(不是空格,也不是其他终端字符)
    -> currentSubString =“H”
  • 获取e(不是空格,也不是其他终端字符)
    -> currentSubString =“He”
  • 获取l(不是空格,也不是其他终端字符)
    -> currentSubString =“Hel”
  • 获取l(不是空格,也不是其他终端字符)
    -> currentSubString =“ hell ”
  • 获取o(不是空格,也不是其他终端字符)
    -> currentSubString =“Hello”
  • 获取.(是终端字符)
  • ->由于currentSubString不为空,请添加到list并重新开始下一个单词的构建,然后列表= [“Hello”]
  • -> currentSubString =“。 (我使用修整的原因是,如果得到此字符,则只是删除。但是对于其他终端字符,我们必须保留下一个单词。
  • 获取(是空格字符)
  • ->由于currentSubString不为空,请添加到list并重新开始构建-> list = [“Hello”,“。”]
  • -> currentSubString =“”(已修剪)。
    ... 等等。
  • 关于ios - 将文本拆分为数组,同时在Swift中保持标点符号,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/39834953/

    10-09 04:40