我有中文新闻提要,我想将句子分成较小的块以传递给API。

我该如何在ios中做到这一点?我为英语设置了50个字符的字符长度。

目前,我正在使用rangeOfString:函数查找点,逗号并分成句子。

NSString *str  = nil, *rem = nil;

str = [final substringToIndex:MAX_CHAR_Private];
rem = [final substringFromIndex:MAX_CHAR_Private];
NSRange rng = [rem rangeOfString:@"?"];
if (rng.location == NSNotFound) {
    rng = [rem rangeOfString:@"!"];
    if (rng.location == NSNotFound) {
        rng = [rem rangeOfString:@","];
        if (rng.location == NSNotFound) {
            rng = [rem rangeOfString:@"."];
            if (rng.location == NSNotFound) {
                rng = [rem rangeOfString:@" "];
            }
        }
    }
}
if (rng.location+1 + MAX_CHAR_Private > MAXIMUM_LIMIT_Private) {
    rng = [rem rangeOfString:@" "];
}

if (rng.location == NSNotFound) {
    remaining = [[final substringFromIndex:MAX_CHAR_Private] retain];
}
else{
    //NSRange rng = [rem rangeOfString:@" "];
    str = [str stringByAppendingString:[rem substringToIndex:rng.location]];
    remaining = [[final substringFromIndex:MAX_CHAR_Private + rng.location+1] retain];
}


这对于中文和日语字符无法正常工作。

最佳答案

检查NSLinguisticTagger,它应该与中文一起使用:

苹果公司的产品:“ NSLinguisticTagger类用于自动对自然语言文本进行分段,并用诸如词性的信息进行标记。它还可以标记语言,脚本,词干等。”

Apple文档NSLinguisticTagger Class Reference

另请参见NSHipster NSLinguisticTagger

另请参见objc.io issue 7

08-15 23:43