我需要将短语拆分为单词、数字、标点符号和空格/制表符。我也想保持事物的秩序。

NSString *text = [NSString stringWithFormat:@"The 3 quick:\"brown fox, jump's\" over."];

这是我需要产生的那种列表:
['The', ' ', '3', ' ', 'quick, ':', '"', 'brown', ' ', 'fox', ',', ' ', 'jump's', ' ', '.']

谢谢!!

最佳答案

试试我用 NSScannerNSCharacterSet 编写的这个类别:

@interface NSString(Splitting)

-(NSArray *) arrayBySeparatingComponentsInCharacterSet:(NSCharacterSet *) charSet;

@end

@implementation NSString(Splitting)

BOOL scanOneCharacterFromSetIntoString(NSScanner *self, NSCharacterSet * charSet, NSString **outStr);
BOOL scanOneCharacterFromSetIntoString(NSScanner *self, NSCharacterSet * charSet, NSString **outStr)
{
    // check for index out of bounds
    NSString *inStr = self.string;

    if (self.scanLocation >= inStr.length)
    {
        return NO;
    }

    unichar ch = [inStr characterAtIndex:self.scanLocation];

    if (![charSet characterIsMember:ch])
    {
        return NO;
    }

    self.scanLocation++;
    if (outStr)
    {
        *outStr = [NSString stringWithCharacters:&ch length:1];
    }

    return YES;
}

-(NSArray *) arrayBySeparatingComponentsInCharacterSet:(NSCharacterSet *)charSet
{
    NSScanner *scanner = [NSScanner scannerWithString:self];
    NSMutableArray *result = [NSMutableArray array];

    NSString *temp = nil;
    while ([scanner scanUpToCharactersFromSet:charSet intoString:&temp] || scanOneCharacterFromSetIntoString(scanner, charSet, &temp)) {;
        [result addObject:temp];

        if ([scanner scanLocation] >= [self length])
        {
            break;
        }

        unichar temp2 = [self characterAtIndex:[scanner scanLocation]];

        if ([charSet characterIsMember:temp2])
        {
            [result addObject:[NSString stringWithFormat:@"%c", temp2]];
            // only update the scan location if the scan was sucessful
            [scanner setScanLocation:[scanner scanLocation] + 1];
        }
    }

    return result;
}

@end

int main (int argc, const char * argv[])
{
    @autoreleasepool {

        NSString *str = @"The 3 quick:\"brown fox, jump's\" over.";
        NSArray *array = [str arrayBySeparatingComponentsInCharacterSet:[NSCharacterSet characterSetWithCharactersInString:@" :\",'."]];
        NSLog(@"%@", array);
    }
}

应该是您需要的,只需将字符集更改为您需要的即可。另请注意,这是在启用 ARC 的情况下编译的,因此它可能会或可能不会在引用计数环境中与内存管理一起正常工作。

关于objective-c - 将文本拆分为单词、数字和标点符号,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/8870032/

10-12 14:39