本文介绍了NSScanner循环问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 NSScanner 对象,该对象可扫描HTML文档中的段落标签.看来扫描器会停止在找到的第一个结果上,但是我需要将所有结果都放在一个数组中.

I have an NSScanner object that scans through HTML documents for paragraph tags. It seems like the scanner stops at the first result it finds, but I need all the results in an array.

如何改进我的代码以遍历整个文档?

How can my code be improved to go through an entire document?

- (NSArray *)getParagraphs:(NSString *) html
{
    NSScanner *theScanner;
    NSString *text = nil;

    theScanner = [NSScanner scannerWithString: html];

    NSMutableArray*paragraphs = [[NSMutableArray alloc] init];

    // find start of tag
    [theScanner scanUpToString: @"<p>" intoString: NULL];
    if ([theScanner isAtEnd] == NO) {
        NSInteger newLoc = [theScanner scanLocation] + 10;
        [theScanner setScanLocation: newLoc];

        // find end of tag
        [theScanner scanUpToString: @"</p>" intoString: &text];

        [paragraphs addObject:text];
    }

    return text;
}

推荐答案

免责声明:要解析HTML,最好使用像libxml的HTML 4解析器这样的HTML解析器,尤其是处理任意可能格式错误的解析器HTML.无论如何,由于该问题询问如何使用 NSParser 改进现有代码,因此我提供了以下示例.在大多数情况下,这是可行的,但在某些极端情况下,它将不起作用.对于seriuos HTML解析,请使用HTML解析器.

Disclaimer: To parse HTML, it's better to use a HTML parser like libxml's HTML 4 parser, especially to deal with arbitrary possibly malformed HTML. Anyway, since the question asks how to improve existing code using NSParser, I provide the following example. This will work in most cases but there are some corner cases where it won't. For seriuos HTML parsing, use a HTML parser.

迭代直到扫描仪用尽所有字符:

Iterate until the scanner has exhausted all characters:

NSScanner* scanner = [NSScanner scannerWithString:html];
NSMutableArray *paragraphs = [[NSMutableArray alloc] init];
[scanner scanUpToString:@"<p" intoString:nil];
while (![scanner isAtEnd]) {
    [scanner scanUpToString:@">" intoString:nil];
    [scanner scanString:@">" intoString:nil];
    NSString * text = nil;
    [scanner scanUpToString:@"</p>" intoString:&text];
    if (text) { // if html contains empty paragraphs <p></p>, text could be nil
        [paragraphs addObject:text];
    }
    [scanner scanUpToString:@"<p" intoString:nil];
}
...
[paragraphs release];

这篇关于NSScanner循环问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-02 07:41