我试图学习Simple HTML Dom Parser,作为介绍,我试图解析ESPN's college football score webpage并将其转换为纯HTML表。到目前为止,除导入日期外,我已经可以做所有事情。我遇到的问题是数据的结构如下:

<div class=gameDay-Container>
    <h4 class="games-date">Thursday, August 28 2014</h4>
    <div class=mod-ncf-scorebox> Containing games that I have scraped, which I want to add the date (Aug 28) to  </div>
    <h4 class="games-date">Friday, August 29 2014</h4>
    <div class=mod-ncf-scorebox> containing the next games that I will scrape, which I want the date (Aug 29) appended to </div>
</div>


顶部刮取每个日期内的游戏数,并将其放入数组中,然后底部对表的数据进行编译。 (这是混乱的,因为我是通过反复试验编写的;是否有更简单的方法来做到这一点?)

foreach($html_base->find('.gameDay-Container') as $dates) {
    $rows = $dates->find('.mod-ncf-scorebox');
    $count = count($rows);
    $supercount=$count.',';

    $megacount=explode(', ',$supercount);

    print_r($megacount);
}

foreach($html_base->find('.mod-ncf-scorebox') as $event) {
    $item['Date'] = '';
    $item['Away Team'] = $event->find('.team-name', 0)->plaintext;
    $item['Away Team'] = substr($item['Away Team'], 0, -1);
    $item['Away Score'] = $event->find('li.finalScore', 1);
    $item['Home Team'] = $event->find('.team-name', 1)->plaintext;
    $item['Home Team'] = substr($item['Home Team'], 0, -1);
    $item['Home Score'] = $event->find('li.finalScore', 2)->plaintext;
    $item['Game Status'] = $event->find('div.game-status', 0)->plaintext;
    $item['Game ID'] = $event->find('p', 0)->id;
    $item['Game ID'] = substr($item['Game ID'], 0, strpos( $item['Game ID'], '-'));
    $item['Week'] = $week;
    $item['League'] = 'NCAA Football';

    $NCAAFscores[] = $item;
}


问题:我可以获取要添加到表格中每一行的游戏日期吗?如果必须使用数组设置,是否可以从中获取值并进行某种计数?有一种更容易被我完全忽略的方法吗?

答案:

根据下面的Enissay的建议,我嵌套了foreach语句,该语句非常完美。这是最后的代码片段,以防万一有人有类似的事情。

foreach($html_base->find('.gameDay-Container') as $event2) {

    $date1 = $event2->prev_sibling()->plaintext;
    $date2 = new DateTime($date1);
    $ymd = ($date2->format('Y-m-d'));


    foreach($event2->find('.mod-ncf-scorebox') as $event) {
        $item['Date']     = $ymd;
        $item['Away Team']     = $event->find('.team-name', 0)->plaintext;
        $item['Away Team'] = substr($item['Away Team'], 0, -1);
        $item['Away Score']     = $event->find('li.finalScore', 1);
        $item['Home Team']     = $event->find('.team-name', 1)->plaintext;
        $item['Home Team'] = substr($item['Home Team'], 0, -1);
        $item['Home Score']     = $event->find('li.finalScore', 2)->plaintext;
        $item['Game Status']    = $event->find('div.game-status', 0)->plaintext;
        $item['Game ID']    = $event->find('p', 0)->id;
        $item['Game ID'] = substr($item['Game ID'], 0, strpos( $item['Game ID'], '-'));
        $item['Week'] = $week;
        $item['League'] = 'NCAA Football';

        $NCAAFscores[] = $item;


}

}

最佳答案

只需更改您的操作方式...



请遵循以下步骤:


游戏按div个块/容器分组,每个块之前都有相应的日期,请提取这些块▶$html_base->find('.gameDay-Container')
对于每个块,您都可以获取日期▶$block->prev_sibling()->plaintext;
然后您找到该街区中的每个游戏

foreach( $block->find('.mod-ncf-scorebox') as $game ) { ... }


未经测试,但应该可以工作:)

关于php - 解析div之外的元素以应用于Div内的数组(简单HTML Dom解析器),我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/25110285/

10-12 00:46
查看更多