这是为什么的ListIterator卡？

本文介绍了这是为什么的ListIterator卡？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我迷惑不解。我有HTML块，我掏出一个更大的表。它看起来大约是这样的：

I'm thoroughly puzzled. I have a block of HTML that I scraped out of a larger table. It looks about like this:

<td align="left" class="page">Number:\xc2\xa0<a class="topmenu" href="http://www.example.com/whatever.asp?search=724461">724461</a> Date:\xc2\xa01/1/1999 Amount:\xc2\xa0$2.50 <br/>Person:<br/><a class="topmenu" href="http://www.example.com/whatever.asp?search=LAST&amp;searchfn=FIRST">LAST,\xc2\xa0FIRST </a> </td>

（事实上，它看起来更糟，但我regexed了大量换行符）

(Actually, it looked worse, but I regexed out a lot of line breaks)

我需要得到这些行，并打破了日期/金额一致。这似乎是开始的地方是找到HTML的该块的孩子。该区块是一个字符串，因为这是正则表达式怎么还给了我。所以，我所做的：

I need to get the lines out, and break up the Date/Amount line. It seemed like the place to start was to find the children of that block of HTML. The block is a string because that's how regex gave it back to me. So I did:

text_soup = BeautifulSoup(text)
text_children = text_soup.find('td').childGenerator()

我可以通过与

for i,each in enumerate(text_soup.find('td').childGenerator()):
    print type(each)
    print i, ":", each

，但不与

for i, each in enumerate(text_children):
    ...etc

这些应该是相同的。所以我很困惑。

These ought to be the same. So I'm confused.

regexed

这是为什么的ListIterator卡？

问题描述

推荐答案