python - BeautifulSoup 4-将两个col表解析为字典

我正在使用BeautifulSoup 4

我要解析的页面很大，但我需要找到该部分

soup.findAll（'h2'，text ='Case details'）

我想创建以下对象

详细信息= {'法院：'：'nysb'}

我如何找到该部分，然后遍历作为两个col表的下一个表，并将第一个col作为哈希中的键，第二个col作为值？

<body>
  <h2>
   Case details
  </h2>
  <table>
   <tr>
    <td>
     <b>
      Court:
     </b>
    </td>
    <td>
     nysb
    </td>
   </tr>
   </table>
</body>

table = h2_details.find_next_sibling('table')
AttributeError: 'ResultSet' object has no attribute 'find_next_sibling'

最佳答案

使用.find_next_sibling()查找H2标记后的表，然后从那里获取：

h2_details = soup.find('h2', text='Case details')

table = h2_details.find_next_sibling('table')

details = {}
for row in table.find_all('tr'):
    cells = row.find_all('td', limit=2)
    details[cells[0].string] = cells[1].string

我在这里使用.string，假设每个表单元格仅包含文本（无标记）。如果存在标记，则可能要改用''.join(cells[0].stripped_strings)和''.join(cells[1].stripped_strings)。

关于python - BeautifulSoup 4-将两个col表解析为字典，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/17131085/