我尝试从以下HTML中提取“恐怖”:

<div class="synopsis-section">
    <div class="movie-add-info left">
        <ul>
            <li>DIRECTOR : Matthew Vaughn</li>
            <li>ACTORS : </li>
            <li>DURATIONS : 107 Minutes</li>
            <li>CENSOR RATING : 17+</li>
            <li>GENRE : HORROR</li>
            <li>LANGUAGE : BAHASA INDONESIA</li>
       </ul>
     </div>


我这样尝试过:

    >> response = get(url)
    >> html_soup = BeautifulSoup(response.text, 'html.parser')
    >> containers = html_soup.find('div', class_='movie-add-info left')
    >> containers.li


输出:
导演:马修·沃恩

对于“恐怖”没有特定的“ li”;
谁能帮我解决这个问题?

最佳答案

import re
from bs4 import BeautifulSoup

soup = BeautifulSoup(my_html, 'lxml')

result = soup.find('div', {'class': 'movie-add-info left'}).find('ul').findChildren(text=re.compile(r'GENRE'))

print(result[0])


输出:

GENRE : HORROR


如果只需要'HORROR',则将其拆分:

print(result[0].split()[2])

关于python - Python BeautifulSoup-在嵌套的<div>和<ul>中查找特定的<li>,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/46726560/

10-11 11:42