我想提取我在下面的图片中引用的参数。。。
我试过的是:

url='http://site.ir'
content=requests.get(url).content
tree = html.fromstring(content)
print [e.text_content() for e in tree.xpath('//div[@class="grouptext"]/????')]

这不在标记范围内,也不在标记br内。
图片:
更新
假设我有:
out=""" <div class="groupinfo">
    <div class="grouptext">
        <span style="color:#5f0101">
            span tag contents
        </span>
        WHAT I WANT
        <br></br>
    </div>
</div> <div class="groupinfo">
    <div class="grouptext">
        <span style="color:#5f0101">
            span tag contents
        </span>
        WHAT I WANT(1)
        <br></br>
    </div>
</div>
imagine I have: out=""" <div class="groupinfo">
    <div class="grouptext">
        <span style="color:#5f0101">
            span tag contents
        </span>
        WHAT I WANT(2)
        <br></br>
    </div>
</div> <div class="groupinfo">
    <div class="grouptext">
        <span style="color:#5f0101">
            span tag contents
        </span>
        WHAT I WANT(3)
        <br></br>
    </div>
</div> """"""

最佳答案

另一个选择是让以下内容成为span文本兄弟:

//div[@class="grouptext"]/span[1]/following-sibling::text()

演示:
from lxml import html

data = """
<div class="groupinfo">
    <div class="grouptext">
        <span style="color:#5f0101">
            span tag contents
        </span>
        WHAT I WANT
        <br></br>
    </div>
</div>
"""

tree = html.fromstring(data)
print tree.xpath('//div[@class="grouptext"]/span[1]/following-sibling::text()')[0].strip()

印刷品:
WHAT I WANT

对于更新的示例,以下是对我有用的:
for result in tree.xpath('//div[@class="grouptext"]/span/following-sibling::text()'):
    print result.strip()

印刷品:
WHAT I WANT

WHAT I WANT(1)

WHAT I WANT(2)

WHAT I WANT(3)

07-24 17:23