xpath不适用于此站点

xpath不适用于此站点

本文介绍了xpath不适用于此站点,请验证的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将Python与Selenium(PhantomJS webdriver)结合使用来解析网站,而我对此有疑问.

I using Python with selenium (PhantomJS webdriver) to parse websites and i have problem with it.

我想从此广播网站获取当前歌曲: http://www.eskago.pl/radio/eska-warszawa .

I want to get current song from this radio site: http://www.eskago.pl/radio/eska-warszawa.

xpath:

/html/body/div[3]/div[1]/section[2]/div/div/div[2]/ul/li[2]/a[2]

xpath不适用于python硒

that xpath does not work with python selenium

错误:

有人知道这有什么问题吗?

Does anyone have idea what is wrong with this?

thx伙计们的答案我终于找到了解决我问题的方法.xpath很好(但实际上很脆弱)

thx guys for answersI finally find a solution for my problem.xpath was good (but in fact fragile)

我使用Firefox驱动程序,发现有问题-广告.

I use firefox driver and i saw problem - ad.

我将不得不跳过它们,因此我决定在没有此广告的情况下使用另一个页面: http://www.eskago.pl/radio

I would have to skip them by that and I decided to use another page without this ad:http://www.eskago.pl/radio

最后,谢谢,我用这个:

and finnaly, thx alecxe - I use this:

driver.find_element_by_xpath('//a[@class="radio-tab-button"]/span/strong').click()
element = driver.find_element_by_xpath('//p[@class="onAirStreamId_999"]/strong')
print element.text

工作完美.

推荐答案

您提供的xpath非常脆弱,现在想知道您是否收到了NoSuchElementException异常.

The xpath you provided is a very fragile one, now wonder you get a NoSuchElementException exception.

相反,依靠a标记的类名,里面有当前正在播放的歌曲:

Instead, rely on the a tag's class name, there is a current playing song inside:

<a class="playlist_small" href="http://www.eskago.pl/radio/eska-warszawa?noreload=yes">
    <img style="width:41px;" src="http://t-eska.cdn.smcloud.net/common/l/Q/s/lQ2009158Xvbl.jpg/ru-0-ra-45,45-n-lQ2009158Xvbl_jessie_j_bang_bang.jpg" alt="">
    <strong>Jessie J, Ariana Grande, Nicki Minaj</strong>
    <span>Bang Bang</span>
</a>

这是示例代码:

element = driver.find_element_by_xpath('//a[@class="playlist_small"]/strong')
print element.text


另一种检索当前播放歌曲的方法是模仿网站针对播放列表做出的JSONP响应:


Well, another way to retrieve the current playing song - is to mimic the JSONP response the website is making for the playlist:

>>> import requests
>>> import json
>>> import re
>>> response = requests.get('http://static.eska.pl/m/playlist/channel-999.jsonp')
>>> json_data = re.match('jsonp\((.*?)\);', response.content).group(1)
>>> songs = json.loads(json_data)
>>> current_song = songs[0]
>>> [artist['name'] for artist in current_song['artists']]
[u'David Guetta', u'Showtek', u'Vassy']
>>> current_song['name']
u'Bad'

这篇关于xpath不适用于此站点,请验证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-31 05:52