问题描述
本覆盖在这个职位:Python网页抓取涉及与属性HTML代码
但我一直没能做到这个网页类似的东西:的?
But I haven't been able to do something similar for this web page: http://www.expatistan.com/cost-of-living/comparison/melbourne/auckland?
我想刮的值:
<td class="price city-2">
NZ$15.62
<span style="white-space:nowrap;">(AU$12.10)</span>
</td>
<td class="price city-1">
AU$15.82
</td>
基本上价格城市-2和价格城市-1(NZ $ 15.62和HK $ 15.82)
Basically price city-2 and price city-1 (NZ$15.62 and AU$15.82)
目前有:
import urllib2
from BeautifulSoup import BeautifulSoup
url = "http://www.expatistan.com/cost-of-living/comparison/melbourne/auckland?"
page = urllib2.urlopen(url)
soup = BeautifulSoup(page)
price2 = soup.findAll('td', attrs = {'class':'price city-2'})
price1 = soup.findAll('td', attrs = {'class':'price city-1'})
for price in price2:
print price
for price in price1:
print price
在理想情况下,我也喜欢有逗号分隔值:
Ideally, I'd also like to have comma separated values for:
<th colspan="3" class="clickable">Food</th>,
提取食物,
<td class="item-name">Daily menu in the business district</td>
提取'在商业区每日菜单
Extracting 'Daily menu in the business district'
,然后价格城市-2和价格city1值
and then the values for price city-2, and price-city1
因此,打印输出将是:
So the printout would be:
食品,在商业区每日菜单,NZ $ 15.62,AU $ 15.82
Food, Daily menu in the business district, NZ$15.62, AU$15.82
谢谢!
推荐答案
我觉得BeautifulSoup难以使用。这是基于一个版本的:
I find BeautifulSoup awkward to use. Here is a version based on the webscraping module:
from webscraping import common, download, xpath
# download html
D = download.Download()
html = D.get('http://www.expatistan.com/cost-of-living/comparison/melbourne/auckland')
# extract data
items = xpath.search(html, '//td[@class="item-name"]')
city1_prices = xpath.search(html, '//td[@class="price city-1"]')
city2_prices = xpath.search(html, '//td[@class="price city-2"]')
# display and format
for item, city1_price, city2_price in zip(items, city1_prices, city2_prices):
print item.strip(), city1_price.strip(), common.remove_tags(city2_price, False).strip()
输出:
在商业区AU每日菜单$ 15.82 NZ $ 15.62
组合一顿快餐店(巨无霸餐或类似)AU $ 7.40 NZ $ 8.16
Combo meal in fast food restaurant (Big Mac Meal or similar) AU$7.40 NZ$8.16
1/2公斤鸡胸脯AU $ 6.07 NZ $ 10.25(1磅)
1/2 Kg (1 lb.) of chicken breast AU$6.07 NZ$10.25
...
这篇关于Python的网页抓取;美丽的汤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!