python - 从网页抓取-python

我对使用python进行网络编程非常陌生。目前，我正在做一些事情，以“刮擦”网站上的一小部分信息。
网站：http://www.airport-data.com/airport/HJO/#location
提取/废弃的信息：“海拔”（请参见“位置和速览”下的）

我到目前为止的代码：

from BeautifulSoup import BeautifulSoup
url2 = urllib2.urlopen('http://www.airport-data.com/airport/HJO/#location').read()
soup = BeautifulSoup(url2)
print soup #I did this just to see the content.

我试图在线阅读并查看过一些以前的文章，但没有回头。关于如何继续从网络链接中提取/抓取“海拔”的任何建议？
谢谢

最佳答案

首先，根据BeautifulSoup project documentation：

  Beautiful Soup 3已由Beautiful Soup 4取代。

  Beautiful Soup 3仅适用于Python 2.x，但Beautiful Soup 4也适用
  适用于Python3.x。 Beautiful Soup 4更快，功能更多，
  并与lxml和html5lib等第三方解析器一起使用。你应该
  对所有新项目使用Beautiful Soup 4。

安装BeautifulSoup 4-th version：

pip install beautifulSoup4

然后，该想法将是找到包含Elevation:文本的标签并获取the next sibling：

import urllib2
from bs4 import BeautifulSoup

url2 = urllib2.urlopen('http://www.airport-data.com/airport/HJO/#location')
soup = BeautifulSoup(url2)

print soup.find('td', class_='tc1', text='Elevation:').next_sibling.text

印刷品：

240 ft / 73.15 m (Estimated)

关于python - 从网页抓取-python，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/25692261/