问题描述
我刮了我的Android应用此。我想提取物对城市和区域codeS
I'm scraping this page for my android app. I'd like to extract the data on the table of cities and area codes
下面是我的code:
from bs4 import BeautifulSoup
import urllib2
import re
base_url = "http://www.howtocallabroad.com/taiwan/"
html_page = urllib2.urlopen(base_url)
soup = BeautifulSoup(html_page)
codes = soup.select("#codes tbody > tr > td")
for area_code in codes:
# print td city and area code
我想知道什么功能用Python或用 BeautifulSoup
从获取值< TD>价值< / TD>
对不起只是一个Android开发人员学习编写Python
Sorry just an android dev learning to write python
推荐答案
您可以使用的findAll()
,连同它打破了一个列表分成块
You can use findAll()
, along with a function which breaks up a list into chunks
>>> areatable = soup.find('table',{'id':'codes'})
>>> d = {}
>>> def chunks(l, n):
... return [l[i:i+n] for i in range(0, len(l), n)]
>>> dict(chunks([i.text for i in areatable.findAll('td')], 2))
{u'Chunan': u'36', u'Penghu': u'69', u'Wufeng': u'4', u'Fengyuan': u'4', u'Kaohsiung': u'7', u'Changhua': u'47', u'Pingtung': u'8', u'Keelung': u'2', u'Hsinying': u'66', u'Chungli': u'34', u'Suao': u'39', u'Yuanlin': u'48', u'Yungching': u'48', u'Panchiao': u'2', u'Taipei': u'2', u'Tainan': u'62', u'Peikang': u'5', u'Taichung': u'4', u'Yungho': u'2', u'Hsinchu': u'35', u'Tsoying': u'7', u'Hualien': u'38', u'Lukang': u'47', u'Talin': u'5', u'Chiaochi': u'39', u'Fengshan': u'7', u'Sanchung': u'2', u'Tungkang': u'88', u'Taoyuan': u'33', u'Hukou': u'36'}
说明:
.find()
中找到与 $的C $ CS
ID的表。使用功能块分裂列表进入evenly大小的块的。
Explanation:
.find()
finds a table with an id of codes
. The chunks
function is used to split up a list into evenly sized chunks.
由于
的findAll
返回一个列表,我们使用列表块创建类似:
As
findAll
returns a list, we use chunks on the list to create something like:
[[u'Changhua', u'47'], [u'Keelung', u'2'], etc]
i.text为我...
用于获取每个 D
标签的文本,否则在< TD>
和< / TD>
仍将
i.text for i in...
is used to get the text of each td
tag, otherwise the <td>
and </td>
would remain.
最后,
字典()
被称为列表的列表转换成一个字典,你可以用它来访问该国的区域code
Finally,
dict()
is called to convert the list of lists into a dictionary, which you can use to access the country's area code.
这篇关于使用BeautifulSoup表中提取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!