python - 刮MTA地铁数据？

我手头有一个有趣的问题，我想知道这里有没有一个更聪明、更有经验的人能给我一些启示。
基本上，我需要几个地铁线路列车的车站的完整清单。以下是数据在mta.info网站上的显示方式，以#3行为例
http://web.mta.info/nyct/service/threelin.htm
有没有什么我可以刮掉这些数据，然后把它写进一个文本或csv文件？如果是这样，我该怎么做呢？我觉得这可能在Python中完成，但我不确定，因为我两天前才开始用Python创建东西（我是一个Java人）。
我尽量避免手动将这些内容写入文档，但如果没有其他选择，我想我必须这样做。我只需要数据
如果有任何脚本专业人士可以为我指出正确的方向，那我将不胜感激：）

最佳答案

有一个free API可用@rjbman指出。另见：
is there an api for the new york mta subway/bus/train etc?
MTA-API python wrapper
但是，这里有一个涉及使用BeautifulSoup进行HTML解析的替代解决方案：

from bs4 import BeautifulSoup
import requests

url = "http://web.mta.info/nyct/service/threelin.htm"
response = requests.get(url)

soup = BeautifulSoup(response.content)
table = soup.find('table', summary='Table of 3 Subway Line Stops')
stops = [tr('td')[2].text.strip().replace('\n', '').split(' /')
         for tr in table('tr', height=25)]
print stops

将所有站点打印为列表列表：

[
    [u'Harlem-148 Street', u'7 Avenue'],
    [u'145 Street', u'Lenox Avenue'],
    ...
    [u'Van Siclen Avenue', u'Livonia Avenue'],
    [u'New Lots Avenue', u'Livonia Avenue']
]

获取我使用过的requests模块的页面内容。

关于python - 刮MTA地铁数据？，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/25634764/