这是第一次接触到python,并在官网上下载了Python和PyCharm,并在网上简单的学习了爬虫的相关知识
题目要求:
1.网上爬取最新疫情数据,并存入到MySql数据库中
2.在可视化显示数据详细信息
项目思路:
爬虫:
1.导入包
2.发送请求,并打印数据状态码
添加headers伪装浏览器
3.取出需要的数据
4.存入数据库中
项目源码:
import requests
import time, json
import sys;
import pymysql
def get_wangyi_request():
url = 'https://c.m.163.com/ug/api/wuhan/app/data/list-total'
headers = {
'accept': '*/*',
'accept-encoding': 'gzip,deflate,br',
'accept-language': 'en-US,en;q=0.9,zh-CN;q = 0.8,zh;q = 0.7',
'origin': 'https://wp.m.163.com',
'referer': 'https://wp.m.163.com/',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-ite',
'user-agent': 'Mozilla/5.0(WindowsNT10.0;Win64;x64) AppleWebKit/37.36 (KHTML, likeGecko) Chrome/82.0.4056.0 Safari/537.36 Edg/82.0.432.3'
}
result = requests.get(url, headers=headers)
return result
def print_mess1(string: str, dict1total: dict):
sys.stdout.write(string + '确诊: ' + str(dict1total['confirm'] if dict1total['confirm'] != None else 0))
sys.stdout.write(' ')
sys.stdout.write(string + '疑似: ' + str(dict1total['suspect'] if dict1total['suspect'] != None else 0))
sys.stdout.write(' ')
sys.stdout.write(string + '治愈: ' + str(dict1total['heal'] if dict1total['heal'] != None else 0))
sys.stdout.write(' ')
sys.stdout.write(string + '死亡: ' + str(dict1total['dead'] if dict1total['dead'] != None else 0))
if __name__ == '__main__':
result = get_wangyi_request()
json_str = json.loads(result.text)['data']
# print(json_str.keys())
# dict_keys(['chinaTotal', 'chinaDayList', 'lastUpdateTime', 'areaTree'])
print(json_str['lastUpdateTime'])
province_list = json_str['areaTree'][0]['children']
# 每个省份包含如下的键
# dict_keys(['today', 'total', 'extData', 'name', 'id', 'lastUpdateTime', 'children'])
conn = pymysql.connect(
host='localhost', # 我的IP地址
port=3306, # 不是字符串不需要加引号。
user='root',
password='20000604',
db='database',
charset='utf8'
)
cursor = conn.cursor() # 获取一个光标
id = 0;
for dict in province_list:
sql = 'insert into pachong1 (province,total_confirm,total_suspect,total_heal,total_dead,today_confirm,today_suspect,today_heal,today_dead,today_lastUpdate,id) values (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s);'
province=dict['name']
total_confirm=dict['total']['confirm']
total_suspect=dict['total']['suspect']
total_heal=dict['total']['heal']
total_dead=dict['total']['dead']
today_confirm=dict['today']['confirm']
today_suspect=dict['today']['suspect']
today_heal=dict['today']['heal']
today_dead=dict['today']['dead']
today_lastUpdate= dict['lastUpdateTime']
id=id+1
sys.stdout.write( dict['name'] + ' ')
cursor.execute(sql, [province,total_confirm, total_suspect,total_heal,total_dead,today_confirm,today_suspect,today_heal,today_dead,today_lastUpdate,id])
print()
conn.commit()
cursor.close()
conn.close()
存入数据库截图:
遇到的问题:
1.mysql导入包的时候失败
在终端进行进行导入包的操作 cmd,操作:例如pip install mysql
花费时间:8分钟
2.数据入库失败
原因:数据库中信息的命名和python前台的名字书写不一致
花费时间:2分钟