大家好:
①目前在学习爬虫,想实现一下功能:
抓取百度输入【两个关键词】后页面中显示的【相关结果数量】,年份为2019、2020两年数据
②出现问题:
设置时间后相关结果数量不再显示(图如附件),且代码运行有时正常,有时显示'NoneType' object has no attribute 'group'
关键词有两个,如何组合可达到搜索结果为 ”关键词1“&”关键词2” ?(目前仅尝试了一个关键词)
③已有代码如下:
def bd_searout(key,t1,t2):
'''key是搜索的关键词,t1是起始时间,t2是结束时间,
t1,t2需要输入datetime类型的时间数据'''
import requests,re
from datetime import datetime
t1_stamp = datetime.timestamp(t1)
t2_stamp = datetime.timestamp(t2)
header={'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'\
,'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36'}
url = f'https://www.baidu.com/s?ie=UTF-8&wd={key}&gpc=stf={t1_stamp},{t2_stamp}|stftype=2&tfflag=1'
result = requests.get(url,headers = header,timeout=5)
print(url)
print(result)
text_out = result.text
p = re.compile('百度为您找到相关结果约.+个')
out = p.search(text_out).group()
return out
from datetime import datetime
t1 = datetime(2019,1,1)
t2 = datetime(2019,12,31)
out = bd_searout('python',t1,t2)
print(out)
求助大家,万分感谢!!!


雷达卡





京公网安备 11010802022788号







