还有一个问题
抓取网页“https://www.nejm.org/search?date=custom&toYear=2017&q=2017+AND+%222017%22+AND+2017&fromYear=2017&toMonth=12&fromMonth=1&isAdvancedSearch=true#qs=%3Fdate%3Dcustom%26toYear%3D2017%26requestType%3Dajax%26toMonth%3D12%26isAdvancedSearch%3Dtrue%26q%3D2017%2BAND%2B%25222017%2522%2BAND%2B2017%26fromYear%3D2017%26fromMonth%3D1%26viewClass%3D%26page%3D
5%26manualFilterParam%3DcontentAge_delimiter_contentAge_firstDelimiter” (这是第五页)上的内容的时候,总是只能抓取到第一页的内容
(就是每页有20篇论文 但是不管用哪页的网址都只能抓取到第一页的内容 标红数字5就是表示第5页)
比如:
- try<-read_html("https://www.nejm.org/search?date=custom&toYear=2017&q=2017+AND+%222017%22+AND+2017&fromYear=2017&toMonth=12&fromMonth=1&isAdvancedSearch=true#qs=%3Fdate%3Dcustom%26toYear%3D2017%26requestType%3Dajax%26toMonth%3D12%26isAdvancedSearch%3Dtrue%26q%3D2017%2BAND%2B%25222017%2522%2BAND%2B2017%26fromYear%3D2017%26fromMonth%3D1%26viewClass%3D%26page%3D4%26manualFilterParam%3DcontentAge_delimiter_contentAge_delimiter_contentAge_delimiter_contentAge_firstDelimiter")
复制代码- trys<-html_nodes(try,'a.js__sliLearn.m-result__link')%>%html_attrs
复制代码
和
- try<-read_html("https://www.nejm.org/search?date=custom&toYear=2017&q=2017+AND+%222017%22+AND+2017&fromYear=2017&toMonth=12&fromMonth=1&isAdvancedSearch=true#qs=%3Fdate%3Dcustom%26toYear%3D2017%26requestType%3Dajax%26toMonth%3D12%26isAdvancedSearch%3Dtrue%26q%3D2017%2BAND%2B%25222017%2522%2BAND%2B2017%26fromYear%3D2017%26fromMonth%3D1%26viewClass%3D%26page%3D5%26manualFilterParam%3DcontentAge_delimiter_contentAge_delimiter_contentAge_firstDelimiter")
复制代码- trys<-html_nodes(try,'a.js__sliLearn.m-result__link')%>%html_attrs
复制代码
应该分别抓取到第5页、第4页的某些数据,但真正抓取到的都是第1页的数据
求问为什么 肿么办
感谢!