你可以试试代码里的其他链接,都没有错误,只有你要爬的这个页面报错
- import urllib2
- import urllib
- import re
- import socket
- import httplib
- import sys
- url='https://asos.tmall.com/search.htm'
- values={
- 'spm':'a1z10.5-b.w4011-5044691060.102.QqiXRs',
- 'search':'y',
- 'orderType':'defaultSort',
- 'pageNo':'2',
- 'tsearch':'y'
- }
- data=urllib.urlencode(values)
- url=url+'?'+data
- url='https://detail.tmall.com/item.htm?spm=a1z10.5-b.w4011-5044691060.108.tFTsqy&id=521245233587&rn=36a17f1a4df6092dbc5daa6cf2ca7f99&abbucket=1'
- url='http://wuyouhuwai.taobao.com/search.htm?search=y&v=1'
- url='https://asos.tmall.com/search.htm?spm=a1z10.5-b.w4011-5044691060.102.QqiXRs&search=y&orderType=defaultSort&pageNo=2&tsearch=y#anchor'
- #url='https://asos.tmall.com/category.htm?spm=a1z10.5-b.w4010-5044691058.2.IHxn4N&search=y'
- #url='https://www.tmall.com/?spm=a1z10.5-b.0.0.gAVjBC'
- #request=urllib2.Request(url)
- #url='https://asos.tmall.com/search.htm?spm=a1z10.5-b.w4011-5044691060.102.QqiXRs&search=y&orderType=defaultSort&pageNo=2&tsearch=y'
- #url=urllib.quote_plus(url,safe=':\'/?&=()')
- headers={'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.137 Safari/537.36 LBBROWSER' }
- request = urllib2.Request(url,headers)
- try:
- response = urllib2.urlopen(url)
- html=response.read()
- print html
- except urllib2.URLError,e:
- print e.reason
复制代码
The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Found
Python2.7.5,Windows7