requests会自动编码,scrapy不会自动编码,比如url中存在 “|”
scrapy默认有一些自定义的头部:Accept, Accept-Language
DEFAULT_REQUEST_HEADERS = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'en',
}
requests默认cookie不传递
scrapy默认cookie传递(COOKIES_ENABLED默认为true,相当于session模块),
scrapy中cookie不能写到headers里,要单独写cookies=cookies_dict(新手最常见的错误)
附:
requests加代理:
proxies = {‘http’: ‘http://localhost:8888’, ‘https’:‘http://localhost:8888’}
scrapy加代理:
meta={‘proxy’: “http://localhost:8888”}