• 深入理解Scrapy



    Scrapy是什么

    An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.

    Scrapy是适用于Python的一个快速、简单、功能强大的web爬虫框架,通常用于抓取web站点并从页面中提取结构化的数据,也可以用来做监控与自动化测试。架构图如下所示:

    640?wx_fmt=png&tp=wxpic&wxfrom=5&wx_lazy=1&wx_co=1

    Scrapy如何工作

    理解工作原理更有益于后面的学习(也可先看后面的快速上手后再返回来看这里),Scrapy运行流程图如下所示:

    640?wx_fmt=png&tp=wxpic&wxfrom=5&wx_lazy=1&wx_co=1

    运行过程如下:

    1. 程序启动后将会创建一个/多个Spiders(爬虫)Spiders会将Requests(请求)经过SpiderMiddlewares(爬虫中间件)加工,再交给Engine(引擎)

    2. EngineSpiders传递过来多个请求转交给Scheduler(调度器),由调度器来安排请求。

    3. Scheduler将需要马上执行的请求交回给Engine

    4. Engine将请求经过DownloaderMiddlewares(下载器中间件)加工,再发送给Downloader(下载器)

    5. Downloader使用Requests完成页面/接口的下载,并生成Responses(响应), 将Responses经过DownloaderMiddlewares再转交给Engine

    6. EngineResponses经过SpiderMiddlewares交回给爬虫处理Responses

    7. Spiders处理Responses后产生的结果返回给Engine。(Spiders处理 Responses

    8. 步骤7 中Spiders处理结果返回的Requests对象将回到步骤2;返回的Items(数据结构化对象)或者dict(字典对象)将交给ItemPipelines(数据管道)处理。

    9. 通过定制ItemPipelines来控制数据如何持久化及处理。

    开始使用Scrapy

    1.安装Scrapy

    通过如下命令安装Scrapy

    pip install scrapy
    

    Scrapy安装完成会提供一个scrapy工具, 通过命令scrapy --help显示如下则表示安装成功:

    1. > scrapy --help
    2. Scrapy 2.6.2 - no active project
    3. Usage:
    4.   scrapy <command> [options] [args]
    5. Available commands:
    6.   bench         Run quick benchmark test
    7.   commands
    8.   fetch         Fetch a URL using the Scrapy downloader
    9.   genspider     Generate new spider using pre-defined templates
    10.   runspider     Run a self-contained spider (without creating a project)
    11.   settings      Get settings values
    12.   shell         Interactive scraping console
    13.   startproject  Create new project
    14.   version       Print Scrapy version
    15.   view          Open URL in browser, as seen by Scrapy
    16.   [ more ]      More commands available when run from project directory
    17. Use "scrapy  -h" to see more info about a command

    2.创建scrapy项目

    通过命令scrapy startproject xxx创建一个Scrapy项目:

    scrapy startproject MySpider
    

    命令执行之后在当前目录下会生成一个MySpider的目录,目录结构如下所示:

    1. MySpider/
    2. ├─scrapy.cfg
    3. └─MySpider/
    4. ├─items.py
    5. ├─middlewares.py
    6. ├─pipelines.py
    7. ├─settings.py
    8. ├─__init__.py
    9. └─spiders/
    10. └─__init__.py
    • items.py文件存放自定义的Items

    • middlewares.py 文件存放SpiderMiddlewaresDownloaderMiddlewares

    • pipelines.py 文件存放自定义的ItemPipelines

    • settings.py 文件存放全局的配置信息

    • spiders/ 目录存放所有Spiders

    之后以第一个MySpider/目录作为项目根目录

    3.创建Scrapy爬虫

    创建Scrapy爬虫命令scrapy genspider [spidername] [allow_domain] 以慢慢买历史价格接口为例,创建慢慢买爬虫:

    scrapy genspider manmanbuy manmanbuy.com
    

    慢慢买历史价格爬取流程:

    1. 访问 http://tool.manmanbuy.com/HistoryLowest.aspx 页面获取隐藏的 标签的 value 值。

    2. 通过 步骤1 获取的 value 值, 加工生成请求头的 Authorization 参数

    3. 生成 请求参数 token 的值

    4. 调用 http://tool.manmanbuy.com/api.ashx 接口获取商品历史价格 (该接口依赖有效cookie, 如何获取有效cookie不是本文重点暂不说明)

    此时在spiders/目录下就能找到生成的manmanbuy.py文件,文件内容如下:

    1. import scrapy
    2. class ManmanbuySpider(scrapy.Spider):
    3.     name = 'manmanbuy'
    4.     allowed_domains = ['manmanbuy.com']
    5.     start_urls = ['http://manmanbuy.com/']
    6.     def parse(self, response):
    7.         pass

    其中name为爬虫名称, allowed_domains 为允许访问的域名, start_urls 为启动爬取的地址。

    Spider 支持两种启动爬取方式, 一种为便捷的配置start_urls方式, 启动后将直接爬取配置的url, 另一种为重写start_requests方法,返回自定义初始化的Request

    4.Scrapy爬虫开发

    4.1. 编写Spiders(MySpider/spiders/manmanbuy.py)

    1. import scrapy
    2. from urllib.parse import quote
    3. import hashlib
    4. import time
    5. import copy
    6. class ManmanbuySpider(scrapy.Spider):
    7.     name = 'manmanbuy'
    8.     allowed_domains = ['manmanbuy.com']
    9.     def start_requests(self):
    10.         # 以京东单个商品查询历史价格为例, 商品ID: 100011493273
    11.         item_urls = ['https://item.jd.com/100011493273.html']
    12.         # 定义请求头
    13.         headers = {
    14.             'User-Agent''Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',
    15.         }
    16.         # 第一步先从h5页面获取ticket参数
    17.         for item_url in item_urls:
    18.             yield scrapy.Request(url='http://tool.manmanbuy.com/HistoryLowest.aspx?url=' + item_url,
    19.                                  headers=headers,
    20.                                  # 透传参数
    21.                                  meta={'key': item_url})
    22.     def parse(self, response: scrapy.http.Response):
    23.         # 从页面中获取ticket值
    24.         ticket = response.css('#ticket')[0].attrib['value']
    25.         # 获取下一段接口请求参数
    26.         req = parse_req({'key': response.meta['key'], 'method''getHistoryTrend'})
    27.         # 请求头
    28.         headers = {
    29.             'User-Agent''Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',
    30.             # 计算auth
    31.             'Authorization': parse_basic_auth(ticket),
    32.         }
    33.         return scrapy.FormRequest(url='http://tool.manmanbuy.com/api.ashx',
    34.                                   method='POST',
    35.                                   formdata=req,
    36.                                   headers=headers,
    37.                                   cookies=self.get_cookies(),
    38.                                   # 自定义回调地址
    39.                                   callback=self.parse_history_price)
    40.     def get_cookies(self):
    41.         # 省略获取cookie逻辑
    42.         cks = '_ga=GA1.2.604426644.1596510819; ASP.NET_SessionId=bbyuxdftfkcf5mrijdgkmnc5; Hm_lvt_01a310dc95b71311522403c3237671ae=1658906329; Hm_lvt_85f48cee3e51cd48eaba80781b243db3=1658748396,1658906330; _gid=GA1.2.472137414.1658906330; _gat_gtag_UA_145348783_1=1; 60014_mmbuser=VQYJA1IFBTBSVwNdClFVUgdRUQcLUgdXDg1RBgNTAVVUVAZRBQFeAw%3d%3d; Hm_lpvt_85f48cee3e51cd48eaba80781b243db3=1658906625; Hm_lpvt_01a310dc95b71311522403c3237671ae=1658906625'
    43.         cookies = {}
    44.         for one in cks.split(';'):
    45.             k, v = one.strip().split("=")
    46.             cookies[k] = v
    47.         return cookies
    48.     def parse_history_price(self, response: scrapy.http.Response):
    49.         # 输出相应
    50.         self.logger.info(response.text)
    51. def parse_basic_auth(ticket):
    52.     """
    53.     这是解析ticket的值啊,就是上面说的那逻辑
    54.     """
    55.     return 'BasicAuth ' + ticket[:160][-4:] + ticket[:160 - 4]
    56. def parse_req(d):
    57.     """
    58.     这是解析请求,增加t和token参数
    59.     """
    60.     d['t'= str(int(time.time() * 1000))
    61.     n = copy.deepcopy(d)
    62.     ks = list(n.keys())
    63.     ks.sort()
    64.     ask = 'c5c3f201a8e8fc634d37a766a0299218'
    65.     mask = ask
    66.     for k in ks:
    67.         mask += f'{k}{quote(str(n[k])).replace("/", "%2F")}'
    68.     mask += ask
    69.     mask = mask.upper()
    70.     md5 = hashlib.md5()
    71.     md5.update(mask.encode('utf-8'))
    72.     d['token'= md5.hexdigest().upper()
    73.     return d

    4.2. 修改Settings(MySpider/settings.py)

    1. # robots.txt 文件检查, 默认为: true, 需要改为Flase
    2. ROBOTSTXT_OBEY = False

    运行scrapy crawl manmanbuy命令启动爬虫, 观察日志能够正常获取数据:

    {"msg":"","code":0,"data":{"haveTrend":1,"changPriceRemark":"降幅5%","runtime":41,"zouShi_test":2,"changePriceCount":14,"spbh":"1|100011493273","spUrl":"https://item.jd.com/100011493273.html","spPic":"http://img13.360buyimg.com/n7/jfs/t1/201578/31/15673/77560/619479ceEd1bde507/c0dab826b71e0b84.jpg","currentPrice":1049.00,"spName":"荣耀Play5T 22.5W超级快充 5000mAh大电池 6.5英寸护眼屏 全网通8GB+128GB极光蓝","lowerDate":"2022-03-08T00:00:00+08:00","lowerPrice":899.00,"bjid":551120462,"zouShi":2,"siteId":1,"siteName":"京东商城","datePrice":"[1621353600000,1199.00,\"\"],[1621440000000,1199.00,\"\"],[1621526400000,1199.00,\"\"],[1621612800000,1199.00,\"\"],[1621699200000,1199.00,\"\"],[1621785600000,1199.00,\"\"],[1621872000000,1199.00,\"\"],[1621958400000,1199.00,\"\"],[1622044800000,1199.00,\"\"],[1622131200000,1199.00,\"\"],[1622217600000,1199.00,\"\"],[1622304000000,1199.00,\"\"],[1622390400000,1199.00,\"\"],[1622476800000,1199.00,\"\"],[1622563200000,1199.00,\"\"],[1622649600000,1199.00,\"\"],[1622736000000,1199.00,\"\"],[1622822400000,1199.00,\"\"],[1622908800000,1199.00,\"\"],[1622995200000,1199.00,\"\"],[1623081600000,1199.00,\"\"],[1623168000000,1199.00,\"\"],[1623254400000,1199.00,\"\"],[1623340800000,1199.00,\"1199元\"],[1623427200000,1199.00,\"\"],[1623513600000,1199.00,\"\"],[1623600000000,1199.00,\"\"],[1623686400000,1199.00,\"\"],[1623772800000,1199.00,\"\"],[1623859200000,1199.00,\"\"],[1623945600000,1099.00,\"购买1件,当前价:1199.00,满减:每满1180减100\"],[1624032000000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624118400000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624204800000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624291200000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624377600000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624464000000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624550400000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624636800000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624723200000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624809600000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624896000000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624982400000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625068800000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625155200000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625241600000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625328000000,1199.00,\"\"],[1625414400000,1199.00,\"\"],[1625500800000,1189.0,\"京东秒杀价:1189\"],[1625587200000,1199.00,\"\"],[1625673600000,1189.0,\"\"],[1625760000000,1199.0,\"\"],[1625846400000,1189.0,\"\"],[1625932800000,1199.0000,\"\"],[1626019200000,1199.0000,\"\"],[1626105600000,1189.0,\"\"],[1626192000000,1199.0,\"\"],[1626278400000,1189.0,\"\"],[1626364800000,1199.0000,\"\"],[1626451200000,1199.0000,\"\"],[1626537600000,1199.0000,\"\"],[1626624000000,1199.0000,\"\"],[1626710400000,1199.0000,\"\"],[1626796800000,1199.0000,\"\"],[1626883200000,1189.00,\"\"],[1626969600000,1199.0000,\"\"],[1627056000000,1199.0000,\"\"],[1627142400000,1199.0000,\"\"],[1627228800000,1199.0000,\"\"],[1627315200000,1189.00,\"\"],[1627401600000,1199.0000,\"\"],[1627488000000,1189.00,\"1189元\"],[1627574400000,1189.00,\"\"],[1627660800000,1199.00,\"\"],[1627747200000,1179.00,\"1179元\"],[1627833600000,1189.0000,\"\"],[1627920000000,1199.00,\"\"],[1628006400000,1199.00,\"\"],[1628092800000,1189.00,\"\"],[1628179200000,1189.0000,\"\"],[1628265600000,1189.0000,\"\"],[1628352000000,1189.0000,\"\"],[1628438400000,1199.00,\"\"],[1628524800000,1189.00,\"\"],[1628611200000,1199.0,\"\"],[1628697600000,1189.0000,\"\"],[1628784000000,1189.0000,\"\"],[1628870400000,1189.0000,\"\"],[1628956800000,1199.00,\"1199元\"],[1629043200000,1189.0000,\"\"],[1629129600000,1199.00,\"\"],[1629216000000,1189.00,\"\"],[1629302400000,1199.0000,\"\"],[1629388800000,1169.0,\"京东秒杀价:1169\"],[1629475200000,1199.00,\"\"],[1629561600000,1199.00,\"\"],[1629648000000,1199.00,\"\"],[1629734400000,1169.00,\"\"],[1629820800000,1199.0,\"\"],[1629907200000,1189.0,\"京东秒杀价:1189\"],[1629993600000,1199.00,\"\"],[1630080000000,1199.00,\"\"],[1630166400000,1199.00,\"\"],[1630252800000,1199.00,\"\"],[1630339200000,1189.00,\"1189元包邮\"],[1630425600000,1189.00,\"\"],[1630512000000,1175.00,\"购买1件,plus价格1175\"],[1630598400000,1189.00,\"\"],[1630684800000,1199.0,\"\"],[1630771200000,1189.0000,\"\"],[1630857600000,1199.0,\"\"],[1630944000000,1189.0000,\"\"],[1631030400000,1199.0,\"\"],[1631116800000,1099.00,\"购买1件,当前价:1199.00,满减:每满1180减100\"],[1631203200000,1189.0000,\"\"],[1631289600000,1189.00,\"\"],[1631376000000,1199.00,\"\"],[1631462400000,1189.0000,\"\"],[1631548800000,1189.0,\"\"],[1631635200000,1199.0,\"\"],[1631721600000,1189.00,\"\"],[1631808000000,1199.00,\"\"],[1631894400000,1189.0,\"\"],[1631980800000,1199.0,\"\"],[1632067200000,1169.00,\"\"],[1632153600000,1169.00,\"\"],[1632240000000,1169.00,\"\"],[1632326400000,1189.0,\"\"],[1632412800000,1169.0,\"\"],[1632499200000,1199.0,\"\"],[1632585600000,1169.00,\"\"],[1632672000000,1199.00,\"\"],[1632758400000,1169.0,\"\"],[1632844800000,1199.0000,\"\"],[1632931200000,1169.0,\"\"],[1633017600000,1169.0,\"\"],[1633104000000,1169.0,\"\"],[1633190400000,1169.00,\"\"],[1633276800000,1169.0000,\"\"],[1633363200000,1199.00,\"\"],[1633449600000,1199.00,\"\"],[1633536000000,1169.0,\"\"],[1633622400000,1169.0,\"\"],[1633708800000,1169.0,\"京东秒杀价:1169\"],[1633795200000,1169.0,\"\"],[1633881600000,1199.00,\"\"],[1633968000000,1169.00,\"\"],[1634054400000,1199.0,\"\"],[1634140800000,1169.00,\"\"],[1634227200000,1199.0,\"\"],[1634313600000,1199.0,\"\"],[1634400000000,1169.0,\"\"],[1634486400000,1199.0,\"\"],[1634572800000,1169.00,\"\"],[1634659200000,1199.0000,\"\"],[1634745600000,1169.00,\"\"],[1634832000000,1199.0,\"\"],[1634918400000,1199.0,\"\"],[1635004800000,1199.0,\"\"],[1635091200000,1199.0,\"\"],[1635177600000,1199.0,\"\"],[1635264000000,1199.0,\"\"],[1635350400000,1199.0,\"\"],[1635436800000,1199.0,\"\"],[1635523200000,1099.00,\"1099元 \"],[1635609600000,1099.0,\"\"],[1635696000000,1099.00,\"\"],[1635782400000,1099.00,\"\"],[1635868800000,1099.00,\"\"],[1635955200000,1099.00,\"购买1件,plus价格1099\"],[1636041600000,949.00,\"购买1件,当前价:1099.00,可叠加优惠券2:满880减150\"],[1636128000000,1099.00,\"\"],[1636214400000,1199.0,\"\"],[1636300800000,1099.0,\"\"],[1636387200000,1099.0,\"\"],[1636473600000,1099.0,\"\"],[1636560000000,979.00,\"购买1件,当前价:1099.00,满减:每满1080减120\"],[1636646400000,1099.0000,\"\"],[1636732800000,1199.00,\"\"],[1636819200000,1199.00,\"\"],[1636905600000,1199.00,\"\"],[1636992000000,1199.00,\"\"],[1637078400000,1099.00,\"\"],[1637164800000,1099.00,\"\"],[1637251200000,1099.00,\"\"],[1637337600000,1099.00,\"\"],[1637424000000,1099.00,\"1099元\"],[1637510400000,1099.00,\"\"],[1637596800000,1099.0,\"\"],[1637683200000,1099.00,\"\"],[1637769600000,1099.00,\"\"],[1637856000000,1099.00,\"\"],[1637942400000,1099.00,\"\"],[1638028800000,1199.0000,\"\"],[1638115200000,1199.00,\"\"],[1638201600000,1199.00,\"\"],[1638288000000,1099.0,\"\"],[1638374400000,1099.00,\"\"],[1638460800000,1099.00,\"\"],[1638547200000,1199.0,\"\"],[1638633600000,1199.00,\"\"],[1638720000000,1099.0,\"\"],[1638806400000,1099.00,\"\"],[1638892800000,1099.00,\"\"],[1638979200000,1099.00,\"1099元\"],[1639065600000,1099.00,\"\"],[1639152000000,1089.0,\"\"],[1639238400000,1089.0,\"\"],[1639324800000,1099.00,\"购买1件,当前价:1199.00,满减:满1150减100\"],[1639411200000,1099.00,\"购买1件,当前价:1199.00,满减:满1150减100\"],[1639497600000,1099.00,\"购买1件,当前价:1199.00,满减:满1150减100\"],[1639584000000,1099.00,\"\"],[1639670400000,1099.0,\"\"],[1639756800000,1099.0,\"\"],[1639843200000,1099.0,\"\"],[1639929600000,1099.00,\"1099元\"],[1640016000000,1099.0,\"\"],[1640102400000,1099.0,\"\"],[1640188800000,1099.0,\"\"],[1640275200000,1099.00,\"1099元\"],[1640361600000,1099.0,\"\"],[1640448000000,1069.0,\"\"],[1640534400000,1099.0,\"\"],[1640620800000,1099.0,\"\"],[1640707200000,1099.0,\"\"],[1640793600000,1099.0,\"\"],[1640880000000,1099.0,\"\"],[1640966400000,1099.00,\"1099元\"],[1641052800000,1099.0,\"\"],[1641139200000,1099.0,\"\"],[1641225600000,1099.0,\"\"],[1641312000000,1099.0,\"\"],[1641398400000,1099.0,\"\"],[1641484800000,1099.0,\"\"],[1641571200000,1099.0,\"\"],[1641657600000,1099.00,\"1099元\"],[1641744000000,1099.0,\"\"],[1641830400000,1099.0,\"\"],[1641916800000,1099.0,\"\"],[1642003200000,1099.0,\"\"],[1642089600000,1099.0,\"\"],[1642176000000,1099.0,\"\"],[1642262400000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642348800000,1099.00,\"\"],[1642435200000,1099.00,\"\"],[1642521600000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642608000000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642694400000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642780800000,1099.0,\"\"],[1642867200000,1099.0,\"\"],[1642953600000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643040000000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643126400000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643212800000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643299200000,1099.0,\"\"],[1643385600000,1099.0,\"\"],[1643472000000,1099.0,\"\"],[1643558400000,949.00,\"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100\"],[1643644800000,949.00,\"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100\"],[1643731200000,1099.0,\"\"],[1643817600000,1099.0,\"\"],[1643904000000,1099.0,\"\"],[1643990400000,1099.0,\"\"],[1644076800000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644163200000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644249600000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644336000000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644422400000,1099.00,\"\"],[1644508800000,1099.00,\"1099元\"],[1644595200000,1199.00,\"1199元\"],[1644681600000,1199.00,\"1089元\"],[1644768000000,1039.00,\"购买1件,当前价:1089.00,满减:满1000减50\"],[1644854400000,1039.00,\"购买1件,当前价:1089.00,满减:满1000减50\"],[1644940800000,1099.00,\"1049元\"],[1645027200000,1099.00,\"\"],[1645113600000,1099.00,\"\"],[1645200000000,1099.00,\"\"],[1645286400000,1099.00,\"\"],[1645372800000,1099.00,\"\"],[1645459200000,1099.00,\"\"],[1645545600000,1099.00,\"\"],[1645632000000,1099.00,\"\"],[1645718400000,1099.00,\"1099元\"],[1645804800000,1099.00,\"1069元\"],[1645891200000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1645977600000,1099.00,\"\"],[1646064000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646150400000,1099.00,\"\"],[1646236800000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646323200000,1099.00,\"\"],[1646409600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646496000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646582400000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646668800000,899.00,\"购买1件,当前价:1099.00,满减:每满1080减200\"],[1646755200000,1099.00,\"\"],[1646841600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646928000000,1099.0,\"\"],[1647014400000,1099.0,\"\"],[1647100800000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1647187200000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1647273600000,1099.00,\"\"],[1647360000000,1099.00,\"1099元\"],[1647446400000,1049.0,\"购买1件,当前价:1099.0,满减:满1050减50\"],[1647532800000,1099.0,\"\"],[1647619200000,1099.00,\"\"],[1647705600000,1099.00,\"\"],[1647792000000,1099.00,\"\"],[1647878400000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1647964800000,1099.00,\"\"],[1648051200000,1099.00,\"\"],[1648137600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648224000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648310400000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648396800000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648483200000,1039.00,\"购买1件,当前价:1089.00,满减:满1050减50\"],[1648569600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648656000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648742400000,1049.00,\"\"],[1648828800000,1049.00,\"\"],[1648915200000,1049.00,\"1049元\"],[1649001600000,1099.00,\"\"],[1649088000000,1049.00,\"\"],[1649174400000,1049.00,\"\"],[1649260800000,1049.00,\"\"],[1649347200000,1049.00,\"\"],[1649433600000,1099.00,\"\"],[1649520000000,1099.00,\"\"],[1649606400000,1099.00,\"\"],[1649692800000,1049.0,\"购买1件,当前价格1049\"],[1649779200000,1099.00,\"\"],[1649865600000,1099.00,\"\"],[1649952000000,1049.00,\"1049元\"],[1650038400000,1099.00,\"\"],[1650124800000,1049.00,\"\"],[1650211200000,1049.00,\"\"],[1650297600000,1049.00,\"\"],[1650384000000,1049.00,\"\"],[1650470400000,1099.00,\"1099元\"],[1650556800000,1099.00,\"\"],[1650643200000,1099.00,\"\"],[1650729600000,1099.00,\"\"],[1650816000000,1099.00,\"\"],[1650902400000,1099.00,\"\"],[1650988800000,1099.00,\"\"],[1651075200000,1099.00,\"1099元\"],[1651161600000,1099.00,\"\"],[1651248000000,1099.00,\"\"],[1651334400000,1049.00,\"\"],[1651420800000,1099.00,\"\"],[1651507200000,1099.0000,\"\"],[1651593600000,1099.0000,\"\"],[1651680000000,1099.00,\"1099元\"],[1651766400000,1089.0,\"购买1件,当前价格1089\"],[1651852800000,1049.00,\"\"],[1651939200000,1089.00,\"\"],[1652025600000,1099.00,\"1099元\"],[1652112000000,1089.00,\"\"],[1652198400000,1049.00,\"\"],[1652284800000,1089.00,\"\"],[1652371200000,1049.00,\"\"],[1652457600000,1099.00,\"\"],[1652544000000,1099.00,\"\"],[1652630400000,1099.00,\"\"],[1652716800000,1099.00,\"1099元\"],[1652803200000,1099.00,\"\"],[1652889600000,1099.00,\"\"],[1652976000000,1049.00,\"\"],[1653062400000,1099.00,\"\"],[1653148800000,1099.00,\"\"],[1653235200000,1099.00,\"1099元\"],[1653321600000,1099.00,\"\"],[1653408000000,1099.00,\"\"],[1653494400000,1099.00,\"\"],[1653580800000,1099.00,\"\"],[1653667200000,1099.00,\"\"],[1653753600000,1099.00,\"\"],[1653840000000,1099.00,\"\"],[1653926400000,1049.0,\"\"],[1654012800000,1049.00,\"\"],[1654099200000,1049.00,\"\"],[1654185600000,1049.00,\"\"],[1654272000000,1049.00,\"\"],[1654358400000,1049.00,\"\"],[1654444800000,1049.00,\"\"],[1654531200000,1049.00,\"\"],[1654617600000,1049.00,\"\"],[1654704000000,1049.00,\"\"],[1654790400000,1049.00,\"\"],[1654876800000,1049.00,\"\"],[1654963200000,1049.00,\"\"],[1655049600000,1049.00,\"\"],[1655136000000,1049.00,\"\"],[1655222400000,1049.00,\"\"],[1655308800000,1049.00,\"1049元\"],[1655395200000,1049.00,\"\"],[1655481600000,1049.00,\"\"],[1655568000000,1049.00,\"\"],[1655654400000,1049.00,\"\"],[1655740800000,1049.00,\"\"],[1655827200000,1049.00,\"1049元\"],[1655913600000,1049.00,\"\"],[1656000000000,1049.00,\"\"],[1656086400000,1049.00,\"\"],[1656172800000,1049.00,\"1049元\"],[1656259200000,1049.00,\"\"],[1656345600000,1049.00,\"\"],[1656432000000,1049.00,\"\"],[1656518400000,1049.00,\"\"],[1656604800000,1049.00,\"1049元\"],[1656691200000,1049.00,\"\"],[1656777600000,1049.00,\"\"],[1656864000000,1049.00,\"1049元\"],[1656950400000,1049.00,\"\"],[1657036800000,1049.00,\"\"],[1657123200000,1049.00,\"\"],[1657209600000,1049.00,\"\"],[1657296000000,1049.00,\"\"],[1657382400000,1049.00,\"\"],[1657468800000,1049.00,\"\"],[1657555200000,1049.00,\"\"],[1657641600000,1049.00,\"\"],[1657728000000,1049.00,\"\"],[1657814400000,1049.00,\"\"],[1657900800000,1049.00,\"1049元\"],[1657987200000,1049.00,\"\"],[1658073600000,1049.00,\"\"],[1658160000000,1049.00,\"\"],[1658246400000,1049.00,\"\"],[1658332800000,1049.00,\"1049元\"],[1658419200000,1049.00,\"\"],[1658505600000,1049.00,\"\"],[1658592000000,1049.00,\"\"],[1658678400000,1049.00,\"\"],[1658764800000,1049.00,\"\"],[1658851200000,1049.00,\"\"]","ZheKouCount":95},"count":0}
    

    5.Scrapy数据持久化开发

    5.1. 编写Items(NySpider/items.py)

    1. import scrapy
    2. class HistoryPriceItem(scrapy.Item):
    3.     """
    4.     自定义历史价格存储Item
    5.     """
    6.     # 商品URL
    7.     itemUrl = scrapy.Field()
    8.     # 图片URL
    9.     picUrl = scrapy.Field()
    10.     # 历史价格信息
    11.     detailPrice = scrapy.Field()

    5.2. 编写ItemPipelines(MySpider/pipelines.py), 以文件存储为例:

    1. import scrapy.crawler
    2. from itemadapter import ItemAdapter
    3. from scrapy import signals
    4. class FilePipeline:
    5.     def __init__(self, filename='store.txt'):
    6.         self.filename = filename
    7.     def process_item(self, item, spider):
    8.         # 使用适配器包装item, 防止直接对item进行修改/删除影响后续Pipeline
    9.         adapter = ItemAdapter(item)
    10.         # 写入文件
    11.         self.fp.write(adapter.get('itemUrl'+ "    " + adapter.get('picUrl'+ "    " + adapter.get('detailPrice'+ '\n')
    12.         return item
    13.     @classmethod
    14.     def from_crawler(cls, crawler:scrapy.crawler.Crawler):
    15.         s = cls()
    16.         # 通过信号绑定行为
    17.         # 爬虫启动时创建文件fp
    18.         crawler.signals.connect(s.opened, signal=signals.spider_opened)
    19.         # 爬虫停止时关闭文件fp
    20.         crawler.signals.connect(s.closed, signal=signals.spider_closed)
    21.         return s
    22.     def closed(self, spider):
    23.         self.fp.close()
    24.     def opened(self, spider):
    25.         self.fp = open(self.filename, 'w', encoding='utf-8')
    26.         self.fp.write('商品URL    主图URL    历史价格信息\n')

    5.3. 修改Spiders(MySpider/spiders/manmanbuy.py)

    1. import scrapy
    2. import json
    3. from MySpider.items import HistoryPriceItem
    4. class ManmanbuySpider(scrapy.Spider):
    5.      # 省略未修改内容
    6.      custom_settings = {
    7.          # 配置使用的Item管道
    8.         'ITEM_PIPELINES': {
    9.             'MySpider.pipelines.FilePipeline'300,
    10.         }
    11.     }
    12.     def parse_history_price(self, response: scrapy.http.Response):
    13.         # 解析价格响应
    14.         self.logger.info(response.text)
    15.         data = json.loads(response.text)
    16.         # 返回Item
    17.         return HistoryPriceItem(itemUrl=data['data']['spUrl'], picUrl=data['data']['spPic'], detailPrice=data['data']['datePrice'])

    这里说明一下scrapy有5种添加配置方式,常用的有3种,高优先级配置会覆盖低优先级相同的Key的配置,不同的Key的配置则组合起来,按优先级从高到底分别是:

    1. 命令行配置

    2. 爬虫配置

    3. 项目全局配置

    Spiders中的custom_settings参数就是爬虫配置

    5.4. 运行scrapy crawl manmanbuy命令启动爬虫,观察当前目录发现生成一个store.txt文件,文件内容如下:

    1. 商品URL 主图URL 历史价格信息
    2. https://item.jd.com/100011493273.html http://img13.360buyimg.com/n7/jfs/t1/201578/31/15673/77560/619479ceEd1bde507/c0dab826b71e0b84.jpg [1621353600000,1199.00,""],[1621440000000,1199.00,""],[1621526400000,1199.00,""],[1621612800000,1199.00,""],[1621699200000,1199.00,""],[1621785600000,1199.00,""],[1621872000000,1199.00,""],[1621958400000,1199.00,""],[1622044800000,1199.00,""],[1622131200000,1199.00,""],[1622217600000,1199.00,""],[1622304000000,1199.00,""],[1622390400000,1199.00,""],[1622476800000,1199.00,""],[1622563200000,1199.00,""],[1622649600000,1199.00,""],[1622736000000,1199.00,""],[1622822400000,1199.00,""],[1622908800000,1199.00,""],[1622995200000,1199.00,""],[1623081600000,1199.00,""],[1623168000000,1199.00,""],[1623254400000,1199.00,""],[1623340800000,1199.00,"1199元"],[1623427200000,1199.00,""],[1623513600000,1199.00,""],[1623600000000,1199.00,""],[1623686400000,1199.00,""],[1623772800000,1199.00,""],[1623859200000,1199.00,""],[1623945600000,1099.00,"购买1件,当前价:1199.00,满减:每满1180减100"],[1624032000000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624118400000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624204800000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624291200000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624377600000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624464000000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624550400000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624636800000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624723200000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624809600000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624896000000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624982400000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625068800000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625155200000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625241600000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625328000000,1199.00,""],[1625414400000,1199.00,""],[1625500800000,1189.0,"京东秒杀价:1189"],[1625587200000,1199.00,""],[1625673600000,1189.0,""],[1625760000000,1199.0,""],[1625846400000,1189.0,""],[1625932800000,1199.0000,""],[1626019200000,1199.0000,""],[1626105600000,1189.0,""],[1626192000000,1199.0,""],[1626278400000,1189.0,""],[1626364800000,1199.0000,""],[1626451200000,1199.0000,""],[1626537600000,1199.0000,""],[1626624000000,1199.0000,""],[1626710400000,1199.0000,""],[1626796800000,1199.0000,""],[1626883200000,1189.00,""],[1626969600000,1199.0000,""],[1627056000000,1199.0000,""],[1627142400000,1199.0000,""],[1627228800000,1199.0000,""],[1627315200000,1189.00,""],[1627401600000,1199.0000,""],[1627488000000,1189.00,"1189元"],[1627574400000,1189.00,""],[1627660800000,1199.00,""],[1627747200000,1179.00,"1179元"],[1627833600000,1189.0000,""],[1627920000000,1199.00,""],[1628006400000,1199.00,""],[1628092800000,1189.00,""],[1628179200000,1189.0000,""],[1628265600000,1189.0000,""],[1628352000000,1189.0000,""],[1628438400000,1199.00,""],[1628524800000,1189.00,""],[1628611200000,1199.0,""],[1628697600000,1189.0000,""],[1628784000000,1189.0000,""],[1628870400000,1189.0000,""],[1628956800000,1199.00,"1199元"],[1629043200000,1189.0000,""],[1629129600000,1199.00,""],[1629216000000,1189.00,""],[1629302400000,1199.0000,""],[1629388800000,1169.0,"京东秒杀价:1169"],[1629475200000,1199.00,""],[1629561600000,1199.00,""],[1629648000000,1199.00,""],[1629734400000,1169.00,""],[1629820800000,1199.0,""],[1629907200000,1189.0,"京东秒杀价:1189"],[1629993600000,1199.00,""],[1630080000000,1199.00,""],[1630166400000,1199.00,""],[1630252800000,1199.00,""],[1630339200000,1189.00,"1189元包邮"],[1630425600000,1189.00,""],[1630512000000,1175.00,"购买1件,plus价格1175"],[1630598400000,1189.00,""],[1630684800000,1199.0,""],[1630771200000,1189.0000,""],[1630857600000,1199.0,""],[1630944000000,1189.0000,""],[1631030400000,1199.0,""],[1631116800000,1099.00,"购买1件,当前价:1199.00,满减:每满1180减100"],[1631203200000,1189.0000,""],[1631289600000,1189.00,""],[1631376000000,1199.00,""],[1631462400000,1189.0000,""],[1631548800000,1189.0,""],[1631635200000,1199.0,""],[1631721600000,1189.00,""],[1631808000000,1199.00,""],[1631894400000,1189.0,""],[1631980800000,1199.0,""],[1632067200000,1169.00,""],[1632153600000,1169.00,""],[1632240000000,1169.00,""],[1632326400000,1189.0,""],[1632412800000,1169.0,""],[1632499200000,1199.0,""],[1632585600000,1169.00,""],[1632672000000,1199.00,""],[1632758400000,1169.0,""],[1632844800000,1199.0000,""],[1632931200000,1169.0,""],[1633017600000,1169.0,""],[1633104000000,1169.0,""],[1633190400000,1169.00,""],[1633276800000,1169.0000,""],[1633363200000,1199.00,""],[1633449600000,1199.00,""],[1633536000000,1169.0,""],[1633622400000,1169.0,""],[1633708800000,1169.0,"京东秒杀价:1169"],[1633795200000,1169.0,""],[1633881600000,1199.00,""],[1633968000000,1169.00,""],[1634054400000,1199.0,""],[1634140800000,1169.00,""],[1634227200000,1199.0,""],[1634313600000,1199.0,""],[1634400000000,1169.0,""],[1634486400000,1199.0,""],[1634572800000,1169.00,""],[1634659200000,1199.0000,""],[1634745600000,1169.00,""],[1634832000000,1199.0,""],[1634918400000,1199.0,""],[1635004800000,1199.0,""],[1635091200000,1199.0,""],[1635177600000,1199.0,""],[1635264000000,1199.0,""],[1635350400000,1199.0,""],[1635436800000,1199.0,""],[1635523200000,1099.00,"1099元 "],[1635609600000,1099.0,""],[1635696000000,1099.00,""],[1635782400000,1099.00,""],[1635868800000,1099.00,""],[1635955200000,1099.00,"购买1件,plus价格1099"],[1636041600000,949.00,"购买1件,当前价:1099.00,可叠加优惠券2:满880减150"],[1636128000000,1099.00,""],[1636214400000,1199.0,""],[1636300800000,1099.0,""],[1636387200000,1099.0,""],[1636473600000,1099.0,""],[1636560000000,979.00,"购买1件,当前价:1099.00,满减:每满1080减120"],[1636646400000,1099.0000,""],[1636732800000,1199.00,""],[1636819200000,1199.00,""],[1636905600000,1199.00,""],[1636992000000,1199.00,""],[1637078400000,1099.00,""],[1637164800000,1099.00,""],[1637251200000,1099.00,""],[1637337600000,1099.00,""],[1637424000000,1099.00,"1099元"],[1637510400000,1099.00,""],[1637596800000,1099.0,""],[1637683200000,1099.00,""],[1637769600000,1099.00,""],[1637856000000,1099.00,""],[1637942400000,1099.00,""],[1638028800000,1199.0000,""],[1638115200000,1199.00,""],[1638201600000,1199.00,""],[1638288000000,1099.0,""],[1638374400000,1099.00,""],[1638460800000,1099.00,""],[1638547200000,1199.0,""],[1638633600000,1199.00,""],[1638720000000,1099.0,""],[1638806400000,1099.00,""],[1638892800000,1099.00,""],[1638979200000,1099.00,"1099元"],[1639065600000,1099.00,""],[1639152000000,1089.0,""],[1639238400000,1089.0,""],[1639324800000,1099.00,"购买1件,当前价:1199.00,满减:满1150减100"],[1639411200000,1099.00,"购买1件,当前价:1199.00,满减:满1150减100"],[1639497600000,1099.00,"购买1件,当前价:1199.00,满减:满1150减100"],[1639584000000,1099.00,""],[1639670400000,1099.0,""],[1639756800000,1099.0,""],[1639843200000,1099.0,""],[1639929600000,1099.00,"1099元"],[1640016000000,1099.0,""],[1640102400000,1099.0,""],[1640188800000,1099.0,""],[1640275200000,1099.00,"1099元"],[1640361600000,1099.0,""],[1640448000000,1069.0,""],[1640534400000,1099.0,""],[1640620800000,1099.0,""],[1640707200000,1099.0,""],[1640793600000,1099.0,""],[1640880000000,1099.0,""],[1640966400000,1099.00,"1099元"],[1641052800000,1099.0,""],[1641139200000,1099.0,""],[1641225600000,1099.0,""],[1641312000000,1099.0,""],[1641398400000,1099.0,""],[1641484800000,1099.0,""],[1641571200000,1099.0,""],[1641657600000,1099.00,"1099元"],[1641744000000,1099.0,""],[1641830400000,1099.0,""],[1641916800000,1099.0,""],[1642003200000,1099.0,""],[1642089600000,1099.0,""],[1642176000000,1099.0,""],[1642262400000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642348800000,1099.00,""],[1642435200000,1099.00,""],[1642521600000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642608000000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642694400000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642780800000,1099.0,""],[1642867200000,1099.0,""],[1642953600000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643040000000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643126400000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643212800000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643299200000,1099.0,""],[1643385600000,1099.0,""],[1643472000000,1099.0,""],[1643558400000,949.00,"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100"],[1643644800000,949.00,"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100"],[1643731200000,1099.0,""],[1643817600000,1099.0,""],[1643904000000,1099.0,""],[1643990400000,1099.0,""],[1644076800000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644163200000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644249600000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644336000000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644422400000,1099.00,""],[1644508800000,1099.00,"1099元"],[1644595200000,1199.00,"1199元"],[1644681600000,1199.00,"1089元"],[1644768000000,1039.00,"购买1件,当前价:1089.00,满减:满1000减50"],[1644854400000,1039.00,"购买1件,当前价:1089.00,满减:满1000减50"],[1644940800000,1099.00,"1049元"],[1645027200000,1099.00,""],[1645113600000,1099.00,""],[1645200000000,1099.00,""],[1645286400000,1099.00,""],[1645372800000,1099.00,""],[1645459200000,1099.00,""],[1645545600000,1099.00,""],[1645632000000,1099.00,""],[1645718400000,1099.00,"1099元"],[1645804800000,1099.00,"1069元"],[1645891200000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1645977600000,1099.00,""],[1646064000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646150400000,1099.00,""],[1646236800000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646323200000,1099.00,""],[1646409600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646496000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646582400000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646668800000,899.00,"购买1件,当前价:1099.00,满减:每满1080减200"],[1646755200000,1099.00,""],[1646841600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646928000000,1099.0,""],[1647014400000,1099.0,""],[1647100800000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1647187200000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1647273600000,1099.00,""],[1647360000000,1099.00,"1099元"],[1647446400000,1049.0,"购买1件,当前价:1099.0,满减:满1050减50"],[1647532800000,1099.0,""],[1647619200000,1099.00,""],[1647705600000,1099.00,""],[1647792000000,1099.00,""],[1647878400000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1647964800000,1099.00,""],[1648051200000,1099.00,""],[1648137600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648224000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648310400000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648396800000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648483200000,1039.00,"购买1件,当前价:1089.00,满减:满1050减50"],[1648569600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648656000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648742400000,1049.00,""],[1648828800000,1049.00,""],[1648915200000,1049.00,"1049元"],[1649001600000,1099.00,""],[1649088000000,1049.00,""],[1649174400000,1049.00,""],[1649260800000,1049.00,""],[1649347200000,1049.00,""],[1649433600000,1099.00,""],[1649520000000,1099.00,""],[1649606400000,1099.00,""],[1649692800000,1049.0,"购买1件,当前价格1049"],[1649779200000,1099.00,""],[1649865600000,1099.00,""],[1649952000000,1049.00,"1049元"],[1650038400000,1099.00,""],[1650124800000,1049.00,""],[1650211200000,1049.00,""],[1650297600000,1049.00,""],[1650384000000,1049.00,""],[1650470400000,1099.00,"1099元"],[1650556800000,1099.00,""],[1650643200000,1099.00,""],[1650729600000,1099.00,""],[1650816000000,1099.00,""],[1650902400000,1099.00,""],[1650988800000,1099.00,""],[1651075200000,1099.00,"1099元"],[1651161600000,1099.00,""],[1651248000000,1099.00,""],[1651334400000,1049.00,""],[1651420800000,1099.00,""],[1651507200000,1099.0000,""],[1651593600000,1099.0000,""],[1651680000000,1099.00,"1099元"],[1651766400000,1089.0,"购买1件,当前价格1089"],[1651852800000,1049.00,""],[1651939200000,1089.00,""],[1652025600000,1099.00,"1099元"],[1652112000000,1089.00,""],[1652198400000,1049.00,""],[1652284800000,1089.00,""],[1652371200000,1049.00,""],[1652457600000,1099.00,""],[1652544000000,1099.00,""],[1652630400000,1099.00,""],[1652716800000,1099.00,"1099元"],[1652803200000,1099.00,""],[1652889600000,1099.00,""],[1652976000000,1049.00,""],[1653062400000,1099.00,""],[1653148800000,1099.00,""],[1653235200000,1099.00,"1099元"],[1653321600000,1099.00,""],[1653408000000,1099.00,""],[1653494400000,1099.00,""],[1653580800000,1099.00,""],[1653667200000,1099.00,""],[1653753600000,1099.00,""],[1653840000000,1099.00,""],[1653926400000,1049.0,""],[1654012800000,1049.00,""],[1654099200000,1049.00,""],[1654185600000,1049.00,""],[1654272000000,1049.00,""],[1654358400000,1049.00,""],[1654444800000,1049.00,""],[1654531200000,1049.00,""],[1654617600000,1049.00,""],[1654704000000,1049.00,""],[1654790400000,1049.00,""],[1654876800000,1049.00,""],[1654963200000,1049.00,""],[1655049600000,1049.00,""],[1655136000000,1049.00,""],[1655222400000,1049.00,""],[1655308800000,1049.00,"1049元"],[1655395200000,1049.00,""],[1655481600000,1049.00,""],[1655568000000,1049.00,""],[1655654400000,1049.00,""],[1655740800000,1049.00,""],[1655827200000,1049.00,"1049元"],[1655913600000,1049.00,""],[1656000000000,1049.00,""],[1656086400000,1049.00,""],[1656172800000,1049.00,"1049元"],[1656259200000,1049.00,""],[1656345600000,1049.00,""],[1656432000000,1049.00,""],[1656518400000,1049.00,""],[1656604800000,1049.00,"1049元"],[1656691200000,1049.00,""],[1656777600000,1049.00,""],[1656864000000,1049.00,"1049元"],[1656950400000,1049.00,""],[1657036800000,1049.00,""],[1657123200000,1049.00,""],[1657209600000,1049.00,""],[1657296000000,1049.00,""],[1657382400000,1049.00,""],[1657468800000,1049.00,""],[1657555200000,1049.00,""],[1657641600000,1049.00,""],[1657728000000,1049.00,""],[1657814400000,1049.00,""],[1657900800000,1049.00,"1049元"],[1657987200000,1049.00,""],[1658073600000,1049.00,""],[1658160000000,1049.00,""],[1658246400000,1049.00,""],[1658332800000,1049.00,"1049元"],[1658419200000,1049.00,""],[1658505600000,1049.00,""],[1658592000000,1049.00,""],[1658678400000,1049.00,""],[1658764800000,1049.00,""],[1658851200000,1049.00,""]

    则说明程序执行正常

    PS: 若想要将数据持久化至Mysql/MongoDB/Elasticsearch,只需编写对应的ItemPipelines实现, 修改爬虫导入的ITEM_PIPELINES配置即可,实现数据持久化与爬虫逻辑的解耦。

    结语

    本文为大家简要说明了使用Scrapy的理由,以及通过一个按理为大家演示了如何开发一个Scrapy爬虫项目。后续将持续为大家带来Scrapy更多。

    如果大家觉得文章还不错的话,欢迎大家三连(点赞+在看+收藏)

  • 相关阅读:
    Java 进阶集合Set、Map(二)
    ubuntu设置 Git 代理(http/git/ssh)
    linux操作系统
    MySQL_数据库图形化界面软件_00000_00001
    MySQL数据库
    葡聚糖修饰的Hrps共价三聚肽(三种Hrps多肽和葡聚糖通过共价修饰形成共价三聚肽,其结构通式为:(3Hrps)(CHO))
    ChatGLM-6b的微调与推理
    手把手教你设计一个CSDN系统
    第一天:java基础复习(1)
    JAVA编程:设计模式原则
  • 原文地址:https://blog.csdn.net/Rocky006/article/details/133861208