①安装pymysql和peewee库
- pip install peewee
- pip install pymysql
②在pipelines.py同级目录下新建db.py,新建数据库连接和数据库模型,内容如下:
- from peewee import *
-
- db_mysql = MySQLDatabase('fast_generator', user='root', password='123456', host='localhost', port=3306)
-
- class Announcement(Model):
- # 标题
- title = TextField()
- # 内容
- content = TextField()
- # 发布日期
- release_date = DateTimeField()
- # 浏览次数
- read_times = IntegerField()
- # 所属页码
- belong_page = TextField()
- # 附件,多个分号隔开
- enclosure = TextField()
- url = TextField()
-
- class Meta:
- database = db_mysql
③管道数据存储数据库
- from announcement.db import Announcement
-
-
- class AnnouncementPipeline:
-
- def process_item(self, item, spider):
- # 保存到数据库
- announcement = Announcement(title=item['title'],
- content=item['content'],
- release_date=item['release_date'],
- belong_page=item['belong_page'],
- read_times=item['read_times'],
- enclosure=item['enclosure'],
- url=item['url'])
- announcement.save()
- # 一定要return出去,否则一下个管道将接收不到数据。
- return item