• python采集美女内容,快来学会把你喜欢的内容全部下载吧~



    本篇代码提供者: 青灯教育-巳月老师


    知识点:

    • 动态数据抓包
    • requests发送请求
    • json数据解析

    开发环境:

    运行代码

    • python 3.8

    辅助敲代码

    • pycharm 2021.2

    第三方模块

    • requests

    如果安装python第三方模块:

    1. win + R 输入 cmd 点击确定, 输入安装命令 pip install 模块名 (pip install requests) 回车
    2. 在pycharm中点击Terminal(终端) 输入安装命令

    如何配置pycharm里面的python解释器?

    1. 选择file(文件) >>> setting(设置) >>> Project(项目) >>> python interpreter(python解释器)

    2. 点击齿轮, 选择add

    3. 添加python安装路径


    pycharm如何安装插件?

    1. 选择file(文件) >>> setting(设置) >>> Plugins(插件)

    2. 点击 Marketplace 输入想要安装的插件名字 比如:翻译插件 输入 translation / 汉化插件 输入 Chinese

    3. 选择相应的插件点击 install(安装) 即可

    4. 安装成功之后 是会弹出 重启pycharm的选项 点击确定, 重启即可生效


    基本思路流程(通用)

    1. 发送请求
    2. 获取数据
    3. 解析数据
    4. 保存数据

    代码

    导入模块

    import requests     # 发送请求 访问网站
    import re
    
    • 1
    • 2

    加入伪装

    url = 'https://www.\.com/graphql'
    # 伪装
    headers = {
        'content-type': 'application/json',
        'Cookie': 'kpf=PC_WEB; kpn=KUAISHOU_VISION; clientid=3; did=web_ea128125517a46bd491ae9ccb255e242; client_key=65890b29; didv=1646739254078; _bl_uid=pCldq3L00L61qCzj6fytnk2wmhz5; userId=270932146; kuaishou.server.web_st=ChZrdWFpc2hvdS5zZXJ2ZXIud2ViLnN0EqABJBZGcI4Czt3EqaG90aWFm2EPPYcAQlkuV3ZOkHUBcqEtWV--udx6stXFOhEkGgx4tNCBS9Vhl-GstWLkvn-r_eV1072IPsO8d5sUcUuTJv3nicPWVBcfHW813ST2a4uN7HyHsnpnRkjx2BFXoMRmdSO4tbAgy3-3QRTaw05tiEp79qGRfQgVmK4kVJLOTF9X7o9vSjrLrTaRxEf0rwsXJhoSsmhEcimAl3NtJGybSc8y6sdlIiCkLEL3ZiZwp85TGjXIHaw7KGda_VNpfdZ1qigsOkLmESgFMAE; kuaishou.server.web_ph=ad8a744eb59aab3bf509625671ad16837e66',
        'Host': 'www.\.com',
        'Origin': 'https://www.\.com',
        'Referer': 'https://www.\.com/search/video?searchKey=%E5%8F%8C%E9%A9%AC%E5%B0%BE',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36',
    }
    # 传数据
    for page in range(1, 10):
        json = {
            'operationName': "visionSearchPhoto",
            'query': "fragment photoContent on PhotoEntity {\n  id\n  duration\n  caption\n  likeCount\n  viewCount\n  realLikeCount\n  coverUrl\n  photoUrl\n  photoH265Url\n  manifest\n  manifestH265\n  videoResource\n  coverUrls {\n    url\n    __typename\n  }\n  timestamp\n  expTag\n  animatedCoverUrl\n  distance\n  videoRatio\n  liked\n  stereoType\n  profileUserTopPhoto\n  __typename\n}\n\nfragment feedContent on Feed {\n  type\n  author {\n    id\n    name\n    headerUrl\n    following\n    headerUrls {\n      url\n      __typename\n    }\n    __typename\n  }\n  photo {\n    ...photoContent\n    __typename\n  }\n  canAddComment\n  llsid\n  status\n  currentPcursor\n  __typename\n}\n\nquery visionSearchPhoto($keyword: String, $pcursor: String, $searchSessionId: String, $page: String, $webPageArea: String) {\n  visionSearchPhoto(keyword: $keyword, pcursor: $pcursor, searchSessionId: $searchSessionId, page: $page, webPageArea: $webPageArea) {\n    result\n    llsid\n    webPageArea\n    feeds {\n      ...feedContent\n      __typename\n    }\n    searchSessionId\n    pcursor\n    aladdinBanner {\n      imgUrl\n      link\n      __typename\n    }\n    __typename\n  }\n}\n",
            'variables': {'keyword': "双马尾", 'pcursor': str(page), 'page': "search", 'searchSessionId': "MTRfMjcwOTMyMTQ2XzE2NTU3MjcyNTU3NjFf5Y-M6ams5bC-XzUwMDQ"}
        }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17

    1. 发送请求

        response = requests.post(url=url, headers=headers, json=json)
    
    • 1

    2. 获取数据

    <Response [200]>: 请求成功

        feeds = response.json()['data']['visionSearchPhoto']['feeds']
    
        for feed in feeds:
            author_id = feed['author']['id']
            photo_id = feed['photo']['id']
            print(author_id, photo_id)
            caption = feed['photo']['caption']
            photoUrl = feed['photo']['photoUrl']
            print(caption, photoUrl)
            caption = re.sub('[\\/:"?<>|\\n]', '', caption)
            json_like = {
            源码、解答、教程可加Q裙:832157862免费领取
                'operationName': "visionVideoLike",
                'query': "mutation visionVideoLike($photoId: String, $photoAuthorId: String, $cancel: Int, $expTag: String) {\n  visionVideoLike(photoId: $photoId, photoAuthorId: $photoAuthorId, cancel: $cancel, expTag: $expTag) {\n    result\n    __typename\n  }\n}\n",
                'variables': {
                    'cancel': 0,
                    'expTag': "1_i/2005246154318902369_xpcwebsearchxxnull0",
                    'photoAuthorId': author_id,
                    'photoId': photo_id
                }
            }
            resp_ = requests.post(url=url, headers=headers, json=json_like)
            # video_data = requests.get(photoUrl).content
            # with open(f'video/{caption}.mp4', mode='wb') as f:
            #     f.write(video_data)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25

    尾语

    好了,我的这篇文章写到这里就结束啦!

    有更多建议或问题可以评论区或私信我哦!一起加油努力叭(ง •_•)ง

    喜欢就关注一下博主,或点赞收藏评论一下我的文章叭!!!

  • 相关阅读:
    SSH 基础学习使用
    Acwing.885 求组合数l
    MyBatis逆向工程
    Java 那些诗一般的 数据类型 (1)
    【html】面试问题总结
    CentOS系统上定时备份与清理Java项目日志文件
    基于Alexnet深度学习网络的人员口罩识别算法matlab仿真
    Qt5开发从入门到精通——第四篇(标准输入对话框类)
    秋招日寄9.10(备战秋招的第三天)
    Radius 成为云原生计算基金会(CNCF)的沙箱项目
  • 原文地址:https://blog.csdn.net/weixin_62853513/article/details/125410830