• 好库推荐|两个解决ja3检测的Python库,强烈推荐


    关注它,不迷路。

    本文章中所有内容仅供学习交流,不可用于任何商业用途和非法用途,否则后果自负!

    某佬丢过来一个网站,我用requests库请求会报错:

    先说下我的环境: Win10 + Python 3.10 + requests  2.27.1,直接请求的话报错了,我猜测是检测了tls,听说降低Python和requests的版本可以正常请求,或许检测不那么严格。

    下面介绍两个库,过掉它的检测。

    1. Pyhttpx

    项目地址:

    https://github.com/zero3301/pyhttpx

    你可以将整个项目下载下来,也可以直接安装这个库:

    pip install pyhttpx

    我选择的是将整个项目下载下来测试,根据他的demo,写下请求代码:

    import pyhttpx
    
    
    sess = pyhttpx.HttpSession()
    
    
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36",
    }
    
    
    url = "打码了"
    
    
    response = sess.get(url)
    
    
    print(response.text)

    可以正常返回了:

    完美解决!

    这个项目的优秀之处在于可以修改成指定的ja3加密套件,还能在Windows下运行,非常的nice!当然我在测试某些网站的时候报错了,项目还不是那么完美,期待解决。

    2. Pycurl

    目前这个库在Windows上还没办法解决ja3的问题,因此我选择了Liunx.

    环境:

    Distributor ID:  Ubuntu
    Description:  Ubuntu 18.04.4 LTS
    Release:  18.04
    Codename:  bionic

    如何安装并解决ja3,可以参考华哥的这篇文章 python完美突破tls/ja3

    也可以参考我在星球里写的:

    https://t.zsxq.com/052z3rzN3

    如果你在服务器上输入下面的命令可以正常返回,说明安装成功了.

    curl_chrome100 https://www.baidu.com

    把下面的请求demo代码上传到服务器:

    import pycurl
    import copy
    from io import BytesIO
    import re
    import io
    import random
    
    
    
    
    class Response:
        def __init__(self, status_code, body, headers):
            self.status_code = status_code
            self.body = body
            self.headers = headers
    
    
        @property
        def text(self, encode="utf-8"):
            return self.body.decode(encode)
    
    
    
    
    class CurlClient:
        def __init__(self):
            c = pycurl.Curl()
            # 自动维护cookie
            c.setopt(pycurl.COOKIEFILE, "")
            c.setopt(pycurl.TIMEOUT, 30)
            # 开启alpn
            c.setopt(pycurl.SSL_ENABLE_ALPN, 1)
    
    
            c.setopt(pycurl.SSL_ENABLE_NPN, 0)
            # 跳转
            c.setopt(pycurl.FOLLOWLOCATION, 1)
            # 处理gzip
            c.setopt(pycurl.ENCODING, "gzip,deflate")
            # 是否验证ssl
            c.setopt(pycurl.SSL_VERIFYPEER, 1)
            c.setopt(pycurl.SSLVERSION, pycurl.SSLVERSION_TLSv1_2)
            try:
                c.setopt(pycurl.SSL_CERT_COMPRESSION, "brotli")
                c.setopt(pycurl.SSL_ENABLE_ALPS, 1)
            except:
                pass
            # 设置代理
            # c.setopt(pycurl.PROXY, "http://127.0.0.1:9091")
            # c.setopt(pycurl.PROXY, "http://127.0.0.1:7890")
            # 加密套件
    
    
            c.setopt(
                pycurl.SSL_CIPHER_LIST,
                "TLS_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256,ECDHE-ECDSA-AES128-GCM-SHA256,ECDHE-RSA-AES128-GCM-SHA256,ECDHE-ECDSA-AES256-GCM-SHA384,ECDHE-RSA-AES256-GCM-SHA384,ECDHE-ECDSA-CHACHA20-POLY1305,ECDHE-RSA-CHACHA20-POLY1305,ECDHE-RSA-AES128-SHA,ECDHE-RSA-AES256-SHA,AES128-GCM-SHA256,AES256-GCM-SHA384,AES128-SHA,AES256-SHA",
            )
            self.c = c
            self._cookies = {}
    
    
        def get(self, url, headers=[]):
            self.c.setopt(pycurl.POST, 0)
            r = self.send(url, headers)
            return r
    
    
        def post(self, url, data, isjson=False, headers=[]):
            self.c.setopt(pycurl.POST, 1)
            self.c.setopt(pycurl.POSTFIELDS, data)
            h = copy.deepcopy(headers)
            if isjson:
                h.append("Content-Type: application/json")
            r = self.send(url, h)
            return r
    
    
        def send(self, url, headers):
            body = BytesIO()
            resp_header = BytesIO()
            h = self.default_headers
            h.extend(headers)
            self.c.setopt(pycurl.HTTPHEADER, h)
            self.c.setopt(pycurl.URL, url)
            self.c.setopt(pycurl.WRITEDATA, body)
            self.c.setopt(pycurl.WRITEHEADER, resp_header)
            self.c.perform()
            r = Response(
                self.c.getinfo(pycurl.HTTP_CODE),
                body.getvalue(),
                resp_header.getvalue().decode(),
            )
            self.save_cookies(resp_header.getvalue().decode())
            return r
    
    
        def save_cookies(self, resp_header):
            cookies = re.findall("Set-Cookie: (.*?);", resp_header, re.IGNORECASE)
            for cookie in cookies:
                k, v = cookie.split("=", 1)
                self._cookies[k] = v
    
    
        def set_proxy(self, proxy):
            self.c.setopt(pycurl.PROXY, proxy)
    
    
        @property
        def cookies(self):
            return self._cookies
    
    
        @property
        def default_headers(self):
            h = [
                'sec-ch-ua: ".Not/A)Brand";v="99", "Microsoft Edge";v="103", "Chromium";v="103"',
                "sec-ch-ua-mobile: ?0",
                'sec-ch-ua-platform: "Windows"',
                "upgrade-insecure-requests: 1",
                "user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44",
                "accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
                "sec-fetch-site: none",
                "sec-fetch-mode: navigate",
                "sec-fetch-user: ?1",
                "sec-fetch-dest: document",
                "accept-encoding: gzip, deflate, br",
                "sccept-language: en-US,en;q=0.9",
            ]
            return h
    
    
    
    
    def test_curl():
        c = CurlClient()
        resp = c.get("打码了")
        print(resp.text)
    
    
    
    
    if __name__ == "__main__":
        test_curl()

    运行后,也返回了正常的数据l

    今天的文章就分享到这里,后续分享更多的技巧,敬请期待。

  • 相关阅读:
    JS判断浏览器类型
    接口自动化测试框架postman tests常用方法
    如何进行接口测试测?有哪些注意事项?保姆级解读
    大白鲨优化算法(WSO)(Matlab代码实现)
    【毕业设计】基于stm32的智能水杯 - 恒温控制 饮水杯 单片机 物联网 嵌入式
    Linux-Hadoop集群测试
    【sqlmap工具的使用】
    【ESP32】 OTA 升级简述与 Flash 分区介绍
    JavaEE-多线程-CAS
    科大讯飞分类算法挑战赛2023的一些经验总结
  • 原文地址:https://blog.csdn.net/m0_72557783/article/details/126587456