python协整与异步调用，压榨程序的摸鱼时间——使用异步编写需要循环执行的程序，并获取返回值（2）

对于同步、异步的基础与基本的书写方法请参考上一篇博客：
python协整与异步调用，压榨程序的摸鱼时间——异步改写一般程序（1）

由于request本身只能发送同步的请求，因此我们使用支持异步的httpx来访问网站，对比同步与异步的写法，用同步与异步的方法循环访问一个网站列表

啊~没错，异步操作需要函数/三方库本身支持异步，这就是为什么上一节中使用await asyncio.sleep(1)，而不是await time.sleep(1)

httpx官方文档：https://www.python-httpx.org/
httpx异步操作文档：https://www.python-httpx.org/async/

同步循环程序&获取返回值

对于一个需要循环执行的程序（如下面的代码）

from datetime import datetime
import httpx


def get_url(url):
    headers = {"User-Agent": "User-Agent:Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0;"}
    req_result = httpx.get(url, headers=headers)
    print('GET URL {} , 时间 {}'.format(url, datetime.now()))
    if req_result.status_code == 200:
        return req_result.text[:20]  # 只返回前20个字符
    else:
        return None


def main():
    start_time = datetime.now()
    url_list = [
        "https://www.baidu.com",
        "https://www.github.com",
        "https://www.bilibili.com",
        "https://zhuanlan.zhihu.com/"
    ]

    # 得到返回的数据
    result_dict = {}
    for _url in url_list:
        result_dict[_url] = get_url(_url)
    end_time = datetime.now()
    print("耗时:", end_time - start_time)
    print(result_dict)


if __name__ == '__main__':
    main()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

执行结果如下：

GET URL https://www.baidu.com , 时间 2022-07-28 08:44:17.898791
GET URL https://www.github.com , 时间 2022-07-28 08:44:19.001649
GET URL https://www.bilibili.com , 时间 2022-07-28 08:44:19.925008
GET URL https://zhuanlan.zhihu.com/ , 时间 2022-07-28 08:44:20.156802
耗时: 0:00:03.159291
{'https://www.baidu.com': '"User-Agent": "User-Agent:Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0;"}
    req = await client.get(url, headers=headers)
    print('GET URL {} , 时间 {}'.format(url, datetime.now()))
    if req.status_code == 200:
        return req.text[:20]
    else:
        return None


def main():
    start_time = datetime.now()
    url_list = [
        "https://www.baidu.com",
        "https://www.github.com",
        "https://www.bilibili.com",
        "https://zhuanlan.zhihu.com/"
    ]

    # 异步循环
    tasks = []
    loop = asyncio.get_event_loop()
    for _url in url_list:
        tasks.append(loop.create_task(get_by_request(_url)))
    loop.run_until_complete(asyncio.wait(tasks))
    # 得到返回数据
    result_dict = {}
    for index, _tk in enumerate(tasks):
        result_dict[url_list[index]] = _tk.result()
    end_time = datetime.now()
    print("耗时:", end_time - start_time)
    print(result_dict)


if __name__ == '__main__':
    main()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

得到的结果：

GET URL https://zhuanlan.zhihu.com/ , 时间 2022-07-28 08:47:18.548526
GET URL https://www.baidu.com , 时间 2022-07-28 08:47:18.738401
GET URL https://www.github.com , 时间 2022-07-28 08:47:19.293389
GET URL https://www.bilibili.com , 时间 2022-07-28 08:47:19.369106
耗时: 0:00:01.224891
{'https://www.baidu.com': '}
    for index, _tk in enumerate(tasks):
        result_dict[url_list[index]] = _tk.result()
1
2
3

相关阅读:
angular：HtmlElement的子节点有Shadow dom时奇怪的现象
由浅入深Dubbo网络通信协议大全
9月14日作业
C++基础语法详解
Java面试题13-CountDownLatch和Semaphore的区别和底层原理
React——谈谈属性与状态
字典树——最大异或对（模板）
今天步行数4000
JAVA实现easyExcel下载压缩包
分类预测 | MATLAB实现KOA-CNN-LSTM开普勒算法优化卷积长短期记忆神经网络数据分类预测

原文地址：https://blog.csdn.net/weixin_35757704/article/details/126022826