urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)>
在代码的头一行加入:就可以解决
- import ssl
- ssl._create_default_https_context = ssl._create_unverified_context
- import urllib
- import urllib.request
- import ssl
- ssl._create_default_https_context = ssl._create_unverified_context
-
-
- data1 = bytes(urllib.parse.urlencode({'name': 'geometry'}), encoding='utf-8')
-
- response = urllib.request.urlopen('https://www.httpbin.org/post', data=data1)
-
- print(response.read().decode('utf-8'))
- {
- "args": {},
- "data": "",
- "files": {},
- "form": {
- "name": "geometry"
- },
- "headers": {
- "Accept-Encoding": "identity",
- "Content-Length": "13",
- "Content-Type": "application/x-www-form-urlencoded",
- "Host": "www.httpbin.org",
- "User-Agent": "Python-urllib/3.10",
- "X-Amzn-Trace-Id": "Root=1-630329c8-79bee34a3feaa8dc06de8d21"
- },
- "json": null,
- "origin": "117.30.119.96",
- "url": "https://www.httpbin.org/post"
- }
安装一下:pip3 install request
- pip3 install requests
- Collecting requests
- Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ca/91/6d9b8ccacd0412c08820f72cebaa4f0c0441b5cda699c90f618b6f8a1b42/requests-2.28.1-py3-none-any.whl (62 kB)
- |████████████████████████████████| 62 kB 502 kB/s
- Collecting charset-normalizer<3,>=2
- Downloading https://pypi.tuna.tsinghua.edu.cn/packages/db/51/a507c856293ab05cdc1db77ff4bc1268ddd39f29e7dc4919aa497f0adbec/charset_normalizer-2.1.1-py3-none-any.whl (39 kB)
- Requirement already satisfied: certifi>=2017.4.17 in /Users/apple/PycharmProjects/spydemo1/venv/lib/python3.10/site-packages (from requests) (2022.6.15)
- Requirement already satisfied: urllib3<1.27,>=1.21.1 in /Users/apple/PycharmProjects/spydemo1/venv/lib/python3.10/site-packages (from requests) (1.26.9)
- Requirement already satisfied: idna<4,>=2.5 in /Users/apple/PycharmProjects/spydemo1/venv/lib/python3.10/site-packages (from requests) (3.3)
- Installing collected packages: charset-normalizer, requests
- Successfully installed charset-normalizer-2.1.1 requests-2.28.1
- import requests
- urlbase = 'https://www.baidu.com'
- r = requests.get(urlbase)
- print(type(r))
- print(r.status_code)
- print(type(r.text))
- print(r.text[:100])
- print(r.cookies)
运行结果:
- <class 'requests.models.Response'>
- 200
- <class 'str'>
- html>
- <html> <head><meta http-equiv=content-type content=text/html;charse
- <RequestsCookieJar[<Cookie BDORZ=27315 for .baidu.com/>]>
https的基本请求,get post delete等。这里请求一下带参数的get请求
- urlparam = 'https://www.httpbin.org/get?name=germey&age=25'
-
- r1 = requests.get(urlparam)
- print(r1.text)
运行结果,请求要获得了germey 25岁的数据
- {
- "args": {
- "age": "25",
- "name": "germey"
- },
- "headers": {
- "Accept": "*/*",
- "Accept-Encoding": "gzip, deflate",
- "Host": "www.httpbin.org",
- "User-Agent": "python-requests/2.28.1",
- "X-Amzn-Trace-Id": "Root=1-63032c31-5d23881d3de2fc404c11c5e5"
- },
- "origin": "117.30.119.129",
- "url": "https://www.httpbin.org/get?name=germey&age=25"
- }
print(r.json())
{'args': {'age': '25', 'name': 'germey'}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Host': 'www.httpbin.org', 'User-Agent': 'python-requests/2.28.1', 'X-Amzn-Trace-Id': 'Root=1-63032cb8-3f780dc507c9344509ebe52b'}, 'origin': '117.30.118.141', 'url': 'https://www.httpbin.org/get?name=germey&age=25'}
获取的实际是图片
可以进行
输出看看是什么?通过写入一个文件,获取图片为
requests 往往需要添加请求头,才可以进行实际请求内容
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='ssr1.csrape.centor', port=443): Max retries exceeded with url: / (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known' ))
提示请求错误的。需要添加请求头;代码如下:
- urlbase4 = 'https://www.sina.com.cn/'
- myheaders = {
- 'User-Agent':'Mozilla/5.0 (Macintosh; Inter Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/ 537.36'
-
- #'User-Agent': 'Mozilla/5.0 (Windows;U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)'
- }
-
- r3 = requests.get(urlbase4, headers=myheaders)
- print(r3.text[:400])
请求头获取比较简单,抓个包就可以知道,获取一下页面的400个字符串,请求为新浪地址,中间有中文,还没有进行转码
- html>
- <html>
- Content-type" content="text/html; charset=utf-8" />
- content="IE=edge" />
-
æ°æµªé¦é¡µ - content="æ°æµª,æ°æµªç½,SINA,sina,sina.com.cn,æ°æµªé¦é¡µ,é¨æ·,èµè®¯" />
- content="æ°æµªç½ä¸ºå¨çç¨
Requests 和 urllib 支持https 1.1 协议,如果要求用http2.0 协议,那么就要用httpx
- Installing collected packages: rfc3986, h11, anyio, httpcore, httpx
- Attempting uninstall: h11
- Found existing installation: h11 0.13.0
- Uninstalling h11-0.13.0:
- Successfully uninstalled h11-0.13.0
- Successfully installed anyio-3.6.1 h11-0.12.0 httpcore-0.15.0 httpx-0.23.0 rfc3986-1.5.0
用法基本和requests的类似
- import httpx
-
-
-
- urlbase = 'https://www.httpbin.org/get'
- with httpx.Client() as client:
- response = client.get(urlbase)
- print(response)
- import asyncio
-
-
- async def execute(x):
- print('Number:', x)
- return x
- coroutine =execute(1)
-
- print('Coroutine:', coroutine)
- print('After Calling execute')
-
- task =asyncio.ensure_future(coroutine)
- print('Task:', task)
- loop = asyncio.get_event_loop()
- loop.run_until_complete(task)
-
- print('Task:', task)
- print('After calling loop')
-