python的request库使用

get方法：发送请求获取相应数据
response.content.decode() 对获取数据进行解码

import requests

# 发送请求获取相应数据
response = requests.get("http://www.baidu.com")
print(response)  # 输出请求结果，
# 获取返回数据
print(response.encoding)  # 返回默认编码格式 ISO-8859-1
response.encoding = 'utf8'  # 修改编码格式
print(response.text)

print(response.content)  # 获取二进制数据
print(response.content.decode())  # 对二进制数据进行解码，默认解码方式为utf8
# print(response.content.decode(encoding='gbk'))  # 也可以传入编码方式进行解码

1
2
3
4
5
6
7
8
9
10
11
12
13
14

python的Beatuiful Soup4 使用

BeautifulSoup(‘数据’, ‘lxml’) ：创建对象解析HTML文本
find方法：查找元素返回第一个查找到的元素，find_all方法类似查询所有

1、根据标签进行查找 name=‘input’
2、根据id进行查找 id = ‘xxx’
3、根据属性进行查找 attrs={‘type’: ‘submit’, ‘name’: ‘smbtn’}
4、根据文本进行查找，返回文本内容 text=‘用户号’，此时返回的是NavigableString格式，其他返回的都是Tag对象

Tag对象方法 .name .attrs .text 返回标签名、属性、文本内容

# Beatuiful Soup4 安装命令 -- pip install bs4
# 会同时安装bs4和beautifulsoup4
# 主要用于解析文档树，与lxml配合使用 pip install lxml

import requests
from bs4 import BeautifulSoup

# 创建BeautifulSoup对象，传入字符串数据和要用的解析器lxml，不用解析器会有警告，会自动修正
soup = BeautifulSoup('数据', 'lxml')
print(soup)

# 获取连接返回值
response = requests.get("https://jksb.v.zzu.edu.cn/vls6sss/zzujksb.dll/first0")

# 获取html文本
# print(response.content.decode())
html_text = response.content.decode()

# 创建BeautifulSoup对象解析html文本
soup = BeautifulSoup(html_text, 'lxml')

# 1、根据标签进行查找 name='input'
# 2、根据id进行查找 id = 'xxx'
# 3、根据属性进行查找 attrs={'type': 'submit', 'name': 'smbtn'}
# 4、根据文本进行查找，返回文本内容 text='用户号'，此时返回的是NavigableString格式，其他返回的都是Tag对象

# 查询标签名为title的元素
title = soup.find('title')
print(title)

# 查询所有input标签
input_all = soup.find_all(name='input')
print(input_all)

# attrs 方式，查询属性名 例如{'id':'uid','class':'xxx'}，找不到返回None
input_uid = soup.find(attrs={'name': 'uid', 'tabindex': '0'})
print(input_uid)
input_uid = soup.find(attrs={'type': 'submit', 'name': 'uid'})
print(input_uid)

# text方式
text = soup.find(text='用户号')
print(text)
print(type(text))

# Tag对象使用
input_uid = soup.find(attrs={'type': 'submit'})
print('标签名：', input_uid.name)
print('属性：', input_uid.attrs)
print('内容：', input_uid.text)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

相关阅读:
华为OD机试 - BOSS的收入 - 回溯（Java 2023 B卷 100分）
【Java设计模式经典设计原则】七 LOD迪米特法则
使用python监控linux服务器
【我的渲染技术进阶之旅】你知道数字图像处理的标准图上的女孩子是谁吗？背后的故事你了解吗？为啥这张名为Lenna的图会成为数字图像处理的标准图呢？
leetcode 刷题 log day 44
OpenCV每日函数相机校准calibrateCamera函数
node-gyp编译c++编写的node扩展
C/C++计算分数的浮点数值 2019年12月电子学会青少年软件编程（C/C++）等级考试一级真题答案解析
SpringBoot 优雅地实现文件的上传和下载
安卓APP源码和报告——学生信息管理系统

原文地址：https://blog.csdn.net/weixin_43960044/article/details/119964344