Playwright的基本使用

文章目录

介绍

Playwright 是一个用于自动化浏览器操作的开源工具，由 Microsoft 开发和维护。它支持多种浏览器（包括 Chromium、Firefox 和 WebKit）和多种编程语言（如 Python、JavaScript 和 C#），可以用于测试、爬虫、自动化任务等场景。

安装

环境安装

python版本的Playwright官网文档：
https://playwright.dev/python/docs/intro

系统要求：
- Python 3.8 或更高版本。
- Windows 10+、Windows Server 2016+ 或适用于 Linux 的 Windows 子系统（WSL）。
- MacOS 12 Monterey 或 MacOS 13 Ventura。
- Debian 11、Debian 12、Ubuntu 20.04 或 Ubuntu 22.04。
安装playwright的python版本
- pip install playwright
安装Playwright所需的所有工具插件和所支持的浏览器
- playwright install
- 该步骤耗时较长

屏幕录制

创建一个py文件，比如：main.py
在终端中，执行如下指令：

 playwright codegen -o main.py  
1

在这里插入图片描述

playwright codegen --viewport-size=800,600  www.baidu.com -o main.py 
1

模拟手机设备进行网络请求（只支持手机模拟器，无需单独安装）
访问指定网址，并且设置浏览器窗口大小

playwright codegen --device="iPhone 13" -o main.py
1

保留记录cookie

在屏幕录制时，进行登录操作，登录后，cookie信息会被保存到auth.json文件中

playwright codegen --save-storage=auth.json http://download.java1234.com/ 
1

基于auth.json进行屏幕录制，会自动进入到登录成功后的页面中

playwright codegen --load-storage=auth.json http://download.java1234.com/ -o main.py 
1

基本使用

from playwright.sync_api import sync_playwright
with sync_playwright() as p:
    # headless 是否是无头浏览器
    bro=p.chromium.launch(headless=False)
    page=bro.new_page()
    # 访问的网站
    page.goto("https://www.baidu.com")
    # 等待时长
    page.wait_for_timeout(1000)
    # 获取网页标头
    title=page.title()
    # 获取网站源码
    content=page.content()
    print(title,content)
    page.close()
    bro.close()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

在这里插入图片描述

元素定位

CSS选择器定位

语法结构：page.locator()
- 参数：标签/id/层级/class 选择器
交互操作：
- 点击元素， click() 方法
- 元素内输入文本， fill() 方法

import random

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    # headless 是否是无头浏览器
    bro = p.chromium.launch(headless=False)
    page = bro.new_page()
    # 访问的网站
    page.goto("https://www.baidu.com")
    # 等待时长
    page.wait_for_timeout(1000)

    # 定位并输入python
    page.locator("#kw").fill("python")

    # 定位搜索按钮，进行搜索 #id定位  .class定位
    page.locator("#su").click()
    page.wait_for_timeout(1000)
    # 网页回退
    page.go_back()
    page.wait_for_timeout(1000)
    # 标签+属性定位
    page.locator("input#kw").fill("人工智能")
    page.locator("#su").click()
    page.go_back()
    page.wait_for_timeout(1000)
    # 层级定位
    page.locator('#form > span > input#kw').fill('数据分析')
    page.locator('#su').click()
    page.wait_for_timeout(1000)
    page.go_back()

    page.wait_for_timeout(1000)

    # 聚焦于当前标签
    page.locator('#form > span > input#kw').focus()

    input_text = 'Hello, World!'
    for char in input_text:
        page.keyboard.type(char, delay=random.randint(300, 600))

    # 定位搜索按钮，进行点击操作
    page.locator('#su').click()

    page.close()
    bro.close()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

xpath定位

page.locator(xpath表达式)

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    bro = p.chromium.launch(headless=False,slow_mo=2000)
    page = bro.new_page()
    page.goto('https://www.bilibili.com/')

    #xpath定位
    page.locator('//*[@id="nav-searchform"]/div[1]/input').fill('Python教程')
    page.locator('//*[@id="nav-searchform"]/div[2]').click()

    page.close()
    bro.close()
1
2
3
4
5
6
7
8
9
10
11
12
13

count

const count = await page.locator('div.my-class').count();  
console.log(count); // 输出匹配 'div.my-class' 的元素数量
1
2

nth(index)

const element =page.locator('button').nth(1); // 获取第二个按钮元素  
element.click();
1
2

inner_text()

const element = await page.locator('h1');  
const text = await element.inner_text();  
console.log(text); // 输出 h1 元素的内部文本

1
2
3
4

get_by_text(xxx)

const element =page.locator('button:text("Click me")'); // 定位包含文本 "Click me" 的按钮  
element.click();
1
2

get_attribute(attrName)

const element = page.locator('img');  
const src = element.get_attribute('src');  
console.log(src); // 输出图像的 src 属性值
1
2
3

相关阅读:
【AIGC】图片生成的原理与应用
【TS】基础类型
DPDK基础组件二（igb_uio、kni、rcu)
vue draggable怎么用？怎么写一个拖拽的看板？
大模型prompt提示词如何调优？
（三）行为模式：8、状态模式（State Pattern）（C++示例）
啃完这些 Spring 知识点，我竟吊打了阿里面试官（附面经 + 笔记）
mybatis学习：二、 Mybatis的Dao开发、mybatis-config.xml文件的详情
C#邮件发送
【前端】JavaScript

原文地址：https://blog.csdn.net/qq_48082548/article/details/138201628