java正则表达式进阶

Java正则表达式进阶

1.捕获分组
2.非捕获分组
3.非贪婪匹配
4.正则表达式应用实例
5.Pattern类
6.Matcher类
7.反向引用

1.捕获分组

常用分组：

在这里插入图片描述

程序实例，编号分组获取：

Pattern pattern = Pattern.compile("(\\d\\d)(\\d\\d)");
Matcher matcher = pattern.matcher("7788 abc7765");
while (matcher.find()) {
    System.out.println("找到：" + matcher.group(0));
    System.out.println("找到分组1：" + matcher.group(1));
    System.out.println("找到分组2：" + matcher.group(2));
}
-------------------------------
输出：
找到：7788
找到分组1：77
找到分组2：88
找到：7765
找到分组1：77
找到分组2：65
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

程序实例，命名分组获取：

// 命名分组
Pattern pattern2 = Pattern.compile("(?\\d\\d)(?\\d\\d)");
Matcher matcher2 = pattern2.matcher("7788 abc7765");
while (matcher2.find()) {
    System.out.println("找到：" + matcher2.group(0));
    System.out.println("找到分组1：" + matcher2.group("g1"));
    System.out.println("找到分组2：" + matcher2.group("g2"));
}
1
2
3
4
5
6
7
8

2.非捕获分组

常用非捕获分组：

在这里插入图片描述

3.非贪婪匹配

?
1

当此字符紧随任何其他限定符（*、+、?、{n}、{n,}、{n,m}）之后时，匹配模式是"非贪心的"。"非贪心的"模式匹配搜索到的、尽可能短的字符串，而默认的"贪心的"模式匹配搜索到的、尽可能长的字符串。

例如，在字符串"oooo"中，"o+?“只匹配单个"o”，而"o+“匹配所有"o”。

4.正则表达式应用实例

四个简单的案例：

/**
 * 正则表达式实例
 */
public class RegularExpressionExample {
    /**
     * 验证字符串是不是纯汉字
     */
    @Test
    public void isChinese() {
        String str = "大河之犬";
        String regex = "^[\u0391-\uffe5]+$";
        Pattern pattern = Pattern.compile(regex);
        Matcher matcher = pattern.matcher(str);
        if (matcher.find()) {
            System.out.println("格式满足！");
        } else {
            System.out.println("格式不满足！");
        }
    }

    /**
     * 验证邮政编码
     * 1-9开头的六位数
     */
    @Test
    public void isZipCode() {
        String str = "123450";
        String regex = "^[1-9]\\d{5}$";
        Pattern pattern = Pattern.compile(regex);
        Matcher matcher = pattern.matcher(str);
        if (matcher.find()) {
            System.out.println("格式满足！");
        } else {
            System.out.println("格式不满足！");
        }
    }

    /**
     * 验证QQ号
     * 1-9开头的5-10位纯数字
     */
    @Test
    public void isQQ() {
        String str = "256789087";
        String regex = "^[1-9]\\d{4,9}$";
        Pattern pattern = Pattern.compile(regex);
        Matcher matcher = pattern.matcher(str);
        if (matcher.find()) {
            System.out.println("格式满足！");
        } else {
            System.out.println("格式不满足！");
        }
    }

    /**
     * 验证手机号码
     * 以13，14，15，18开头的到11位数
     */
    @Test
    public void isPhoneNumber() {
        String str = "13245678901";
        String regex = "^1[3|4|5|8]\\d{9}$";
        Pattern pattern = Pattern.compile(regex);
        Matcher matcher = pattern.matcher(str);
        if (matcher.find()) {
            System.out.println("格式满足！");
        } else {
            System.out.println("格式不满足！");
        }
    }
}
------------------------
输出都为格式满足！
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73

正则验证复杂URL：

/**
 * 验证是否为合法的URL
 * 思路：
 * 1.URL开头部分：https://或者http://
 * 2.通过([\w-]+\.)+[\w-]+$匹配https://blog.csdn.net部分
 * 3.匹配其余部分
 */
@Test
public void isUrl() {
    String str = "https://blog.csdn.net/Gherbirthday0916?spm=1010.2135.3001.5343";
    String regex = "^((http|https)://)([\\w-]+\\.)+[\\w-]+(\\/[\\w-?=&/%.#]+)?$";
    Pattern pattern = Pattern.compile(regex);
    Matcher matcher = pattern.matcher(str);
    if (matcher.find()) {
        System.out.println("格式满足！");
    } else {
        System.out.println("格式不满足！");
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

5.Pattern类

matches方法

在验证输入的字符串是否满足条件使用，只是验证是否满足规范

/**
 * 验证matches方法，用于整体匹配
 * 在验证输入的字符串是否满足条件使用
 */
@Test
public void matchesTest() {
    String str = "大河之犬";
    String regex = "^[\u0391-\uffe5]+$";
    boolean matches = Pattern.matches(regex, str);
    System.out.println(matches);  // true
}
1
2
3
4
5
6
7
8
9
10
11

6.Matcher类

常用方法使用案例：

/**
 * Matcher类演示
 */
public class MatcherTest {
    public static void main(String[] args) {
        String content = "Go的语法接近C语言，但对于变量的声明有所不同。Go支持垃圾回收功能" +
                "。Go的并行模型是以东尼·霍尔的通信顺序进程（CSP）为基础，采取类似模型的" +
                "其他语言包括Occam和Limbo，但它也具有Pi运算的特征，比如通道传输。在" +
                "1.8版本中开放插件（Plugin）的支持，这意味着现在能从Go中动态加载部分" +
                "函数。\n" +
                "与C++相比，Go并不包括如枚举、异常处理、继承、泛型、断言、虚函数等功" +
                "能，但增加了 切片(Slice) 型、并发、管道、垃圾回收、接口" +
                "（Interface）等特性的语言级支持。Go 2.0版本将支持泛型，对于断言的" +
                "存在，则持负面态度，同时也为自己不提供类型继承来辩护。\n" +
                "不同于Java，Go内嵌了关联数组（也称为哈希表(Hashes)或字典" +
                "(Dictionaries) ），就像字符串类型一样。aaaaaa11c8abcABCaBc";

        // 匹配Go
        Pattern pattern = Pattern.compile("Go");
        Matcher matcher = pattern.matcher(content);
        while (matcher.find()) {
            // 找到每一个匹配的开始索引
            System.out.println(matcher.start());
            // 找到每一个匹配的结束索引
            System.out.println(matcher.end());
            System.out.println("============================");
        }

        // 整体匹配方法，去检验某个字符串是否满足某个规则
        System.out.println(matcher.matches());

        // 替换所有的Go为Golang
        String res = matcher.replaceAll("Golang");
        System.out.println(res);
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

7.反向引用

反向引用非常方便，因为它允许重复一个模式（pattern），无需再重写一遍。我们可以使用\#（#是组号）来引用前面已定义的组。比如一个文本以abc开始，接着为xyz，紧跟着abc，对应的正则表达式可以为“abcxyzabc”，也可以使用反向引用重写正则表达式，“(abc)xyz\1”，\1表示第一组（abc）。\2表示第二组，\3表示第三组，以此类推。

小例子：

匹配两个连续的相同数字：

(\\d)\\1
1

匹配五个连续的相同数字：

(\\d)\\1{4}
1

匹配个位与千位相同，十位与百位相同的数组：

(\\d)(\\d)\\2\\1
1

反向引用之结巴去重案例：

import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * 结巴去重案例
 * 将结巴的语句恢复为正常的语句
 */
public class Stuttering {
    public static void main(String[] args) {
        String content = "我...要要...学学学学......java";
        // 去掉.
        Pattern pattern = Pattern.compile("\\.");
        Matcher matcher = pattern.matcher(content);
        // 将所有的.替换为空
        content = matcher.replaceAll("");

        // 去掉重读的字
        pattern = Pattern.compile("(.)\\1+");
        matcher = pattern.matcher(content);
        content = matcher.replaceAll("$1");
        System.out.println(content);  // 我要学java
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

Java常用正则表达式校验案例

相关阅读:
PHP 免费开源 cms 内容管理系统 (07flyCMS)
【图论中貌似要二分的题，有可能是假二分，直接用kruskal】【最小生成树-独立的联通块】【最小生成森林-多个独立的联通块】【如何判定最小生成森林】
五表联筛：从五个表格中筛选出出现过两次及两次以上的人名
 Scala入门到精通(尚硅谷学习笔记)章节十——模式匹配
 【RabbitMQ学习笔记】第一章 MQ的基本概念
 V10 桌面版、服务器版系统加固
 微服务下的Mybatis xml无效绑定问题分析 Invalid bound statement
智能电销机器人好做吗？ai机器人有没有用？
C++lambda表达式
 《opencv学习笔记》-- 寻找已知物体
原文地址：https://blog.csdn.net/Gherbirthday0916/article/details/126500279