• linux正则使用


    特殊符号

    特殊符号代表意义
    [:alnum:]代表英文大小写字节及数字,亦即 0-9, A-Z, a-z
    [:alpha:]代表任何英文大小写字节,亦即 A-Z, a-z
    [:blank:]代表空白键与 [Tab] 按键两者
    [:cntrl:]代表键盘上面的控制按键,亦即包括 CR, LF, Tab, Del… 等等
    [:digit:]代表数字而已,亦即 0-9
    [:graph:]除了空白字节 (空白键与 [Tab] 按键) 外的其他所有按键
    [:lower:]代表小写字节,亦即 a-z
    [:print:]代表任何可以被列印出来的字节
    [:punct:]代表标点符号 (punctuation symbol),亦即:" ’ ? ! ; : # $…
    [:upper:]代表大写字节,亦即 A-Z
    [:space:]任何会产生空白的字节,包括空白键, [Tab], CR 等等
    [:xdigit:]代表 16 进位的数字类型,因此包括: 0-9, A-F, a-f 的数字与字节
    #创建一个文件 拷贝下面内容到文件中
    touch regular_express.txt
    
    • 1
    • 2

    文件内容

    "Open Source" is a good mechanism to develop programs.
    apple is my favorite food.
    Football game is not use feet only.
    this dress doesn't fit me.
    However, this dress is about $ 3183 dollars.^M
    GNU is free air not free beer.^M
    Her hair is very beauty.^M
    I can't finish the test.^M
    Oh! The soup taste good.^M
    motorcycle is cheap than car.
    This window is clear.
    the symbol '*' is represented as start.
    Oh!     My god!
    The gd software is a library for drafting programs.^M
    You are the best is mean you are the no. 1.
    The world <Happy> is the same with "glad".
    I like dog.
    google is the best tools for search keyword.
    goooooogle yes!
    go! go! Let's go.
    # I am VBird
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21

    grep进阶操作

    搜寻特定字串

    [root@localhost ~]# grep -n 'the' regular_express.txt
    8:I can't finish the test.^M
    12:the symbol '*' is represented as start.
    15:You are the best is mean you are the no. 1.
    16:The world <Happy> is the same with "glad".
    18:google is the best tools for search keyword.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    反向选择

    [root@localhost ~]# grep -vn 'the' regular_express.txt
    1:"Open Source" is a good mechanism to develop programs.
    2:apple is my favorite food.
    3:Football game is not use feet only.
    4:this dress doesn't fit me.
    5:However, this dress is about $ 3183 dollars.^M
    6:GNU is free air not free beer.^M
    7:Her hair is very beauty.^M
    9:Oh! The soup taste good.^M
    10:motorcycle is cheap than car.
    11:This window is clear.
    13:Oh!     My god!
    14:The gd software is a library for drafting programs.^M
    17:I like dog.
    19:goooooogle yes!
    20:go! go! Let's go.
    21:# I am VBird
    22:
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18

    利用中括号 [] 来搜寻集合字节

    #搜寻 test 或 taste 这两个单字
    [root@localhost ~]# grep -n 't[ae]st' regular_express.txt
    8:I can't finish the test.^M
    9:Oh! The soup taste good.^M
    
    • 1
    • 2
    • 3
    • 4
    #不想要 oo 前面有 g 
    [root@localhost ~]# grep -n '[^g]oo' regular_express.txt
    2:apple is my favorite food.
    3:Football game is not use feet only.
    18:google is the best tools for search keyword.
    19:goooooogle yes!
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
     #oo 前面不想要有小写字节
    [root@localhost ~]# grep -n '[^a-z]oo' regular_express.txt
    3:Football game is not use feet only.
    
    • 1
    • 2
    • 3

    当我们在一组集合字节中,如果该字节组是连续的,例如大写英文/小写英文/数字等等, 就可以使用[a-z],[A-Z],[0-9]等方式来书写

    也可以使用如下的方法

    [root@localhost ~]# grep -n '[^[:lower:]]oo' regular_express.txt
    3:Football game is not use feet only.
    
    • 1
    • 2

    行首与行尾字节 ^ $

    #让 the 只在行首列出
    [root@localhost ~]# grep -n '^the' regular_express.txt
    12:the symbol '*' is represented as start.
    
    • 1
    • 2
    • 3
    #开头是小写字节的那一行就列出
    [root@localhost ~]# grep -n '^[a-z]' regular_express.txt
    2:apple is my favorite food.
    4:this dress doesn't fit me.
    10:motorcycle is cheap than car.
    12:the symbol '*' is represented as start.
    18:google is the best tools for search keyword.
    19:goooooogle yes!
    20:go! go! Let's go.
    #上面的命令也可以用如下的方式来取代
    grep -n '^[^a-zA-Z]' regular_express.txt
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11

    ^ 符号,在字节集合符号(括号[])之内与之外是不同的! 在 [] 内代表『反向选择』,在 [] 之外则代表定位在行首的意义

    行尾结束为小数点 (.) 的那一行

    [root@localhost ~]# grep -n '\.$' regular_express.txt
    1:"Open Source" is a good mechanism to develop programs.
    2:apple is my favorite food.
    3:Football game is not use feet only.
    4:this dress doesn't fit me.
    10:motorcycle is cheap than car.
    11:This window is clear.
    12:the symbol '*' is represented as start.
    15:You are the best is mean you are the no. 1.
    16:The world  is the same with "glad".
    17:I like dog.
    18:google is the best tools for search keyword.
    20:go! go! Let's go.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13

    任意一个字节 . 与重复字节*

    • . (小数点):代表『一定有一个任意字节』的意思;
    • * (星星号):代表『重复前一个字节, 0 到无穷多次』的意思,为组合形态
    #找出 g??d 的字串
    [root@localhost ~]# grep -n 'g..d' regular_express.txt
    1:"Open Source" is a good mechanism to develop programs.
    9:Oh! The soup taste good.^M
    16:The world <Happy> is the same with "glad".
    
    #至少两个 o 以上的字串
    [root@localhost ~]# grep -n 'ooo*' regular_express.txt
    1:"Open Source" is a good mechanism to develop programs.
    2:apple is my favorite food.
    3:Football game is not use feet only.
    9:Oh! The soup taste good.^M
    18:google is the best tools for search keyword.
    19:goooooogle yes!
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14

    限定连续 RE 字符范围 {}
    限制一个范围区间内的重复字节数,使用字符范围{}
    注:因为 { 与 } 的符号在 shell 是有特殊意义的,因此, 我们必须要使用转义字符 \ 来让他失去特殊意义才行

    #找到两个 o 的字串
    [root@localhost ~]# grep -n 'o\{2\}' regular_express.txt
    1:"Open Source" is a good mechanism to develop programs.
    2:apple is my favorite food.
    3:Football game is not use feet only.
    9:Oh! The soup taste good.^M
    18:google is the best tools for search keyword.
    19:goooooogle yes!
    
    #找出 g 后面接 2 到 5 个 o ,然后再接一个 g 的字串
    [root@localhost ~]# grep -n 'go\{2,5\}g' regular_express.txt
    18:google is the best tools for search keyword.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12

    基础正规表示法字符汇整 (characters)

    RE 字符意义与范例
    ^word意义:待搜寻的字串(word)在行首!
    word$意义:待搜寻的字串(word)在行尾!
    .意义:代表『一定有一个任意字节』的字符!
    \意义:跳脱字符,将特殊符号的特殊意义去除!
    *意义:重复零个到无穷多个的前一个 RE 字符
    [list]意义:字节集合的 RE 字符,里面列出想要撷取的字节!
    [n1-n2]意义:字节集合的 RE 字符,里面列出想要选择的字节范围
    [^list]意义:字节集合的 RE 字符,里面列出不要的字串或范围!
    {n,m}意义:连续 n 到 m 个的『前一个 RE 字符』;若为 {n} 则是连续 n 个的前一个 RE 字符,;若是 {n,} 则是连续 n 个以上的前一个 RE 字符!
  • 相关阅读:
    重磅出击,20张图带你彻底了解ReentrantLock加锁解锁的原理
    【项目实战】springboot+vue舞蹈课程在线学习系统-java舞蹈课程学习打卡系统的设计与实现
    大数据毕业设计选题推荐-家具公司运营数据分析平台-Hadoop-Spark-Hive
    Tomcat
    【塔望咨询】×【紫燕食品】签署“紫燕·方便菜”品牌战略合作协议
    网络安全(黑客)自学
    面试经典150题——Day19
    CSS 效果 圆形里一个文字居中
    Word控件Spire.Doc 【文本】教程(2) ;在 C#、VB.NET 中从 Word 文档中提取文本
    文本分类方案,飞浆PaddleNLP涵盖了所有
  • 原文地址:https://blog.csdn.net/qq_32702685/article/details/127916532