• jq命令用法总结


    原创:扣钉日记(微信公众号ID:codelogs),欢迎分享,转载请保留出处。

    简介#

    如果说要给Linux文本三剑客(grep、sed、awk)添加一员的话,我觉得应该是jq命令,因为jq命令是用来处理json数据的工具,而现如今json几乎无所不在!

    网上的jq命令分享文章也不少,但大多介绍得非常浅,jq的强大之处完全没有介绍出来,所以就有了这篇文章,安利一下jq这个命令。

    基本用法#

    格式化#

    # jq默认的格式化输出
    $ echo -n '{"id":1, "name":"zhangsan", "score":[75, 85, 90]}'|jq .
    {
      "id": 1,
      "name": "zhangsan",
      "score": [
        75,
        85,
        90
      ]
    }
    
    # -c选项则是压缩到1行输出
    $ jq -c . <<eof
    {
      "id": 1,
      "name": "zhangsan",
      "score": [
        75,
        85,
        90
      ]
    }
    eof
    {"id":1,"name":"zhangsan","score":[75,85,90]}
    

    属性提取#

    # 获取id字段
    $ echo -n '{"id":1, "name":"zhangsan", "score":[75, 85, 90]}'|jq '.id'
    1
    # 获取name字段
    $ echo -n '{"id":1, "name":"zhangsan", "score":[75, 85, 90]}'|jq '.name'
    "zhangsan"
    
    # 获取name字段,-r 解开字符串引号
    $ echo -n '{"id":1, "name":"zhangsan", "score":[75, 85, 90]}'|jq -r '.name'
    zhangsan
    
    # 多层属性值获取
    $ echo -n '{"id":1, "name":"zhangsan", "attr":{"height":1.78,"weight":"60kg"}}'|jq '.attr.height'
    1.78
    
    # 获取数组中的值
    $ echo -n '{"id":1, "name":"zhangsan", "score":[75, 85, 90]}'|jq -r '.score[0]'
    75
    
    $ echo -n '[75, 85, 90]'|jq -r '.[0]'
    75
    
    # 数组截取
    $ echo -n '[75, 85, 90]'|jq -r '.[1:3]'
    [
      85,
      90
    ]
    
    # []展开数组
    $ echo -n '[75, 85, 90]'|jq '.[]'
    75
    85
    90
    
    # ..展开所有结构
    $ echo -n '{"id":1, "name":"zhangsan", "score":[75, 85, 90]}'|jq -c '..'
    {"id":1,"name":"zhangsan","score":[75,85,90]}
    1
    "zhangsan"
    [75,85,90]
    75
    85
    90
    
    # 从非对象类型中提取字段,会报错
    $ echo -n '{"id":1, "name":"zhangsan", "attr":{"height":1.78,"weight":"60kg"}}'|jq '.name.alias'
    jq: error (at <stdin>:0): Cannot index string with string "alias"
    
    # 使用?号可以避免这种报错
    $ echo -n '{"id":1, "name":"zhangsan", "attr":{"height":1.78,"weight":"60kg"}}'|jq '.name.alias?'
    
    # //符号用于,当前面的表达式取不到值时,执行后面的表达式
    $ echo -n '{"id":1, "name":"zhangsan", "attr":{"height":1.78,"weight":"60kg"}}'|jq '.alias//.name'
    "zhangsan"
    

    管道、逗号与括号#

    # 管道可以将值从前一个命令传送到后一个命令
    $ echo -n '{"id":1, "name":"zhangsan", "attr":{"height":1.78,"weight":"60kg"}}'|jq '.attr|.height'
    1.78
    
    # jq中做一些基础运算也是可以的
    $ echo -n '{"id":1, "name":"zhangsan", "attr":{"height":1.78,"weight":"60kg"}}'|jq '.attr|.height*100|tostring + "cm"'
    "178cm"
    
    # 逗号使得可以执行多个jq表达式,使得一个输入可计算出多个输出结果
    $ echo 1 | jq '., ., .'
    1
    1
    1
    
    # 括号用于提升表达式的优先级,如下:逗号优先级低于算术运算
    $ $ echo '1'|jq '.+1, .*2'
    2
    2
    
    $ echo '1'|jq '(.+1, .)*2'
    4
    2
    
    # 管道优先级低于逗号
    $ echo '1'|jq '., .|tostring'
    "1"
    "1"
    
    $ echo '1'|jq '., (.|tostring)'
    1
    "1"
    

    理解jq执行过程#

    表面上jq是用来处理json数据的,但实际上jq能处理的是任何json基础元素所形成的流,如integer、string、bool、null、object、array等,jq执行过程大致如下:

    1. jq从流中获取一个json元素
    2. jq执行表达式,表达式生成新的json元素
    3. jq将新的json元素打印输出

    可以看看这些示例,如下:

    # 这里jq实际上将1 2 3 4当作4个integer元素,每找到一个元素就执行+1操作
    # jq实际上是流式处理的,1 2 3 4可以看成流中的4个元素
    $ echo '1 2 3 4'|jq '. + 1'
    2
    3
    4
    5
    
    # 流中的元素不需要是同种类型,只要是完整的json元素即可
    $ jq '"<" + tostring + ">"' <<eof
    1
    "zhangsan"
    true
    {"id":1}
    [75, 80, 85]
    eof
    
    "<1>"
    "<zhangsan>"
    "<true>"
    "<{\"id\":1}>"
    "<[75,80,85]>"
    
    # -R选项可用于将读取到的json元素,都当作字符串对待
    $ seq 4|jq -R '.'
    "1"
    "2"
    "3"
    "4"
    
    # -s选项将从流中读取到的所有json元素,变成一个json数组元素  
    # 这里理解为jq从流中只取到了1个json元素,这个json元素的类型是数组
    $ seq 4|jq -s .
    [
      1,
      2,
      3,
      4
    ]
    

    基础运算#

    jq支持 + - * / % 运算,对于+号,如果是字符串类型,则是做字符串拼接,如下:

    # 做加减乘除运算
    $ echo 1|jq '.+1, .-1, .*2, ./2, .%2'
    2
    0
    2
    0.5
    1
    
    # 赋值运算
    $ echo -n '{"id":1,"name":"zhangsan","age":"17","score":"75"}'|jq '.id=2' -c
    {"id":2,"name":"zhangsan","age":"17","score":"75"}
    

    数据构造#

    jq可以很方便的将其它数据,转化为json对象或数组,如下:

    # 使用[]构造数组元素,-n告诉jq没有输入数据,直接执行表达式并生成输出数据
    $ jq -n '[1,2,3,4]' -c
    [1,2,3,4]
    
    $ cat data.txt
    id  name      age  score
    1   zhangsan  17   75
    2   lisi      16   80
    3   wangwu    18   85
    4   zhaoliu   18   90
    
    # 每行分割成数组,[]构造新的数组输出
    $ tail -n+2 data.txt|jq -R '[splits("\\s+")]' -c
    ["1","zhangsan","17","75"]
    ["2","lisi","16","80"]
    ["3","wangwu","18","85"]
    ["4","zhaoliu","18","90"]
    
    $ jq -n '{id:1, name:"zhangsan"}' -c
    {"id":1,"name":"zhangsan"}
    
    # 每行转换为对象,{}构造新的对象格式输出
    $ tail -n+2 data.txt|jq -R '[splits("\\s+")] | {id:.[0]|tonumber, name:.[1], age:.[2], score:.[3]}' -c
    {"id":1,"name":"zhangsan","age":"17","score":"75"}
    {"id":2,"name":"lisi","age":"16","score":"80"}
    {"id":3,"name":"wangwu","age":"18","score":"85"}
    {"id":4,"name":"zhaoliu","age":"18","score":"90"}
    
    # \()字符串占位变量替换
    $ cat data.json
    {"id":1,"name":"zhangsan","age":"17","score":"75"}
    {"id":2,"name":"lisi","age":"16","score":"80"}
    {"id":3,"name":"wangwu","age":"18","score":"85"}
    {"id":4,"name":"zhaoliu","age":"18","score":"90"}
    
    $ cat data.json |jq '"id:\(.id),name:\(.name),age:\(.age),score:\(.score)"' -r
    id:1,name:zhangsan,age:17,score:75
    id:2,name:lisi,age:16,score:80
    id:3,name:wangwu,age:18,score:85
    id:4,name:zhaoliu,age:18,score:90
    

    基础函数#

    # has函数,检测对象是否包含key
    $ echo -n '{"id":1,"name":"zhangsan","age":"17","score":"75"}'|jq 'has("id")'
    true
    
    # del函数,删除某个属性
    $ echo -n '{"id":1,"name":"zhangsan","age":"17","score":"75"}'|jq 'del(.id)' -c
    {"name":"zhangsan","age":"17","score":"75"}
    
    # map函数,对数组中每个元素执行表达式计算,计算结果组织成新数组
    $ seq 4|jq -s 'map(. * 2)' -c
    [2,4,6,8]
    
    # 上面map函数写法,其实等价于这个写法
    $ seq 4|jq -s '[.[]|.*2]' -c
    [2,4,6,8]
    
    # keys函数,列出对象属性
    $ echo -n '{"id":1,"name":"zhangsan","age":"17","score":"75"}'|jq 'keys' -c
    ["age","id","name","score"]
    
    # to_entries函数,列出对象键值对
    $ echo -n '{"id":1,"name":"zhangsan","age":"17","score":"75"}'|jq 'to_entries' -c
    [{"key":"id","value":1},{"key":"name","value":"zhangsan"},{"key":"age","value":"17"},{"key":"score","value":"75"}]
    
    # length函数,计算数组或字符串长度
    $ jq -n '[1,2,3,4]|length'
    4
    
    # add函数,计算数组中数值之和
    $ seq 4|jq -s 'add'
    10
    
    # tostring与tonumber,类型转换
    $ seq 4|jq 'tostring|tonumber'
    1
    2
    3
    4
    
    # type函数,获取元素类型
    $ jq 'type' <<eof
    1
    "zhangsan"
    true
    null
    {"id":1}
    [75, 80, 85]
    eof
    
    "number"
    "string"
    "boolean"
    "null"
    "object"
    "array"
    

    过滤、排序、分组函数#

    $ cat data.json
    {"id":1,"name":"zhangsan","sex": 0, "age":"17","score":"75"}
    {"id":2,"name":"lisi","sex": 1, "age":"16","score":"80"}
    {"id":3,"name":"wangwu","sex": 0, "age":"18","score":"85"}
    {"id":4,"name":"zhaoliu","sex": 0, "age":"18","score":"90"}
    
    # select函数用于过滤,类似SQL中的where
    $ cat data.json |jq 'select( (.id>1) and (.age|IN("16","17","18")) and (.name != "lisi") or (has("attr")|not) and (.score|tonumber >= 90) )' -c
    {"id":3,"name":"wangwu","sex":0,"age":"18","score":"85"}
    {"id":4,"name":"zhaoliu","sex":0,"age":"18","score":"90"}
    
    # 有一些简化的过滤函数,如arrays, objects, iterables, booleans, numbers, normals, finites, strings, nulls, values, scalars
    # 它们根据类型过滤,如objects过滤出对象,values过滤出非null值等
    $ jq -c 'objects' <<eof
    1
    "zhangsan"
    true
    null
    {"id":1}
    [75, 80, 85]
    eof
    
    {"id":1}
    
    $ jq -c 'values' <<eof
    1
    "zhangsan"
    true
    null
    {"id":1}
    [75, 80, 85]
    eof
    
    1
    "zhangsan"
    true
    {"id":1}
    [75,80,85]
    
    # 选择出id与name字段,类似SQL中的select id,name
    $ cat data.json|jq -s 'map({id,name})[]' -c
    {"id":1,"name":"zhangsan"}
    {"id":2,"name":"lisi"}
    {"id":3,"name":"wangwu"}
    {"id":4,"name":"zhaoliu"}
    
    # 提取前2行,类似SQL中的limit 2
    $ cat data.json|jq -s 'limit(2; map({id,name})[])' -c
    {"id":1,"name":"zhangsan"}
    {"id":2,"name":"lisi"}
    
    # 按照age、id排序,类似SQL中的order by age,id
    $ cat data.json|jq -s 'sort_by((.age|tonumber), .id)[]' -c
    {"id":2,"name":"lisi","sex":1,"age":"16","score":"80"}
    {"id":1,"name":"zhangsan","sex":0,"age":"17","score":"75"}
    {"id":3,"name":"wangwu","sex":0,"age":"18","score":"85"}
    {"id":4,"name":"zhaoliu","sex":0,"age":"18","score":"90"}
    
    
    # 根据sex与age分组,并每组聚合计算count(*)、avg(score)、max(id)
    $ cat data.json |jq -s 'group_by(.sex, .age)[]' -c
    [{"id":1,"name":"zhangsan","sex":0,"age":"17","score":"75"}]
    [{"id":3,"name":"wangwu","sex":0,"age":"18","score":"85"},{"id":4,"name":"zhaoliu","sex":0,"age":"18","score":"90"}]
    [{"id":2,"name":"lisi","sex":1,"age":"16","score":"80"}]
    
    $ cat data.json |jq -s 'group_by(.sex, .age)[]|{sex:.[0].sex, age:.[0].age, count:length, avg_score:map(.score|tonumber)|(add/length), scores:map(.score)|join(","), max_id:map(.id)|max }' -c                 
    {"sex":0,"age":"17","count":1,"avg_score":75,"scores":"75","max_id":1}
    {"sex":0,"age":"18","count":2,"avg_score":87.5,"scores":"85,90","max_id":4}
    {"sex":1,"age":"16","count":1,"avg_score":80,"scores":"80","max_id":2}
    
    

    字符串操作函数#

    # contains函数,判断是否包含,实际也可用于判断数组是否包含某个元素
    $ echo hello | jq -R 'contains("he")'
    true
    
    # 判断是否以he开头
    $ echo hello | jq -R 'startswith("he")'
    true
    
    # 判断是否以llo结尾
    $ echo hello | jq -R 'endswith("llo")'
    true
    
    # 去掉起始空格
    $ echo ' hello '|jq -R 'ltrimstr(" ")|rtrimstr(" ")'
    "hello"
    
    # 大小写转换
    $ echo hello|jq -R 'ascii_upcase'
    "HELLO"
    
    $ echo HELLO|jq -R 'ascii_downcase'
    "hello"
    
    # 字符串数组,通过逗号拼接成一个字符串
    $ seq 4|jq -s 'map(tostring)|join(",")'
    "1,2,3,4"
    
    # json字符串转换为json对象
    $ echo -n '{"id":1,"name":"zhangsan","age":"17","attr":"{\"weight\":56,\"height\":178}"}'|jq '.attr = (.attr|fromjson)' -c
    {"id":1,"name":"zhangsan","age":"17","attr":{"weight":56,"height":178}}
    
    # json对象转换为json字符串
    $ echo -n '{"id":1,"name":"zhangsan","age":"17","attr":{"weight":56,"height":178}}'|jq '.attr = (.attr|tojson)'
    {
      "id": 1,
      "name": "zhangsan",
      "age": "17",
      "attr": "{\"weight\":56,\"height\":178}"
    }
    
    $ cat data.txt
    id:1,name:zhangsan,age:17,score:75
    id:2,name:lisi,age:16,score:80
    id:3,name:wangwu,age:18,score:85
    id:4,name:zhaoliu,age:18,score:90
    
    # 正则表达式过滤,jq使用的是PCRE
    $ cat data.txt|jq -R 'select(test("id:\\d+,name:\\w+,age:\\d+,score:8\\d+"))' -r
    id:2,name:lisi,age:16,score:80
    id:3,name:wangwu,age:18,score:85
    
    # 正则拆分字符串
    $ cat data.txt|jq -R '[splits(",")]' -cr
    ["id:1","name:zhangsan","age:17","score:75"]
    ["id:2","name:lisi","age:16","score:80"]
    ["id:3","name:wangwu","age:18","score:85"]
    ["id:4","name:zhaoliu","age:18","score:90"]
    
    # 正则替换字符串
    $ cat data.txt |jq -R 'gsub("name"; "nick")' -r
    id:1,nick:zhangsan,age:17,score:75
    id:2,nick:lisi,age:16,score:80
    id:3,nick:wangwu,age:18,score:85
    id:4,nick:zhaoliu,age:18,score:90
    
    # 正则表达式捕获数据
    $ cat data.txt|jq -R 'match("id:(?<id>\\d+),name:(?<name>\\w+),age:\\d+,score:8\\d+")' -cr
    {"offset":0,"length":30,"string":"id:2,name:lisi,age:16,score:80","captures":[{"offset":3,"length":1,"string":"2","name":"id"},{"offset":10,"length":4,"string":"lisi","name":"name"}]}
    {"offset":0,"length":32,"string":"id:3,name:wangwu,age:18,score:85","captures":[{"offset":3,"length":1,"string":"3","name":"id"},{"offset":10,"length":6,"string":"wangwu","name":"name"}]}
    
    # capture命名捕获,生成key是捕获组名称,value是捕获值的对象
    $ cat data.txt|jq -R 'capture("id:(?<id>\\d+),name:(?<name>\\w+),age:\\d+,score:8\\d+")' -rc
    {"id":"2","name":"lisi"}
    {"id":"3","name":"wangwu"}
    
    # 正则扫描输入字符串
    $ cat data.txt|jq -R '[scan("\\w+:\\w+")]' -rc
    ["id:1","name:zhangsan","age:17","score:75"]
    ["id:2","name:lisi","age:16","score:80"]
    ["id:3","name:wangwu","age:18","score:85"]
    ["id:4","name:zhaoliu","age:18","score:90"]
    

    日期函数#

    # 当前时间缀
    $ jq -n 'now'
    1653820640.939947
    
    # 将时间缀转换为0时区的分解时间(broken down time),形式为 年 月 日 时 分 秒 dayOfWeek dayOfYear
    $ jq -n 'now|gmtime' -c
    [2022,4,29,10,45,5.466768980026245,0,148]
    
    # 将时间缀转换为本地时区的分解时间(broken down time)
    $ jq -n 'now|localtime' -c
    [2022,4,29,18,46,5.386353015899658,0,148]
    
    # 分解时间转换为时间串
    $ jq -n 'now|localtime|strftime("%Y-%m-%dT%H:%M:%S")' -c
    "2022-05-29T18:50:33"
    
    # 与上面等效
    $ jq -n 'now|strflocaltime("%Y-%m-%dT%H:%M:%SZ")'
    "2022-05-29T19:00:40Z"
    
    # 时间串解析为分解时间
    $ date +%FT%T|jq -R 'strptime("%Y-%m-%dT%H:%M:%S")' -c
    [2022,4,29,18,51,27,0,148]
    
    # 分解时间转换为时间缀
    $ date +%FT%T|jq -R 'strptime("%Y-%m-%dT%H:%M:%S")|mktime'
    1653850310
    

    高级用法#

    实际上jq是一门脚本语言,它也支持变量、分支结构、循环结构与自定义函数,如下:

    $ cat data.json
    {"id":1,"name":"zhangsan","sex": 0, "age":"17","score":"75"}
    {"id":2,"name":"lisi","sex": 1, "age":"16","score":"80"}
    {"id":3,"name":"wangwu","sex": 0, "age":"18","score":"85"}
    {"id":4,"name":"zhaoliu","sex": 0, "age":"18","score":"90"}
    
    # 单变量定义
    $ cat data.json| jq '.id as $id|$id'
    1
    2
    3
    4
    
    # 对象展开式变量定义
    $ cat data.json |jq '. as {id:$id,name:$name}|"id:\($id),name:\($name)"'
    "id:1,name:zhangsan"
    "id:2,name:lisi"
    "id:3,name:wangwu"
    "id:4,name:zhaoliu"
    
    $ cat data.json
    ["1","zhangsan","17","75"]
    ["2","lisi","16","80"]
    ["3","wangwu","18","85"]
    ["4","zhaoliu","18","90"]
    
    # 数组展开式变量定义
    $ cat data.json|jq '. as [$id,$name]|"id:\($id),name:\($name)"'
    "id:1,name:zhangsan"
    "id:2,name:lisi"
    "id:3,name:wangwu"
    "id:4,name:zhaoliu"
    
    # 分支结构
    $ cat data.json|jq '. as [$id,$name]|if ($id>"1") then "id:\($id),name:\($name)" else empty end'
    "id:2,name:lisi"
    "id:3,name:wangwu"
    "id:4,name:zhaoliu"
    
    # 循环结构,第一个表达式条件满足时,执行只每二个表达式
    # 循环结构除了while,还有until、recurse等
    $ echo 1|jq 'while(.<100; .*2)'
    1
    2
    4
    8
    16
    32
    64
    
    # 自定义计算3次方的函数
    $ echo 2|jq 'def cube: .*.*. ; cube'
    8
    

    由于这些高级特性并不常用,这里仅给出了一些简单示例,详细使用可以man jq查看。

    辅助shell编程#

    熟悉shell脚本编程的同学都知道,shell本身是没有提供Map、List这种数据结构的,这导致使用shell实现某些功能时,变得很棘手。

    但jq本身是处理json的,而json中的对象就可等同于Map,json中的数组就可等同于List,如下:

    list='[]';
    #List添加元素
    list=$(echo "$list"|jq '. + [ $val ]' --arg val java);
    list=$(echo "$list"|jq '. + [ $val ]' --arg val shell);
    #获取List大小
    echo "$list"|jq '.|length'
    #获取List第1个元素
    echo "$list"|jq '.[0]' -r
    # List是否包含java字符串
    echo "$list"|jq 'any(.=="java")'
    #删除List第1个元素
    list=$(echo "$list"|jq 'del(.[0])');
    # List合并
    list=$(echo "$list"|jq '. + $val' --argjson val '["shell","python"]');
    # List截取
    echo "$list"|jq '.[1:3]'
    # List遍历
    for o in $(echo "$list" | jq -r '.[]');do 
        echo "$o"; 
    done
    
    map='{}';
    #Map添加元素
    map=$(echo "$map"|jq '.id=$val' --argjson val 1)
    map=$(echo "$map"|jq '.courses=$val' --argjson val "$list")
    #获取Map大小
    echo "$map"|jq '.|length'
    #获取Map指定key的值
    echo "$map"|jq '.id' -r
    #判断Map指定key是否存在
    echo "$map" | jq 'has("id")'
    #删除Map指定key
    map=$(echo "$map"|jq 'del(.id)')
    # Map合并
    map=$(echo "$map"|jq '. + $val' --argjson val '{"code":"ID001","name":"hello"}')
    # Map的KeySet遍历
    for key in $(echo "$map" | jq -r 'keys[]'); do 
        value=$(jq '.[$a]' --arg a "$key" -r <<<"$map"); 
        printf "%s:%s\n" "$key" "$value"; 
    done
    # Map的entrySet遍历
    while read -r line; do 
        key=$(jq '.key' -r <<<"$line"); 
        value=$(jq '.value' -r <<<"$line"); 
        printf "%s:%s\n" "$key" "$value"; 
    done <<<$(echo "$map" | jq 'to_entries[]' -c)
    

    总结#

    可以发现,jq已经实现了json数据处理与分析的方方面面,我个人最近在工作中,也多次使用jq来分析调用日志等,用起来确实非常方便。

    如果你现在还没完全学会jq的用法,没关系,建议先收藏起来,后面一定会用得到的!

    往期内容#

    密码学入门
    q命令-用SQL分析文本文件
    神秘的backlog参数与TCP连接队列
    mysql的timestamp会存在时区问题?
    真正理解可重复读事务隔离级别
    字符编码解惑

  • 相关阅读:
    TypeScript 基础教程,适合新手
    SpringCloud 10 Hystrix 服务降级
    Java面试ZooKeeper面试题汇总及答案
    C#中的委托、事件与接口
    Intellij Debugger slow: Method breakpoints may dramatically slow down debugging
    Learning Open-World Object Proposals without Learning to Classify(论文解析)
    如何使用Java创建数据透视表并导出为PDF
    vue2通信方法(最全)
    回归(Regression)
    HDI的盲孔设计,你注意到这个细节了吗?
  • 原文地址:https://www.cnblogs.com/codelogs/p/16324928.html