• python对dataframe中series的json格式解析


    方法1:如果df里只有一列json格式,可以保存为txt,然后再删掉列名,在进行处理。

    import pandas as pd
    
    result = []
    with open(r"C:\Users\Administrator\Desktop\json处理.txt") as f:
        for line in f:
            print(line) # txt样子
            line = line.replace('""','"')
            line = line.replace('"{','{')
            line = line.replace('}"','}')
    #         print(line) # 处理后的line
            dict_o = eval(line.strip()) #eval:将字符串字典转换为字典,strip:将某行字的两端空格去掉
            result.append(dict_o)
    df = pd.DataFrame(result)
    df
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14

    txt 内容


    {""imei"":""5gy5"",""deviceid"":""c5ty7890h"",""accid"":""null""}
    
    {"imei":"5gy5","deviceid":"c5ty7890h","accid":"null"}
    
    {""imei"":""5rrr5"",""createtime"":""c5ty7890h"",""accid"":""null""}
    
    {"imei":"5rrr5","createtime":"c5ty7890h","accid":"null"}
    
    {""appid"":""5eeey5"",""deviceid"":""c5ty7890h"",""createtime"":""c5ty7890h"",""accid"":""null""}
    
    {"appid":"5eeey5","deviceid":"c5ty7890h","createtime":"c5ty7890h","accid":"null"}
    
    {""imei"":""null"",""deviceid"":""c544ty7890h"",""accid"":""null""}
    
    {"imei":"null","deviceid":"c544ty7890h","accid":"null"}
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15

    结果输出:

    imeideviceidaccidcreatetimeappid
    05gy5c5ty7890hnullNaNNaN
    15gy5c5ty7890hnullNaNNaN
    25rrr5NaNnullc5ty7890hNaN
    35rrr5NaNnullc5ty7890hNaN
    4NaNc5ty7890hnullc5ty7890h5eeey5
    5NaNc5ty7890hnullc5ty7890h5eeey5
    6nullc544ty7890hnullNaNNaN
    7nullc544ty7890hnullNaNNaN

    方法2

    import json
    import pandas as pd
    df = pd.read_excel(r"C:\Users\Administrator\Desktop\json处理.xlsx")
    print(df)
    data = df.to_dict(orient='records')
    for _ in data:
        _.update(json.loads(_['json']))
        del _['json']
    
    df1 = pd.DataFrame(data)
    df1
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11

    原表

        城市                                               json
    0   重庆  {"imei":"5gy5","deviceid":"c5ty7890h","accid":...
    1   北京  {"imei":"5gy5","deviceid":"c5ty7890h","accid":...
    2   上海  {"imei":"5rrr5","createtime":"c5ty7890h","acci...
    3   广州  {"imei":"5rrr5","createtime":"c5ty7890h","acci...
    4   深圳  {"appid":"5eeey5","deviceid":"c5ty7890h","crea...
    5   地球  {"appid":"5eeey5","deviceid":"c5ty7890h","crea...
    6   火星  {"imei":"null","deviceid":"c544ty7890h","accid...
    7  太阳星  {"imei":"null","deviceid":"c544ty7890h","accid...
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

    结果输出:

    城市imeideviceidaccidcreatetimeappid
    0重庆5gy5c5ty7890hnullNaNNaN
    1北京5gy5c5ty7890hnullNaNNaN
    2上海5rrr5NaNnullc5ty7890hNaN
    3广州5rrr5NaNnullc5ty7890hNaN
    4深圳NaNc5ty7890hnullc5ty7890h5eeey5
    5地球NaNc5ty7890hnullc5ty7890h5eeey5
    6火星nullc544ty7890hnullNaNNaN
    7太阳星nullc544ty7890hnullNaNNaN

    方法3,取json的固定字段

    import json
    import pandas as pd
    df = pd.read_excel(r"C:\Users\Administrator\Desktop\json处理.xlsx")
    print(df)
    tmp=[]
    for line in df.json.values:
    #     d=eval(line)[0] 
        d=eval(line)
        tmp.append(d.get('imei') or d.get('deviceid'))
        # tmp.append(d.get('xx')['yy']) #如果是嵌套字典,就用此方法
    df['结果列']=tmp
    df
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12

    原表

        城市                                               json
    0   重庆  {"imei":"5gy5","deviceid":"c5ty7890h","accid":...
    1   北京  {"imei":"5gy5","deviceid":"c5ty7890h","accid":...
    2   上海  {"imei":"5rrr5","createtime":"c5ty7890h","acci...
    3   广州  {"imei":"5rrr5","createtime":"c5ty7890h","acci...
    4   深圳  {"appid":"5eeey5","deviceid":"c5ty7890h","crea...
    5   地球  {"appid":"5eeey5","deviceid":"c5ty7890h","crea...
    6   火星  {"imei":"null","deviceid":"c544ty7890h","accid...
    7  太阳星  {"imei":"null","deviceid":"c544ty7890h","accid...
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

    结果输出:

    城市json结果列
    0重庆{"imei":"5gy5","deviceid":"c5ty7890h","accid":...5gy5
    1北京{"imei":"5gy5","deviceid":"c5ty7890h","accid":...5gy5
    2上海{"imei":"5rrr5","createtime":"c5ty7890h","acci...5rrr5
    3广州{"imei":"5rrr5","createtime":"c5ty7890h","acci...5rrr5
    4深圳{"appid":"5eeey5","deviceid":"c5ty7890h","crea...c5ty7890h
    5地球{"appid":"5eeey5","deviceid":"c5ty7890h","crea...c5ty7890h
    6火星{"imei":"null","deviceid":"c544ty7890h","accid...null
    7太阳星{"imei":"null","deviceid":"c544ty7890h","accid...null
  • 相关阅读:
    作为一名软件测试开发岗的面试官,我都是怎么选人的?
    L17.linux命令每日一练 -- 第三章 文件过滤及内容编辑处理命令 -- more和less命令
    JAVA毕业设计购物网站设计计算机源码+lw文档+系统+调试部署+数据库
    Toaster - Android 吐司框架,专治 Toast 各种疑难杂症
    【springboot】1、快速入门
    荧光EEM平滑教程(去除散射)
    Spring Security中文文档
    2020年上半年软件设计师真题
    带你吃透Servlet核心编程下篇(完整图文教程)
    测试/开发程序员的成长路线,全局思考问题的问题......
  • 原文地址:https://blog.csdn.net/qq_45396577/article/details/125540216