• 分段读取csv文件并可视化处理


    1.数据

    使用数据为csv文件,数据有89万条记录,名称字段具体如下:

    Time (sec)

    Delta Time (sec)

    Segment ID

    Latitude (deg)

    Longitude (deg)

    Easting (m)

    Northing (m)

    Cross-Track (m)

    Along-Track (m)

    Height (m HAE)

    Height (m MSL)

    Classification

    Signal Confidence

    Dem_height

    luccID

    32.56000081

    78976682.22

    161266

    29.08135819

    113.4059348

    734190.7155

    3219390.276

    -22.7432411

    20201.67926

    77.74754333

    93.76404762

    0

    0

    0

    0

    32.56000081

    78976682.22

    161266

    29.08135858

    113.4059346

    734190.7011

    3219390.318

    -22.7526257

    20201.72316

    91.41132355

    107.4278278

    0

    0

    0

    0

    32.56000081

    78976682.22

    161266

    29.08135802

    113.4059348

    734190.722

    3219390.257

    -22.7389931

    20201.65946

    71.5628891

    87.57939339

    0

    0

    0

    0

    32.56000081

    78976682.22

    161266

    29.08135773

    113.4059349

    734190.733

    3219390.224

    -22.7318915

    20201.62621

    61.22312164

    77.23962593

    0

    0

    0

    0

    32.56000081

    78976682.22

    161266

    29.08135811

    113.4059348

    734190.7185

    3219390.267

    -22.7412967

    20201.67021

    74.91683197

    90.93333626

    0

    0

    0

    0

    32.56000081

    78976682.22

    161266

    29.08135616

    113.4059355

    734190.7915

    3219390.052

    -22.6938061

    20201.44813

    5.772859573

    21.78936386

    0

    0

    0

    0

    32.56000081

    78976682.22

    161266

    29.08135614

    113.4059355

    734190.7922

    3219390.05

    -22.6933262

    20201.44587

    5.074115753

    21.09062004

    0

    0

    0

    0

    32.56000081

    78976682.22

    161266

    29.08135651

    113.4059354

    734190.7784

    3219390.091

    -22.7023598

    20201.48811

    18.22643089

    34.24293518

    0

    0

    0

    0

    32.56000081

    78976682.22

    161266

    29.08135656

    113.4059354

    734190.7765

    3219390.096

    -22.7035772

    20201.49374

    19.99853134

    36.01503563

    0

    0

    0

    0

    32.56000081

    78976682.22

    161266

    29.08135584

    113.4059356

    734190.8036

    3219390.016

    -22.6859499

    20201.41132

    -5.66560459

    10.3508997

    0

    0

    0

    0

    32.56010081

    78976682.22

    161266

    29.08136468

    113.405934

    734190.6294

    3219390.994

    -22.7452324

    20202.4024

    80.65776825

    96.67429392

    0

    0

    0

    0

    2.读取数据 

    2.1完整读取数据并显示

    1. import numpy as np
    2. import pandas as pd
    3. import matplotlib.pyplot as plt
    4. inputpath=r"E:\csv_dbscan\ATL03_20200703015729_01180802_005_01_gt1r.csv"
    5. df=pd.read_csv(inputpath)
    6. X=df['Time (sec)']
    7. Y=df['Height (m HAE)']
    8. plt.figure()
    9. plt.scatter(X,Y,marker='o',s=0.000003,label='Point Cloud')
    10. plt.legend()
    11. plt.show()

    放大显示:

     2.2分段读取数据并显示

    (1)根据总记录数和分段数来计算每段的数量是多少,然后我们每次只读取一段数据。

    方法1:使用chunksize参数实现分段读取和显示

    chunksize是按照行记录数量来分段的

    1. df_chunker = pd.read_csv(inputpath,chunksize=segment_count)
    2. for df_item in df_chunker:
    3. X_seg = df_item['Time (sec)']
    4. Y_seg = df_item['Height (m HAE)']
    5. plt.figure()
    6. plt.scatter(X_seg, Y_seg, marker='o', s=1, label='Point Cloud')
    7. plt.legend()
    8. plt.show()

    比如我们每段数据设置为segment_count=10000条,那么就有segment_n=N/segment_count(segment_count每段记录数,N总记录数,segment_n为段数)段数据。

    我们按照顺序读取每段数据并显示:

    方法2:使用沿线距离来分段显示 

    比如我们使用Along-Track (m)属性1000m分段显示:第一段的范围是20201.67926-20301.67926,后面依次加1000.

    1. len_data=len(df['Time (sec)']) #行记录数
    2. segment_count=1000
    3. Along_track_n=int((df['AlongTrack'][len_data-1]-df['AlongTrack'][0])/segment_count)+1
    4. start=df['AlongTrack'][0]
    5. end=df['AlongTrack'][len_data-1]
    6. for len_seg in range(Along_track_n):
    7. df_seg = df.loc[(df['AlongTrack'] >= (start + len_seg * segment_count)) & (df['AlongTrack'] <= (start+ (len_seg + 1) * segment_count)),:]
    8. X_seg = df_seg['AlongTrack']
    9. Y_seg = df_seg['Height (m HAE)']
    10. if len(X_seg)==0:
    11. print("该段没数据!")
    12. plt.figure()
    13. plt.scatter(X_seg, Y_seg, marker='o', s=1, label='Point Cloud')
    14. plt.legend()
    15. plt.show()

  • 相关阅读:
    MySql数据库入门的基本操作
    AIGC之文本内容生成概述(下)—— GPT
    嵌入式Linux入门-手把手教你初始化SDRAM(附代码)
    走进Hive
    关键字extern用法
    文件上传漏洞笔记
    5- FreeRTOS任务通知
    element ui 时间筛选样式遮盖问题修复
    【uniapp】六格验证码输入框实现
    C# 语法分析器(二)LR(0) 语法分析
  • 原文地址:https://blog.csdn.net/soderayer/article/details/126695776