问题描述:
在Modelarts上代码使用mindspore.dataset.NumpySlicesDataset
【操作步骤&问题现象】
data中包含四种数据:
data1(25000000,5,5,3),data2(25000000,5,5,3),data3(25000000,5,5,3),label(25000000)
使用以下代码时报错
dataset = ds.NumpySlicesDataset(data, num_parallel_workers=4,column_names=['data1', 'data2','data3', 'label'], shuffle=False)
报错信息:
create_dataset
ds.NumpySlicesDataset
[Modelarts Service Log]2022-05-27 14:09:06,923 - ERROR - proc-rank-0-device-0 (pid: 298) has exited with non-zero code: -9
[Modelarts Service Log]2022-05-27 14:09:06,926 - INFO - Begin destroy training processes
[Modelarts Service Log]2022-05-27 14:09:06,926 - INFO - proc-rank-0-device-0 (pid: 298) has exited
[Modelarts Service Log]2022-05-27 14:09:06,928 - INFO - End destroy training processes
[ModelArts Service Log]modelarts-pipe: total length: 4142
[Modelarts Service Log]Training end with return code: 247
[Modelarts Service Log]Training end at 2022-05-27-14:09:07
[Modelarts Service Log]Training completed.
报此类ERROR是什么问题呢?因为内存爆了的原因吗?
解答:
应该是内存爆了,为什么要申请这么大的数据呢?
如果按int32来算,单条数据达到了(25000000*5*5*3)* 4 = 7500000000Byte ≈ 7GB