
pytorch模型转onnx模型示例:
torch.onnx.export(model, # model being run
onnx_inputs, # model input (or a tuple for multiple inputs)
"trt/dense_tnt.onnx", # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
verbose=True, # if True, all parameters will be exported
opset_version=11, # the ONNX version to export the model to
do_constant_folding=False, # whether to execute constant folding for optimization
input_names=['vector_data', 'all_polyline_idx', 'vector_idx', 'invalid_idx', 'map_polyline_idx', 'traj_polyline_idx', 'cent_polyline_idx', 'topk_points_idx'], # the model's input names
output_names=['trajectories'], # the model's output names
dynamic_axes={
'all_polyline_idx': {
1: 'all_polyline_num'},
'vector_idx': {
1: 'vector_num'},
'invalid_idx': {
1: 'invalid_num'},
'map_polyline_idx': {
1: 'map_polyline_num'},
'traj_polyline_idx': {
1: 'traj_polyline_num'},
'cent_polyline_idx': {
1: 'cent_polyline_num'},
'topk_points_idx': {
1: 'topk_points_num'},})
verbose=True 可以将转换后的模型代码及参数输出,并对应了相应的源代码。
dynamic_axes 可以设置动态输入,如 'all_polyline_idx': {1: 'all_polyline_num'} 表示 all_polyline_idx 的第1维的shape为动态,且命名为 all_polyline_num。
ONNX 模型简化:
onnxsim input_onnx_model_name output_onnx_model_name
简化后输出:

tar xzvf TensorRT-8.4.3.1.Linux.x86_64-gnu.cuda-11.6.cudnn8.4.tar.gz
export LD_LIBRARY_PATH=${TENSORRT_PATH}/TensorRT-8.4.3.1/lib:$LD_LIBRARY_PATH
export LIBRARY_PATH=${TENSORRT_PATH}/TensorRT-8.4.3.1/lib:$LIBRARY_PATH
cd TensorRT-8.4.3.1/python/
pip install tensorrt-8.4.3.1-cp38-none-linux_x86_64.whl
转换命令:
${TENSORRT_PATH}/TensorRT-8.4.2.4/bin/trtexec
--onnx=dense_tnt_sim.onnx
--minShapes=all_polyline_idx:1x7,vector_idx:1x6,invalid_idx:1x1,map_polyline_idx:1x6,traj_polyline_idx:1x1,cent_polyline_idx:1x3,topk_points_idx:1x50
--optShapes=all_polyline_idx:1x1200,vector_idx:1x24000,invalid_idx:1x24000,map_polyline_idx:1x1200,traj_polyline_idx:1x1200,cent_polyline_idx:1x1200,topk_points_idx:1x24000
--maxShapes=all_polyline_idx:1x1200,vector_idx:1x24000,invalid_idx:1x24000,map_polyline_idx:1x1200,traj_polyline_idx:1x1200,cent_polyline_idx:1x1200,topk_points_idx:1x24000
--saveEngine=dense_tnt_fp32.engine
--device=0
--workspace=48000
--noTF32
--verbose
--minShapes 为inputs的最小维度;
--optShapes 为输入常用的inputs维度,我这边输入的是最大维度;
--maxShapes 为inputs的最大维度;
--device 设置转换模型使用的gpu;
--noTF32 不使用tf32数据类型,使用fp32;
--verbose 输出详细信息。
整体流程:
if __name__ == '__main__':
device = torch.device("cpu") # TensorRT模型不管device是cpu还是cuda都会调用gpu,为了方便转换用cpu可以解决device冲突问题
trt_path = "/home/chenxin/peanut/DenseTNT/trt/densetnt_vehicle_trt_model_tf32.engine"
input_names = ["vector_data", "all_polyline_idx", "vector_idx", "invalid_idx", "map_polyline_idx", "traj_polyline_idx",
"cent_polyline_idx", "topk_points_idx"]
output_names = ["1195", "1475", "1785"