输入图像1920x1080,batch_size=8为例.
key | type | dtype | size | remark |
---|---|---|---|---|
boxes | Tensor | float32 | (n,4)1 | the ground-truth boxes in [x1, y1, x2, y2] format, with 0 <= x1 < x2 <= W and 0 <= y1 < y2 <= H. |
labels | Tenosr | int64 | (n,) | the class label for each ground-truth box |
maskes | Tensor | uint8 | (n,1920,1080)[N,H,W] | the segmentation binary masks for each instance,实际就是0和1,有对象的区域就是1,否则就是0,这个照片有多少个对象就有多少个mask |
area* | Tensor | float32 | (n,) | 对象面积 |
iscrowd* | Tensor | int64 | (n,) | 是否为一群对象(coco数据集会标注) |
image_id* | int | 图像编号 |
*为非必要参数,有一些数据集合处理的时候会标注上去*
key | type | dtype | size | 损失函数 | remark |
---|---|---|---|---|---|
loss_classifer | Tensor | float32 | () | CrossEntropyLoss | 对象分类损失(Classification Loss): |
loss_box_reg | Tensor | float32 | () | Smooth L1 Loss/MSE | 边界框回归损失(Bounding Box Regression Loss): |
loss_mask | Tensor | float32 | () | Binary Cross-Entropy Loss | 掩膜损失(Mask Loss): |
loss_objectness | Tensor | float32 | () | CrossEntropyLoss | RPN分类损失(RPN Classification Loss):前景/背景二分类损失 |
loss_rpn_box_reg | Tensor | float32 | () | Smooth L1/MSE | RPN边界框回归损失(RPN Bounding Box Regression Loss) |
key | type | dtype | size | remark |
---|---|---|---|---|
boxes | Tensor | float32 | (m,4)2 | the predicted boxes in [x1, y1, x2, y2] format,预测的所有的边界框 |
labels | Tensor | int64 | (m,) | the predicted labels for each instance |
boxes | Tensor | float32 | (m,) | the scores or each instance |
boxes | Tensor | float32 | (m,1,1920,1080)[M, 1, H, W] | the predicted masks for each instance, in 0-1 range. In order to obtain the final segmentation masks, the soft masks can be thresholded, generally with a value of 0.5 (mask >= 0.5).实际存储的是一个软掩膜,0.5以下的也有,存在比较平滑的过度 |
参考官方文档
maskrcnn_resnet50_fpn