医疗图像其中两种图像格式:MRI(Magnetic Resonance Imaging,磁共振成像)、CT(Computed Tomography,计算机断层),常存成 .nii.gz 格式。都是 3D 的
H
×
W
×
L
H \times W \times L
H×W×L,但不是 RGB,而是类似 grey-scale 图只有一个 channel,如 CT 的值可能是 Hu 值,取值范围可从负几千到正几千,参考 [1]。segmentation label 也相应是 3D 的。
这里记录几种分割指标,参考 [2,3],可调 medpy 的包,其文档也有指标列表。记有 C + 1 个类,0 固定是 background,(某个数据的)prediction 和 label
Y
^
,
Y
∈
{
0
,
…
,
C
}
H
×
W
×
L
\hat{Y},Y\in \{0, \dots, C\}^{H \times W \times L}
Y^,Y∈{0,…,C}H×W×L,只留下第 c 类的 binary prediction 和 label 为
B
^
c
,
B
c
∈
{
0
,
1
}
H
×
W
×
L
\hat{B}^c,B^c \in \{0, 1\}^{H \times W \times L}
B^c,Bc∈{0,1}H×W×L,
∣
⋅
∣
|\cdot|
∣⋅∣ 表示非零元素个数,
B
^
c
,
B
c
\hat{B}^c,B^c
B^c,Bc 的 surface / boundray 记为
S
^
c
,
S
c
\hat{S}^c, S^c
S^c,Sc,是其表面 voxels 的三维索引向量(下标)的集合。
即 F1-scoure[2]。第 c 类的 dice 系数: D C c = 2 ∣ B ^ c ∩ B c ∣ ∣ B ^ c ∣ + ∣ B c ∣ DC_c=\frac{2|\hat{B}^c \cap B^c|}{|\hat{B}^c| + |B^c|} DCc=∣B^c∣+∣Bc∣2∣B^c∩Bc∣ 调包:medpy.metric.binary.dc。
即 Jaccard 系数[2]。第 c 类的 IoU: I o U c = ∣ B ^ c ∩ B c ∣ ∣ B ^ c ∪ B c ∣ IoU_c = \frac{|\hat{B}^c \cap B^c|}{|\hat{B}^c \cup B^c|} IoUc=∣B^c∪Bc∣∣B^c∩Bc∣ 调包:medpy.metric.binary.jc。由 [3],IoU 与 dice 系数有确定的转换关系,即等价,所以两者应该用其中一种就行。C 类的 IoU 取平均就是 mIoU: m I o U = ∑ c = 0 C I o U c C + 1 mIoU=\frac{\sum_{c=0}^{C}IoU_c}{C + 1} mIoU=C+1∑c=0CIoUc
[5] 用到,但其加权系数的好像写错了,应参照 [6,7] 的公式。而 [5] 本身的实现 与 [6-8] 一致,是对的。公式应为:
f
w
I
o
U
c
=
∣
B
c
∣
H
W
L
I
o
U
c
f
w
I
o
U
=
∑
c
=
0
C
f
w
I
o
U
c
即 recall,也叫 TPR(True Positive Rate),非对称。第 c 类的 sensitivity: R c = ∣ B ^ c ∩ B c ∣ ∣ B c ∣ R_c=\frac{|\hat{B}^c \cap B^c|}{|B^c|} Rc=∣Bc∣∣B^c∩Bc∣ 调包:medpy.metric.binary.sensitivity 或 medpy.metric.binary.recall。
也叫 selectivity、TNR(True Negative Rate),是 sensitivity 的补。第 c 类的 specifity: S P c = ∣ ( 1 − B ^ c ) ∩ ( 1 − B c ) ∣ ∣ 1 − B c ∣ SP_c = \frac{|(1-\hat{B}^c) \cap (1-B^c)|}{|1-B^c|} SPc=∣1−Bc∣∣(1−B^c)∩(1−Bc)∣ 调包:medpy.metric.binary.specificity。
[5] 用到 a c c = 1 ( Y ^ = Y ) H W L acc=\frac{1(\hat{Y} = Y)}{HWL} acc=HWL1(Y^=Y)
由 [4],第 c 类的 average surface distance: A S D ( S ^ c , S c ) = ∑ p ∈ S ^ c d s ( p , S c ) / ∣ S ^ c ∣ ≠ A S D ( S c , S ^ c ) ASD(\hat{S}^c, S^c) = \sum_{p \in \hat{S}^c} d_s(p,S^c) \big/ |\hat{S}^c| \ne ASD(S^c, \hat{S}^c) ASD(S^c,Sc)=p∈S^c∑ds(p,Sc)/∣S^c∣=ASD(Sc,S^c) 其中 d s ( ⋅ , ⋅ ) d_s(\cdot,\cdot) ds(⋅,⋅) 是点到点集(此处是表面 S)距离: d s ( p , S ) = min q ∈ S d ( p , q ) d_s(p,S)=\min_{q \in S} d(p,q) ds(p,S)=q∈Smind(p,q) p , q ∈ { 1 , … , W } × { 1 , … , H } × { 1 , … , L } p,q \in \{1,\dots,W\} \times \{1,\dots,H\} \times \{1,\dots,L\} p,q∈{1,…,W}×{1,…,H}×{1,…,L} 是三维索引向量(下标), d ( ⋅ , ⋅ ) d(\cdot,\cdot) d(⋅,⋅) 是欧氏距离。就是对 S ^ \hat{S} S^ 中属于此 object 表面的每一个 voxel,都算一下它到 S 的距离,然后取平均。调包:medpy.metric.binary.asd。
ASD 不对称,一般会用 average symmetric surface distance: A S S D ( S ^ c , S c ) = A S D ( S ^ c , S c ) + A S D ( S c , S ^ c ) 2 = A S S D ( S c , S ^ c ) ASSD(\hat{S}^c, S^c) = \frac{ASD(\hat{S}^c, S^c) + ASD(S^c, \hat{S}^c)}{2} = ASSD(S^c, \hat{S}^c) ASSD(S^c,Sc)=2ASD(S^c,Sc)+ASD(Sc,S^c)=ASSD(Sc,S^c) 调包:medpy.metric.binary.assd。有些文章也会将 ASSD 简叫成 ASD,感觉可以无脑只用 ASSD。
由 [2,4],第 c 类的 Hausdorff 距离: H D ( S ^ c , S c ) = max { sup p ∈ S ^ c d s ( p , S c ) , sup p ∈ S c d s ( p , S ^ c ) } HD(\hat{S}^c,S^c)=\max\left\{ \sup_{p \in \hat{S}^c} d_s(p, S^c), \sup_{p \in S^c} d_s(p, \hat{S}^c) \right\} HD(S^c,Sc)=max{p∈S^csupds(p,Sc),p∈Scsupds(p,S^c)} 调包:medpy.metric.binary.hd。
从公式看,有些指标在遇见特殊情况(全零 prediction、全零 label、全一 label)可能会有除零问题,一例见下文 example: error from distance metrics 小节。现测试 medpy.metric.binary 各 API 对这些突殊情况作何处理,以便校正不合理之处。
import numpy as np
import medpy.metric.binary as mmb
np.bool = np.bool_ # 新的 numpy 已无 np.bool
p0 = np.array([0, 0, 0, 0]) # all 0
p1 = np.array([1, 1, 1, 1]) # all 1
p = np.array([0, 1, 1, 0]) # both 0 & 1
y0 = p0
y1 = p1
y = 1 - p
print("\tdice")
print(mmb.dc(p0, y0), mmb.dc(p0, y), mmb.dc(p, y0))
print("\tiou")
try:
print(mmb.jc(p0, y0))
except ZeroDivisionError as e:
print("iou(p0, y0):", e)
print(mmb.jc(p0, y), mmb.jc(p, y0))
print("\tsensitivity")
print(mmb.sensitivity(p0, y0), mmb.sensitivity(p0, y), mmb.sensitivity(p, y0))
print("\tspecificity")
print(mmb.specificity(p1, y1), mmb.specificity(p1, y), mmb.specificity(p, y1))
# ASD 不对称!此处只依上文公式调包,即 pred 在前、label 在后
print("\tasd")
try:
print(mmb.asd(p0, y0))
except RuntimeError as e:
print("asd(p0, y0):", e)
try:
print(mmb.asd(p0, y))
except RuntimeError as e:
print("asd(p0, y):", e)
try:
print(mmb.asd(p, y0))
except RuntimeError as e:
print("asd(p, y0):", e)
print("\tassd")
try:
print(mmb.assd(p0, y0))
except RuntimeError as e:
print("assd(p0, y0):", e)
try:
print(mmb.assd(p0, y))
except RuntimeError as e:
print("assd(p0, y):", e)
try:
print(mmb.assd(p, y0))
except RuntimeError as e:
print("assd(p, y0):", e)
print("\thd")
try:
print(mmb.hd(p0, y0))
except RuntimeError as e:
print("hd(p0, y0):", e)
try:
print(mmb.hd(p0, y))
except RuntimeError as e:
print("hd(p0, y):", e)
try:
print(mmb.hd(p, y0))
except RuntimeError as e:
print("hd(p, y0):", e)
print("\thd95")
try:
print(mmb.hd95(p0, y0))
except RuntimeError as e:
print("hd95(p0, y0):", e)
try:
print(mmb.hd95(p0, y))
except RuntimeError as e:
print("hd95(p0, y):", e)
try:
print(mmb.hd95(p, y0))
except RuntimeError as e:
print("hd95(p, y0):", e)
输出:
dice
0.0 0.0 0.0
iou
iou(p0, y0): float division by zero
0.0 0.0
sensitivity
0.0 0.0 0.0
specificity
0.0 0.0 0.0
asd
asd(p0, y0): The first supplied array does not contain any binary object.
asd(p0, y): The first supplied array does not contain any binary object.
asd(p, y0): The second supplied array does not contain any binary object.
assd
assd(p0, y0): The first supplied array does not contain any binary object.
assd(p0, y): The first supplied array does not contain any binary object.
assd(p, y0): The second supplied array does not contain any binary object.
hd
hd(p0, y0): The first supplied array does not contain any binary object.
hd(p0, y): The first supplied array does not contain any binary object.
hd(p, y0): The second supplied array does not contain any binary object.
hd95
hd95(p0, y0): The first supplied array does not contain any binary object.
hd95(p0, y): The first supplied array does not contain any binary object.
hd95(p, y0): The second supplied array does not contain any binary object.
可见 medpy.metric.binary 的处理分两类:
分类讨论:
算 assd 时,如果 y_true 为空,会报错:
Traceback (most recent call last):
File "/home/tom/codes/monai-tutorials/2d_segmentation/torch/mmwhs_train.py", line 268, in
main(tempdir)
File "/home/tom/codes/monai-tutorials/2d_segmentation/torch/mmwhs_train.py", line 246, in main
_res = evaluate(val_outputs, val_labels, 4+1)
File "/home/tom/codes/monai-tutorials/2d_segmentation/torch/mmwhs_train.py", line 66, in my_evaluate
res[metr].append(fn(B_pred_c, B_c))
File "/share/tom/miniconda3/envs/cu116_pt1131/lib/python3.9/site-packages/medpy/metric/binary.py", line 453, in assd
assd = numpy.mean( (asd(result, reference, voxelspacing, connectivity), asd(reference, result, voxelspacing, connectivity)) )
File "/share/tom/miniconda3/envs/cu116_pt1131/lib/python3.9/site-packages/medpy/metric/binary.py", line 561, in asd
sds = __surface_distances(result, reference, voxelspacing, connectivity)
File "/share/tom/miniconda3/envs/cu116_pt1131/lib/python3.9/site-packages/medpy/metric/binary.py", line 1215, in __surface_distances
raise RuntimeError('The second supplied array does not contain any binary object.')
RuntimeError: The second supplied array does not contain any binary object.
其它 distance metrics(asd、hd、hd95)只要 prediction 和 label 其中一方为空也都会遇到。好像没有标准解法:
Y_pred
、Y_true
皆空时,学 [13] 置零;而只其中一方为空时,[13] 置 128
、[14] 有置 0
、373.128664
、NaN 的,不知道什么鬼,此处选置 NaN,即忽略。import packaging.version
import numpy as np
import medpy.metric.binary as mmb
# https://stackoverflow.com/questions/74893742/how-to-solve-attributeerror-module-numpy-has-no-attribute-bool
if packaging.version.parse(np.__version__) >= packaging.version.parse('1.24'):
if hasattr(np, 'bool_'):
np.bool = np.bool_
else:
np.bool = bool
class SegEvaluator:
"""segmentation evaluation based on medpy.metric.binary
It will calibrate some results from medpy.metric.binary (see __call__).
It will also record the number of NaN in distance-based metrics that are
caused by empty prediction or label.
Usage:
```python
# 2D image evaluation
evaluator = SegEvaluator(N_CLASSES, IGNORE_BACKGROUND)
for images, labels in loader:
predictions = model(images) # [bs, c, h, w]
predictions = predictions.argmax(1) # -> [bs, h, w]
for pred, lab in zip(predictions, labels):
evaluator(pred, lab) # eval each image
print(evaluator.reduce())
evaluator.reset()
```
"""
METRICS = {
"dice": mmb.dc,
"iou": mmb.jc,
"accuracy": lambda _B1, _B2: (_B1 == _B2).sum() / _B1.size,
"sensitivity": mmb.sensitivity,
"specificity": mmb.specificity,
"hd": mmb.hd,
"assd": mmb.assd,
"hd95": mmb.hd95,
"asd": mmb.asd
}
def __init__(self, n_classes, ignore_bg=False, select=[]):
"""
Input:
n_classes: int, assuming 0 is background
ignore_bg: bool, ignore background or not in reduction
select: List[str], list of name of metrics of interest,
i.e. will only evaluate on these selected metrics if provided.
Should be a subset of supported metrics (see METRICS).
"""
self.n_classes = n_classes
self.ignore_bg = ignore_bg
self.metrics = {}
_no_select = len(select) == 0
for m, f in SegEvaluator.METRICS.items():
if _no_select or m in select:
self.metrics[m] = f
self.reset()
def reset(self):
# records:
# - records[metr][c][i] = score of i-th datum on c-th class, or
# - records[metr][c] = # of NaN caused by empty pred/label
self.records = {}
for metr in self.metrics:
# self.records[metr] = [[]] * self.n_classes # wrong
self.records[metr] = [[] for _ in range(self.n_classes)]
for metr in ("hd", "assd", "hd95", "asd"):
if metr in self.metrics:
self.records[f"empty_gt_{metr}"] = [0] * self.n_classes
self.records[f"empty_pred_{metr}"] = [0] * self.n_classes
def __call__(self, Y_pred, Y_true):
"""evaluates 1 prediction
Input:
Y_pred: numpy.ndarray, typically [H, W] or [H, W, L], predicted class ID
Y_true: same as `Y_pred`, label, ground-truth class ID
"""
for c in range(self.n_classes):
B_pred_c = (Y_pred == c).astype(np.int64)
B_c = (Y_true == c).astype(np.int64)
pred_l0, pred_inv_l0, gt_l0, gt_inv_l0 = B_pred_c.sum(), (1 - B_pred_c).sum(), B_c.sum(), (1 - B_c).sum()
for metr, fn in self.metrics.items():
is_distance_metr = metr in ("hd", "assd", "hd95", "asd")
if 0 == c and (self.ignore_bg or is_distance_metr):
# always ignore bg for distance metrics
a = np.nan
elif 0 == gt_l0 and 0 == pred_l0 and metr in ("dice", "iou", "sensitivity"):
a = 1
elif 0 == gt_inv_l0 and 0 == pred_inv_l0 and "specificity" == metr:
a = 1
elif is_distance_metr and pred_l0 * gt_l0 == 0: # at least one party is all 0
if 0 == pred_l0 and 0 == gt_l0: # both are all 0
# nips23a&d, xmed-lab/GenericSSL
a = 0
else: # only one party is all 0
a = np.nan
if 0 == pred_l0:
self.records[f"empty_pred_{metr}"][c] += 1
else: # 0 == gt_l0
self.records[f"empty_gt_{metr}"][c] += 1
else: # normal cases or that medpy can solve well
# try:
a = fn(B_pred_c, B_c)
# except:
# a = np.nan
self.records[metr][c].append(a)
def reduce(self, prec=4):
"""calculate class-wise & overall average
Input:
prec: int, decimal precision
Output:
res: dict
- res[]: float, overall average
- res[_clswise]: List[float], class-wise average of each class
- res[empty_pred|gt_]: int, overall #NaN caused by empty pred/label
- res[empty_pred|gt__clswise]: List[int], class-wise #NaN
"""
res = {}
for metr in self.records:
if metr.startswith("empty_"):
res[metr+"_clswise"] = self.records[metr]
res[metr] = int(np.sum(self.records[metr]))
else:
CxN = np.asarray(self.records[metr], dtype=np.float32)
nans = np.isnan(CxN)
CxN[nans] = 0
not_nans = ~nans
# class-wise average
cls_n = not_nans.sum(1) # [c]
# cls_avg = np.where(cls_n > 0, CxN.sum(1) / cls_n, 0)
_cls_n_denom = cls_n.copy()
_cls_n_denom[0 == _cls_n_denom] = 1 # to avoid warning though not necessary
cls_avg = np.where(cls_n > 0, CxN.sum(1) / _cls_n_denom, 0)
res[f"{metr}_clswise"] = np.round(cls_avg, prec).tolist()
# overall average
ins_cls_n = not_nans.sum(0) # [n]
# ins_avg = np.where(ins_cls_n > 0, CxN.sum(0) / ins_cls_n, 0)
_ins_cls_n_denom = ins_cls_n.copy()
_ins_cls_n_denom[0 == _ins_cls_n_denom] = 1 # to avoid warning though not necessary
ins_avg = np.where(ins_cls_n > 0, CxN.sum(0) / _ins_cls_n_denom, 0)
ins_n = (ins_cls_n > 0).sum()
avg = ins_avg.sum() / ins_n if ins_n > 0 else 0
res[metr] = float(np.round(avg, prec))
return res
import os, os.path as osp
import numpy as np
import medpy.metric.binary as mmb
from collections import defaultdict
import torch
from monai.networks.utils import one_hot
from monai.metrics import DiceMetric, MeanIoU, GeneralizedDiceScore, ConfusionMatrixMetric, HausdorffDistanceMetric, SurfaceDistanceMetric
class SegEvaluator:
"""同上文"""
pass
print("自制 pred、label,保证非空以对拍 asd、assd、hd、hd95")
nc = 4+1
gen_data_size = [100, 40*(nc-1)]
gen_data_list = []
for i in range(30):
pred = np.zeros(gen_data_size, dtype=np.uint8)
label = pred.copy()
for ic in range(1, nc):
fill_v = np.concatenate([np.zeros([70*40], dtype=np.uint8), ic * np.ones([30*40], dtype=np.uint8)])
np.random.shuffle(fill_v)
pred[:, (ic-1)*40: ic*40] = fill_v.reshape(100, 40)
if i > 2: # let the 1st 3 be perfect predictions
np.random.shuffle(fill_v)
label[:, (ic-1)*40: ic*40] = fill_v.reshape(100, 40)
gen_data_list.append((pred, label))
print(len(gen_data_list)) # 30
print("MONAI overall")
monai_metrics = {
"dice": DiceMetric(include_background=True, reduction="mean", get_not_nans=False, ignore_empty=True, num_classes=4+1),
"iou": MeanIoU(include_background=True, reduction="mean", get_not_nans=False, ignore_empty=True),
"sensitivity": ConfusionMatrixMetric(include_background=True, metric_name='sensitivity', reduction="mean", compute_sample=True, get_not_nans=False),
"specificity": ConfusionMatrixMetric(include_background=True, metric_name='specificity', reduction="mean", compute_sample=True, get_not_nans=False),
"accuracy": ConfusionMatrixMetric(include_background=True, metric_name='accuracy', reduction="mean", get_not_nans=False),
# 下面 distance metrics 忽略 background
"hd": HausdorffDistanceMetric(include_background=False, reduction="mean", get_not_nans=False),
"hd95": HausdorffDistanceMetric(include_background=False, percentile=95, reduction="mean", get_not_nans=False),
"assd": SurfaceDistanceMetric(include_background=False, symmetric=True, reduction="mean", get_not_nans=False),
"asd": SurfaceDistanceMetric(include_background=False, symmetric=False, reduction="mean", get_not_nans=False),
}
print("MONAI class-wise")
monai_metrics_cls = {
"dice": DiceMetric(include_background=True, reduction="mean_batch", get_not_nans=False, ignore_empty=True, num_classes=4+1),
"iou": MeanIoU(include_background=True, reduction="mean_batch", get_not_nans=False, ignore_empty=True),
"sensitivity": ConfusionMatrixMetric(include_background=True, metric_name='sensitivity', reduction="mean_batch", compute_sample=True, get_not_nans=False),
"specificity": ConfusionMatrixMetric(include_background=True, metric_name='specificity', reduction="mean_batch", compute_sample=True, get_not_nans=False),
"accuracy": ConfusionMatrixMetric(include_background=True, metric_name='accuracy', reduction="mean_batch", get_not_nans=False),
# 下面 distance metrics 忽略 background
"hd": HausdorffDistanceMetric(include_background=False, reduction="mean_batch", get_not_nans=False),
"hd95": HausdorffDistanceMetric(include_background=False, percentile=95, reduction="mean_batch", get_not_nans=False),
"assd": SurfaceDistanceMetric(include_background=False, symmetric=True, reduction="mean_batch", get_not_nans=False),
"asd": SurfaceDistanceMetric(include_background=False, symmetric=False, reduction="mean_batch", get_not_nans=False),
}
print("medpy")
seg_eval = SegEvaluator(4+1, False)
print("2D seg,一张张测")
for pred, label in gen_data_list:
pred = pred.astype(np.int32)
label = label.astype(np.int32)
# print(pred.shape, pred.dtype, np.unique(pred), label.shape, label.dtype, np.unique(label))
seg_eval(pred, label)
# MONAI 要求 [B, C, H, W],且有些指标要求 one-hot
pred = one_hot(torch.from_numpy(pred).unsqueeze(0).unsqueeze(0), num_classes=4+1, dim=1)
label = one_hot(torch.from_numpy(label).unsqueeze(0).unsqueeze(0), num_classes=4+1, dim=1)
for _m in monai_metrics:
monai_metrics[_m](y_pred=pred, y=label)
monai_metrics_cls[_m](y_pred=pred, y=label)
print("monai")
for _m in monai_metrics:
print('\t', _m)
_agg = monai_metrics[_m].aggregate()#.item()
_agg_cls = monai_metrics_cls[_m].aggregate()
# print(type(_agg_cls))
if isinstance(_agg, list):
assert len(_agg) == 1
_agg = _agg[0]
if isinstance(_agg_cls, list):
assert len(_agg_cls) == 1
_agg_cls = _agg_cls[0]
_agg = round(_agg.item(), 6)
_agg_cls = [round(_x, 6) for _x in _agg_cls.tolist()]
# monai_metrics[_m].reset()
print("class-wise:", _agg_cls)
print("overall:", _agg)
print("medpy")
medpy_res = {k: v for k, v in seg_eval.reduce(6).items()
if not k.startswith("empty_")} # 虑掉那些 NaN 记数
for metr in medpy_res:
if not metr.endswith("_clswise"):
print('\t', metr)
print("class-wise:", medpy_res[metr+"_clswise"])
print("overall:", medpy_res[metr])
输出
monai
dice
class-wise: [0.819524, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
iou
class-wise: [0.699342, 0.197282, 0.200425, 0.199778, 0.197985]
overall: 0.298962
sensitivity
class-wise: [0.819524, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
specificity
class-wise: [0.278094, 0.961871, 0.962136, 0.962083, 0.96193]
overall: 0.825223
accuracy
class-wise: [0.711238, 0.927554, 0.928058, 0.927958, 0.927667]
overall: 0.884495
hd
class-wise: [3.283388, 3.363868, 3.308639, 3.316445]
overall: 3.318085
hd95
class-wise: [2.012461, 2.004592, 2.012461, 2.012461]
overall: 2.010494
assd
class-wise: [0.945859, 0.939, 0.945332, 0.943413]
overall: 0.943401
asd
class-wise: [0.943933, 0.938989, 0.944227, 0.941334]
overall: 0.942121
medpy
dice
class-wise: [0.819523, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
iou
class-wise: [0.699342, 0.197282, 0.200425, 0.199778, 0.197985]
overall: 0.298962
sensitivity
class-wise: [0.819523, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
specificity
class-wise: [0.278094, 0.961871, 0.962136, 0.962083, 0.96193]
overall: 0.825223
accuracy
class-wise: [0.711238, 0.927554, 0.928058, 0.927958, 0.927667]
overall: 0.884495
hd
class-wise: [0.0, 3.283388, 3.363868, 3.308639, 3.316445]
overall: 3.318085
hd95
class-wise: [0.0, 2.004592, 2.004592, 2.005379, 2.004592]
overall: 2.004789
assd
class-wise: [0.0, 0.945857, 0.939, 0.945332, 0.94341]
overall: 0.9434
asd
class-wise: [0.0, 0.943933, 0.938989, 0.944227, 0.941334]
overall: 0.942121
结论:hd95 不同,其它一致。对比 medpy 的实现和 MONAI 的实现,是因为它们取 95% 分位数的方式不同:
print("2D 辑成 3D")
gen_data_3d_list = []
for i in range(0, len(gen_data_list), 10):
pred3d = np.concatenate([x[0][:, :, np.newaxis] for x in gen_data_list[i: i+10]], axis=2)
label3d = np.concatenate([x[1][:, :, np.newaxis] for x in gen_data_list[i: i+10]], axis=2)
print(pred3d.shape, label3d.shape) # (100, 160, 10) (100, 160, 10)
gen_data_3d_list.append((pred3d, label3d))
print(len(gen_data_3d_list)) # 3
print("重置 metrics 记录")
seg_eval.reset()
for _m in monai_metrics:
monai_metrics[_m].reset()
monai_metrics_cls[_m].reset()
print("3D,逐 volume 测")
for pred, label in gen_data_3d_list:
pred = pred.astype(np.int32)
label = label.astype(np.int32)
# print(pred.shape, pred.dtype, np.unique(pred), label.shape, label.dtype, np.unique(label))
seg_eval(pred, label)
pred = one_hot(torch.from_numpy(pred).unsqueeze(0).unsqueeze(0), num_classes=4+1, dim=1)
label = one_hot(torch.from_numpy(label).unsqueeze(0).unsqueeze(0), num_classes=4+1, dim=1)
for _m in monai_metrics:
monai_metrics[_m](y_pred=pred, y=label)
monai_metrics_cls[_m](y_pred=pred, y=label)
print("monai") # 同前面 2D
for _m in monai_metrics:
print('\t', _m)
_agg = monai_metrics[_m].aggregate()#.item()
_agg_cls = monai_metrics_cls[_m].aggregate()
# print(type(_agg_cls))
if isinstance(_agg, list):
assert len(_agg) == 1
_agg = _agg[0]
if isinstance(_agg_cls, list):
assert len(_agg_cls) == 1
_agg_cls = _agg_cls[0]
_agg = round(_agg.item(), 6)
_agg_cls = [round(_x, 6) for _x in _agg_cls.tolist()]
# monai_metrics[_m].reset()
print("class-wise:", _agg_cls)
print("overall:", _agg)
print("medpy")
medpy_res = {k: v for k, v in seg_eval.reduce(6).items()
if not k.startswith("empty_")} # 虑掉那些 NaN 记数
for metr in medpy_res:
if not metr.endswith("_clswise"):
print('\t', metr)
print("class-wise:", medpy_res[metr+"_clswise"])
print("overall:", medpy_res[metr])
输出
monai
dice
class-wise: [0.819523, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
iou
class-wise: [0.695224, 0.165041, 0.168134, 0.168165, 0.165894]
overall: 0.272492
sensitivity
class-wise: [0.819523, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
specificity
class-wise: [0.278094, 0.961871, 0.962136, 0.962083, 0.96193]
overall: 0.825223
accuracy
class-wise: [0.711237, 0.927554, 0.928058, 0.927958, 0.927667]
overall: 0.884495
hd
class-wise: [2.307209, 2.44949, 2.575802, 2.44949]
overall: 2.445498
hd95
class-wise: [1.414214, 1.414214, 1.414214, 1.414214]
overall: 1.414214
assd
class-wise: [0.815341, 0.80984, 0.813648, 0.815224]
overall: 0.813514
asd
class-wise: [0.812973, 0.809779, 0.813435, 0.814375]
overall: 0.812641
medpy
dice
class-wise: [0.819523, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
iou
class-wise: [0.695224, 0.165041, 0.168134, 0.168165, 0.165894]
overall: 0.272492
sensitivity
class-wise: [0.819523, 0.275542, 0.280583, 0.279583, 0.276667]
overall: 0.38638
specificity
class-wise: [0.278094, 0.961871, 0.962136, 0.962083, 0.96193]
overall: 0.825223
accuracy
class-wise: [0.711238, 0.927554, 0.928058, 0.927958, 0.927667]
overall: 0.884495
hd
class-wise: [0.0, 2.307209, 2.44949, 2.575802, 2.44949]
overall: 2.445498
hd95
class-wise: [0.0, 1.414213, 1.414213, 1.414213, 1.414213]
overall: 1.414214
assd
class-wise: [0.0, 0.815341, 0.80984, 0.813648, 0.815224]
overall: 0.813514
asd
class-wise: [0.0, 0.812973, 0.809779, 0.813435, 0.814375]
overall: 0.812641
结论: