Occluded Pedestrian Detection Algorithm Based on Improved YOLOv3

Fig. 1. Illustration of difficulties in occluded object detection. (a) Loose prediction boxes of heavily overlapped objects; (b) center points of prediction boxes of heavily overlapped objects locate in same feature grid; (c) most regions in occluded object box occupied by foreground object

Download full size

Fig. 2. Convergence results before and after introducing Tight Loss function. (a) Variance of convergence result of prediction box is relatively larger without Tight Loss function; (b) prediction boxes with different anchor frames as starting points tend to be consistent after introducing Tight Loss function

Download full size

Fig. 3. Schematic diagrams of high-resolution feature pyramid and insertion position of spatial attention prediction head in network. (a) YOLOv3 network; (b) high-resolution feature pyramid; (c) center points of heavily overlapped objects locate in same grid in original feature pyramid; (d) center points of heavily overlapped objects locate in different grids in high resolution feature pyramid

Download full size

Fig. 4. Schematic diagram of redundant bounding boxes with high confidence in high-resolution feature pyramid. (a) Target box in original feature pyramid and its confidence prediction; (b) confidence prediction and redundant prediction boxes generated by upsampling mechanism; (c) confidence and prediction boxes filtered by spatial attention mechanism

Download full size

Fig. 5. Spatial attention module

Download full size

Fig. 6. Schematic diagrams of spatial attention prediction head and spatial attention residual block. (a) Spatial attention prediction head; (b) spatial attention residual block

Download full size

Fig. 7. Heat map of target confidence

Download full size

Fig. 8. Influence of Tight Loss function on model performance. (a)(c) Prediction results after Tight Loss fine-tuning; (b)(d) prediction results without Tight Loss adjustment

Download full size

Fig. 9. Comparison of comprehensive performance of models. (a)(c) Prediction results generated by improved YOLOv3; (b)(d) prediction results generated by original YOLOv3

Download full size

Table 1. Results of ablation experiments based on YOLOv3

View table

Table 1. Results of ablation experiments based on YOLOv3

Tight loss	HRFP	SAPH	M_AP /%	M_MR /%	M_Recall /%	Reasoning speed /（frame $∙ s^{- 1}$ ）
			86.84	49.52	90.68	38
√			86.86	49.38	90.65	38
	√		89.53	48.30	93.80	26
√	√		89.57	48.14	93.72	26
	√	√	89.72	48.34	93.81	32
√	√	√	89.75	48.28	93.88	32

Table 2. Influence of spatial attention mechanism on number of predicted boxes
View table
Table 2. Influence of spatial attention mechanism on number of predicted boxes
Mode 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Variation -16199 -344 +1946 +2313 +2043 +1768 +1593 +1184 +883 +429
H 374911 141925 107315 90477 77812 65193 51491 36305 21742 8725
H+S 358712 141581 109261 92790 79855 66961 53084 37489 22575 9154

Table 3. Performance comparison of models under different NMS methods
View table
Table 3. Performance comparison of models under different NMS methods
Index Original NMS Soft NMS Adaptive NMS
YOLOv3 H+T+S YOLOv3 H+T+S YOLOv3 H+T+S
M_AP 84.97 88.38 89.04 91.41 86.84 89.75
M_MR 50.39 49.14 50.26 49.07 49.52 48.28
M_Recall 88.85 92.66 94.97 97.32 90.68 93.88

Table 4. Comparison between proposed algorithm and current advanced occluded pedestrian detection algorithms

View table

Table 4. Comparison between proposed algorithm and current advanced occluded pedestrian detection algorithms

Algorithm	NMS Method	M_AP	M_MR	M_Recall
RetinaNet^［15］	Original NMS	78.33	65.22	94.13
IterDet（RetinaNet）^［26］	Original NMS	84.77	56.21	91.49
Faster RCNN^［15］	Original NMS	83.07	52.35	90.57
PS-RCNN^［27］	Original NMS	86.05		93.77
IterDet （Faster RCNN）^26］	Original NMS	88.08	49.44	95.80
YOLOv3+H+S+T	Original NMS	88.38	49.14	92.66
RetinaNet^［15］	Soft NMS	78.10	66.34	95.37
Faster RCNN^［15］	Soft NMS	83.92	51.97	91.73
YOLOv3+H+S+T	Soft NMS	91.41	49.07	97.32
RetinaNet^［15］	Adaptive NMS	79.67	63.03	94.77
Faster RCNN^［15］	Adaptive NMS	84.71	49.73	91.27
YOLOv3+H+S+T	Adaptive NMS	89.75	48.28	93.88