Abstract
The object detection and deep learning technology has certainly proven effective in surveillance systems, automated driving, and facial recognition. Today, computer vision has given an entirely new perspective. However, when it comes to targeting a particular object within a complex image or video footage, it may seem to be a major challenge. By the rapid developments in the area of computer vision, the detectors have certainly improved greatly. This study presents a comprehensive literature review of various object detection algorithms, and their challenges, including one-stage and two-stage detectors. Finally, based on the current development of target image detection, the future prospects of research have been stated.
References
Scalable Learning for Object Detection with GPU Hardware, Adam Coates, Paul Baumstarck, Quoc Le, and Andrew Y. Ng
Object Recognition, Ming-Hsuan Yang University of California at Merced
R. Girshick, “Fast r-cnn,” in Proc. of the IEEE Int. Conf. on Computer Vision, Santiago, Chile, pp. 1440–1448, 2015.
Q. Zhu, M. C. Yeh, K. T. Cheng and S. Avidan, “Fast human detection using a cascade of histograms of oriented gradients,” in Proc. of the IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, New York, NY, USA, pp. 1491–1498, 2006.
S. Maji, A. C. Berg and J. Malik, “Classification using intersection kernel support vector machines is efficient,” in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Anchorage, Alaska, USA, pp. 1–8, 2008
You Only Look Once: Unified, Real-Time Object Detection J Redmon ,et. a, University of Washington, Allen Institute for AI, Facebook AI Research
J. Redmon and A. Angelova. Real-time grasp detection using convolutional neural networks. CoRR, abs/1412.3128, 2014
J. Li, Y. Wei, X. Liang, J. Dong, T. Xu et al., “Attentive contexts for object detection,” IEEE Transactions on Multimedia, vol. 19, no. 5, pp. 944–954, 2017.
R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in CVPR, 2014.
R. Girshick, “Fast r-cnn,” in ICCV, 2015.
S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, 2017.
P. Druzhkov and V. Kustikova, “A survey of deep learning methods and software tools for image classification and object detection,” Pattern Recognition and Image Anal., vol. 26, no. 1, p. 9, 2016.
S. Yang, P. Luo, C.-C. Loy, and X. Tang, “From facial parts responses to face detection: A deep learning approach,” in ICCV, 2015.
S. Yang, Y. Xiong, C. C. Loy, and X. Tang, “Face detection through scale-friendly deep convolutional networks,” in CVPR, 2017.
Z. Hao, Y. Liu, H. Qin, J. Yan, X. Li, and X. Hu, “Scale-aware face detection,” in CVPR, 2017.
H. Wang, Z. Li, X. Ji, and Y. Wang, “Face r-cnn,” arXiv:1706.01061,2017.
X. Sun, P. Wu, and S. C. Hoi, “Face detection using deep learning: An improved faster rcnn approach,” arXiv:1701.08289, 2017.
L. Huang, Y. Yang, Y. Deng, and Y. Yu, “Densebox: Unifying landmark localization with end to end object detection,” arXiv:1509.04874, 2015.
Y. Li, B. Sun, T. Wu, and Y. Wang, “face detection with end-to-end integration of a convnet and a 3d model,” in ECCV, 2016.
K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, “Joint face detection and alignment using multitask cascaded convolutional networks,” IEEE Signal Process. Lett., vol. 23, no. 10, pp. 1499–1503, 2016.
A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao,“Yolov4: Optimal speed and accuracy of object detection,” arXiv preprint arXiv:2004.10934, 2020.
A.Peng, N. Wang, X. Gao, and J. Li, “Face recognition from multiplestylistic sketches: Scenarios, datasets, and evaluation,” in ECCV, 2016.
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in CVPR, 2016.
S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards realtime object detection with region proposal networks,” in NIPS, 2015,pp. 91–99.
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed,C.-Y. Fu, and A. C. Berg, “Ssd: Single shot multiboxdetector,” in ECCV. Springer, 2016, pp. 21–37.
T.-Y. Lin, P. Doll´ar, R. B. Girshick, K. He, B. Hariharan,and S. J. Belongie, “Feature pyramid networks for object detection.” in CVPR, vol. 1, no. 2, 2017, p. 4.
T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Doll´ar, “Focal loss for dense object detection,” IEEE transactions on pattern analysis and machine intelligence,2018.
H. Law and J. Deng, “Cornernet: Detecting objects as paired keypoints,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 734–750.
Z.-Q. Zhao, P. Zheng, S.-t. Xu, and X. Wu, “Object detection with deep learning: A review,” IEEE transactions on neural networks and learning systems, vol. 30, no. 11, pp. 3212–3232, 2019.
N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov,and S. Zagoruyko, “End-to-end object detection with transformers,” in European Conference on Computer Vision. Springer, 2020, pp. 213–229.
D. G. Lowe, “Object recognition from local scaleinvariant features,” in ICCV, vol. 2. Ieee, 1999, pp. 1150–1157.
D. Erhan, C. Szegedy, A. Toshev, and D. Anguelov,“Scalable object detection using deep neural networks,” in CVPR, 2014, pp. 2147–2154.
K. Duan, S. Bai, L. Xie, H. Qi, Q. Huang, and Q. Tian,“Centernet: Keypoint triplets for object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 6569–6578.
