A Novel Approach for Surveillance Compression using Neural Network Technique

Nikita Mohod; Prateek Agrawal; Vishu Madaan

doi:10.54392/irjmt2436

Authors

Nikita Mohod School of Computer Science and Engineering, Lovely Professional University, Punjab-144411, India Author https://orcid.org/0000-0002-6782-1714
Prateek Agrawal School of Computer Science and Engineering, Lovely Professional University, Punjab-144411, India Author https://orcid.org/0000-0001-6861-0698
Vishu Madaan School of Computer Science and Engineering, Lovely Professional University, Punjab-144411, India Author https://orcid.org/0000-0002-9127-4490

DOI:

https://doi.org/10.54392/irjmt2436

Keywords:

Significant Frames, Non-Significant Frames, Object Detection, YOLOv5, YOLOv7, YOLOv8, Surveillance Compression

Abstract

The integration of closed-circuit television (CCTV) monitoring is crucial in the field of video processing, which provides an efficient method for comprehensive surveillance. However, a key challenge associated with this practice is its substantial demand for storage space. Typically, surveillance footage is stored in hard disk drives, and due to limited storage spaces, it is deleted after some time. To address this issue, an innovative method for compressing CCTV video, named object detection-based surveillance compression (ODSC), is introduced. Our ODSC model is divided into two steps: -i) depending upon the objects in the video, determine the significant and non-significant frames of surveillance video using the neural network approach YOLOv5s & YOLOv7-tiny and Yolov8s ii) construct the video of significant frames. Following a comprehensive analysis of the experimental outcomes, it is noted that YOLOv8s stands out with a remarkable detection accuracy of 99.7% on the COCO dataset. Our ODSC approach is reducing the storage space greatly and achieving an average compression ratio of up to 96.31% using YOLOv8s, which surpasses the existing state-of-the-art methods.

References

A. Hope, CCTV, school surveillance and social control. British Educational Research Journal, 35(6), (2009) 891-907. https://doi.org/10.1080/01411920902834233

E. L. Piza, B. C. Welsh, D. P. Farrington, A. L. Thomas. CCTV surveillance for crime prevention: A 40-year systematic review with meta-analysis. Criminology & public policy, 18(1), (2019) 135-159. https://doi.org/10.1111/1745-9133.12419

M. P. J. Ashby, The Value of CCTV Surveillance Cameras as an Investigative Tool: An Empirical Analysis. European Journal on Criminal Policy and Research, 23(3), (2017) 441-459. https://doi.org/10.1007/s10610-017-9341-6

C. Norris, G. Armstrong, The maximum surveillance society: The rise of CCTV. Routledge, London. https://doi.org/10.4324/9781003136439

L. Zhao, S. Wang, S. Wang, Y. Ye, S. Ma, W. Gao, Enhanced surveillance video compression with dual reference frames generation. IEEE Transactions on Circuits and Systems for Video Technology, 32(3), (2021) 1592-1606. https://doi.org/10.1109/TCSVT.2021.3073114

P. Kavitha, A survey on lossless and lossy data compression methods. International Journal of Computer Science & Engineering Technology, 7(3), (2016) 110-114.

H. Rhee, Y. Il Jang, S. Kim, N. I. Cho, Lossless Image Compression by Joint Prediction of Pixel and Context Using Duplex Neural Networks. IEEE Access, 9, (2021) 86632-86645. https://doi.org/10.1109/ACCESS.2021.3088936

G. J. Sullivan, J.-R. Ohm, W.-J. Han, T. Wiegand. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on circuits and systems for video technology, 22(12), (2012) 1649. https://doi.org/10.1109/TCSVT.2012.2221191

Z. Cheng, H. Sun, M. Takeuchi, J. Katto, Learning Image and Video Compression Through Spatial-Temporal Energy Compaction. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019) 10071-10080. https://doi.org/10.1109/CVPR.2019.01031

J. Mao, L. Yu, Convolutional neural network based bi-prediction utilizing spatial and temporal information in video coding. IEEE Transactions on Circuits and Systems for Video Technology, 30(7), (2019) 1856-1870. https://doi.org/10.1109/TCSVT.2019.2954853

Z. Zou, K. Chen, Z. Shi, Y. Guo, J. Ye, Object detection in 20 years: A survey. Proceedings of the IEEE, 111(3), (2023) 257-276. https://doi.org/10.1109/JPROC.2023.3238524

X. Wu, D. Sahoo, S.C.H. Hoi, Recent advances in deep learning for object detection. Neurocomputing, 396, (2020) 39-64. https://doi.org/10.1016/j.neucom.2020.01.085

R. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, (2014) 580-587. https://doi.org/10.1109/CVPR.2014.81

R. Girshick, (2015) Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, IEEE, Chile. https://doi.org/10.1109/ICCV.2015.169

S. Ren, K. He, R. Girshick, J. Sun, Faster r-cnn: Towards real-time object detection with region proposal networks. in IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), (2017) 1137 - 1149. https://doi.org/10.1109/TPAMI.2016.2577031

K. He, X. Zhang, S. Ren, J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE transactions on pattern analysis and machine intelligence, 37(9), (2015) 1904-1916. https://doi.org/10.1109/TPAMI.2015.2389824

T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, Feature pyramid networks for object detection. in Proceedings of the IEEE conference on computer vision and pattern recognition, (2017) 2117-2125. https://doi.org/10.1109/CVPR.2017.106

K. He, G. Gkioxari, P. Dollár, R. Girshick, Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, (2017) 2961-2969. https://doi.org/10.1109/ICCV.2017.322

W. Liu et al., SSD: Single shot multibox detector. In Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, (2016) 21-37. https://doi.org/10.1007/978-3-319-46448-0_2

J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, (2016) 779-788. https://doi.org/10.1109/CVPR.2016.91

Y. Zhong, J. Wang, J.Peng, L. Zhang. Anchor box optimization for object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, (2020) 1286-1294. https://doi.org/10.1109/WACV45572.2020.9093498

J. Redmon, A. Farhadi, YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, (2017) 7263-7271. https://doi.org/10.1109/CVPR.2017.690

J. Redmon and A. Farhadi, Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, (2018). https://doi.org/10.48550/arXiv.1804.02767

T. Diwan, G. Anirudh, J. V. Tembhurne, Object detection using YOLO: Challenges, architectural successors, datasets and applications. multimedia Tools and Applications, 82(6), (2023) 9243-9275. https://doi.org/10.1007/s11042-022-13644-y

N. Mohod, P. Agrawal, V. Madaan, YOLOv4 Vs YOLOv5: Object Detection on Surveillance Videos. In International Conference on Advanced Network Technologies and Intelligent Computing, (2022) 654-665. https://doi.org/10.1007/978-3-031-28183-9_46

N. Mohod, P. Agrawal, V. Madan, Human Detection in Surveillance Video using Deep Learning Approach. In 2023 6th International Conference on Information Systems and Computer Networks (ISCON), (2023) 1-6. https://doi.org/10.1109/ISCON57294.2023.10111951

T. Wiegand, G. J. Sullivan, G. Bjontegaard, A. Luthra, Overview of the H. 264/AVC video coding standard. IEEE Transactions on circuits and systems for video technology, 13(7), (2003) 560-576. https://doi.org/10.1109/TCSVT.2003.815165

L. Dong, Y. Li, J. Lin, H. Li, F. Wu, Deep learning-based video coding: A review and a case study. ACM Computing Surveys (CSUR), 53(1), (2020 1-35. https://doi.org/10.1145/3368405

W. Cui, T. Zhang, S. Zhang, F. Jiang, W. Zuo, D. Zhao, (2017) Convolutional Neural Networks Based Intra Prediction for HEVC. Data Compression Conference Proceedings, USA. https://doi.org/10.1109/DCC.2017.53

J. Lin, D. Liu, H. Li, F. Wu, (2018) Generative adversarial network-based frame extrapolation for video coding. In 2018 IEEE Visual Communications and Image Processing (VCIP), IEEE, Taiwan. https://doi.org/10.1109/VCIP.2018.8698615

Y. Zhang, L. Chen, C. Yan, P. Qin, X. Ji, Q. Dai, weighted convolutional motion-compensated frame rate up-conversion using deep residual network. IEEE Transactions on Circuits and Systems for Video Technology, 30(1), (2018) 11-22. https://doi.org/10.1109/TCSVT.2018.2885564

G. Lu, W. Ouyang, D. Xu, X. Zhang, C. Cai, Z. Gao, (2019) Dvc: An end-to-end deep video compression framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, USA.https://doi.org/10.1109/CVPR.2019.01126

L. Zonglei, X. Xianhong, Deep compression: A compression technology for apron surveillance video. IEEE Access, 7, (2019) 129966-129974. https://doi.org/10.1109/ACCESS.2019.2940252

L. Wu, K. Huang, H. Shen, L. Gao, Foreground-background parallel compression with residual encoding for surveillance video. IEEE Transactions on Circuits and Systems for Video Technology, 31(7), (2021) 2711-2724. https://doi.org/10.1109/TCSVT.2020.3027741

N. Ghamsarian, H. Amirpourazarian, C. Timmerer, M. Taschwer, K. Schöffmann, Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks. In Proceedings of the 28th ACM International Conference on Multimedia, (2020) 3577-3585. https://doi.org/10.1145/3394171.3413658

N. Mathá, K. Schoeffmann, S. Sarny, D. Putzgruber-Adamitsch, Y. El-Shabrawi, Evaluation of Relevance-Driven Compression of Regular Cataract Surgery Videos. In 2022 IEEE 35th International Symposium on Computer-Based Medical Systems (CBMS), (2022) 429-434. https://doi.org/10.1109/CBMS55023.2022.00083

C.Y. Wang, A. Bochkovskiy, H.Y.M. Liao, YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, (2023) 7464-7475. https://doi.org/10.1109/CVPR52729.2023.00721

Jocher, G., Chaurasia, A., & Qiu, J. (2023). Ultralytics YOLO (Version 8.0.0) [Computer software]. https://github.com/ultralytics/ultralytics

T.Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C.L. Zitnick, Microsoft coco: Common objects in context. In Computer Vision-ECCV 2014: 13th European Conference, (2014)740-755. https://doi.org/10.1007/978-3-319-10602-1_48

S. Arora, K. Bhatia, V. Amit, Storage optimization of video surveillance from CCTV camera. 2nd International Conference on Next Generation Computing Technologies, (2016) 710-713. https://doi.org/10.1109/NGCT.2016.7877503

A Novel Approach for Surveillance Compression using Neural Network Technique

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Latest publications