To cut down on occlusion-induced false detections of vehicle targets, an improved vehicle detection strategy based on a more advanced YOLO network is proposed. The proposed method makes use of the Flip-Mosaic algorithm to enhance the network\'s perception of small targets. A multi-type vehicle target dataset was developed using data from a variety of scenarios. The dataset served as the foundation for the detection model\'s training. Experiments showed that the Flip-Mosaic data enhancement algorithm reduced false detection rates and improved vehicle detection accuracy.
Introduction
I. INTRODUCTION
The smart freeway makes vehicle–road collaboration easier by creating an efficient communication system between the cloud platform, roadside infrastructure, road users, and large data centers. There are still a few issues that need to be resolved, even though the expressway network's construction is getting smarter and the technology for comprehensive traffic management is getting better quickly. The expressway successfully implemented the "one network" operation mode of "one pass, one deduction, one notification" and implemented segmented billing across the entire network system. The charging mode was changed from weight charging to per-vehicle charging, and the billing mileage was determined by the system and the toll booths based on the driving path. The new toll collection system makes it difficult for the expressway toll system to recover from accidents and avoid traffic. When compared to urban arterial roads, the expressway also has higher speeds, a large capacity for traffic, and a lot of commercial trucks transporting dangerous goods. Traffic accidents on the expressway cause more damage and have longer-lasting effects, such as congestion caused by accidents, despite the relatively low accident rate.
A. Object Detection
The computer technology known as object detection focuses on locating instances of semantic objects belonging to a particular class—such as people, buildings, or automobiles—in digital images and videos. It has to do with image processing and computer vision. Two well-studied subfields of object detection are face detection and pedestrian detection. Object detection is used in a lot of computer vision applications, like image retrieval and video surveillance. Face detection, face recognition, vehicle counting, video object co-segmentation, and image annotation all make extensive use of it in computer vision. It can also be used to monitor the movement of a cricket bat, a football ball during a game, a person in a video, or a person in a video.
B. YOLO V5
YOLO v5 is the fifth generation of YOLO. It is well-known for its rapid prediction and detection accuracy. YOLO v5's network structure is straightforward and consists of prediction, input, backbone, and neck. A) Data: Similar to YOLO v4, YOLO v5 incorporates the Mosaic data augmentation technique into the training images. Random scaling, random cutting, and random layout combine four distinct images into a single one. The enrichment of the background information in the training image has significant advantages for small target detection.
C. Digital Image Processing
Digital image processing is the process of using a digital computer to apply an algorithm to digital images. There are numerous advantages to digital image processing over analog image processing, which is a subfield of digital signal processing. It makes it possible to use a much wider range of algorithms on the input data and keeps problems like noise and distortion from building up during processing. Because images can be defined in more than two dimensions, digital image processing can be represented as multidimensional systems. Digital image processing was created and developed primarily by three factors: first, the growth of computer technology; second, the development of mathematics, specifically discrete mathematics theory; Thirdly, there has been an increase in the demand for a wide range of applications in the fields of agriculture, environmental science, the military, and industry.
II. OBJECTIVES
To recognize vehicles in the surveillance data in order to anticipate traffic.
The YOLO algorithm's vehicle detection and traffic prediction are built on this project's foundation. This allows for an examination of the method for identifying a congested area. Numerous server configurations and IOT modules can be made possible in the future.
III. RELATED WORKS
Nidhi Soni et al. report, In order to reduce the number of accidents, it is urgent to investigate the influence of accident-causing factors and implement effective strategies. Researchers have recently focused on traffic accidents, people, vehicles, roads, the environment, and the influence of influencing variables.
Trackers based on discriminative correlation filters (DCFs), as stated by Taihang Dong et al., have recently achieved high computational efficiency at excellent performance. Chaonan Fan et al. report, The conventional GMM quickly recognizes the background as a moving target and is sensitive to changes in illumination.
In this paper, a moving target detection algorithm was created by combining the improved custom GMM with the five-frame interframe difference method. Yangquan Yu et al. report, For moving target detection, there are two common algorithms, each with its own set of benefits and drawbacks: The background subtraction method and the frame difference method are contrasted and examined in this paper.
Badri Narayan Subudhi et al. state: A novel background subtraction (BGS) technique for detecting local changes in video scenes that correspond to the movement of objects is presented in this article. Here, six local characteristics are suggested as effective combinations: three that have already been proposed and three that have not previously been proposed In this instance, a statistical parametric biunique model is proposed for background modeling and subtraction.
IV. METHODOLOGY
The development of GPU hardware devices and the theory of deep learning have also contributed significantly to the advancement of computer vision technology in recent years. There are a lot of practical implications when computer vision technology is used to save labor.
Object detection, which is also an important fundamental branch of digital image processing and computer vision, is the fundamental component of intelligent monitoring systems for a variety of application scenarios. The YOLO algorithm is used to find the vehicle, and cctv footage is used as the input.
The footage is analyzed using the x and y planes, and the detection is very accurate. Image annotation can be used to identify objects even in low-light settings. Data input or surveillance in real time is required. The YOLO algorithm's improved vehicle detection yields superior results.
The parameterized result demonstrates the definitive results.
Using this method, traffic forecasting can be developed in the future.
V. DATASET COLLECTION
Open-source dataset creators frequently construct them in accordance with the current research's requirements. Because of this, it's possible that the characteristics of the data won't exactly meet the requirements of the current research. Because research in a particular field necessitates datasets under specific scenarios or conditions, this paper creates some datasets to verify its own findings. The expressway monitoring application's angle and resolution serve as the foundation for the multi-angle monitoring video of a specific point.
A. Image Labeling Method For Vehicle Detection
Vehicle inspection is a learning activity that is supervised. Model training relies on the vehicle's location and classification information in the image. When classifying vehicle targets, the practical application of vehicle target detection in highway scenarios was taken into account.
B. Image Annotation
The labeled dataset's quality largely influenced the quality of the model. so that a vehicle detection model that was more like the real-world traffic on the highway could be trained
VI. ARCHITECTURE DIAGRAM
VII. RESULT AND DISCUSSION
The effectiveness of the improved data enhancement optimization method presented in this paper. When it came to predicting vehicle identification, the improved YOLO v5 network performed better. Due to their similarity, the two models' overall performance improved by 0.5 percent and 0.3 percent, respectively, especially in the case of a smaller calibration frame. The primary focus of this section was on how to improve the performance of vehicle inspections. Finally, the effectiveness of the improvement strategy proposed in this project was demonstrated by comparing performance before and after the effect.
Conclusion
First, the high-speed scene-diversified dataset and rich data samples were built. Because it covered numerous highway scenarios and monitored various road sections and perspectives, the dataset provided a dataset with strong applicability for vehicle object detection in high-speed scenarios. In addition, object detection was carried out in this article using the enhanced YOLO v5 network. Utilizing a variety of datasets improved the accuracy with which vehicle targets could be identified at their source. Based on this method, the result was more in line with engineering practice and significantly improved the rate of recognition of similar small targets. Real-world applications may be significantly affected by these enhancements.
References
[1] Ministry of Transport of the People’s Republic of China, Statistical Bulletin of Transport Industry Development 2020. Available online: https://www.mot.gov.cn/jiaotongyaowen/202105/t20210519_3594381.html (accessed on 9 May 2022).
[2] Jiangsu Provincial Department of Transport, Framework Agreement on Regional Cooperation of Expressway. Available online: http://jtyst.jiangsu.gov.cn/art/2020/8/24/art_41904_9471746.html (accessed on 9 May 2022).
[3] Park, S.-H.; Kim, S.-M.; Ha, Y.-G. Highway traffic accident prediction using VDS big data analysis. J. Supercomput. 2016, 72, 2832–2832.
[4] Paragios, N.; Chen, Y.; Faugeras, O.D. Handbook of Mathematical Models in Computer Vision; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2006.
[5] Liu, P.; Fu, H.; Ma, H. An end-to-end convolutional network for joint detecting and denoising adversarial perturbations in vehicle classification. Comput. Vis. Media 2021, 7, 217–227.
[6] Lee, D.S. Effective Gaussian mixture learning for video background subtraction. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27,
[7] 827–832.
[8] Deng, G.; Guo, K. Self-Adaptive Background Modeling Research Based on Change Detection and Area Training. In Proceedings of the IEEE Workshop on Electronics, Computer and Applications (IWECA), Ottawa, ON, Canada, 8–9 May 2014; Volume 2, pp. 59–62.
[9] Muyun, W.; Guoce, H.; Xinyu, D. A New Interframe Difference Algorithm for Moving Target Detection. In Proceedings of the 2010 3rd International Congress on Image and Signal Processing, Yantai, China, 16–18 October 2010; pp. 285–289.
[10] Zhang, H.; Zhang, H. A Moving Target Detection Algorithm Based on Dynamic Scenes. In Proceedings of the 8th International Conference on Computer Science and Education (ICCSE), Sri Lanka Inst Informat Technol, Colombo, Sri Lanka, 26–28 April 2013; pp. 995–998.