Vehicle lights are the best assurance of driving safety when it is dark outside. High beams are frequently turned on by drivers to narrow their field of vision, to make the oncoming car more visible, or even when they are following another vehicle. The lights themselves hinder drivers from checking their rear vision in the rearview mirror and result in traffic collisions. Additionally, failure to maintain a safe distance from oncoming traffic by vehicles frequently results in accidents. This research suggests a deep learning-based image recognition system for the headlamp control system as a solution to this issue. When a driver is operating a car equipped with this system, they can instantly identify the vehicle in front of them and estimate at night the safety distance of the car in front of them to reduce light damage and traffic accidents at safe distances, consider if there are cars in the forward and opposing lanes before deciding whether to turn on the high beam.
Introduction
I. INTRODUCTION
Future applications of vehicle detection technology in traffic include autonomous driving, traffic control, and traffic monitoring.
Vehicle identification based on deep learning is one of the applications that is presently competing for research due to the rise of deep learning and the maturity of computer vision algorithms in recent years [1][2][3][4]. Deep learning is the process through which a computer uses a huge amount of data to perform detection or classification tasks depending on the data. The important thing is that the data is used to teach the function of detection or classification. Applications that identify vehicles must do so in real time to meet traffic standards.
YOLOv3[5] deep learning network, in contrast to previous object detection networks [6][7] that use YOLO [8], can swiftly and efficiently conduct vehicle detection in the object detection learning method. robust learning, He can execute commands at a breakneck pace thanks to the network's single neural network end-to-end structure, which also increases YOLO's object accuracy [8]. It can keep some precision in both the bounding box for small things and for huge objects.
This study employs the YOLOv3 [5] deep learning network for vehicle recognition, which uses the object frame to determine the headlight switch and carry out distance warnings.
II. PROPOSED METHOD
This article addresses and enhances the object detection issue using the YOLOv3 [5] approach. All types of vehicles, COCO data sets, and BDD100K data sets recorded by their driving recorder are used as learning resources if the cars, buses, trucks, locomotives, etc. on the road are. R-CNN was employed for object detection in the beginning [6]. The network's drawback is that it takes too long to compute the image. The real-time network architecture of Object Detection is the YOLOv3[5] deep learning network used in this article, as depicted in Fig. 1 Show.
A. You Only Look Once in v3. Version 3: A Modest Improvement
YOLO [8] is created as a regression problem for computing the coordinate location of the bounding box and calculating the probability of its related category to solve the problem. It differs from the Region-based technique of R-CNN [6] and employs a single regression model. The category score and object frame are predicted by the network. He can execute operations very quickly and retain a certain level of accuracy by using a single neural network end-to-end architecture.
YOLO [8] Real-time object detection can be swiftly achieved with deep learning network calculations, however there is one drawback:
The estimated location is not exact enough and the object frame accuracy is poor. The accuracy decreases with object size
To address YOLO [8drawbacks,]'s YOLOv2[9] employs the anchor box in the Faster RCNN [7] network to forecast the coordinates of the object frame and uses k-means clustering to determine the anchor box's width and height ratio distribution. Remove the fully connected layer (Completely Connected Layer), change the network's structure to a fully convolutional layer (Convolutional Layer), and the accuracy of smaller objects is maintained while real-time object recognition is maintained. A modest enhancement has also been made.
To increase the depth of the neural network layers, YOLOv3[5] updated the base network to Darknet-53 and utilized the (Deep Residual Learning for Image Recognition) ResNet network structure. Use the (Feature Pyramid Networks) FPN network to switch the feature map from a single layer of 13 by 13 to several layers of 13 by 13 and 26 by 26 and 52 by 52, and the feature map of each layer predicts three different types of bounding boxes. The accuracy of tiny item prediction is increased by using the FPN pyramid design, which merges low-level information with high-level characteristics and gives each layer its own independent prediction bounding box. The threshold is utilized to forecast multiple bounding box labels in the class classification section, and the logistic classifier function is modified from the softmax function to predict the scores of all categories. The box will be given to the categories whose scores are above this cutoff.
B. The Architecture of Neural Network
The YOLOv3[5] neural network architecture is used in this study, and the basic network transforms the feature matrix of the DarkNet network's bottom and middle layers into an FPN structure using Darknet53. This paper also uses multiple matrix splicing and convolution operations to produce three different output scales. 13 13, 26 26, 52 52, the estimated number of anchor boxes in the whole network will be (13 13) + (26 26) + (52 52) = 10647, making it a tiny network. Object identification is also more accurate, as is bounding box coordinate prediction. The loss function depicted in this paper's (1).
III. RESULTS
The computer equipment and platform tools used in the experiment of this paper are shown in TABLE I.
TABLE I. COMPUTER EQUIPMENT AND PLATFORM TOOLS
Items
Specifications
Mother Board
ASUS ROG STRIX B360-F GAMING
CPU
Intel Core i5-8500
Memory
8G
Graphics
NVIDIA GeForce GTX 2080
Open sources
OpenCV-python 4.1.1.26, Keras 2.2.2, Tensorflow 1.10.0
For the neural network model to learn the features in the big amount of data, the deep learning of the object identification job requires training with a vast quantity of data using a large number of picture data sets. Real-time detection techniques are made possible by YOLOv3's rapid photo calculations.
As demonstrated in TABLE II, this study employs MSCOCO, BDD100K, and custom image data sets to collect images of vehicles, motorbikes, buses, and trucks.
TABLE II. Data and Extraction Categories
Dataset
Extraction categories
MSCOCO
Car, Bus, Truck, Motorcycle
BDD100K
Car, Bus, Truck
Self-made image dataset
Car, Bus, Truck, Motorcycle
In this study, the neural network primarily distinguishes between three types. An automobile, a motorcycle, and a big car. To increase accuracy, the Bus and Truck categories in this page have been combined into Bigcar. According to TABLE III, the vehicle data set is divided in this article into a training set and a test set in a ratio of (80:20) for training.
TABLE III. DATA AND EXTRACTION CATEGORIES
No.
Input class
1
Car
2
Motorcycle
3
Bigcar
During the experiment, this paper sets the number of iterations of the network to one hundred times, and trains the data in batch mode, and sets the batch size to 20. This experiment cuts the input data into the first 85% are training data, and the last 15% are test data. The training results are shown in Fig.2.
Use the YOLO model that has been trained to detect vehicles, connect data from the locomotive's driving recorder, enter the video data from the night driving recorder, and determine whether there is a car in front of it. This is necessary because parking cars on both sides of the road will cause the deep learning model to fail. Detected, therefore pick a specific location on your own, and just search for automobiles in the vicinity. As illustrated in Fig.3 and Fig.4, the high beam is switched on when a vehicle is identified in front of it using a deep learning model, and off when it is not.
where D stands for the distance, d for the focal length, T for the real object's width, and t for the image object's width.
Use the deep learning object frame to determine the distance and the average horizontal line to determine whether the distance is too near to enhance distance judgment. The warning line in this document is currently fixed at 20 meters. A warning will be sent to the driver when the vehicle is discovered inside the caution line, as illustrated in Fig. 5
Conclusion
This study makes judgments on vehicle detection using a deep learning neural network, the MSCOCO data set, the BDD100K data set, and its driving recorder picture data. The test set\'s accuracy is greater than 60%, and the nerve is fed the driving recorder image. The network model improves detection and enables the driver to maintain a safe distance to help decrease traffic accidents through real-time vehicle identification, the switching of the high-beam headlights, and the distance warning.
References
[1] Y Gao and H J Lee, \"Vehicle Make Recognition Based on Convolutional Neural Network[C]\", International Conference on Information Science and Security, pp. 1-4, 2015.
[2] Q Fan, L Brown, and J Smith, \"A closer look at Faster R-CNN for vehicle detection[C]\", Intelligent Vehicles Symposium, pp. 124-129, 2016.
[3] S Wang, Z Li, H Zhang et al., \"Classifying vehicles with convolutional neural network and feature encoding[C]\", IEEE International Conference on Industrial Informatics, pp. 784-787, 2017.
[4] Yong He, Liangqun Li, \"A Novel Multi-Source Vehicle Detection Algorithm based on Deep Learning\", IEEE International Conference on Signal Processing, pp. 979-982, 2018.
[5] Joseph Redmon, Ali Farhadi, “YOLOv3: An Incremental Improvement”, 2018, arXiv:1804.02767. [Online]. Available: https://arxiv.org/abs/1804.02767
[6] Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik, “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,” IEEE Conference on Computer Vision and Pattern
Recognition, pp.580-587, 2014.
[7] S. Ren, K. He, R. Girshick and J. Sun, \"Faster R-CNN: Towards Real- Time Object Detection with Region Proposal Networks\", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 2017.
[8] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, \"You Only Look Once: Unified Real-Time Object Detection 2016\", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779-788, Las Vegas, NV, 2016.
[9] Joseph Redmon, Ali Farhadi, “YOLO9000: Better, Faster, Stronger”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.6517-6525, 2017