Detection and Identification of Non-Helmet Riders and Their License Plate Numbers

Authors: Sheik Arshad, Birali Prasanthi

DOI Link: https://doi.org/10.22214/ijraset.2023.49682

Abstract

Due to the ever-increasing number of vehicles a variety of traffic regulations are imposed and are addressed using various approaches. Two-wheelers are the most preferred and commonly used vehicles among the youth due to their cost. Bike riders are highly supposed to use helmets, but wearing helmets is often neglected by bike riders leading to accidents and deaths. Riding a bike without a helmet is a traffic offense. While riding a motorcycle/bike, the rider and as well as the pillion rider should wear a helmet according to the traffic rules of road safety. In the current system, traffic offenses are largely monitored by the traffic police by investigating CCTV records. The traffic police zoom into every frame of the CCTV record and tries to identify the license plates of non-helmet riders. This will take a lot of labor and time. The proposed system uses deep learning algorithms, CNN (convolutional neural networks) to analyze real-time video footage from cameras installed in key locations, such as roads and intersections, to identify individuals riding without helmets. The proposed system is expected to significantly reduce the number of non-helmet riding incidents and, therefore, decrease the number of fatalities and serious injuries resulting from motorcycle accidents.

Introduction

I. INTRODUCTION

The helmet (also referred to as "Protective gear") is responsible for guarding the motorcyclist against road accidents and serious head injuries. Although wearing a helmet is obligatory in most of the countries around the world, some people do not use the helmet while riding a bike. Although the government of certain nations has placed specialized sensors to verify the presence of the helmet, it would not be economically feasible to purchase sensors for each bike. Many studies and research have been conducted in traffic analysis and road safety, which includes vehicle recognition and categorization, and detection of helmets, during the past several years. This research and analysis contributed to major advancements of technology in ensuring road safety. Manual enforcement of the helmet-wearing rule is not only time-consuming but also prone to errors. On the other hand, automated enforcement using cameras can be more efficient and accurate, but it requires advanced image processing techniques. The proposed system uses convolutional neural networks (CNNs) to automatically identify non-helmet riders and their license plates. The system consists of three main stages: Data collection, Model training, Detection, and Identification. A prototype of the deep learning model is developed on the proposed system for the detection and identification of non-helmet riders. The results of this proposed system can be used for further studying the accuracy improvement in the deep learning model, and to develop a practical solution that can be implemented in real-world scenarios.

II. LITERATURE SURVEY

The research is focused on development of a Deep learning model based on CNN that could identify non-helmet riders and extract their license plate numbers. This information will help traffic enforcement agencies improve their ability to prevent accidents involving these non-helmet riders through the use of new technologies.

According to the literature survey, the present existing methods and image processing techniques provide a decent accuracy for high quality images of the traffic, although poor results and negative effects are brought on by unfavorable environmental conditions as fog, haze, pollution and so on.
The majority of current existing image processing techniques are better suited for the high-quality images and the accuracy results deteriorate if the low-quality images are used for model training. The accuracy might get effected with the low-quality images of the traffic.
The majority of the detection systems employ image processing and feature extraction techniques. The deep learning techniques and algorithms can be incorporated to improve the efficiency and accuracy of the results even more. As the detection system takes the input video from a CCTV footage, the video can be pre-processed and each frame can be given as an input to an artificial neural network, developed based on deep learning algorithms. This might require GPU utilization, but the results of the detection accuracy can be improved.

B. Srilekha, K. V. D. Kiran, and Venkata Vara Prasad Padyala in "Detection of License Plate Numbers and Identification of Non-Helmet Riders using Yolo v2 and OCR Method" [1] proposed a CNN model that uses Yolov2 for the detection of Non-Helmet Riders. HOG component vector is used in filtering the bikes. When the CNN model recognizes a bike/motorcycle, it checks to see if the rider is wearing a helmet or not. The license plate of the non-helmet rider is identified and extracted using Tesseract OCR.

Md. Iqbal Hossain, Raghib Barkat Muhib, and Amitabha Chakrabarty in "Identifying Bikers Without Helmets Using Deep Learning Models" proposed a Tensorflow's SSD Mobilenet V2 and R-CNN model for the Detection of Non-Helmet riders[2]. The video dataset of the busiest roads of Dhaka, Bangladesh was collected in 720p HD resolution at 30 fps. TensorFlow SSD Mobilenet V2 which is a deep learning framework and Faster R-CNN Inception V2 Models were used for the object detection in the dataset. The proposed model outperforms other related helmet detection systems and license plate recognition systems. The model achieved a high frames per second (FPS) rate of approximately 45 on NVIDIA RTX2080 GPU and the model was able to perform successfully even when there were 6 bikes in a single frame.

The activities of the proposed system are often broken down into two sections:

a. Detection: Detection stage involves, making the model to detect only the bike/motorcycle riders and not the others like pedestrians and bicycle riders. The model should be able to detect the bike riders and locate the region of interest(ROI). Here, the detection stage involves checking whether the bike rider is wearing a helmet or not. The Region Of Interest(ROI) is the region around the bike rider's head. The model should be trained well enough to differentiate between a helmet and a head cap. The detection stage ends with finding the region of interest(ROI) in the input footage.

b. Identification: When the region of interest(ROI) is identified, it is then processed by the system for the identification of the helmet. The identification stage involves identifying if the bike rider is wearing a helmet or not. The model should be capable to identify the helmet and classify them into two classes helmet and non-helmet. If the bike rider is not wearing a helmet or wearing a cap, then those cases are classified into the non-helmet class. When the identification stage is done, based on the non-helmet class the license plates of the non-helmet bike riders are extracted and processed. This system helps the traffic police efficiently to determine the non-helmet riding traffic offenses and issue challans accordingly.

II. METHODOLOGY

A pre-trained model YOLOV3 is employed to recognize motorcycles and license plates. The Convolutional Neural Network (CNN) model is developed using the TensorFlow.keras model Sequential(). This model is trained on two classes, Helmet and Non-Helmet for the detection of non-helmet riders. The input provided to the model is a video/CCTV footage of the road, the video is considered as a collection of frames. Each video frame is processed to obtain the Region of Interest (ROI) of the Helmet and then passed to the trained CNN model. The model finally predicts whether the bike/motorcycle rider is wearing a helmet or not. The Methodology for implementing the proposed system consists of two stages: Model Training and Detection.

A. System Architecture

The system architecture is the conceptual model defining how the proposed system is implemented. It gives an overview of the structure, tasks, and functions performed by the system. The basic and simple architecture for proposed system is given below.

B. Model Training

Image Pre-Processing Steps

When each frame from the input CCTV footage is read, the image pre-processing steps are applied to the frame so as prepare the CNN model to read from the input. The image pre-processing steps are a must and should be followed accordingly for the input. The steps could include resizing and changing the brightness, intensity, and pixel values of the images.

The main aim of this step is to improve the image data and suppress the distortions. Enhancing the image features could be significant for further processing. Convolutional neural networks' fully connected layers demand the input images to be in arrays or matrices of the same size. The image pre-processing step may also try to minimize the training time of the model and get quick inferences.

2. Image Normalization

The pixel data values of the input images have to be scaled or normalized before the images can be specified as input to the deep learning artificial neural network model. Image normalization is often used to prepare the image dataset for the model training.

The multiple images are put into a fixed statistical distribution in terms of pixels and size. Image normalization is a significant step to make sure that the input image has a fixed pixel data distribution as it makes the convergence faster while training the network.

3. Sequential Model

An Artificial Neural Network can be created by simply calling the Sequential() API of Keras models. The Sequential model is based on the neural networks algorithm and is often used for a plain stack of neural network layers where each layer has exactly one input tensor and one output tensor. It arranges the Keras layers in a sequential order that allows the data to flow from one layer to other in the specified order of layer until it reaches the output layer. Different layers of connected layers, dense layers, max-pooling layers, and convolutional layers are added to the model to form the network.

C. Model Detection

Forward Propagation in CNN

Forward propagation in a Convolutional Neural Network involves receiving the input data, processing the received data, and generating the output. A Convolutional Neural Network identifies and compares the image object based on its pixel data values. The image is read based on its pixel values and this data is processed in the multiple layers of the neural network. The significant difference among the pixel data values in the input image object can be used by the model for learning. To capture the information of pixel values, the input image is convoluted with a filter which is also known as a 'kernel.' The below is an example of how the pixel information is extracted from an image.

2. Blob From Image

A blob is a potential collection of image(s) with the same spatial dimensions, same width, height, and depth (number of channels) in which all of them are preprocessed in the same manner.

OpenCV's deep neural network module (dnn) provides two functions that can perform image preprocessing and prepare the images for classification via trained deep learning models. The blobFromImage function creates a 4D blob from the image, resizing and cropping the image from the center, subtracting mean values, and scaling values by a scale factor.

3. Region Of Interest

The Region of Interest (ROI) in an image, is a sample segment within the image data identified for a particular purpose. The region of interest is used as an input to the network for classification. The coordinates are identified and calculated from the input image to identify the ROI.

Here, the proposed system tries to find out the ROI for helmet detection and then extracts that ROI to give it as input to the trained network. Based on the ROI provided to the trained network the model then tries to classify the image into helmet or non-helmet classes based on the features it observes in the input ROI.

The figure above corresponds to the ROI for helmet detection. Here, the bike/motorcycle rider is wearing a helmet, and the ROI identified here is provided as input to the trained network. The model then tries to detect the presence of helmet based on its learning and classifies the ROI as a helmet. In a similar way if the bike/motorcycle rider is not wearing a helmet, then the corresponding ROI from the image which is sent to the network will be classified as non-helmet by the model as shown in the below figure.

In the above case, the bike rider is not wearing a helmet/protective gear which is a traffic offense. The ROI identified is classified as non-helmet by the model and when the model detects a non-helmet class then it tries to identify the ROI of the license plate of that non-helmet rider. It detects the ROI and extracts the license plate.

4. Convolutional Neural Network (CNN)

The Convolutional Neural Network (ConvNet/CNN) which is an artificial neural network belongs to the Deep learning algorithm which is primarily used for image classification. It takes the image object as an input and set weights and biases (importance) to the various features in the image object corresponding to the pixel data. The pre-processing can be reduced in the CNN when compared to other image classification algorithms. Filters are added to the neural network layers and the CNN can learn these filters and characteristics of the image data. For implementing the proposed system, the CNN is instantiated as a keras sequential model because the sequential model has exactly one input and output and is stacked together to form the entire neural network.

The typical layers in the detection model of CNN are stacked as, Convolutional-Pooling layer -> Convolutional-Pooling layer -> Flattened layer -> Multiple Dense layers.

Convolutional and pooling layers are essentially used for extracting the features from the input images while maintaining the significant pixel data dependencies. They also reduce the input image dimensions by reducing the number of pixels to speed up the training process. These layers are stacked together as pairs when building the neural network model.

5. YOLOv3

You Only Look Once (YOLO) is an artificial neural network algorithm that is designed and implemented to detect and identify the various objects of different classes in a real-time image. Yolo performs object recognition and detection as a problem of regression and comes up with the class probabilities for the recognized objects in the image. It implements a Convolutional Neural Network (CNN) to recognize and classify the objects in real-time. Yolo also recognizes the positions of the detected objects in the images and the complete image is processed through a single neural network.

The proposed system uses a pre-trained yolov3 network to identify the bikes and the number plates. The input image splits into multiple regions using this neural network, which tries to generate the class probabilities for each region in the image. Yolo tries to predict different bounding boxes that cover few of the regions and based on the best probabilities it picks one. Here, yolov3 pre-trained weights and configurations are employed for the detection of bikes and their number plates.

Yolov3 has 106 layers in total and the detections are made at 82, 94, and 106 layers. Each convolutional layer is followed by a batch normalization layer and leaky ReLU activation functions. There are no pooling layers added in the network but instead, additional convolutional layers with stride 2, are used to down-sample feature maps.

The weights of the Yolov3 model are saved in a binary file called yolov3.weights. The weights here correspond to only convolutional layers. The weights are applied to the convolutional layers with respect to the type of layer.

IV. CHALLENGES

Several challenges arise in implementing the proposed system for Non-Helmet rider detection. A few of those challenges include:

A. Data Collection

Collecting an image dataset of large size with all different types of images and videos that include both helmet and non-helmet riders can be a challenging task. It requires time, effort, and resources for the collection of the dataset.

B. Model Training

Training a CNN model requires powerful hardware and can take a long time, especially if we use a large dataset. The images should be pre-processed with respect to its dimensions and pixel information.

C. Accuracy

Achieving high accuracy in the detection of non-helmet riders and their license plates numbers can be a challenging task due to various factors such as the lighting conditions, quality of images or videos, angle of the image, and other head caps, turbans worn by bike riders which look similar to the helmet but are not helmets.

V. FUTURE SCOPE

The image dataset can be increased by adding a variety of images that include bike riders wearing helmets, caps, turbans, scarfs, etc., and bike riders not wearing any helmets.
The detection code can be embedded in the live monitoring system displays which take the live feed input from the real-time CCTV cameras.
The facility to convert the extracted license plate images to text, to get the actual license number of the non-helmet rider can be added to the system. This can enable us to design an automatic challan-sending system, which sends the challans to the non-helmet riders.
The detection and identification could work even better when the system is connected to a GPU, allowing it to run smoothly with less delay in output display.
The entire detection system can be embedded in the cameras building up an IOT system for the detection in real-time.

Conclusion

While riding a motorcycle or bike the rider must always have to wear a helmet if not, it is considered as a traffic offense. This has escalated the number of bike of accidents and deaths. While riding on the roads everyone must follow the traffic rules accordingly and the bike riders have to wear helmets for their own safety. The proposed system demonstrates that deep learning, specifically CNNs, can be used to detect and identify non-helmet riders and their license plate numbers from CCTV footage. However, there are several challenges in implementing this system, such as data collection, model training, and accuracy. Further research can improve the accuracy of the model and address these challenges. TensorFlow and keras model Sequential() is used in training the CNN model. The trained CNN model is then loaded into the system for the detection of non-helmet riders. The pre-trained yolov3 weights file is used for the detection of license plates. OpenCV module is used to take the input, process it, and display the output frames as a video.

References

[1] B. Srilekha, K. V. D. Kiran and V. V. P. Padyala, \"Detection of License Plate Numbers and Identification of Non-Helmet Riders using Yolo v2 and OCR Method,\" 2022 International Conference on Electronics and Renewable Systems (ICEARS), 2022, pp. 1539-1549, doi: 10.1109/ICEARS53579.2022.9751989. [2] M. I. Hossain, R. B. Muhib, and A. Chakrabarty, \"Identifying Bikers Without Helmets Using Deep Learning Models,\" 2021 Digital Image Computing: Techniques and Applications (DICTA), 2021, pp. 01-08, doi: 10.1109/DICTA52665.2021.9647170. [3] S. Kadam, R. Hirve, N. Kawle and P. Shah, \"Automatic Detection of Bikers with No Helmet and Number Plate Detection,\" 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), 2021, pp. 1-5, doi: 10.1109/ICCCNT51525.2021.9579898. [4] M. A. V. Forero, \"Detection of motorcycles and use of safety helmets with an algorithm using image processing techniques and artificial intelligence models,\" MOVICI-MOYCOT 2018: Joint Conference for Urban Mobility in the Smart City, 2018, pp. 1-9, doi: 10.1049/ic.2018.0001. [5] K. C. D. Raj, A. Chairat, V. Timtong, M. N. Dailey and M. Ekpanyapong, \"Helmet violation processing using deep learning,\" 2018 International Workshop on Advanced Image Technology (IWAIT), 2018, pp. 1-4, doi: 10.1109/IWAIT.2018.8369734.

Copyright

Copyright © 2023 Sheik Arshad, Birali Prasanthi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET49682

Publish Date : 2023-03-20

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here