Greenscore Vehicle Identification Using CNN (Convolutional Neural Network)

Authors: Anuj Gupta, Aparna Saini, Yuvraj Joshi, Ankit Gupta, Abhishek Sarkar, Sujata Kukreti, Shashank Barthwal

DOI Link: https://doi.org/10.22214/ijraset.2022.45106

Abstract

Humans are the superior creature on the earth, they have invented the new technologies and inventions a one of the greatest inventions is vehicles (2vechicle, 4 vehicle , heavy loads vehicles , commercial vehicles) which saves time , money and muscle work. The most important things we want is less pollution and less accident on road. Humans have made ease of life but as the population is increasing day by day the more dependent on vehicles the pollution caused by the vehicles effects the environment as well as living creature life , the pollution caused by the vehicles such as carbon monoxide(CO sulfur dioxide(SO2) and hydrocarbons affects nature and living life the government have many precautions and polices to control such as PUC certificate, even odd vehicles, electric vehicles subsite. But mainly buyer mainly focused on speed,torque,safety and milage while buying but they also forget to check Greenscore for vehicles which is also essential while buying .which tells about which vehicles is greenest and meanest ranking. This research is all about the identification and selection of vehicle(cars) using machine learning As a result, we employed a CNN network with multiple layers, including different type layers, ReLU, pooling layers, dense layers, and so on. We also use batch normalization and dropout layers to prevent the model from becoming overfit. To improve the accuracy of the outcome, we applied augmentation techniques. The effect of employing Max polling in CNN for feature mapping and reducing overfitting is shown below. With a 5 CNN hidden layer model, while working with different dataset we have achieved 93 percent training accuracy and 86 percent testing accuracy with one dataset while another we have achieved 99 percent training accuracy and 95 percent testing accuracy but having the same model and no. of epochs . The model\'s output will aid in prediction and selection of greenscore cars .

Introduction

I. INTRODUCTION

Buying green, as per the American Council for an Energy-Efficient Economy (ACEEE), is the first step toward decreasing the environmental impact of automotive use. The most essential factor is the vehicle you choose, but how you operate and how well you manage your car, van, or light truck will also play a role. The greenscore ranges from 0 to 100. This year's top vehicle received a 59 mileage at higway, while the worst gasguzzler received a 23 milage at higway. The ranking is based on automakers' reported EPA test results for fuel efficiency and emissions, as well as an estimate of pollution from vehicle manufacturing, gasoline manufacturing and distribution, and vehicle tailpipes. It also takes into account air pollution shown in figure 1.Here the Image and speech recognition have been among the many domains where Deep Neural Networks have shown impressive gains over the last decade. Among the most significant advantages of CNN models is how much time they save. The parameters of ANN are numerous. This success has prompted research community to examine larger scale project models to address tough problems, which was not common in the past. With traditional

Here the Image and speech recognition have been among the many domains where Deep Neural Networks have shown impressive gains over the last decade. Among the most significant advantages of CNN models is how much time they save. The parameters of ANN are numerous. This success has prompted research community to examine larger scale project models to address tough problems, which was not common in the past. With traditional CNNs, this is possible.

The most basic premise is that CNN's issues aren't worth discussing. We don't have to pay for a facial recognition program, for example, because these attributes are spatially dependent. Take note of the position of the faces in the photographs. Anywhere on the globe will suffice as long as they are discovered somehow. the situation When data propagates to higher levels, CNN also has the ability to extract complicated patterns. Edges may be found in the first layer of an image classification, followed by useable ones in the second layer, and finally higher-level features in the third layer.

II. METHODOLOGY

A. Theoretical Background

Deep Learning has proven to be a particularly useful technology in recent decades due to its capacity to manage massive volumes of data. Hidden layers have eclipsed traditional techniques in popularity, particularly in pattern recognition. Convolutional Neural Networks are one of the most widely used deep neural networks. Scholars have attempted to create a system that really can visual input since the 1980s,when artificial intelligence was in its infancy This field became called computer Vision in the years that followed. When a group of academics from the University built an Ai system that outperformed the top image recognizer by a considerable margin in 2012, machine vision took a next level. The AI system ,dubbed AlexNet took first place in the 2012 ImageNet Machine vision challenge with an incredible 84 percentage accuracy On the tests, the runnerup received a respectable 74 percent. Deep Neural Networks, a form of neural network that approximates human vision, were at the heart of AlexNet CNNs are now an integral feature of many computer vision and its applications over through the year.

B. Working of CNN algorithm

Many layers are used to approximate the image data. CNNs take advantage of space by building a local connection network between neurons in neighboring layers. Each weak filter is replicated over the entire visual field in the CNN technique. In order to create feature maps with the same weight and bias, each of these units is combined. The image depicts three shrouded convolution layers. For this reason, the weights of the same hue are frequently used together and must be identical. The gradient of the shared parameters is added to the gradient of the common weights. Recurrence allows us to recognize an object no matter where it is in our frame of view. Weight sharing also reduces the amount of information that can be gleaned without restriction. CNN is able to attain greater generality because of its control over visual difficulties. CNN typically makes use of the non-linear down sampling technique known as "max-pooling." This approach divides the input image into

non-overlapping rectangles. The best value is offered for each sub-region.as shown in figure 2

When all know about neural networks, we usually think of matrix multiplications, but this isn't the situation with Cnns. It employs a technique known as Convolution. Convolution is a mathematical expression on two functions that yields a third function that explains how the form of one is changed by the other.

There is an an explanation how cnn recognizes the image . when the image converted into RGB format then the dark marked as 1 and light marked as 0

As shown in figure 3.

CNN models were invented by Yann Lecun ,the director of facebook’s AI research Group. In 1988, he created LeNet, the first convolutional neural network. Pattern classification tasks such as reading zip codes and digits were performed using LeNet.

McCulloch and Pitts, who provided an initial model in, devised the first technique. Neural networks are made up of layers that are connected to form the networ.A feedforward neural network is a type of Neural network that learns from its mistakes.Convolutional layers are feedforward neural networks that are commonly used to evaluate pictures by processing data in a grid-like fashion.

Layers in a Convolutional Neural Network. A convolution neural network has multiple hidden layers that help in extracting information from an image. The four important layers in CNN are: Convolution layer, ReLU layer, Pooling layer and Fully connected layer

C. Convolution

This is the first step in the process of extracting valuable features from an image. A convolution layer has several filters that perform the convolution operation. Every image is considered as a matrix of pixel values.Consider the following 5x5 image whose pixel values are either 0 or 1. There’s also a filter matrix with a dimension of 3x3. Slide the filter matrix over the image and compute the dot product to get the convolved feature matrix.

This layer is the starting layer where the input image is passed here 224x224 using 64 filter is passed these images gone through the convolutional matrix of the filter has been decided by the model itself this is the good part of the CNN on we have to decide the filters the padding and shifting of filter matrix has been set to 2. At last, the CNN model produces the 1-d matrix that is flatten finally has been classified. The CNN formula which is used to produces the matrix and final output is shown below figure 4.

D. ReLU Activation Function

ReLU stands for the rectified linear unit. Once the feature maps are extracted, the next step is to move them to a ReLU layer. A nonlinear activation functions. A nonlinear mapping function is used to manage the results of a linear operation, such as convolution. Although soft non - linear functions like the sigmoid or hyperbolic tangent show in figure 5

Actions, Because of its simplicity in computing the feature, the rectified linear unit (ReLU) function has become the most employed nonlinear activation function.

E. Pooling Layer

If the picture features are down sampled in-plane, they will be translation invariant to slight shifts or distortions, as well as reduced in number. Even though filter size, speed, and padding are hyper-parameters in the pooling operations, which are like convolution processes, neither of the pooling layers contains learnable parameters.as shown in figure 6

F. Fully Connected Layer

The pooling layer's extracted features are usually flattened or converted into a 1-D array of integers (or vector) and associated to one or more fully connected, also known as dense layers, in which a learning weight connects each input to each output. A subset of fully connected layers, such as the probability for each class in classification methods, transfers the characteristics recovered by the convolution operation and down sampled by the pooling layers to the network's final output. The number of output nodes is typically equivalent to the number of classes in the final fully connected layer. Each fully linked layer is examined in turn. Shown in figure 7

III. MODELING AND ANALYSIS

A. Expermintal Procedure

Image Processing

The photos of heavy vehicles used in the study were obtained from Kaggle. Except for few images that are not used as commercial vehicles for road shipments, we used the dataset. In the axial design, all the heavy load images are 224 by 224 pixels and T-2 weighted. In the database, there are 242 photos with a resolution of 224 x 224 pixels and 3 RGB images in two different scenarios that have been split into two categories. For better results, we have used the augmentation technique on many types of automobiles. We need augmentation since it can manage rotated and scaled images. For example, in ANN, if the same image changes its axis and rotation, it is difficult to identify. Shown in figure 8

2. Network Training and Testing

A training algorithm's goal is to train a network to have the smallest possible difference between its output and the desired output. This is how the error function is defined:

3. Model Preparation

There are three layers in a Convolutional Neural Network (CNN), which are the Pooling and Fully Connected. CNN The image has been transformed to an RGB grid, and CNN has assigned the pixel values -1 and 1 in the matrix. CNN uses epochs of 30 and a dimension reduction called max pooling.

Layer 1 we have used the sequential model in layer 1 of CNN filter size of 3x3 matrix and number of filters are 64 image size are in 224 x 224 pixel with 3 RBG.After that ReLU hidden layer is activated which convert all the negative values to 0 and remain same as non-linearity , rgb(3)until we have 224 x 224 pixel with 64 filters ReLU helps also helps to speed up and faster the computation in CNN. We used 224 × 224 pixels and 64 filters for batch normalization. Faster training is possible with batch normalization, which in some circumstances reduces the number of epochs by half or more.

In layer 2 we have conv2d_11 (Conv2D)(None, 222, 222, 64),here we the CNN fiter converted 224 x 224 pixel to 222x222 pixel that having 64 filters , activation_15 (Activation) (None, 222, 222, 64),same activation function is used but now in 222x222 pixel ,max_pooling2d_4 (MaxPooling (None, 111, 111, 64) .In max polling this is aur first max polling in this CNN network max polling we have used 64 filter of having size matrix 2 x 2 max pooing decreases the dimension of pervious matrix batch_normalization_13(Batch (None, 111, 111, 64) Normalization) ,dropout_6 (Dropout)(None, 111, 111, 64)

In layer 3 conv2d_12 (Conv2D)(None, 111, 111, 64)the pixel of the image has been used to 222 x 222 pixel to 111x111 pixel having of 64 filters activation_16 (Activation) (None, 111, 111, 64) activation has been same as previous one but having pixel of 111x111 pixel with 64 filters batch_normalization_14 (Bat (None, 111, 111, 64)

layer 4 conv2d_13 (Conv2D) (None, 109, 109, 64),In this layer the image pixel have been decreased to 109x109 pixel and this image is used with 64 layer for next processing ,activation_17 (Activation) (None, 109, 109, 64) which is having pixel of 109x109 pixel,max_pooling2d_5 (MaxPooling (None, 54, 54, 64) 2D),This is second max pooling we have used in this CNN network uptill we have converted to 54x54 pixel of filter 64 batch_normalization_15 (Bat (None, 54, 54, 64),dropout_7 (Dropout)(None, 54, 54, 64).

In layer 5 conv2d_14 (Conv2D) (None, 54, 54, 64), activation_18 (Activation) (None, 54, 54, 64) batch_normalization_16 (Bat (None, 54, 54, 64)

4. Fully Connected Layer

As the size have gone to 54x54 pixel of filter 64 here its last dense layer ,it is converted into flatten in single r 1D matrix,flatten_2 (Flatten) (None, 186624),dropout_8 (Dropout) (None, 186624) ,dense_4 (Dense) (None, 512) activation_19 (Activation) (None, 512) ,batch_normalization_17 (Bat (None, 512).Finally we have trained 95,704,194 parameters out of 95,705,858 parameters in which 1,664 parameters are not trained. Total params: 95,705,858, Trainable params: 95,704,194, non-trainable params: 1,664. Same for different dataset we have used same model and same no . of epochs Total params: 95,705,858, Trainable params: 95,704,194, non-trainable params: 1,664

IV. RESULTS AND DISCUSSION

The results and discussion show the CNN model has prediction the output which predict the greenscore and vehicle type the images shown. In figure 10 below

Conclusion

Over going through the whole CNN model this has been concluded that the model is able to predict the greenscore vehicle which shows the type and model of vehicle while training with small dataset we got good accuracy but in same type of brand vehicle accuracy decreases .

References

[1] Deepika Jaswal ,Soman Kp, Sowmya Vishvanathan “Image Classification Using Convolutional Neural Networks” International Journal of Scientific and Engineering Research 5(6):1661-1668 2014 [2] Li Deng and Dong Yu “Deep Learning: methods and applications” by Microsoft research [Online] available at: http://research.microsoft.com/pubs/209355/NOW-Book-Revised- Feb2014-online.pdf [3] McCulloch, Warren; Walter Pitts, \"A Logical Calculus of Ideas Im- manent in Nervous Activity”, Bulletin of Mathematical Biophysics 5 (4): 115–133(1943) [4] An introduction to convolutional neural networks [Online]available at: http://white.stanford.edu/teach/index.php/An_Introduction_to _Convolutional_Neural_Networks [5] Hubel, D. and Wiesel, T. (1968). Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology (Lon- don), 195, 215–243C. J. Kaufman, Rocky Mountain Research Laboratories, Boulder, Colo., personal communication, 1992. (Personal communication) [6] Yann LeCun, Leon Bottou, Yodhua Bengio and Patrick Haffner, “Gradient -Based Learning Applied to Document Recognition”, Proc. Of IEEE, November 1998. [7] S. L. Phung and A. Bouzerdoum,” MATLAB library for convolutional neural network,” Technical Report, ICT Research Institute, Visual and Audio Signal Processing Laboratory, University of Wollongong. Available at: http://www.uow.edu.au/˜phung [8] Tutorial on deep learning [Online] available at: http://deeplearning.net/tutorial/lenet.html [9] Adelson, Edward H., Charles H. Anderson, James R. Bergen, Peter J. Burt, and Joan M. Ogden. \"Pyramid methods in image processing.\" RCA engineer 29, no. 6 (1984): 33-41. [10] M. Riedmiller and H. Braun, “A direct adaptive method of faster backpropagation learning: The rprop algorithm”, in IEEE International Conference on Neural Networks, San Francisco, 1993, pp. 586– 591. [11] S. L. Phung, A. Bouzerdoum, and D. Chai, “Skin segmentation using color pixel classification: analysis and comparison,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 1, pp. 148–154, 2005. [12] Yi Yang and Shawn Newsam, \"Bag-Of-Visual-Words and Spatial Exten- sions for Land-Use Classification”, ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM GIS), 2010. [13] J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba, “SUN Data- base: Large-scale Scene Recognition from Abbey to Zoo”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) [14] J. Xiao, K. A. Ehinger, J. Hays, A. Torralba, and A. Oliva, “SUN Database: Exploring a Large Collection of Scene Categories”, (in revision) Internation- al Journal of Computer Vision (IJCV) [15] R. E. Turner, “Lecture 14: Convolutional neural networks for computer vision,” 2014. [16] Source for highway images [Online] National Highway Authority of India, nhai.org [17] Wei Xiong, Bo Du, Lefei Zhang, Ruimin Hu, Dacheng Tao \"Regularizing Deep Convolutional Neural Networks with a Structured Decorrelation Constraint” IEEE 16th International Conference on Data Mining (ICDM), pp. 3366–3370, 2016. [18] Taigman, Y., Yang, M., Ranzato, M.A. and Wolf, L., 2014. Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1701-1708). [19] LeCun, Y., Bottou, L., Bengio, Y. and Haffner, P., 1998. Gradientbased learning applied to document recognition. Proceedings of the IEEE, 86(11), pp.2278-2324.

Copyright

Copyright © 2022 Anuj Gupta, Aparna Saini, Yuvraj Joshi, Ankit Gupta, Abhishek Sarkar, Sujata Kukreti, Shashank Barthwal. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET45106

Publish Date : 2022-06-30

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here