Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Niharika Gupta, Priya Khobragade
DOI Link: https://doi.org/10.22214/ijraset.2023.48665
Certificate: View Certificate
Humans are very proficient at perceiving natural scenes and understanding their contents. Everyday image content across the globe is rapidly increasing and there is a need for classifying these images for further research. Scene classification is a challenging task, because in some natural scenes there will be common features in images and some images may contain half indoor and half outdoor scene features. In this project we are going to classify natural scenery in images using Artificial Intelligence. Based on the analysis of the error backpropagation algorithm, we propose an innovative training criterion of depth neural network for maximum interval minimum classification error. At the same time, the cross entropy and M3CE are analyzed and combined to obtain better results. Finally, we tested our proposed M3 CE-CEc on two deep learning standard databases, MNIST and CIFAR-10. The experimental results show that M3 CE can enhance the cross-entropy, and it is an effective supplement to the cross-entropy criterion. M3 CE-CEc has obtained good results in both databases.
I. INTRODUCTION
Traditional machine learning methods mostly use shallow structures to deal with a limited number of samples and computing units.The convolution neural network (CNN) developed in recent years has been widely used in the field of image processing because it is good at dealing with image classification and recognition problems and has brought great improvement in the accuracy of many machine learning tasks.It has become a powerful and universal deep learning model.
Image classification is a task that requires a machine to be able to distinguish between different classes of objects in images. This task is challenging due to the sheer variety of objects that can appear in an image. a traditional approach to this would involve manually labelling. A new reconstruction algorithm based on convolutional neural networks is proposed by Newman et al and its advantages in speed and performance are demonstrated. Image classification is a task that requires a machine to be able to distinguish between different classes of objects in images, such as streets, buildings, seas, glaciers, forests and mountains. Traditional approaches to this task involve manually labeling each image with the categories in which the objects belong. However, as the number of categories increases, and the complexity of the images involved, manually labelling images becomes increasingly difficult and time consuming. Deep learning is an ai technique that has been shown to be effective at classifying complex images. it uses a type of artificial neural network (ANN), also known as a convolutional neural network (CNN), to extract features from each image and then train its weights and biases to recognize patterns in the data. The CNN can then be used to identify objects of different classes in the images, and make predictions about which class each object belongs to. Additionally, transfer learning can be used to further improve accuracy by training the network with data from multiple datasets, allowing for models that are more generalizable. Finally, computer vision algorithms such as edge detection, colour histogram analysis, region segmentation, and pattern recognition can also be employed to classify images
II. PROPOSED METHODOLOGY
Even though all study domains share some steps in the experimental design, the use of an ML approach must be cross-disciplinary. We can distinguish the following steps in the ML methodology used in image classification specifically: Data Collection, Data Pre-processing & Augmentation, Model Selection, Model Training, and Model evaluation and parameter tunning are the essential steps:
6. Parameter Tuning: Tune the hyperparameters of the model to further optimize the accuracy. Hyperparameter tuning consists of finding a collection of hyperparameter variables that a learning algorithm should have in order to apply to any given data set. That hyperparameters in combination maximizes the performance of the model, minimizing a standardized loss function to produce better results with fewer errors. Hyperparameter tuning takes advantage of the processing infrastructure of Google Cloud to test different hyperparameter configurations when training your model. It can give you optimized values for hyperparameters, which maximizes your model's predictive accuracy.
The application of machine learning techniques is widespread, and more articles have been published recently in particular. These are the steps by which the model is trained and tested. So, that it can give the accepted accuracy with respect to classification.
III. DATABASE, PACKAGES
A. Database: Intel image classification
The Intel Image Classification dataset is a collection of images of natural scenes from around the world, organized into six categories: buildings, forests, glaciers, mountains, sea, and streets. The dataset includes approximately 25,000 images, each with a size of 150x150 pixels. The images are split into training, test, and prediction sets, with approximately 14,000, 3,000, and 7,000 images in each set, respectively.
The dataset was published by Intel on the Analytics Vidhya website as part of an image classification challenge. It can be used to train and evaluate machine learning models for tasks such as image classification and object recognition. The Intel Image Classification dataset is a collection of images of natural scenes from around the world, organized into six categories: buildings, forests, glaciers, mountains, sea, and streets. The dataset includes approximately 25,000 images, each with a size of 150x150 pixels. The images are split into three sets: training, test, and prediction. The training set contains approximately 14,000 images, the test set contains 3,000 images, and the prediction set contains 7,000 images. Intel first made the dataset available as part of an image classification challenge on the Analytics Vidhya website. It can be used to train and test machine learning models for things like object recognition and image classification. The dataset's six classes—buildings, forest, mountain, sea, and street—offer a diverse collection of images that can be used to evaluate a model's performance on a variety of natural scenes. Overall, the Intel Image Classification dataset is a valuable resource for researchers and developers working in the field of machine learning and image processing. It provides a large, diverse dataset that can be used to train and evaluate models for a variety of tasks and applications.
B. Softwares: Colab Notebook, Visual Studio
C. Packages
The packages that are used to build the model are
IV. ALGORITHMS
A. Transfer Learning
Transfer learning is a machine learning technique in which a model trained on one a model on a second, related task is used as a starting point for the first task. This allows the second model to benefit from the knowledge and experience of the first model, and can greatly speed up training and improve performance.
Transfer learning is often used when there is a lack of labeled data for the target task, or when the target task is related to the source task. For example, a model trained on image classification tasks could be used as the starting point for a model that performs object detection, since the two tasks are related. Similarly, a model trained on natural language processing tasks could be used as the starting point for a model that performs sentiment analysis, since both tasks involve working with text data.
Transfer learning is a powerful tool that can greatly speed up model training and improve performance. By leveraging the knowledge and experience of pre-trained models, transfer learning allows you to quickly and easily build and train models for new tasks, even with limited data. It is used extensively in a variety of fields, including speech recognition, natural language processing, and computer vision.
B. Convolutional Neural Netwirks
A type of neural network known as a convolutional neural network (CNN) is made to process data and has a grid-like structure, such as an image. It is composed of multiple layers, including pooling layers, fully-connected layers, and convolutional layers.
The convolutional layers of a CNN apply a series of filters applied to the input data that are used to extract data features. Most of the time, these filters are two-dimensional, small matrices that are applied to specific parts of the input data. By applying these filters across the entire input data, the convolutional layers are able to extract a rich set of features that capture the spatial and temporal relationships in the data. The pooling layers of a CNN are used to downsample the output of the convolutional layers, reducing the spatial dimensions of the data and increasing the robustness of the features. By reducing the number of parameters in the model and preventing overfitting, this may be of assistance. Finally, the fully-connected layers of a CNN are used to make predictions based on the extracted features. These layers typically use a softmax activation function to produce probabilities for each of the classes in the task, allowing the CNN to make multi-class predictions. Overall, CNNs are a powerful and widely-used tool for many image classification tasks, including multi-class classification. They are able to automatically learn complex patterns in the data, and can perform at the highest level on numerous tasks.
There are several types of convolutional neural networks (CNNs), which differ in the architecture and the specific operations used in the convolutional and pooling layers. Some common types of CNNs include:
Overall, there are many different types of CNNs, each with its own strengths and weaknesses. The choice of architecture will depend on the specific characteristics of the data and the requirements of the task.
V. FUTURE SCOPE
The process of organizing data into various classes or categories based on particular characteristics is known as classification. It is a crucial step in the field of machine learning, where algorithms are trained to classify data based on previously known or labeled examples. The model, which is builded, is used to classify the images generated by various resources. We can further develop this model by increasing epochs to attain more accurate results.
The model's performance can be enhanced and its potential applications expanded by increasing the number of epochs, which is a measure of how frequently the model is exposed to the training data. Such as the recognition of objects and scenes in surveillance footage, the classification of terrain with the help of satellite imagery, the filtering and organization of personal collections of photos or videos, the assistance with the creation of artistic or creative projects, and the classification of geopolitical regions.
VI. ACKNOWLEDGEMENT
The author would like to express their appreciation to Prof. Priya Khobragade for her wise counsel and ongoing assistance during the project. They would also want to give particular thanks to Prof. Minakshee Chandankhede for her diligent supervision of the improvisation. We successfully finished this paper with your helpful guidance, and we are grateful to have both of you as our mentors.
Copyright © 2023 Niharika Gupta, Priya Khobragade. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET48665
Publish Date : 2023-01-15
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here