Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Sneha Verma, Mr. Aman Kumar Sharma
DOI Link: https://doi.org/10.22214/ijraset.2023.57662
Certificate: View Certificate
Image classification is a hot research topic in today\'s society and an important direction in the field of image processing research. Image classification is a supervised learning method used to classify images. Paper analyses four common image classification algorithms: convolution neural network, support vector machine, artificial neural network and logistic regression. In the research work, both theoretical and empirical approaches were followed. For the theoretical approach a review of both secondary data as well as data based on results obtained by application on the tools is studied. Secondary data was acquired from the research articles, text books, journals, technical reports, published thesis, websites, e-journals, software tool manuals, conference proceedings and any other research articles published in the related domain. The empirical study was carried out on the set of experiments, using software tools. The results obtained from the experiments were analyzed for the finding of the research. The paper compares the results of these four algorithms when tested on same dataset, in same environment and on same system. Research paper proves that results obtained from theoretical analysis are same as results obtained from experiments. The study found out that best results were given by convolution neural network, followed by support vector machine, artificial neural network and at last logistic regression.
I. INTRODUCTION
Recently, machine learning (ML) has become very widespread in research and has been incorporated in a variety of applications, including text mining, spam detection, video recommendation, image classification, and multimedia concept retrieval.[1] Image classification is a big part of machine learning.[2] Image classification involves assigning a label or tag to an entire image based on pre existing training data of already labelled images. While the process may appear simple at first glance, it actually entails pixel-level image analysis to determine the most appropriate label for the overall image. This provides us with valuable data and insights, enabling informed decisions and actionable outcomes. However, we need to make sure that data labelling is completed accurately in the training phase to avoid discrepancies in the data. In order to facilitate accurate data labelling, publicly available datasets are often used in the model training phase.[3]There are many image classification algorithms. The more common ones are machine learning and deep learning. Different models have different effects in different problems.[4] In this paper four image classification will be compared to check the accuracy. Four algorithms used for comparison are convolution neural network, support vector machine, artificial neural network and logistic regression.
II. TAXONOMY
Let’s discuss the four image classification algorithms in brief:
A. Logistic Regression (LR)
Logistic regression is a supervised machine learning algorithm that accomplishes binary classification tasks by predicting the probability of an outcome, event, or observation. The model delivers a binary or dichotomous outcome limited to two possible outcomes: yes/no, 0/1, or true/false.[5] The logistic function was invented in the 19TH century by Pierre François Verhulst a French mathematician for the description of growth of human populations, and the course of autocatalytic chemical reactions.[6] Logistic Regression is an extension of linear Regression. When the outcome variable is continuous, we usually use simple linear regression. For example, if we want to see how body mass index (BMI) predicts blood cholesterol level (continuous), we would use simple linear regression. But in many cases the outcome variable is categorical in nature. For example, if we want to see how smoking habit predicts the odds of having cardiovascular diseases, than the outcome variable is categorical and has two categories (have cardiovascular disease: yes/ no). In such cases it is not idle to use simple linear regression, because, when the outcome variable is not continuous the assumptions of linear regression i.e., linearity, normality and continuity are violated. To deal with categorical outcome variables, logistic regression was introduced as an alternative to simple linear regression.[7]
The biggest difference between it and linear regression is that in logistic regression data points are not arranged in line rows.[8] Unlike linear regression, in logistic regression, we fit a “S” shaped logistic function, instead of a regression line. Linear Regression solves regression problems, while logistic regression solves classification problems. Logistic regression can be of different types, such as binomial (binary), multinomial, or ordinal depending on the nature of outcome variable. When the outcome variable can have only two categories, (e.g., disease present vs. disease absent, dead vs. alive) then binomial or binary logistic regression is used, if the outcome variable have more than two categories (e.g., drug A, drug B, and drug C) which are not ordered then multinomial logistic regression is used and if the outcome variable is ordered (e.g., poor, fair, good, very good, excellent) then ordinal logistic regression is used.[7]
B. Artificial Neural Network (ANN)
The study of the human brain is thousands of years old. With the advent of modern electronics, it was only natural to try to harness this thinking process. The first step toward artificial neural networks came in 1943 when Warren McCulloch, a neurophysiologist, and a young mathematician, Walter Pitts, wrote a paper on how neurons might work. They modelled a simple neural network with electrical circuits.[9] An Artificial Neural Network (ANN) is a computational model inspired by the human brain’s neural structure. It consists of interconnected nodes (neurons) organized into layers. Information flows through these nodes, and the network adjusts the connection strengths (weights) during training to learn from data, enabling it to recognize patterns, make predictions, and solve various tasks in machine learning and artificial intelligence.[10] ANNs are high in pattern recognition-like abilities, which are needed for pattern recognition and decision-making and are robust classifiers with the ability to generalize and make decisions from large and somewhat fuzzy input data.[11] Most machine learning today is artificial neural networks. Because of the recent rise in computing power, these neural networks have turn into tremendously popular, and they are presently found practically everywhere.[12] Few areas that uses artificial neural network are: Medical diagnosis and health care, facial recognition, behaviour of social media users is analysed using ANN, stock market forecasting, weather forecasting, robotics and dynamics, etc.
C. Support Vector Machines (SVM)
Support vector machine was first proposed by Vapnik in 1995.[13] Support Vector Machines have been developed in the framework of Statistical Learning Theory.[14] Traditional statistics study the situation when the amount of samples tends to be infinite. However, the amount of samples in our daily lives is usually limited. Different from traditional statistics, Statistical Learning Theory(SLT) is a theory that specializes in studying the laws of machine learning in the case of small samples. SLT provides a new framework in dealing with the general learning problem.[15] The support vector machine is a novel small-sample learning method, because it is based on the principle of structural risk minimization, rather than the traditional empirical risk minimization principle, it is superior to existing methods on many performances.[16] Support vector machines, being computationally powerful tools for supervised learning, are widely used in classification, clustering and regression problems. SVMs have been successfully applied to a variety of real-world problems like particle identification, face recognition, text categorization, bioinformatics, civil engineering and electrical engineering etc.[17]
D. Convolution Neural Network (CNN)
In the field of deep learning, the CNN is the most famous and commonly employed algorithm.[1] LeNet is the first Convolutional Neural Network (CNN) proposed by Yann LeCun, in 1998. It was used mainly to recognize digits and handwritten numbers on bank checks.[18] Convolutional Neural Networks (CNNs) are analogous to traditional ANNs in that they are comprised of neurons that self-optimise through learning.[19] CNN is to blame for the current popularity of deep learning. The primary benefit of CNN over its forerunners is that it does everything automatically and without human supervision, making it the most popular. Convolutional neural networks are used to automatically learn a hierarchy of features that can then be utilized for classification, as opposed to manually creating features. In achieving this, a hierarchy of feature maps is constructed by iteratively convolving the input image with learned filters. Because of the hierarchical method, higher layers can learn more intricate features that are also distortion and translation invariant.[20] Generally, a CNN consists of three main neural layers, which are convolutional layers, pooling layers, and fully connected layers. These different kinds of layers play different roles.[18] The first two, convolution and pooling layers, perform feature extraction, whereas the third, a fully connected layer, maps the extracted features into final output, such as classification. [21]
a. Input: Input for convolutional neural networks is provided through images.[20]
b. Convolution: It applies a set of learnable filters (also known as kernels) to the input data. These filters are small grids that slide over the input image to perform element-wise multiplications and additions. Each filter extracts specific features from the input data, such as edges, textures, or more complex patterns. Multiple filters are used to capture different features. The output of this layer is called feature maps.[22]
c. Pooling: Pooling is the step in which features are extracted from the image output of a convolutional layer. This involves reducing the dimensionality of the image by extracting its key features, and combining the resulting output of the previous layer into a single one.[23]
2. Classification
a. Fully Connected Layer: The fully connected layer is where image classification happens in the CNN based on the features extracted in the previous layers.[24] A fully connected layer is employed to predict the most suitable label for the given image. This involves flattening the output from the previous layers. The feature maps are typically flattened into a one-dimensional vector. This is done to match the dimensionality between the convolutional/pooling layers and the fully connected layers.[22] Here, fully connected means that all the inputs or nodes from one layer are connected to every activation unit or node of the next layer.[24]
b. Output: Finally the classified and extracted images are given as output.[23]
III. ENVIRONMENT
This experiment is written on Jupyter notebook using python language. Jupyter notebook is an application of anaconda navigator that acts as a graphical user interface. Experiment is conducted using a 64 bit windows 10 system, the processor is Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz and the memory is 8.00 GB RAM. The dataset used for the purpose is taken from the following website: https://www.kaggle.com/ on 1st December 2023 at 05:00 PM. The dataset used contains binary images only. The size of the dataset is 62 MB and it comprises of 48x48 pixel grayscale images of faces. The faces have been automatically registered so that the face is more or less centred and occupies about the same amount of space in each image.There are seven categories of emotions in the dataset (0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5= Surprise, 6=Neutral). The training set consists of 28,709 examples and the public test set consists of 3,589 examples.[25] For implementing neural networks tensor flow is must. Tensor flow is a python library for high performance numerical calculations that allows users to create sophisticated deep learning and machine learning applications.[26]
IV. RESULTS AND ANALYSIS
The experiment was carried out in order to compare the accuracy of four image classification algorithms, mainly convolution neural network, support vector machine, artificial neural network and logistic regression. All the four algorithms were tested on same dataset and in same environment using same system.
A. Dataset
The dataset consists of seven categories of emotions (0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5= Surprise, 6=Neutral). The training set consists of 28,709 examples and the public test set consists of 3,589 examples. The bar graph shown in figure 1 displays the distribution of emotion classes in the training and testing sets.
The research paper briefs about image classification and how it is a big part of machine learning. Paper focuses on four image classification algorithms and gives a brief idea about them. The four image classification algorithms are convolution neural network, support vector machines, artificial neural network and logistic regression. The results given by these algorithms when implemented in same environment and with same dataset are compared and analysed. The result of these image classification algorithms is expressed using confusion matrix, classification reports and graphs. On the basis of these results conclusion is drawn out that convolution neural network yields best results with accuracy of 64.25%. Second best results are given by support vector machines(52.42%) followed by artificial neural network(35.11%) and logistic regression(32.22%).
[1] L. Alzubaidi et al., “Review of deep learning: concepts, CNN architectures, challenges, applications, future directions,” J Big Data, vol. 8, no. 1, Dec. 2021, doi: 10.1186/s40537-021-00444-8. [2] R. S. Chugh, V. Bhatia, K. Khanna, and V. Bhatia, “A comparative analysis of classifiers for image classification,” in Proceedings of the Confluence 2020 - 10th International Conference on Cloud Computing, Data Science and Engineering, Institute of Electrical and Electronics Engineers Inc., Jan. 2020, pp. 248–253. doi: 10.1109/Confluence47617.2020.9058042. [3] “What is image classification? Basics you need to know | SuperAnnotate.” Accessed: Dec. 10, 2023. [Online]. Available: https://www.superannotate.com/blog/image-classification-basics [4] P. Wang, E. Fan, and P. Wang, “Comparative analysis of image classification algorithms based on traditional machine learning and deep learning,” Pattern Recognit Lett, vol. 141, pp. 61–67, Jan. 2021, doi: 10.1016/j.patrec.2020.07.042. [5] “Logistic Regression: Equation, Assumptions, Types, and Best Practices.” Accessed: Dec. 06, 2023. [Online]. Available: https://www.spiceworks.com/tech/artificial-intelligence/articles/what-is-logistic-regression/ [6] E. Y. Boateng, D. A. Abaye, E. Y. Boateng, and D. A. Abaye, “A Review of the Logistic Regression Model with Emphasis on Medical Research,” Journal of Data Analysis and Information Processing, vol. 7, no. 4, pp. 190–207, Sep. 2019, doi: 10.4236/JDAIP.2019.74012. [7] T. Abedin, M. Ziaul, I. Chowdhury, A. Afzal, F. Yeasmin, and T. C. Turin, “Application of Binary Logistic Regression in Clinical Research Corresponding Author.” [Online]. Available: https://www.researchgate.net/publication/320432727 [8] Dalian jiao tong da xue and Institute of Electrical and Electronics Engineers, Proceedings of IEEE 7th International Conference on Computer Science and Network Technology?: ICCSNT 2019?: October 19-21, 2019, Dalian, China. [9] S. B. Maind and P. Wankar, “International Journal on Recent and Innovation Trends in Computing and Communication Research Paper on Basic of Artificial Neural Network”, [Online]. Available: http://www.ijritcc.org [10] “Introduction to Artificial Neural Networks - Analytics Vidhya.” Accessed: Dec. 06, 2023. [Online]. Available: https://www.analyticsvidhya.com/blog/2021/09/introduction-to-artificial-neural-networks/ [11] E. Grossi and M. Buscema, “Introduction to artificial neural networks,” European Journal of Gastroenterology and Hepatology, vol. 19, no. 12. pp. 1046–1054, Dec. 2007. doi: 10.1097/MEG.0b013e3282f198a0. [12] A. Goel, A. K. Goel, and A. Kumar, “The role of artificial neural network and machine learning in utilizing spatial information,” Spatial Information Research, vol. 31, no. 3. Springer Science and Business Media B.V., pp. 275–285, Jun. 01, 2023. doi: 10.1007/s41324-022-00494-x [13] H. Wang, J. Xiong, Z. Yao, M. Lin, and J. Ren, “Research Survey on Support Vector Machine,” vol. 9, 2017, doi: 10.475/123. [14] T. Evgeniou and M. Pontil, “Support vector machines: Theory and applications,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer Verlag, 2001, pp. 249–257. doi: 10.1007/3-540-44673-7_12. [15] Z. Jun, “The Development and Application of Support Vector Machine,” in Journal of Physics: Conference Series, IOP Publishing Ltd, Jan. 2021. doi: 10.1088/1742-6596/1748/5/052006 [16] C. Liu, L. Wang, A. Yang, and Y. Zhang, “Support Vector Machine Classification Algorithm and Its Application,” 2012. [17] J. Nayak, B. Naik, and H. S. Behera, “A Comprehensive Survey on Support Vector Machine in Data Mining Tasks: Applications & Challenges,” International Journal of Database Theory and Application, vol. 8, no. 1, pp. 169–186, Feb. 2015, doi: 10.14257/ijdta.2015.8.1.18. [18] A. A. Komlavi, K. Chaibou, and H. Naroua, “Comparative study of machine learning algorithms for face recognition.” [Online]. Available: https://inria.hal.science/hal-03620410v [19] K. O’Shea and R. Nash, “An Introduction to Convolutional Neural Networks,” Nov. 2015, [Online]. Available: http://arxiv.org/abs/1511.08458 [20] M. M. Taye, “Theoretical Understanding of Convolutional Neural Network: Concepts, Architectures, Applications, Future Directions,” Computation, vol. 11, no. 3. MDPI, Mar. 01, 2023. doi: 10.3390/computation11030052. [21] R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, “Convolutional neural networks: an overview and application in radiology,” Insights into Imaging, vol. 9, no. 4. Springer Verlag, pp. 611–629, Aug. 01, 2018. doi: 10.1007/s13244-018-0639-9. [22] “What is CNN? Explain the Different Layers of CNN.” Accessed: Dec. 13, 2023. [Online]. Available: https://www.theiotacademy.co/blog/layers-of-cnn/ [23] B. Koodalsamy, M. B. Veerayan, and V. Narayanasamy, “Face Recognition using Deep Learning,” in E3S Web of Conferences, EDP Sciences, May 2023. doi: 10.1051/e3sconf/202338705001. [24] “What are Convolutional Neural Networks? | Definition from TechTarget.” Accessed: Dec. 13, 2023. [Online]. Available: https://www.techtarget.com/searchenterpriseai/definition/convolutional-neural-network [25] “Find Open Datasets and Machine Learning Projects | Kaggle.” Accessed: Dec. 11, 2023. [Online]. Available: https://www.kaggle.com/datasets [26] “Anaconda | TensorFlow in Anaconda.” Accessed: Dec. 11, 2023. [Online]. Available: https://www.anaconda.com/blog/tensorflow-in-anaconda [27] P. Singh, N. Singh, K. K. Singh, and A. Singh, “Diagnosing of disease using machine learning,” Machine Learning and the Internet of Medical Things in Healthcare, pp. 89–111, Jan. 2021, doi: 10.1016/B978-0-12-821229-5.00003-3. [28] “Classification Report — Yellowbrick v1.5 documentation.” Accessed: Dec. 15, 2023. [Online]. Available: https://www.scikit-yb.org/en/latest/api/classifier/classification_report.html
Copyright © 2023 Sneha Verma, Mr. Aman Kumar Sharma. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET57662
Publish Date : 2023-12-20
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here