Food Recognition Using Extreme Learning Machines

Authors: Sanjeev T K, Merin Meleet

DOI Link: https://doi.org/10.22214/ijraset.2022.47302

Abstract

New pictures of current classes are always arriving in open-ended continuous learning, and new classes are constantly appearing. Due to the great generalization capacity which was before deep learning networks, transfer learning was utilized to identify the most effective network for feature extraction from food photos. During transfer learning, it leverages online data augmentation to make up for the paucity of datasets from other orientations. Experimental research has demonstrated that the model\'s capability to categorize food photos from various potential orientations has been greatly improved by online data augmentation. Second, this study effort decreases the dimensions of retrieved features by using the Relief F technique to rank features. The redundant characteristics make the model\'s computations more difficult. The best epoch is achieved by getting a training accuracy of 98 percent and a validation accuracy of 92 percent.

Introduction

I. INTRODUCTION

In the training stage of the prototype system, a machine-learning model made up of convolutional neurons divides food into several categories. Every element of our life makes use of machine learning techniques, and object recognition through image processing is one of them. With the advent of feature-rich mobile devices and cloud services, the construction of a contemporary computer-based food identification system for trustworthy food recognition is now feasible. Addressing the issue of food detection in images of various foods is worsened by the wide range of foods with low inter-organizational and large intra-class differences and the scant data in a central image. To improve the identification and recognition abilities of traits generated from different deep models, suggested the general use of multiple fusion-trained classifiers.

Updated images of existing lessons and new classes frequently appear in open-end continuous learning. Data help in understanding enhances the quality and accuracy of the current classes using newly available images and adjusts to domain changes. On the other hand, class incremental learning continually picks up knowledge from fresh classes. Similar to this, looking at the proposed framework demonstrates that photos contain a notable amount of intra-class variation and inter-class similarity. There are no pre-established limitations for new tags inside the food database because of the extensive vocabulary. Although there are drawbacks to open-ended learning which must be taken into account, it may be utilized to address the need for food recognition in practical settings. Because the database is ongoing and new, interesting ideas develop over time, this highlights the advantages of wide active learning in a variety of real-world identification and categorization, including food recognition.

On various food datasets, recurrent neural networks for cuisine photo classification have achieved state-of-the-art performance. The gap between the lab as well as the real world has grown even though these algorithms are based on specified datasets. In practice, the majority of image databases were dynamic and open-ended. A model cannot easily be given training for more samples of present or prospective food classes using static food datasets without degrading its earlier performance. Neural networks for unlocked continuous learning are highly challenging due to these two problems. To illustrate catastrophic forgetting while learning incrementally, two theories have been put forth. In the first hypothesis, neurons that contain previously learned knowledge become less adaptable in people. The retention of the previous information is facilitated by this decreased neuroplasticity. The second hypothesis proposes that while maintaining episodic memories, people retrieve high-level data and store it in distinct brain regions. Based on two main theories, researchers have created deep learning systems that gradually learn.

Numerous intriguing applications exist for the automated identification of food in photographs, such as nutritional monitoring in medical cohorts. Almost throughout the entirety of human existence, obtaining enough food has been the primary worry. The primary public health issues now are the prevention of excessive calories and the nutritional makeup of diets. In the past, the primary purpose of eating was to give adequate energy. The convolution kernels demonstrate that the feature selection method is dominated by color. Efficientnetb2 has also shown much more accuracy for food picture identification than a traditional approach. Due to the enormous variety of food varieties, it is typically exceedingly difficult to recognize food products by their images. Efficientnetb2 is a cutting-edge deep learning method that has recently been demonstrated to be an extremely strong image identification technology. Due to the strong generalization capacity of deep learning features, transfer learning is advantageous.

II. RELATED WORK

Many of the researchers had worked on a similar topic to get the best possible output. Ghalib Ahmed Tahir et al, [1] given that modern deep learning methods for food recognition generally suffer severe interference issues during class incremental learning do not enable data incremental learning. Since actual food datasets were open-end and dynamic and involved a constant growth in food sampling and food categories, this is a crucial problem in food recognition. To deal with the changing nature of the information, model retraining is frequently used, but it takes a lot of time and expensive computer resources. To construct a new wide continuous learning framework, a novel adaptable reduction class continuously kernels extreme learning machine called Relief F is used for choosing features.

Chengpeng Chen et al., [2] gave an idea of an important field of research is intelligence technology. Hand-held object recognition is a highly unique but significant instance of object recognition, and it plays a significant part in intelligent systems for its various applications, including visual reasoning and question-answering. The datasets in real-world settings are dynamic and open-ended, with an ongoing influx of new object classes and object samples. To effectively learn the new information, intelligence technology must offer a hybrid backpropagation algorithm, which enables both information and class-incremental learning. As a consequence, our system can concurrently transfer the prior model to detect unfamiliar items and enhance the recognition accuracy of recognized ideas by lowering the prediction error.

Jackson Kamiri et al., [3] have worked on the most popular tools for building, training, as well as testing models for the Python programming languages and associated libraries. the most popular techniques for handling categorization and forecasting issues. The usage of research methodologies is crucial in machine learning since they have an impact on the dependability and accuracy of the outcomes. To maximize the effectiveness of machine learning algorithms, researchers are turning to the best feature selection. The primary methods for assessing the effectiveness of algorithms are still the multilayer perceptron and its variations. Machine learning is utilized to address issues in society; it should not exist in a vacuum.

Jiangpeng He et al., [4] proposed on current approaches need static datasets for learning and are unable to learn from newly available food photographs sequentially, classifying food images is difficult for real-world applications. Online continuous learning seeks to learn different classes from a stream of data by using each new piece of information just once without losing what has already been learned. The first and most important phase in image-based dietary evaluation, which promises to offer insightful information for the prevention of numerous chronic illnesses, is food classification. The ideal food classification system would be able to continuously update utilizing all newly recorded food images while without losing track of previously learned food classes. The deployment of such an automated system for nutritional evaluation and monitoring would greatly benefit from achieving this aim.

Pengcheng Duan et al., [5] worked for e-health applications, food picture identification is becoming more and more crucial. But given the variety of foods and the impact that color, light, and view angles have on how food appears, this is a difficult subject. Due to the high computational demand for the numerous concurrent identification requests, we also suggest using the ubiquitous cloud technology paradigm to increase the performance in food picture recognition. Evaluations reveal that, compared to the conventional client-server strategy, the suggested technique can provide adequate recognition accuracy, and Hadoop programming can offer a potential performance benefit.

A. Suzuki et al., [6] proposed the art of recognizing and identifying food photos, we use a deep neural network. Due to the wide diversity of food products, image identification of food goods is sometimes fairly difficult. Although deep learning has recently been shown to be a highly effective way of identifying images, CNN is a state-of-the-art deep learning approach. Through parameter optimization, we used CNN for the tasks of identifying and detecting food. To assess recognition performance, created a database of the most popular food products in a publicly accessible food-logging system. Compared to previous support-vector machine-based approaches using handmade features, CNN demonstrated much greater accuracy. Additionally, discovered that the process cycle demonstrates that the features are extracted dominated by color.

III. SYSTEM DESIGN AND ARCHITECTURE

A System Architecture is a model that identifies a system's structure, components, behavior, and other aspects. Architecture is a representation of the system and its components that will work together to realize the overall framework. Data is unprocessed information that is obtained and a dataset is formed and collected from several datasets. For training and validation, these datasets are utilized. The dataset for the model is taken from a medical dataset from the life science database archive. The architecture strategy involves the implementation and training of the classifier predictive model to accurately predict test scores.

The above figure shows the relationships and interactions between different components within the system. Here the predictive model is trained with the dataset obtained from various sources such as food archives and many others. The models will predict the test score for the given dataset and finally shows the scores from the various classifier model used under this project. Both the classification and regression models belong to supervised learning, where the former is applied for the outcome is finite values. Thus one can evaluate this model with human validation results.

IV. METHODOLOGY

During the training phase, the dataset is additionally preprocessed to eliminate outliers and choose the key features to train the model, and weights and parameters are applied to the model to produce the standard evaluation metrics. The dataset needed for the model is from the food image database for machine learning.

The following phases are included:

Phase 1: Data Collection

Data collection is the process of acquiring and evaluating information with the use of accessible software. The data used to train transformer models is quite important. Data collection, which includes classifying and gathering organized quantitative data, is required before any analysis. The data was gathered on a group of individuals who all had the same condition. The models' outputs are only as efficient as the data they're based on, therefore this is an important stage in constructing high-performance models.

2. Phase 2: Data Loading

Training image data with different faults or trash values inside the data set is typical, and these errors may be removed by determining whether the data has any missing values and whether the value must be within a specific range. If a variable contains a lot of missing values, it must be removed. Cleaning the image data will not increase the model's accuracy, but model selection will at least mitigate any detrimental consequences.

3. Phase 3: Training

Examining the results of categorization predictive modeling techniques. A popular metric for assessing the performance of the model based on anticipated class labels is classification performance. Although classification accuracy isn't ideal, it's an excellent place to start for many classification jobs. There is no solid concept including how to transfer algorithms onto problem categories; instead, practitioners are advised to conduct controlled tests to determine which algorithm and method configuration performs best for a specific classification job. Finding the best method to map samples of input data to predetermined class labels will be done using the training dataset. The training of the food image dataset should be illustrative of the challenge and contain several instances of each classifier.

4. Phase 4: Fine Tuning

As it is a machine learning method, parameter estimates can range from hundreds to millions, therefore training it on a tiny food image dataset from the start will only result in overfitting. As a result, it is usually preferable to start with a pre-trained model and then fine-tune the model using a comparatively tiny bit of the food image dataset that is relevant to the domain.

5. Phase 5: Validation

Model validation is usually done after a model has been trained. It is a way of determining whether or not the assignment successfully is effective on the provided dataset, and it involves evaluating the trained model with a test dataset. It also analyzes the outputs of the system under assessment of the model's outputs.

V. RESULTS

In this last stage, test the classi?ers on the ready picture dataset and evaluate the model's performance. Utilizing accuracy criteria, we evaluate the success of our developed model and compare it to existing methodologies.

An experiment is a systematic operation that is performed under predetermined conditions to test a hypothesis, explain a known effect, or uncover an undiscovered effect. Process analysis determines how input influences output and what the optimal input level should be to produce the intended outcome. Here the data is trained under the efficientnetb2 model for several epoch runs. The best epoch is achieved by getting a training accuracy of 98 percent and a validation accuracy of 92 percent.

The output is the accuracy score of different trained model predictions which is given in the below figure. The same accuracy scores are plotted in a bar graph so that it will be easy to visualize The core values can be easily visualized with the help of this graph, concerning the algorithm. The confusion matrix is important for predicting metrics like memory, sensitivity, accuracy, and precision. Here the class-by-class predicted and actual values are given for the considered training model set.

Here the model is given one of the images and asked to recognize what it is. Based on the training given to the system, it recognized it as the burger with the class of burgers. It is also given a probability value of 97.54 percent so that one can be sure about the obtained result.

Conclusion

The dataset for food recognition is dynamic and open-ended. Food sample classes are becoming more commonplace. Currently available food identification deep learning techniques. Assume that there are initial food classes and variations within each class. They suffer from catastrophic forgetfulness when learning incrementally in class. This work addresses these problems by outlining a novel, open-ended, and continuous paradigm for food identification. Instead of starting from scratch, transfer learning has been taken into consideration in this work, and models that call for retraining have not been investigated. The scientific goal of the study was not met by starting the model from scratch. Large amounts of training time and sophisticated computing tools like GPU are needed. These feature extractors are all inappropriate for real-time scenarios. However, by combining transfer learning with incremental classifiers, which need less training time and produce strong discriminating features, the goal of open-ended ongoing learning was attained. Real-time circumstances were not appropriate for the deep learning model\'s retrieved features because of their huge dimensions. The Relief F technique is used to evaluate the features for this purpose, and the best features are chosen depending on the ranking.

References

[1] G. A. Tahir and C. K. Loo, \"An Open-Ended Continual Learning for Food Recognition Using Class Incremental Extreme Learning Machines,\" in IEEE Access, vol. 8, pp. 82328-82346, 2020, doi: 10.1109/ACCESS.2020.2991810. [2] Chengpeng Chen, Weiqing Min, Xue Li, Shuqiang Jiang, “Hybrid incremental learning of new data and new classes for hand-held object recognition”, Journal of Visual Communication and Image Representation, Volume 58,2019. [3] Kamiri, J., & Mariga, G. (2021). “Research Methods in Machine Learning: A Content Analysis,” International Journal of Computer and Information Technology(2279-0764), 10(2). https://doi.org/10.24203/ijcit.v10i2.79. [4] He, Jiangpeng, and Fengqing Zhu. \"Online continual learning for visual food classification.\" Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021. [5] P. Duan, W. Wang, W. Zhang, F. Gong, P. Zhang, and Y. Rao, \"Food Image Recognition Using Pervasive Cloud Computing,\" 2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing, 2013, pp. 1631-1637, doi: 10.1109/GreenCom-iThings-CPSCom.2013.296. [6] A. Suzuki, H. Akutsu, T. Naruko, K. Tsubota, and K. Aizawa, \"Learned Image Compression with Super-Resolution Residual Modules and DISTS Optimization,\" 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2021, pp. 1906-1910, doi: 10.1109/CVPRW53098.2021.00215. [7] Fanyu Kong, Jindong Tan, “DietCam: Automatic dietary assessment with mobile camera phones”, Pervasive and Mobile Computing, Volume 8, Issue 1, 2012. [8] B. Deng, X. Zhang, W. Gong, and D. Shang, \"An Overview of Extreme Learning Machine,\" 2019 4th International Conference on Control, Robotics and Cybernetics (CRC), 2019, pp. 189-195, doi: 10.1109/CRC.2019.00046. [9] D. Keysers, T. Deselaers, C. Gollan and H. Ney, \"Deformation Models for Image Recognition,\" in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 8, pp. 1422-1435, Aug. 2007, doi: 10.1109/TPAMI.2007.1153. [10] G. -B. Huang, H. Zhou, X. Ding, and R. Zhang, \"Extreme Learning Machine for Regression and Multiclass Classification,\" in IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 42, no. 2, pp. 513-529, April 2012, doi: 10.1109/TSMCB.2011.2168604. [11] J. Xu, C. Jianjiang, X. Dingyu, and P. Feng, \"Near infrared vein image acquisition system based on image quality assessment,\" 2011 International Conference on Electronics, Communications, and Control (ICECC), 2011, pp. 922-925, doi: 10.1109/ICECC.2011.6066618. [12] K. Aizawa and M. Ogawa, \"FoodLog: Multimedia Tool for Healthcare Applications,\" in IEEE MultiMedia, vol. 22, no. 2, pp. 4-8, Apr.-June 2015, doi: 10.1109/MMUL.2015.39.

Copyright

Copyright © 2022 Sanjeev T K, Merin Meleet. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET47302

Publish Date : 2022-11-04

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here