Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Preeti Bailke, Krisha Patel, Aayushi Patel, Rohan More, Sudhanshu Pathrabe, Shreyash Patil
DOI Link: https://doi.org/10.22214/ijraset.2023.53481
Certificate: View Certificate
In recent years, the demand for automated food recognition systems has increased due to the large variety of dishes available especially in Indian cuisine and also due to the growing awareness of the importance of a healthy diet. Identifying multiple food items from an image is a challenging task, especially when it comes to Indian cuisine, which is known for its diverse range of dishes and ingredients. The goal of this paper is to create a food dish predictor for five common Indian cuisines utilizing the cutting-edge object detection algorithm - YOLOv4. The five foods selected for this study are Aloo paratha, Biryani, Poha, Khichdi, and Chapati. A dataset is created by taking the images of selected dishes from various platforms such as social media and forming classes, which was then used to train the YOLOv4 model. To accurately train the machine to identify the dishes, the dataset was manually labeled. The YOLOv4-based food dish predictor that has been proposed could be put to use in a number of applications. The predictor can assure proper order fulfillment in food delivery services by recognising food plates in real-time from customer-provided photos. The predictor\'s capacity to automatically extract dish information can assist menu recognition software, saving restaurant personnel time and allowing for rapid menu revisions. Meal recommendation engines can utilize the predictor to provide personalized meal choices based on user preferences and previous eating experiences. Overall, the YOLOv4-based food dish predictor helps multiple sectors of the food business to increase productivity, consumer experiences, and decision-making capacities. The findings of this investigation show how well YOLOv4 can recognise and identify food items, particularly Indian food items.
I. INTRODUCTION
India has a rich and diverse culinary heritage and with a vast variety of flavors and spices, Indian cuisine is both rich and eclectic. There is an increasing demand to automatically identify and recognise food items present in a dish from an image of that dish with the growth of meal delivery services and restaurant recommendation systems, especially for Indian cuisine. These systems can be integrated with several applications to enhance user experience, such as in restaurant recommendation systems, food ordering apps and recipe websites. For restaurants and catering services, these systems can help in menu planning and cost estimation. By predicting the food items in a meal, chefs and catering services can better plan for ingredients and quantities required, leading to better cost estimation and less food waste. For this ingredients are identified from meals, so as per sell quantity inventory can be managed and as size of inventory is known it will help in cost estimation also.
With the help of these systems, users can simply take a picture of a meal and receive information about all the dishes in the picture, along with their ingredients and nutritional information. By analyzing the food items in a meal and their nutritional information, these systems can provide personalized health and nutrition recommendations and can be helpful for people with dietary restrictions, allergies, or health conditions to identify suitable meal options. Also, these systems can educate people about the different Indian dishes and their origins, promoting cultural understanding and appreciation. The accuracy and effectiveness of the system depend on the quality of the models and algorithms used. Systems for predicting food dish names from an image can greatly enhance our interactions with food and make food-related tasks more efficient.
Indian food dishes usually come in as a cluster of multiple items, which makes it even more difficult to recognize the names of each item accurately. Several object detection algorithms such as YOLO, SSD, R-CNN, etc. have been extensively used to identify multiple objects in photos, including food dishes. These models can identify the location and type of different objects in the image by generating bounding boxes around them.
In this study, a food dish predictor is suggested which utilizes YOLOv4 for detecting five well-known Indian dishes: Aloo paratha, Biryani, Poha, Khichdi, and Chapati. Due to their widespread appeal and variety within Indian cuisine, these dishes were picked. The suggested model trains the YOLOv4 model to reliably detect and recognise the five Indian food items using a sizable dataset of annotated photos. To make sure that the model is trained on high-quality data, which is essential for producing accurate predictions, the dataset was manually annotated.
The YOLOv4-based food dish predictor presented in this research holds significant potential for various applications, such as automated restaurant menu recognition and meal delivery services. By leveraging the capabilities of YOLOv4, this predictor offers an efficient and accurate solution for streamlining food-related processes in diverse settings, ultimately enhancing efficiency and user experience in the domain of Indian food recognition.
II. LITERATURE REVIEW
III. METHODOLOGY
The project started off with the process of data preparation that is converting the images, removing the unwanted classes from the datasets and then adding the label files so that the YOLOv4 model can take this for training. Additionally, it had 10 classes of data of Indian cuisines so as to remove the 5 classes we started off by writing functions to remove these images and then remove the corresponding ID given to each class and the renaming the ID in correct order so that they can be used for training. Initially we started off by taking around 1000 images per class for training so that the cfg file was changed correspondingly to the correct values of filters, number of epochs, and then the batch size which is to be kept 80% and 90 % of the input size respectively.
After completing the data pre-processing part for training, the next step was creating the custom input data files that are obj.data and obj.names.
The first file basically contains all the paths of the data such as the training path, testing path then the directory to store the backup and the weights file. The later one that is the obj.names is the file containing the classes in the same order to that of they are numbered. Any additional class or wrong number would lead to wrong training.
For splitting the data into training/ testing the technique used is adding all the paths of the images to a text file. After that using pandas these files are converted into data frames which are then divided into training :70% testing: 30%
Google colab platform is used for training purposes on which using the darknet directory one can train YOLOv4. So as to train YOLOv4 on a custom dataset a change in certain files is needed such that all the images are to be added in the data folder under a sub folder called obj.
This obj folder is later used as reference to all the names and data files. Once this is done and the obj.data and obj.names file are created and the training of the model is started. GPU is a must for training the YOLOv4 model.
A. Algorithm
We can model CNN to the prediction of YOLOv4. The output of YOLOv4 can be fed into the CNN model after object detection has already been completed. One strategy is to use the bounding boxes produced by YOLOv4 to clip the identified items from the original image, and then run the CNN model on the cropped images to extract features.
The following approach can be implemented :
The YOLOv4 algorithm's final result is a set of bounding boxes, each of which represents an object that was found in the image or video. Each bounding box additionally has a class label and a confidence score that express the degree of trust the neural network has in the classification of the object.
By utilizing a CNN model to extract more particular and discriminative characteristics from the objects, this method may be able to increase the object detection algorithm's accuracy and specificity. However, because the CNN model must analyze numerous cropped images, it may significantly raise computing costs and memory utilization.
B. Flowchart
Figure 1 represents the overall flow of how the names of the food items present in the given image is detected. It shows the working of complete model in which step 1 is to fed the input image to the YOLOv4 model the results of the model are stored and the bounding box which is made by the YOLOv4 model around the detected object is cropped which serves as the input for the second model which is a CNN model then ultimately the results of both the models are combined to give one final output.
IV. RESULTS AND DISCUSSION
With the use of customized datasets, the potent object detection system YOLOv4 can be trained to recognise certain things. For evaluating an object detection model's accuracy is mean average precision (mAP).The mean average precision (mAP) is a measurement used to assess object detection models like R-CNN and YOLO. The mAP calculates a score by comparing the detected box to the ground-truth bounding box. The model's detections are more precise the higher the score.
In figure 2 we can see the progress of training, that is the mAP graph and the loss graph both increasing and decreasing simultaneously for 3600 iterations. In this model mAP is calculated after the first 1000 iterations so from the figure it can be seen that at 1000 iterations it started with 22% mAP score. Following the same graph in figure 3 the mAP score after 7000 iterations went to 78% which then till 9000 iterations goes up to 87% and from the figure 4 which shows the last part of the training that is the last 1000 iterations the mAP score reaches 86 to 87 again.
The loss which gradually decreases along all the iterations can be seen in all the three figures 1,2 and 3. The blue curve represents the training error or loss (more specifically, the Complete Intersection-Over-Union or CIoU loss for YOLOv4) on the training dataset. The blue curve represents the training error or loss (more specifically, the Complete Intersection-Over-Union or CIoU loss for YOLOv4) on the training dataset.
V. FUTURE SCOPE
Extension of the food dish predictor to recognise other Indian foods and perhaps other cuisines could be the subject of future research. Using larger datasets and implementing additional pre-processing methods to increase the quality of the training images could also help the model perform even better.
We have created a food dish predictor using YOLOv4 to recognise five well-known Indian dishes: Aloo paratha, Biryani, Poha, Khichdi, and Chapati. The suggested model demonstrated great accuracy in detecting and identifying Indian cuisine items after being trained on a sizable dataset of annotated photos, identifying items in a certain area of the image. Table 1. Loss and mAP values Iteration Loss mAP 1k 3.9 22 2k 3.8 50 3.6k 2.5 69 In table 1,the loss and mAP values are shown. The training process is repeated 3600 times, with the loss and mAP values recorded every 1,000 iterations. As training goes, we can see that the loss lowers and the mAP improves, indicating that the model is growing and becoming better at identifying things. The findings of this investigation show how well YOLOv4 recognises and classifies food items, particularly Indian cuisine items. The potential applications for the suggested food dish predictor include meal delivery services, restaurant menu recognition, and food recommendation systems notifying items in a certain area of the image. Overall, the creation of a YOLOv4-based food dish predictor has huge ramifications for the food sector and has the potential to change the way we recognise and identify food meals.
[1] Dr. Damala Dayakar Rao, Nanuri Pandu Ranga, “Indian Food Image Classification Using K-Nearest-Neighbour and Support-Vector-Machines”, International Journal for Advanced Research In Science and Technology, Volume 09, Issue 06, Jun 2021. [2] Deepanshu Pandey, Purva Parmar, Gauri Toshniwal, Mansi Goel, Vishesh Agrawal, Shivangi Dhiman, Lavanya Gupta, Ganesh Bagler, “Object Detection in Indian Food Platters usingTransfer Learning with YOLOv4”, May 2022, arXiv:2205.04841. [3] C.B Selvalakshmi, R.Amriitha, U.Fathima Shafa, M.Karpaga Meena, R.B Vandhana, “CD-DIET: A Prediction and Food Recommendation System for Chronic Diseases”, International Journal of Innovative Science and Research Technology, Volume 5, Issue 5 - May 2020. [4] Man Wong, Qing Ye, Yuk Chan Kylar, Wai-Man Pang, Kin Kwan, “A Mobile Adviser of Healthy Eating by Reading Ingredient Labels”, Wireless Mobile Communication and Healthcare, 6th International Conference, MobiHealth 2016. [5] Prajapati, Sabina, “Eatable: An Application That Helps People With Food Allergies Check and Locate Allergen-Free Food Products”, Master\'s thesis, Harvard Extension School, 2017. [6] Zhao-Yan Ming, Jingjing Chen, Yu Cao, Ciarán Forde, Chong-Wah Ngo, Tat Seng Chua, “Food Photo Recognition for Dietary Tracking: System and Experiment”, International Conference on Multimedia Modeling, 2018. [7] C.-Y. Teng, Y.-R. Lin, and L. A. Adamic, “Recipe recommendation using ingredient networks”, Annual ACM Web Science Conference, 2012. [8] J. Freyne, S. Berkovsky, “Intelligent food planning: personalized recipe recommendation”, International conference on Intelligent user interface, 2010.
Copyright © 2023 Preeti Bailke, Krisha Patel, Aayushi Patel, Rohan More, Sudhanshu Pathrabe, Shreyash Patil. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET53481
Publish Date : 2023-05-31
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here