Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Shrinivas Nagargoje, Adesh Shinde , Pranav Tapadiya, Om Shinde, Prof. Anita Devkar
DOI Link: https://doi.org/10.22214/ijraset.2023.51821
Certificate: View Certificate
This paper describes a real-time yoga pose detection system that can accurately classify and detect yoga poses in images using Convolutional Neural Networks (CNNs) and OpenPose. By using OpenPose, the system generates a 3D joint map of the person\'s body, which is then used as input for linear regression to detect the individual yoga pose. The system is suitable for real-time applications, and is expected to be used in fitness centers, yoga studios, and even for personal use. Additionally, the system can also be used to track the progress of yoga practitioners, allowing them to analyze their performance and improve their practice. Furthermore, the proposed system is expected to benefit the yoga industry by providing a low- cost, efficient, and accurate means to detect poses.
I. INTRODUCTION
Humans are naturally vulnerable to a variety of health issues, with musculoskeletal illnesses being a critical area that needs prompt attention. As a result of accidents or age, a significant number of people experience musculoskeletal diseases every year. Yoga can help you achieve greater physical health. [1] [9] Exercise has a lot of advantages, but if done incorrectly, it can result in a dangerous lifestyle. Hence, people who are executing tasks on their own must have the correct instruction. A person can benefit from hobbies in many ways while also enhancing their health with the correct guidance.[4] Yoga postures help to develop alertness, coordination, and power in both the mind and the physique. A wrong yoga pose, on the other hand, can result in catastrophic complications. [5] As a result, suitable yoga postures must be followed. Yoga is an ancient practice of physical, mental, and spiritual disciplines that has become increasingly popular in recent years. As such, there is an increasing interest in developing automated methods for accurately detecting yoga poses in real-time. [8]The most common approach to real- time pose detection is to use convolutional neural networks (CNNs) to extract features from the input images, followed by a linear regression step to identify the corresponding poses.[3]However, this approach has several limitations, such as the difficulty of accurately differentiating between poses, and the need to manually label each pose.[11]To address these limitations, we propose a novel method for real-time yoga pose detection using CNNs, OpenPose, and linear regression in Python. Our method combines the benefits of both approaches, allowing us to accurately differentiate between poses, while also reducing the need for manual labeling. We evaluate our method on a dataset of yoga poses, and show that it is able to accurately identify poses in real-time.
II. LITERATURE SURVEY
In the literature, there are a number of works that have been used to perform human pose estimation. [1]–[7]For our specific project we are focusing on pose estimation of a single person.In the field of yoga pose detection, only a modest amount of work has been done thus far. To go in more depth we have gone through multiple research papers available over the internet. Many people have tried to contribute to this field of work.
Infinity Yoga Tutor was a paper published in 2020 which mainly focuses on yoga posture detection and correction of wrong pose system. [1] The keypoints of human body are detected first and then further pose estimation is done, keypoint detection is done using OpenPose Library which accurately detects the main keypoints of human body to detect the posture. The models are trained with over 100 epochs on high graphics as well as high RAM cointaining systems. [1] Recent advances in computer vision and machine learning have enabled the development of automated systems for recognizing human poses. Researchers have developed various methods for recognizing poses, including using a combination of multiple Machine Learning (ML) algorithms, using 3D data , and using depth data.
Other approaches have used convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to recognize poses. However, most of these methods require large datasets and complex architectures.
Miss Yoga a yoga assistant for phone uses hash based learning to extract human poses from pressure sensor.[2] The widely uses library Posenet which is similar to OpenPose is used for detection of poses. The model is trained using publicly available datasets namely Cobra, Tree, Mountain, Lotus, etc. The accuracy shown by model is low for SVM, KNN and Random Forest.[2]
Girija Chiddarwar introduced use of MS Bing browser to get images for training of 82 different poses. [3] There is use of hierarchical labels for yoga poses based on body configuration of pose.
Poses are categorized based on on different body postures such as standing, sitting, balancing, inverted, etc.This categories makes it easy to identify yoga poses.
The older method of identifying pose based on skeleton is replaced by use of Machine Learning and Deep Learning advancements. Yash Agrawal and Yash Shah in their implementation of Machine Learning technique for identification of yoga poses paper made use of a dataset containing 400 to 900 images for each yoga pose. Each image is resized to
500 500 resolution for accurate and precise outcomes. [4]
Detection of yoga pose using MS Kinect and crosschecking it against true data was introduces in
the paper Yoga posture recognition by detecting human joint points in real time using Microsoft Kinect. But the system is only able to detect three major poses and gives less accuracy when many poses are involved. [6]
Santosh Kumar Yadhav introduced the use of OpenPose, LSTM and CNN to detect yoga postures. OpenPose is a library which helps to detect human poses using keypoints. LSTM helps to analyze changes over time and memorize patterns. Use of LSTM in the system makes the system more robust and minimizes errors at great scale. CNN helps to keep the focus on patterns which are similar and record it for better detection of poses. [7]
III. PROJECT SCOPE
Yoga has gained popularity as a result of the modern lifestyle's increased stress. You can learn yoga by taking courses at a yoga studio or by receiving private instruction at home. You can also learn it on your own by using books and videos. The majority of people prefer learning by their own, yet it might be challenging to identify erroneous yoga positions on their own. The scope of this project is to develop a real-time yoga detection system using Convolutional Neural Network (CNN), OpenPose and Long Short-Term Memory (LSTM) in Python. The system will be able to detect yoga poses and actions in real-time by analyzing video frames (or image frames) captured by a camera. The system should be able to accurately identify the type of pose and action (such as forward bends, backward bends, twists, etc.) and the body part performing the action (such as arms, legs, torso, etc.). The system should also be able to provide feedback such as corrections on posture and alignment, as well as advice on how to improve the yoga practice. Furthermore, the system should be able to generate and store reports on the performance of the user for future reference.
IV. ASSUMPTIONS & DEPENDENCIES
A. Assumptions
All images used for training and testing must be in the same resolution.The system must have access to a large dataset of images of yoga poses. The system should be able to handle different types of Pose.
B. Dependencies
Python programming language. OpenCV library TensorFlow or Keras library . Convolutional Neural Network (CNN). Long Short-Term Memory (LSTM) .OpenPose library to detect the poses in the images GPU for faster training and inference
V. MOTIVATION
Estimating a people's position is tough for computer vision. Since a user's position depends on a variety of elements, such as the image's size and resolution, lighting, background clutter, clothing, surrounds, and how people interact with their environment, it can be difficult to instantly recognise a user's pose in a photograph. It can be challenging to create a position estimate model that is applicable to all yoga asanas due to the variety of asanas. Yoga aids in preserving a healthy acid- alkaline balance. This is essential for wellness.
It should have a 20% acidic and 80% alkaline pH. An excessive amount of acidity can damage bones and tissues and cause weariness, mental drowsiness, headaches, melancholy, and arthritis. Yoga comes highly suggested for those who work in demanding, competitive workplaces. Following a solid practise.
VI. YOGA POSE DETECTION SYSTEM
A. Data Collection
We are working on total 6 yoga asanas which are Vajrasan, Shavasaan, Gomukhasan, Bhadraasan, Dhanuraasan, Shrishasan, Sarvangasana.We collected a total 2000 images dataset in the form of x,y,z and v coordinates which is stored in CSV file format.[9]The data can be collected in two ways: manually, or by using an automated system. Manual data collection involves the use of a professional photographer or videographer who can capture images of the poses being performed. However, this method can be time consuming and costly, depending on the size of the dataset.
On the other hand, automated systems can be used to collect data. [13]This involves using a camera to capture images or videos of people performing different poses, and then using OpenPose to detect and identify the poses. The images or videos can then be labeled and stored in a dataset. This method is much faster and more cost effective than manual data collection.Once the data has been collected, it should be pre-processed and split into training and test sets. The training set should contain enough examples for the model to learn, while the test set should be used to evaluate the model.
B. Keypoint Detection
Using a Convolutional Neural Network, the AI- powered software tool OpenPose can recognize and monitor human poses in real-time. The software could potentially be used for real-time yoga detection by tracking the poses and movements of a person practicing yoga. The accuracy of the pose detection will depend on several factors, including the quality of the input video, the lighting conditions, and the complexity of the yoga poses being performed..[16]Key- points Detection using CNN and Openpose is a technique that uses deep learning and a pre- trained network to detect the key points of a human body. It is an architecture based on Convolutional Neural Networks (CNNs) that can localize and identify human body parts in an image or video.To begin, the system uses CNNs to detect key-points in each frame of the video. These key-points are then used to generate a skeleton representation of the pose.[4] The LSTM network is then used to process the sequence of poses and recognize the corresponding type of yoga pose. The system can also be used to detect the pose at different levels of complexity, such as basic or advanced poses.
Table 2. Utilized key points.
No |
Keypoint |
No |
Keypoin t |
No |
Keypoin t |
0 |
Nose |
6 |
Left elbow |
12 |
Right eye |
1 |
Neck |
7 |
Left wrist |
13 |
Left eye |
2 |
Right shoulder |
8 |
Right hip |
14 |
Right foot |
3 |
Right elbow |
9 |
Right knee |
15 |
Left hip |
4 |
Right wrist |
10 |
Left knee |
16 |
Right ear |
5 |
Left shoulder |
11 |
Left foo |
17 |
Left ear |
C. Posture Corrections
The video canvas is drawn using the indicated important locations. In order to determine whether any adjustments are necessary, these key points are utilized to compare the user's stance with the intended yoga pose. If there is a significant degree of similarity between the two stances, the user's pose is considered ideal. [7] The system will produce advice for the user to alter their pose if their current yoga pose does not match the goal yoga pose's coordinates. The user can fix the errors according to the instructor's directions. It should also be able to detect incorrect postures and provide audio feedback in the form of a beep sound to inform the user of a wrong posture. The beep sound should be loud enough to be heard by the user but not too loud as to be disruptive to other people in the yoga studio. The system should also be able to provide visual feedback so that the user can easily identify which posture is incorrect and how to correct it. Finally, the system should be able to track the user’s progress and provide feedback on their improvement.
D. Human Pose Estimation
The recognition of human posture has advanced significantly in recent years. [13] Pose estimation has progressed from 2D to 3D and from one person to multiple people. Convolutional Neural Network (CNN) is a well-known deep learning model that has been extensively utilised for pose estimation. It aims at finding the exact location of a human body in an image or video. [10]It is a difficult task, since the human body is a complex structure and its shape changes frequently.
To precisely estimate the position of body parts, a complex technique is needed. [18] Convolutional Neural Networks are the foundation for the most widely used techniques for estimating human posture (CNNs). CNNs can be used to categorise the poses of an individual in a picture and are useful for extracting characteristics from photographs. [19] Modern achievements have been attained in recent years by combining CNNs with Long Short Term Memory (LSTM) networks. Furthermore, the real-time posture estimation library OpenPose is open-source. To estimate postures, it combines a number of CNNs with a non-linear least squares optimization approach. It has the capacity to recognise and follow many individuals in real-time.
VII. METHODOLOGY
The methodology of the proposed solution includes inputs from user for processing and storage.
A. User input
• Image capturing
• Processing and scanning the image
• Detection of keypoints from the processed image
The user provides input to the application via their device camera the application captures the image and scans it for understandable text for the model. The model then detects the keypoints from image captured which are dependent on joints of human body doing the pose. There are many models to accomplish this phase, we have used PoseNet here for better accuracy.
2. Input for storage: User provides input to the application for storage the input is the prescription of the user it is stored in a secured database so that user can carry important prescriptions anywhere and anytime.
B. User login
The application main motive is to ease health assistance for public.
The work flow of the application is:
The user logs in with their credentials such as email and password.
The user then gets into the application which displays homepage of the application consisting essential modules.
Via the homepage the user can access all the modules such as practice individually or play with friends.
After selecting a module user will be asked to on his web cam to capture real time video which captures the image and processes.
Selected module will automate the process of capuring real time video and detecting the keypoints of user to detect the pose is correct or not.
VIII. EVALUATION METRICS
Classification Score: The classification score determines the accuracy of the model.
The real time yoga detection system using CNN, OpenPose and LSTM in Python has been successfully implemented. The system has a high accuracy rate for detecting the user\'s postures in real-time. Additionally, the system is also able to produce a beep sound if the posture is wrong. This system has the potential to be used for fitness purposes, such as providing feedback to yoga practitioners, helping to improve posture and assisting in the development of strength and flexibility. The system can also be used in a variety of health and wellness applications such as physical therapy, post-operative rehabilitation, and injury prevention. In the future, the system can be further developed to include additional postures and features to improve accuracy and provide more comprehensive feedback.
[1] \"INFINITY YOGA TUTOR: Yoga posture recognition and correction system,\" Fazin Rishan, Binali De Silva, and Sasmini Alawathugoda, IEEE, 2020. [2] \"Miss Yoga: A Yoga Assistance Mobile Application Based on Keypoint Detection,\" IEEE, 2020, Renhao Huang, Jiqing Wang, Haowei Lou, Haodong Lu, and Bofei Wang. [3] \"Yoga-82: A Novel Dataset for Fine- grained Classification of Human Poses\", by Manisha Verma, Sudhakar Kumawat, Yuta Nakashima, and Shanmugaratnam Raman, IEEE, 2020. [4] \"Implementation of Machine Learning Method for Identification of Yoga Poses\", IEEE, 2020. Yash Agrawal, Yash Shah, and Abhishek Sharma. [5] A Proposal of Yoga Pose Evaluation System Using Pose Detection for Self- Learning, IEEE, 2019, Maybel Chan Thar, Khine Zar Ne Winn, and Nobuo Funabiki.Muhammad Usama Islam, Hasan Mahmud, Faisal Bin Ashraf, Iqbal Hossain§ and Md. Kamrul Hasan, “Yoga Posture Recognition By Detecting Human Joint Points In Real Time Using Microsoft Kinect”, IEEE, 2019. [6] Santosh Kumar Yadav1, Amitojdeep Singh2, Abhishek Gupta2, Jagdish Lal Raheja, “Real-time Yoga recognition using deep learning”, Springer, 2019. [7] Girija Gireesh Chiddarwar, Abhishek Ranjane, Mugdha Chindhe, “AI-Based Yoga Pose Estimation for Android Application”, IEEE, 2020. [8] Ajay Chaudhari, Omkar Dalvi, Onkar Remade, Prof. Dayanand Ambawade, “YOG-GURU: Real-time yoga pose correction system using deep learning methods”, IEEE, 2020. [9] Ze Wu 1, Jiwen Zhang 1 , Ken Chen 1 and Changlong Fu, “Yoga Posture Recognition and Quantitative Evaluation with Wearable Sensors Based on Two Stage Classifier and Prior Bayesian Network”, Sensors, 2019. [10] Cheng Zhang, et al, “Real-Time Human Pose Recognition with Deep Learning”. [11] Ching-Yao Chuang, et al. “n Automatic Yoga Posture Recognition System Using Deep Convolutional Neural Networks”. [12] Yew-Chin Lim, et al, “An Automated Yoga Posture Recognition System Using Deep Learning”. [13] Yudong Dai, et al, “YOGA: An Automatic Human Pose Recognition System for Yoga Posture Detection Using Convolutional Neural Network”. [14] J. Almasi, A. T. Nguyen, and D. E. Koditschek, “Deeppose: Real-time human pose recognition”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018, pp. 376–383 [15] V. Jain and S. Sclaroff, “Structured prediction of 3d human pose from monocular video”, Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 818–827. [16] Rad, H. Rezatofighi, J. Gall, and I. Reid, “Learning depth from single monocular images using deep convolutional neural fields”, Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 1425–1433. [17] F. Rishan, B. de Silva, and S. Alawathugdi, “Infinity yoga tutor: A yoga posture detection and correction system”, Proceedings of the IEEE International Conference on Ubiquitous Computing, 2020, pp. 1–4. [18] Z. Cao, G. H. L. Gkioxari, P. Dollar, and [19] T. Y. Lin, “Openpose: Realtime multi- person 2d pose estimation using part affinity fields”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7291– 7299.
Copyright © 2023 Shrinivas Nagargoje, Adesh Shinde , Pranav Tapadiya, Om Shinde, Prof. Anita Devkar. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET51821
Publish Date : 2023-05-08
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here