OpenPose serves as a powerful tool for Human Pose Estimation, specializing in localizing anatomical key points or parts with a primary focus on identifying various body parts. Leveraging part affinity fields and a multi-convolutional neural network (CNN) architecture, OpenPose excels in extracting human parts and establishing associations with exceptional performance. The integrated design aims at jointly learning parts detection and their associations, with the part affinity score guiding the connections between different body parts.
Introduction
I. INTRODUCTION
Yoga, with its roots entrenched in ancient India, has evolved into a popular practice globally, attracting enthusiasts for its holistic benefits encompassing mental, physical, and spiritual well-being. While the amalgamation of yoga and sports has been a longstanding trend, a noteworthy surge in its adoption over the last decade can be attributed to its unparalleled health advantages. However, a growing concern arises from individuals embarking on solo yoga journeys without proper guidance, often leading to injuries due to incorrect postures.
Recognizing the challenge of accessibility to trained instructors, a revolutionary approach has emerged leveraging computer vision and data science techniques. The advent of AI-driven software designed for Yoga Pose Assessment seeks to empower self-learners, enabling them to practice yoga with precision and safety.
The proposed method employs OpenPose and a PC camera to detect and assess yoga poses. As users perform the poses, the software calculates the variance in specified body angles between their execution and that of an instructor. If the deviation surpasses a predefined threshold, the software provides real-time feedback, guiding users to correct their posture for optimal effectiveness and injury prevention.
II. RELATED WORK
Examining the existing literature reveals a multitude of performance assessment systems geared towards understanding human body movements by utilizing simulators, sensors, and various sensing equipment. However, many of these systems come with inherent limitations, including high costs and complexities that may deter widespread adoption among self-learners. Notably, the ease of use for diverse user groups remains an ongoing challenge for these technologies.
In an attempt to overcome these challenges, the proposed system takes a distinct approach by employing postural feature extraction while enhancing methods for feature point detection and assistant axis generation. The innovation lies in the utilization of Microsoft Kinect and the OpenNI library to generate body skeletons for yoga poses, facilitating performance comparison. This system integrates both an RGB camera and a depth sensor to access crucial depth information pertaining to human body articulation.
One notable advantage of this methodology is its potential for wider applicability among self-learners. Unlike some existing systems, the proposed approach aims to be inclusive, accommodating various types of individuals seeking to engage in self-paced yoga practice. By leveraging accessible technologies like Microsoft Kinect and OpenNI library, the system democratizes the learning process, making it more user-friendly and approachable for a broader audience.
III. PROPOSED ALGORITHM
A. SVM Algorithm
In the vast landscape of machine learning, the Support Vector Machine (SVM) emerges as a sentinel, poised at the intersection of classification and regression tasks.
With unwavering determination, it embarks on a quest to carve out the elusive hyperplane, a beacon of clarity amidst the sea of data points in an N-dimensional space.
Like an artist wielding a brush on a canvas, the SVM seeks to delineate the boundaries of classification with surgical precision. With each data point, it iterates tirelessly, sculpting the hyperplane to separate the wheat from the chaff, the signal from the noise, in a symphony of mathematical elegance.
Yet, the dimensionality of this endeavour is not to be underestimated. As the number of features swells, so too does the complexity of the hyperplane, morphing from a mere line in two-dimensional space to a multidimensional labyrinth of decision boundaries. It's a realm where imagination struggles to keep pace, as the hyperplane transcends our intuitive understanding, morphing into abstract geometries that defy visualization.
B. CNN Algorithm
In the vast expanse of the digital realm, a formidable entity known as the Convolutional Neural Network (CNN) emerges, a titan among algorithms, wielding its prowess in the realm of image recognition and computer vision tasks. Like a virtuoso sculptor of the digital age, the CNN possesses the innate ability to perceive, interpret, and distill intricate patterns and features from the visual tapestry of images.
With a symphony of convolutions and pooling layers, the CNN orchestrates a ballet of pixels, seamlessly weaving together a tapestry of abstract representations that capture the essence of visual information. It is not merely a tool but a maestro, conducting a harmonious cacophony of mathematical operations to distill the essence of visual data into actionable insights.
In the annals of technological innovation, the CNN stands as a harbinger of a new era, a paradigm shift that has revolutionized the field of computer vision. Its impact reverberates across diverse domains, from healthcare to autonomous vehicles, from surveillance to artistic expression, leaving an indelible mark on the fabric of society.
C. Haar Cascade Algorithm
In the ever-evolving landscape of computer vision, the Haar cascade algorithm emerges as a beacon of efficiency and versatility, offering a gateway to real-time object detection across a myriad of scales and spatial orientations. Unlike its more complex counterparts, the Haar cascade algorithm embodies simplicity without sacrificing efficacy, making it an indispensable tool in the arsenal of machine vision applications.
At its core lies the concept of cascading windows, a clever mechanism that enables the algorithm to efficiently scan through the image at multiple scales and locations, seeking out potential objects with remarkable speed. Through a judicious selection of features and a hierarchical decision-making process, the algorithm sifts through the visual data, discerning meaningful patterns that hint at the presence of objects of interest.
IV. BODY ANGLE CALCULATION
The proposed system uses the angles on each joint of the whole body in order to calculate the position values of each Yoga pose. The angle of the two body parts at each joint is calculated by the following equation:
V. SIMULATION RESULTS
Pose Classification
The proposed method introduces a nuanced approach to yoga pose assessment by incorporating a pose classification system into the evaluation process. This classification aims to provide self-learners with an intuitive understanding of their performance, categorizing poses into four distinct levels: "perfect," "good," "not good," and "bad." The classification is determined by comparing the average angle difference across all joints
The formula utilized for determining the result value is:
Result Value=Total Angle Difference/Total Number of Joints
Result Value=Total Number of Joints/ Total Angle Difference?
This result value is then mapped using a range function to assign a performance level. The range spans from 0 to 9, encompassing the possible degrees of deviation within a pose, which is a maximum of 360 degrees. This classification system aligns with human behavior, acknowledging the general interest in overall performance results and the innate desire to enhance one's performance level. The thresholds for each performance level are defined based on the result value, allowing learners to gauge the quality of their pose execution.
Conclusion
In this paper, we introduced a cutting-edge Performance Evaluation System designed to function as a Yoga Pose Training System, catering to the needs of self-learners in the realm of yoga practice. The comprehensive approach of the system involves multiple stages, including the detection of yoga poses or skeletons, calculation of body angle differences between the instructor and learner, identification of incorrect pose components, and a novel classification system assigning poses into four distinct levels based on average angle differences.
The efficacy of the proposed system was validated through practical applications involving individuals with diverse characteristics, such as varying ages, genders, and body shapes, performing three distinct yoga poses. The results demonstrated the system\'s adaptability and reliability across a spectrum of users, reinforcing its potential as a valuable tool for individuals engaging in self-paced yoga learning.
References
[1] H.-T. Chen, Y.-Z. He and C.-C. Hsu, “Computer Assisted Yoga Training System”, Multi. Tool. Appl., vol. 77, no. 18, September 2018, pp.2396923991.
[2] K.-M. Chen, W.-S. Tseng, L.-F. Ting, and G.-F. Huang, “Development and Evaluation of a Yoga Exercise Programme for Older adults”, J. Adv. vol. 57, no. 4, 2007.
[3] H. E. Downs, R. Miltenberger, J. Biedronski, and L. Witherspoon, “The Effects of Video Self-Evaluation on Skill Acquisition with Yoga Postures”, Journal of applied behavior analysis, vol. 484, pp. 930-935, 2015.
[4] M. Eichner and V. Ferrari, “Human Pose CoEstimation and Applications”, IEEE Trans. Pattern Ana. Machi. Intel, vol. 34, no. 11, November 2012, pp.2282- 2288.
[5] Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh, “OpenPose: Real-time Multi-person 2D Pose Estimation Using Part Affinity Fields”, December 2018, https://arxiv.org/abs/1812.08008.
[6] D. Osokin, “Real-time Multi-person Pose Estimation on CPU: lightweight OpenPose”, November 2018, https://arxiv.org/abs/1811.12004.
[7] B. Xiao, H. Wu, and Y. Wei, “Simple Baselines for Human Pose Estimation and Tracking”, August 2018, https://arxiv.org/abs/1804.06208.