A revolutionary method that enables users to write in the air using hand gestures and precisely recognizes finger movement is the air writing and recognition system using the MediaPipe module. A new method that enables users to write in the air using hand movements while AWS Textract accurately translates and converts the written language is to use AWS Textract for air writing detection and recognition.
It uses machine learning techniques to examine how the hand landmarks move over time and recognize various gestures. The system is an effective and user-friendly solution for jobs requiring text input and air writing due to its real-time tracking and precise recognition capabilities.
Introduction
I. INTRODUCTION
The air writing and recognition system is an innovative technology that allows users to write in the air using hand gestures and accurately recognizes the written text.
This system leverages computer vision and machine learning techniques to track the movement of the user's hand and convert it into digital text.
The air writing and recognition system has a wide range of applications. It can be used in virtual reality environments, where traditional input methods like keyboards or touchscreens are impractical. It can also find applications in smart devices, enabling users to interact with devices using hand gestures instead of physical touch.
Overall, the air writing and recognition system combines computer vision, machine learning, and natural language processing techniques to enable users to write in the air and convert their gestures into digital text. It offers a unique and intuitive way of text input, expanding the possibilities for interaction in various domains.
Amazon Textract is built on the same tried-and-true, highly scalable deep-learning technology that Amazon's computer vision scientists use to analyse billions of photos and videos every day. Amazon Textract features straightforward, easy-to-use API methods that can analyse picture and PDF files, so you don't need any machine learning skills to utilize it. Amazon Textract is always learning from fresh data, and Amazon is constantly expanding the service's functionalities. In recent years, air writing has become one of the most popular dynamic gestures.
MediaPipe Hands is a high-definition hand and finger tracking system. It extrapolates 21 3D hand landmarks from a single photo using machine learning (ML).
Enhancing one's ability to detect the motion and form of hands can be a major factor.
A. Motivation
The motivation behind developing an air writing and recognition system stems from Traditional text input methods such as keyboards for touch screens may not be accessible to everyone. Individuals with disabilities or impairments may find it challenging to use conventional input devices.
Air writing provides an alternative method that can be more accessible and inclusive, allowing a wider range of individuals to engage in written communication.
Air writing taps into the natural human instinct of using gestures to communicate. By enabling users to write in the air using hand movements, the system provides an intuitive and familiar interaction paradigm.
It eliminates the need for learning complex keyboard layouts or operating touchscreens, making it easier for users to express themselves through writing.
B. Problem Statement
The problem addressed by the air writing and recognition project is the limited accessibility and usability of traditional text input methods, such as keyboards or touchscreens, for certain individuals or in specific contexts. This includes people with disabilities, situations where touch-based input is impractical, or virtual reality environments that lack physical input devices. The goal is to provide an alternative and intuitive method for users to write and input text.
II. RELATED WORKS
A. Chaur-Heh Hsieh , You-Shen Lo,Jen-Yang Chen ,And Sheng-Kai Tang
Due to its potential use in intelligent systems, air-writing recognition has drawn a lot of attention. Some of the fundamental issues with isolated writing have not yet been adequately addressed.
An air-writing recognition method based on deep convolutional neural networks (CNNs) is presented in this research. The extraction of air-writing trajectories captured by a single web camera is proposed together with a reliable and effective hand tracking method. Without a delimiter and a fictitious box, the technique solves the push-to-write issue while avoiding writing limitations for users. The writing trajectory is transformed into the proper types of data using a novel preprocessing method, which makes training CNNs with these types of data easier and more efficient.
B. Kr. Pandey, Dheeraj, Manas Tripathi and Vidyotma
In recent years, one of the most captivating and difficult research areas in the fields of image processing and pattern recognition has been abstract-writing in the air. It makes a significant contribution to the advancement of an automated process and can advance the point of interaction between man and machine in a variety of applications. The task of object following is regarded as important in the realm of computer vision. The advent of faster PCs, the availability of affordable, high-quality camcorders, and the demand for computerized video analysis have made object tracking systems more popular.
C. Prof. S.U. Saoji, Nishtha Dua, Akash Kumar Choudhary, Bharat Phogat
Writing in the air has been one of the most exciting and difficult research areas in image processing and pattern recognition in recent years.
It makes a significant contribution to the advancement of an automation process and can improve the interaction between man and machine in a variety of applications.
Several research studies have focused on novel strategies and methodologies that would cut processing time while improving recognition accuracy.
Object tracking is regarded as a critical task in the field of computer vision. Object tracking techniques have gained popularity as a result of faster computers, the availability of low-cost, high-quality video cameras, and the demands for automated video analysis.
In general, the video analysis technique consists of three basic steps: First, the object is detected, and then it is tracked. The system has the potential to challenge traditional writing methods. It eradicates the need to carry a mobile phone in hand to jot down notes, providing a simple on-the-go way to do the same. It will also serve a great purpose in helping specially abled people communicate easily.
D. Chen, Hui; Ballal, Tarig; Muqaibel, Ali H.; Zhang, Xiangliang; Al-Naffouri, Tareq Y.:
Recently, air-writing devices have been suggested as tools for human-machine interaction that allow for the writing of letters or numbers in the air to represent commands.
Different Systems for writing in the air have been made possible by technologies. In this study, we suggest an acoustic wave-based air-writing system.
There are two parts to the proposed system: a movement, a text recognition component as well as a tracking component. We use direction-of-arrival (DOA) information for motion tracking. By monitoring the shift in the DOA of the signals, an array of ultrasonic receivers follows the movement of a wearable ultrasonic transmitter. We put forth a brand-new 2-D DOA estimation algorithm that uses the measured phase-differences between the receiver array elements to follow the transmitter's changing direction. The phase-difference projection (PDP) algorithm that has been suggested can
III. FLOW DIAGRAM
IV. ALGORITHM
A. Algorithm For Hand Tracking
Import the necessary libraries: cv2, MediaPipe and time.
Set up the video capture usingcv2.VideoCapture().
Initialize the MediaPipe hands module and drawing utility.
Enter the main loop for capturing video format from the camera.
Read a frame from the video capture
Convert BGR image to RGB format, as MediaPipe Hands module requires RGB images.
Process the RGB image with module to detect the hands.
Check if any hands are detected in frame.
For each detected ,iterate over the landmarks and extract the landmark’s index x and y coordinates.
Calculate the pixel coordinates (cx, cy) from the normalized landmark coordinates.
f the index finger detected, draw a filled circle at the finger tip on the image.
Draw the hand landmarks and connections on the image using the drawing utility.
This is the basic algorithm for hand tracking module. It continuously captures frames from the camera, detects hand landmarks using MediaPipe, and visualizes the detected landmarks on the image with the frames per second.
B. Algorithm For Virtual Canvas
Import the necessary libraries and modules cv2, NumPy, os, time, handTrackingModule, boto3, textwrap, tkinter, pytesseract, and PIL.
Set up the AWS session and Textract client using your AWS access key, secret access key, and region.
Set up variables for pen and eraser thickness, as well as the folder path for header images.
Load the header images from the specified folder and store them in the overlay List.
Create a tkinter window and set up the UI elements, such as labels and a text widget.
Enter the main loop for capturing video frames from the camera:
Read a frame from the camera.
Flip the frame horizontally using cv2.flip.
Use the handTrackingModule to detect and track hands in the frame.
Get the hand landmarks and finger positions.
Based on the finger positions, determine the selected color and header image.
If the index and little fingers are extended, update the drawing position.
If the little finger is extended and other fingers are not, draw on the canvas using the selected color.
If the thumb, index, middle, ring, and little fingers are extended, enter convert mode.
Capture the inverted image of the canvas and send it to Textract for text recognition.
Extract the recognized words from the Textract response and append them to the words array.
Display the recognized words in the text widget.
If the thumb, index, and little fingers are extended and the middle and ring fingers are not, enter delete word mode.
Remove the last word from the words array and update the text widget.
Display the camera image with overlays and the canvas.
Update the tkinter window.
This is the basic algorithm for virtual canvas. It continuously captures frames from the camera, tracks hand movements, allows drawing on the canvas, performs word recognition using Textract, and provides functionality for deleting words.
Conclusion
In conclusion, the project\'s goal is to create a system that enables users to write in the air using hand gestures and precisely recognises and translates those motions into words. The purpose behind the project, problem definition, objectives, scope, limitations, assumptions, dependencies, and functional requirements are just a few of the important topics that have been covered during the project. Maintaining effective communication and collaboration between the project team and stakeholders is crucial throughout the development phase, as is routinely checking on the status of the project. By effectively implementing the air writing and recognition system, the project hopes to give consumers a novel and simple method to communicate with digital gadgets and get around the constraints of conventional input. The air writing and overall recognition project endeavours to deliver a reliable, accurate, and user-friendly system that enhances the way people interact with technology and opens up new possibilities for input methods.
References
[1] Chaur-heh Hsieh, You-shen Lo , Jen-yang Chan and Sheng-kai Tang, “Air-Writing Recognition Based on Deep Convolutional Neural Networks”, Oct 2021.
[2] Ashutosh Kr. Pandey, Dheeraj, Manas Tripathi and Vidyotma ,”AIR WRITING USING PYTHON (2021-2022)”, May 2022.
[3] Prof. S.U. Saoji, Nishtha Dua, Akash Kumar Choudhary and Bharat Phogat ,”AIR CANVAS APPLICATION USING OPENCV AND NUMPY IN PYTHON”, Aug 2021.
[4] Md. Shahinur Alam ,Ki-Chul Kwon ,Md. Ashraful Alam ,Mohammed Y. Abbass, Shariar Md Imtiaz and Nam Kim ,”Trajectory-Based Air-Writing Recognition Using Deep Neural Network and Depth Sensor”, Jan 2020.
[5] Chen, Hui, Ballal, Tarig, Muqaibel, Ali H.,Zhang, Xiangliang, Al-Naffouri, Tareq Y ,” Air-writing via Receiver Array Based Ultrasonic Source Localization” , 2020.