Gesture Controlled Virtual Mouse with the Support of Voice Assistant

Authors: Cherukupally Karunakar Reddy, Suraj Janjirala, Kevulothu Bhanu Prakash

DOI Link: https://doi.org/10.22214/ijraset.2022.44323

Abstract

This work offers a cursor control system that utilises a web cam to capture human movements and a voice assistant to quickly traverse system controls. Using MediaPipe, the system will let the user to navigate the computer cursor with their hand motions. It will use various hand motions to conduct activities such as left click and dragging. It also allows you to choose numerous items, adjust the volume, and adjust the brightness. MediaPipe, OpenCV etc advanced libraries in python are used to build the system. A hand gesture and a voice assistant used to physically control all i/o operations. To recognise hand gestures and vocal instructions, the project employs cutting-edge machine learning and computer vision techniques, which operate effectively without the need for extra computer hardware. Hand gestures are a simple and natural way to communicate.

Introduction

I. INTRODUCTION

Non - verbal communication in the form of gestures is utilized to convey a certain message. The movements of a person's body, hands, or face can be used to send this message. Gestures have the ability to convey something when engaging with other individuals. From simple to incredibly complicated hand movements. For example, we can point to something (an object or people) or utilize a variety of simple gestures or motions that are conveyed in sign language that are integrated with their syntax and dictionary, more often known as sign languages. As a result, humans can communicate more effectively by employing hand motions as a device with the help of computers

Hand gestures have taken control of mouse functions such as controlling the movement of a visual item. The work is supposed to be low-cost, and it makes use of low-cost input devices such as a webcam to capture hand movements as input. Modeling predetermined command-based movements is used to manipulate materials.

A. Scope and Proposed Model

There are different existing systems. One with regular mouse (hardware tool) to navigate around in the monitor. It is not possible to use hand motions to access the monitor screen.

Other is the gesture system which uses color tapes to identify the gestures. And the functionalities performed are static in nature which are basic.

Making use of the current system, we can utilise a laptop/computer with a web camera and microphone to control the mouse and execute simple operations without additional computer hardware. In addition, a voice assistant is used to perform additional tasks.

II. LITERATURE SURVEY

A. Recognition of Hand Gestures

Gesture recognition is a hot topic in computer science, and it involves developing systems that translate human movements so that anyone can interact with a device without touching it directly. Gesture recognition is the process of detecting, representing, and turning gestures into a precise intended command. The aim of hand gesture recognition is to identify from a clear hand movement that is given and process this gesture representation on the devices using a map as the output.

From a variety of sources, there are three ways to recognize hand gestures, as follows:

Machine Learning Methods
Algorithm Methods
Rule-based Methods

???????B. MediaPipe Framework

MediaPipe gives life to the products and services we use on a daily basis. Unlike other machine learning frameworks that use a lot of resourses, MediaPipe uses very little. It is so tiny and functional that it can run on embedded IoT devices. Following its release in 2019, MediaPipe opens up a whole new universe of possibilities for researchers and developers. MediaPipe implements the pipeline in Figure 2.1. contains 2 models of hand gesture recognition as follows:

Palm detector model
Hand landmark model
Gesture recognizer

???????C. Voice Assistant

Below figure forms the basis for any kind of voice assistant.

III. SYSTEM ARCHITECTURE

The proposed system can be initially started by invoking either voice assistant program or gesture control program. Using either the other program can be started as well. In the gesture control program, users gestures are captured through web cam, each frame goes through MediaPipe’s hand gesture recognition module (mp.solutions.hands) and landmarks are established. Using these landmark’s a gesture is recognized with the help of some computation. Then a controller class performs actions based on these commands. This is done repeatedly. In the voice assistant program, voice is recorded through microphone. Commands are understood. According to the commands the actions are performed.

The project uses touch control to provide the following functions:

Move the cursor
Stop gesture
Left cursor
Double click
Scrolling
Drag and Drop
Multiple Item Selection
Volume Control

The project uses voice assistant to provide following functions:

a. Launch / Stop gesture recognition

b. Google Search

c. Find a location on google maps

d. File navigation

e. Date & time

f. Copy Paste

g. Sleep/wake voice assistant

h. Exit

V. SYSTEM IMPLEMENTATION

Gesture Control interface is started by running the Gesture_Controller.py in Anaconda prompt. Language/Technology used for the implementation consists of Python, HTML, CSS, JavaScript, and Anaconda as platform. Gesture Control interface is started by running the Gesture_Controller.py in Anaconda prompt

VII. ACKNOWLEDGEMENT

We would like to express our deep and sincere gratitude to our research guide, Mrs. Vemula Geeta,Assitant Professor, Computer Science and Engineering, Sreenidhi Institute of Science and Technology, Hyderabad, for giving us the opportunity to do research and providing invaluable guidance throughout this research. . We are extremely grateful for what she has offered us.

Conclusion

The Hand gesture recognition and voice assistant system has become an important role in building efficient human-machine interaction. Implementation using hand gesture recognition promises wide-ranging in technology industry. The MediaPipe as one framework based on machine learning plays an effective role in developing this application using hand gesture recognition.

References

[1] https://www.topcoder.com/thrive/articles/what-is-the-opencv-library-and-why-do-we-need-to-know-about-it [2] https://dx.doi.org/10.2991/aer.k.211106.017 [3] https://github.com/nicknochnack/MediaPipeHandPose/blob/main/Handpose%20Tutorial.ipynb [4] https://www.youtube.com/watch?v=9iEPzbG-xLE [5] https://learnopencv.com/introduction-to-mediapipe/ [6] https://ijirt.org/master/publishedpaper/IJIRT152099_PAPER.pdf

Copyright

Copyright © 2022 Cherukupally Karunakar Reddy, Suraj Janjirala, Kevulothu Bhanu Prakash. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET44323

Publish Date : 2022-06-15

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here