AI-Based Virtual Assistant Using Python: A Systematic Review

Authors: Patil Kavita Manojkumar, Aditi Patil, Sakshi Shinde, Shaktiprasad Patra, Saloni Patil

DOI Link: https://doi.org/10.22214/ijraset.2023.49519

Abstract

A software agent that will carry out tasks or provide services in response to a user\'s privately supported instructions or inquiries is known as an intelligent virtual assistant (IVA) or intelligent personal assistant (IPA). A virtual assistant capable of being accessed via web chat is sometimes called a \"chatbot.\" Online chat systems can occasionally only be used for amusement. Some virtual assistants are equipped to comprehend spoken language and answer with synthetic voices. Users can use voice commands to manage other basic chores like email, to-do lists, and calendars in addition to asking their assistants questions, controlling home automation devices, and controlling media playing. One of the best applications of artificial intelligence is the virtual personal assistant (VPA), which offers a new way for people to delegate tasks to machines. To create a Virtual Personal Assistant (VPA) and use it in various software applications, certain approaches and principles are used. To enable users to communicate with virtual assistants, speech recognition systems—also known as Automatic Speech Recognition, or ASR—play a crucial role.

Introduction

I. INTRODUCTION

In today's technological age, machines are replacing humans in every task. Performance changes are one of the primary causes. In the modern world, we teach our machines to think like people and do tasks on their own. As a result, the idea of a virtual assistant emerged. A virtual assistant is a digital assistant that recognizes user voice commands and complies with their requests using speech recognition technology and language processing algorithms. A virtual assistant can cut through background noise and return pertinent information based on the user's particular demands.

Virtual assistants are entirely software-based, although they are now integrated into a variety of gadgets. Also, some assistants, like Alexa, are made expressly for particular gadgets. We must train our machines using machine learning, deep learning, and neural networks in light of the recent dramatic changes in technology. Voice Assistant allows us to communicate with our devices nowadays.

Today, every major corporation uses voice assistant technology so that customers can speak to a machine for assistance. Hence, with the Voice Assistant, we are advancing to the stage where we may speak to our machine. These virtual assistants are highly helpful for persons who are elderly, physically disabled, blind, or have children since they make it easier for humans to engage with machines. Even blind people who are unable to see the device can communicate with it using only their voice.

The following are a few of the fundamental things that a voice assistant can aid with: - Reading a newspaper, getting mail updates, searching the web, playing music or videos, setting reminders and alarms, using any program or application, and getting weather updates are just a few examples.

These are only a few examples; there are many more things we may accomplish based on our needs. For Windows users, we have created a voice assistant. We created a desktop-based voice assistant that makes use of Python modules and libraries. The current technology is good in that it can still be combined with machine learning and the internet of things (IoT) for better improvements. Nonetheless, this assistant is a version that could accomplish all the fundamental functions that have been discussed above.

The model was created using Python modules and libraries, and machine learning was used to train it. Certain Windows commands were also added to the model to ensure that it would function properly on this operating system.

Our model will mostly operate in three modes:

Supervised Learning
Unsupervised Learning
Reinforcement Learning

Based on the purpose for which the user needs the aid. And deep learning and machine learning can be used to accomplish these. The Voice Assistant will make it unnecessary for you to repeatedly type commands to carry out specific tasks. Once a model is generated, it can be utilized as many times as necessary by as many people as possible in the simplest manner. Hence, with the aid of a virtual assistant, we will be able to manage numerous aspects of our environment on a single platform.

A. Aim and Motivation

The creation of genuine conversation between humans and machines is the primary objective of artificial intelligence (AI).

In order to increase the connection between humans and machines, numerous IT firms have developed various types of Virtual Personal Assistants (VPAs) based on their applications and use cases, including Google Assistant, Amazon Alexa, Apple's Siri, and Microsoft's Cortana. We have developed our own virtual personal assistant only for Windows using Python, which can be accessed on any Windows explorer, including Windows 7, 8, and 10. Python is used as a programming language because it has a number of important libraries that are used to carry out tasks. Our personal virtual assistant can detect the user's speech and carry out tasks by using python installation packages.

II. LITERATURE REVIEW

With much significant advancement over the years, virtual assistants have a long history. Smartphones and wearable technology now come bundled with a voice assistant for dictation, search, and voice commands. Nowadays, practically all digital devices include voice assistants that help users control them through speech recognition. To enhance the effectiveness of voice computerized seek, new ways are continually being developed.

A voice assistant that can understand commands and carry out tasks given by the client was created utilizing Python and artificial intelligence technology [1]. NLP was utilized by the virtual assistant to translate user speech or text input into actionable commands. Because it is a quick process, time is saved. Because it is a quick process, time is saved. This project is more adaptable and simple to grasp because of its modular design. The Python programming language's necessary packages have all been installed, and the VS Code Integrated Development Environment was used to write the code (IDE). The data for the various noises were also obtained from the environment, and the Python version used for this project was 3.x.

Mobile professionals can use an intelligent computer secretarial service through the Virtual Personal Assistant [5]. The new service is built on the confluence of mobile, internet, and speech recognition technologies.

The VPA reduces the user's interruptions, enhances his time management, and offers a central hub for all of his communications, contacts, schedule, and information sources. The study also suggests a decision-making framework for call screening and dealing with meeting and appointment requests.

The survey provided in this paper will aid in acquiring a thorough knowledge of the distinctions between IBM Watson and Google Dialog Flow as Natural Language Understanding Platforms [6].

The survey also gave information on research done on various projects using IBM Watson and Google Dialog Flow to identify their features. In this experiment, the Google Dialog Flow-powered application ERAA was able to access other installed apps on the device, such as WhatsApp, Instagram, and Gmail, among other things. Its user-friendly platform, which was designed with the aid of Flutter made access to the Application simple.

The application's Speech Recognition functionality allowed users to carry out tasks by issuing Voice Commands. Also, the application may engage in light conversation with the user.

Design of a portable, large-vocabulary voice recognition system that functions accurately, quickly, and reliably [7]. It accomplished this by applying a CTC-based LSTM acoustic model, which predicts context-independent phones and compressing it using a mix of SVD-based compression and quantization to a tenth of its original size. To achieve real-time performance on contemporary smartphones, quantized deep neural networks (DNNs) and on-the-fly language model rescoring were used.

The voice recognition algorithms, how SSH can be enabled on Raspberry Pi 4, and face detection using OpenCV and Raspberry Pi are all covered in the literature study on Smart Assistant [4].

The goal of the voice assistant, which was created using Python, machine learning, and AI algorithms, is to support people by responding to their voice instructions. Voice Recognition API is used to translate the user's audio input into an English sentence [3].

III. PROPOSED SYSTEM ARCHITECTURE

The conceptual model that describes a system's structure, behavior, and other aspects is called system architecture. A formal description and representation of a system that is set up to facilitate analysis of its structures and behaviors are called an architecture description. System architecture can comprise designed subsystems and system components that will cooperate to implement the entire system.

This section gives a succinct summary of our findings after analyzing and comparing our suggested work. We have used Python, machine learning, and AI to implement this concept.

Our primary goal is to enable consumers to do their jobs using voice commands.

The suggested system will be capable of:

The system will continuously listen for commands, and it can adjust the amount of time it spends doing so as per user preferences.
The system will keep requesting the user to repeat their input the desired number of times if it cannot get the information from it.
The user's preferences can determine whether the system uses male or female voices.
The current version supports features including playing music, sending emails and texts, searching Wikipedia, opening system-installed programs, and accessing any website.
The system will continue to listen for commands, and it can adjust the duration of that listening based on user needs.
The system will keep requesting the user to repeat their input till the desired number of times if it cannot get information from it.
The user's preferences can determine whether the system uses male or female voices.

IV. EXPERIMENTAL RESULTS

It takes less time to use a virtual assistant. A virtual assistant is software that can follow instructions and carry out tasks given by the client. NLP is used by virtual assistants to translate user speech or text input into actionable commands.

You may control devices like laptops and PCs on your own with the aid of a virtual assistant. Because it is a quick process, time is saved. Your virtual assistant is available to you constantly and can swiftly adjust to new needs because they are working for you at predetermined hours. A virtual assistant will be at your disposal, and if their workload permits, they can assist others as well, like relatives and coworkers.

V. IMPORTANT LIBRARIES/PACKAGES

A. Speech To Text

It is an application that transforms audio to text. It is incapable of understanding anything you may say.

B. Text Analyzing

For a computer, converted text is just letters. Text is converted using software so that computers can understand it. The command is understood by the computer, thus virtual assistants like Siri translate the text into a command for the computer. In order to construct a command that a computer can understand, VPAs maps the words to functions and parameters.

C. Speech Recognition Modules

The system makes use of the Python-text-to-speech (pyttx3) package for speech rearrangement. Text input from users is converted using it. Python's pyttsx3 library contains built-in text-to-speech and speech-to-text converters. It increases user interaction with the system.

D. API Calls

A package interface known as an API enables the communication between two programs. Petite API refers to the connection that routes the request to the provider and subsequently returns the provider's response to the request.

E. Python Backend

The users' text-based request is received by the Python backend, which analyses it to identify whether it involves an API call or data extraction. The system is then able to respond at any time with the appropriate response.

F. Data Extraction

It involves extracting organized statistics from disorganized machine-readable documents. We have included the pertinent data that the user requested in our proposed system.

G. Text-To-Speech Module

For people who struggle with interpretation, this feature is quite helpful. Using third party libraries, text-to-speech engines can be created in a wide range of languages, dialects, and sophisticated vascular structures.

H. Date time

Date and time are displayed using the Datetime package. Python already includes a built-in datetime module.

I. Wikipedia

We have utilized the Wikipedia module in our project to get more information from Wikipedia or to do a Wikipedia search because, as we all know, Wikipedia is a great and enormous source of knowledge, just like GeeksforGeeks or any other sources. Use pip install Wikipedia to install this Wikipedia module.

J. Web Browser

This python module is used to conduct web search.

K. OS

Python's OS module offers tools for communicating with the operating system. OS is included in the basic utility modules for Python. Using operating system-dependent functionality is made possible by this module.

L. Pyjokes

Pyjokes is a tool for collecting jokes online. Pyjokes is included in our project since it includes jokes. It's quite fascinating. Pyjokes is the one-line joke that adds interest to our project.

M. Pyaudio

PortAudio is a cross-platform C++ library that interfaces with audio drivers. PyAudio is a set of Python bindings for PortAudio.

Conclusion

In this paper, we discussed a Python-based Voice Assistant. Now, this assistant operates as an application and carries out routine duties like checking the weather, streaming music, searching Wikipedia, opening desktop programs, etc. The current system\'s functionality is restricted to working with application-based data only. Python-based personal virtual assistants for Windows have been discussed. Humans\' lives are made easier by virtual assistants. The freedom to only hire a virtual assistant for the services they require. We also create virtual assistants using Python for all Windows versions, just as Alexa, Cortana, Siri, and Google Assistant. For this project, we make use of artificial intelligence technology. Using a virtual personal assistant to manage or organize your schedule is a good idea. Because they are more movable, dependable, and always accessible, virtual personal assistants are more dependable than human personal assistants.

References

[1] Vishal Kumar Dhanraj, Lokesh kriplani, Semal Mahajan, ”Research Paper on Desktop Voice Assistant” International Journal of Research in Engineering and Science, Volume 10 Issue 2, February 2022. [2] Prof. Suresh V. Reddy, Chandresh Chhari, Prajwal Wakde, Nikhi Kamble, ”Review on Personal Desktop Virtual Voice Assistant using Python” International Advanced Research Journal in Science, Engineering and Technology, Vol. 9 Issue 2, February 2022. [3] Nivedita Singh, Dr. Diwakar Yagyasen, Mr. Surya Vikram Singh, Gaurav Kumar, Harshit Agrawal, ”Voice Assistant Using Python” International Journal of Innovative Research in Technology, Volume 8 Issue 2, July 2021, ISSN: 2349-6002. [4] Edwin Shabu, Tanmay Bore, Rohit Bhatt, Rajat Singh,”A Literature Review on Smart Assistant” International Research Journal of Engineering and Technology (IRJET), Volume: 08 Issue: 04, April 2021. [5] A. Sudhakar Reddy M, Vyshnavi, C. Raju Kumar, and Saumya, ”Virtual Assistant using Artificial Intelligence and Python” Journal of Emerging Technologies and Innovative Research (JETIR), Volume 7 Issue 3, March 2020, ISSN-2349-5162. [6] Dr. Jaydeep Patil, Atharva Shewale, Ekta Bhushan, Alister Fernandes, Rucha Khartadkar, ”A Voice Based Assistant Using Google Dialogflow and Machine Learning” International Journal of Scientific Research in Science and Technology Volume 8 Issue 3, May 2021. [7] Xin Lei, Andrew Senior, Alexander Gruenstein, Jeffrey Sorensen, “Accurate and compact large vocabulary speech recognition on mobile devices,” in INTERSPEECH. 2013, pp. 662–665, ISCA

Copyright

Copyright © 2023 Patil Kavita Manojkumar, Aditi Patil, Sakshi Shinde, Shaktiprasad Patra, Saloni Patil. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET49519

Publish Date : 2023-03-12

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here