Artificial Intelligence-Based Voice Assistant

Authors: Vigneswaran A, Dr. Gowri J, Aakash B

DOI Link: https://doi.org/10.22214/ijraset.2022.48147

Abstract

Workers were replaced by machines throughout the commercial revolution, sending more people into the service industry. Chatbots and voice assistants, that might offer support to customers or users, square measure currently a part of the digital revolution\'s assault on this field. Voice assistants (VA) are the type of voice- enabled artificial intelligence (AI). AI refers to some level of intelligence displayed by digital interfaces, or the ability of algorithms to mimic intelligent human behaviour. However, AI refers to “cognitive” functions that we tend to escort the human mind, including problem solving and learning. The counselling response model provides a suitable response by combining the users’ input and the emotional status of the user; this can have a consolatory impact which will create the user loaded down with depression feel higher.

Introduction

I. INTRODUCTION

A voice assistant (VA) is a sort of artificial intelligence that can respond to voice commands. Voice is currently incorporated in varied merchandise in consumers' homes, including smartphones and smart speakers. Voice assistants are also growing more and more important in our daily lives. AI-based voice assistants are operating systems that can recognize the human voice and respond via integrated voices. This voice assistant will gather the audio from the microphone and then convert that into text. Later it is sent through pyttsx3. pyttsx3 supports multiple TTS engines, including Sapi5, nsss, and espeak. While human personalities influence how we connect with the world, voice assistant personalities (VAP) can have an impact on how we interact with our surroundings on a daily basis.

This analysis identifies seven temperament traits shared by three popular applications: Microsoft's Cortana, Google's Assistant, and Amazon's Alexa. This study uses and extends flow theory to investigate why VAP has the impact it does, as well as what aspects of VAP generate the voice interaction flow experience that can influence consumers' attitudes and behavioural intentions and their current mood.

This reveals that voice engagement with a virtual assistant that integrates operational intelligence, sincerity, and creativity encourages customers to take charge of their voice interactions with the VA, focus on them, and engage in exploratory behaviour. Consumer happiness and willingness to use voice assistants are influenced by consumers' experimental activity. In order to personalize interactions with customers, VAP refers to the attribution of cognitive, emotional, and social human traits to VA. Consumers are more engaged in their dealings with VA because of these compassionate features.

II. EXISTING SYSTEM

Existing System Suppose a user wants to open any application such as paint, MS Office, or VLC player on a personal PC or laptop means they need to search. The application manually after that if they click application it will open. But the problem is now a day’s operating system comes with the latest version users are struggling a lot for search and open applications. The existing system doesn’t have any voice assistance option for opening any application. It takes a lot of time to open an application. Another important drawback of the existing system is not a user-friendly application. And another important problem is Sentiment Analysis. The existing system Sentiment Analysis was only performed on Twitter Data. The existing system doesn’t provide any Sentiment Analysis on user voice input.

A. Drawbacks

Time-consuming process
Sentiment Analysis only performs On Twitter Data
The existing system doesn’t have any voice assistance option for opening any application
It’s Not a user-Friendly Application

III. PROPOSED METHODOLOGY

The drawbacks, which are faced during the existing system, can be eradicated by using this application. The main objective of the proposed system is to provide a user-friendly application for users to get voice assistance options for opening any application using AI. User no need to open the application manually. Whenever users give voice input this s application analysis the voice data of users using AI after the application will open. This application integrates with Sentiment Analysis after getting user voice input application convert voice to text data. This text data gives to Sentiment Analyzer finally sentiment display to the user effectively.

A. Advantage

Less consuming process
Effective Sentiment Analysis performs on user voice input
The proposed system has any voice assistance option for opening any application
It’s a user-Friendly Application

IV. INPUT PROCESSING

User-oriented inputs are converted to computer-based representations via Input Design. Inaccurate input files are the foremost common explanation for errors within the processing. Error data entered by the info operator are often controlled by the input design. The goal of designing input is to form the info entry easy, logical and as free from errors in the maximum amount possible. The proposed system is completely menu- driven. It is a strong tool for interactive design. It assists the user in comprehending the variety of options offered while also preventing them from making an incorrect decision. All of the entry screens are interactive. It has been created with all of the end constraints users in mind.

Some other features included are:

The form title clearly states the aim of the shape
Adequate space is given for data entry
Data Validation is completed for eliminating duplicate entries

V. PROCESSED RESPONSE

Here outputs are the most important and direct source of information for the customer and management. Intelligent output design will improve the system's relationship with the user and help in deciding. Outputs are wont to make a permanent text of the results for later consultation. The output generated by the system is usually considered the standard for evaluating the performance of the system.

The output design was supported by the subsequent factors.

Usefulness determining the varied outputs to be printed to the system user.
Differentiating between the outputs to be displayed and those to be printed.
The format in which the output will be Presented.

For the proposed system, it's necessary that the output should be compatible with the prevailing manual reports. The outputs are formatted with this consideration in mind. The outputs are obtained after all the phases, of the system and can be displayed or can be produced in the hard copy. The text is very preferred since it is often employed by the controller section for future reference and it is often used for maintaining the record.

VI. SEGMENTS

A. User Enrollment Process

This module helps users to register with the application. Registration is mandatory since it is required for users to perform the voice assistance options for opening any application. In the registration, form user has to fill in their personal details such as name, address, DOB, and the mobile number, mail id details, user needs to select a username and password at the time of registration and the username will be Unique. All the details are stored in the user table. Users can log on to this software using their user names and password.

B. Voice Process

This module is completely for users using this modules user can give voice input. Speech recognition, or speech-totext, is the ability of a machine or program to identify words spoken aloud and convert them into readable text. This readable text will open the application and do the overall activities in the system. This voice model is working on a Python text to speech converter. This module converts the speech to text with the help of sapi5 and nsss.

C. Pattern Matching

Pattern matching is a machine learning algorithm that finds pre-determined patterns among sequences of raw data or processed tokens.

After successfully converting of user's voice text is compared with a list of process names inside the system using a pattern-matching algorithm. Finally, the application will open Based on user voice input. This will also help in asking queries to the voice assistant that is not related to the computer system. The machine learning algorithm will match the pattern of the previously asked queries and gives the best and most relevant answer to the user.

D. Text Classification

A support vector machine (SVM) is a machine learning algorithm that analyzes data for classification and regression analysis. SVM is a learning method that looks at data and sorts it into multiple categories. An SVM map of the sorted data with the margins between the two as far apart as possible. Every sentence will be segmented and each and every keyword will match with prefix and suffix.

Based on analysing the sentence will classification effectively use the SVM Classification Algorithm.

Pros

a. With a distinct dividing margin, it works incredibly well.

b. It works well in three-dimensional spaces.

c. When the number of dimensions exceeds the number of samples, this method works well.

d. It is memory-efficient because it uses a subset of training points (called support vectors) in the decision function.

2. Cons

a. When we have a large data collection, it does not perform well since the necessary training time is longer.

b. When the data set contains more noise, such as overlapping target classes, it does not perform well.

c. Probability estimates are produced using an expensive five-fold cross-validation method, which is not directly provided by SVM. It's part of the Python scikit-learn library's related SVC algorithm.

E. Text Result Analyze

These modules are very important processes finally user voice input data are given to the SVM classification algorithm. This algorithm analyzes user sentences in to sentiments such as positive, negative, and neutral so that users can easily view sentence based sentiment information.

F. Other Modules

There are so many small modules in this Voice assistant model.

SMTP: A simple mail transfer protocol is used to send mail throughout the voice at a complete process.
Selenium: Selenium web driver is used to mostly automate the process of the browsers that are requested by the users by voice.

VII. WORKING MODEL

When the assistant first starts, it will wait for the user to provide input. When a voice command is given to the assistant, it will be recorded and searched for using the keyword that was included in the command. If the assistant was successful in locating a keyword, it will carry out the action as directed and speak the output back to the user. If not, the assistant will once more begin to wait for input from the user. Each of these functions plays a unique role in how well the entire system functions.

Conclusion

The proposed system is to create an easy and simple voice assistant, especially for the computers that do some simple operations like responding, opening programs like system apps, google, and some other applications, telling jokes, finding lyrics, sending emails, dictionaries, converters. It has an Ocr function that helps to extract text from images. It can be connected to the IoT and works with various operations like turning on and off all electrical appliances such as lights, fans, AC, etc. by voice control (based on hardware and software).

References

[1] Hoy, Matthew B. (2018). \"Alexa, Siri, Cortana, and More: An Introduction to Voice Assistants\". Medical Reference Services Quarterly. 37 (1): 81–88. doi:10.1080/02763869.2018.1404391. PMID 29327988. S2CID 30809087. [2] Schwoebel, J. (2018). An Introduction to Voice Computing in Python. Boston; Seattle, Atlanta: NeuroLex Laboratories.https://neurolex.ai/voicebook [3] Mozilla\'s large repository of voice data will shape the future of machine learning. https://opensource.com/article/18/4/common-voice [4] Hill, I. (1983). \"Natural language versus computer language.\" In M. Sime and M. Coombs (Eds.) Designing for Human-Computer Communication. Academic Press. [5] \"1.4. Support Vector Machines — scikit-learn 0.20.2 documentation\". Archived from the original on 2017- 11-08. Retrieved 2017-11-08. [6] Wenzel, Florian; Galy-Fajou, Theo; Deutsch, Mattha?us; Kloft, Marius (2017). \"Bayesian Nonlinear Support Vector Machines for Big Data\". Machine Learning and Knowledge Discovery in Databases (ECML PKDD). Lecture Notes in Computer Science. 10534: 307–322. arXiv:1707.05532. Bibcode:2017arXiv170705532W . doi:10.1007/978-3- 319-71249-9_19. ISBN 978-3-319-71248-2. S2CID4018290. [7] Test result analysis https://serokell.io/blog/machine-learning-testing [8] Howard, W.R. (2007-02-20). \"Pattern Recognition and Machine Learning\". Kybernetes. 36 (2): 275. doi:10.1108/03684920710743466. ISSN 0368-492X.

Copyright

Copyright © 2022 Vigneswaran A, Dr. Gowri J, Aakash B. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET48147

Publish Date : 2022-12-14

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here