SARA: A Voice Assistant Using Python

Authors: Ayush Chinchane, Aryan Bhushan, Ayush Helonde, Prof. Kiran Bidua

DOI Link: https://doi.org/10.22214/ijraset.2022.44517

Abstract

Artificial intelligence technologies are starting to be actively utilized in human life, thanks to the Internet of Things\' debut and widespread distribution. Autonomous gadgets are growing more intelligent in their interactions with humans and with each other. New capabilities led to the development of various solutions for integrating smart devices into the Internet of Things Social Networks. The technique of recognizing a human\'s natural language is one of the most important trends in artificial intelligence. New insights into this area could lead to new forms of natural human-machine interaction, in which the computer learns to understand and engage with human language. Voice assistant is one of these tools, and it can be integrated into a variety of different intelligent systems. The basics of voice assistant operation are outlined in this paper, as well as the major flaws and limitations. The approach for establishing a local voice assistant without needing cloud services is explained, allowing future applications of such devices to be considerably expanded.

Introduction

I. INTRODUCTION

Nowadays, humans rely on other humans for help or services. The world's digitalization ensured that human reliance on the system could be switched to the system, allowing for far more efficient and reliable employment, as well as a device that could take care of their daily needs. Computers, cell phones, laptop computers, and other electronic devices have become an indispensable part of our daily lives. They can perform simple calculations as well as complex programs, reducing monotonous work and personnel waste. To solve problems quickly, Virtual Personal Assistants have practically become a must-have feature in all electronic devices. Virtual assistance can help the user in a variety of ways. Speech recognition is a relatively new addition to the virtual world. However, despite being reasonably effective, it is not particularly useful and, as a result of the high rate of error, is not used by the user. Despite the fact that the future virtual assistant has an error rate of about 5%, it is not yet ready to become a routine part of the user's life. As a result, the project's goal is to develop a virtual assistant with low error rate speech recognition.

We developed a voice assistant that allows users to accomplish any task on the system without having to use a keyboard, decreasing the number of input devices.

The elderly, the visually and physically handicapped, children, and others benefit from virtual assistants since engaging with machines is no longer a challenge. Even blind people who can't see the computer can communicate with it simply by speaking to it. Some of the basic tasks that a voice assistant can assist you with are listed below.

Reading Newspaper
Getting updates on mail
Search on the web
Play music or video
Setting a reminder and alarm
Run any program or application
Getting weather updates

These are some of the examples, we can do many more things according to our requirements.

II. COMPREHENSIVE EXISTING WORK SURVEY

Works	Name	Algorithms/Techniques used	Type	Research
[1]	A Voice Based Assistant Using Google Dialogflow and Machine Learning	Artificial Intelligence, Natural Language Understanding, IBM Watson, Google Dialogflow, Speech Recognition	Personal Voice Assistant	In this project, the application, ERAA, developed with the help of Google Dialog Flow, is able to perform various tasks like accessing the other applications like WhatsApp, Instagram, and Gmail that are installed on the device. It is user-friendly and was developed with the help of Flutter, which provided ease in accessing the application. With the help of graphics packages in Flutter, they were able to develop an attractive user interface. It is able to perform the basic features as required in an ideal Personal Assistant.
[2]	Desktop Voice Assistant Using Natural Language Processing (NLP)	Speech Recognition, Python Backend, System Calls, Google-Text-To-Speech	Desktop Voice Assistant	In this study, they have developed a voice assistant that can perform any kind of task in exchange for commands given by the users without any error. They have added more features like listening to the user’s voice only and not being activated by environmental noise.
[3]	Desktop Assistant AI Using Python	Desktop Assistant, Python, Machine Learning, Text to Speech, Speech to Text, Language Processing, Voice Recognition, Artificial Intelligence, Internet of Things (IoT), Pyttsx3, Speech Recognition, SQLite	Desktop Voice Assistant	They discussed a Python-based voice-activated personal assistant in this paper. This assistant currently works online and performs basic tasks like weather updates, streaming music, searching Wikipedia, opening desktop applications, etc. The functionality of the current system is limited to working online only.
[4]	JARVIS: A PC Voice Assistant	Python[pyttsx] and gTTS[Google Text to Speech]	PC Voice Assistant	This voice assistant has automated various services using a single line command. It eases most of the tasks of the user like searching the web, retrieving weather forecast details, translating words from one language to another language, accessing youtube videos, sending mail through voice, and solving computational queries.
[5]	The Voice Enabled Personal Assistant for Pc using Python	Python, Quepy, Pyttsx3, Speech Recognition, SQLite	Personal Voice Assistant	This paper presents a comprehensive overview of the design and development of a voice-enabled personal assistant for the PC using the Python programming language. This voice-enabled personal assistant, in today's lifestyle, will be more effective in saving time compared to the previous days. Furthermore, there are many things that this assistant is capable of doing, like turning our PC off, restarting it, or reciting the latest news, with just one voice command.
[6]	Smart Voice Based Virtual Personal Assistants with Artificial Intelligence	Python, Text-to-Speech, Speech-to-Text, Voice Recognition	Virtual Assistant	In this paper, the design and implementation of an Intelligent Personal Voice Assistant are described. The project is built using available open-source software modules with visual studio code community backing, which can accommodate any updates in the future.
[7]	Voice Assistant Using Python	Desktop Assistant, Python, Text to Speech, Virtual Assistant, Voice Recognition	Voice Assistant	This paper discusses a voice assistant developed using Python. This assistant currently works as an application and performs basic tasks like weather updates, streaming music, searching Wikipedia, opening desktop applications, etc.
[8]	AI Based Voice Assistant Using Python	Speech Recognition, Python, Speech-to-Text, Text-to-Speech	Voice Assistant	In this paper, they have developed a voice assistant using Python. The project is built using open-source software modules with the PyCharm community. This project can be further improved by implementing the voice command in Google search queries.
[9]	Personal Assistant with Voice Recognition Intelligence	Google Voice Search, Voice Pattern Detection, Keyword Learning	Personal Voice Assistant	This paper focuses on "PARI", which is specially designed to help Native and Blind people who work on their Voice Commands. It also has the capability of recognizing voice commands without an internet connection. It has various functionalities for mobile devices, like network connection and managing various applications with just voice commands. It contains key features like Voice Pattern Detection, Keyword Learning, etc.
[10]	INTELLIGENT VOICE ASSISTANT	wake-word, voice assistant, voice recognition, Alexa, API-Application Program Interface, localization	Voice Assistant	This intelligent voice assistant responds to the commands given by the user. To accept the command, first, the voice recognition tool of the system has to be awakened to accept and execute the request. In this, various skills have been created in Hindi and Marathi languages, such as facts, good thoughts, weather, time, nearby hospitals, and a city guide. These skills are created using an Amazon Developer account.
[11]	Desktop Voice Assistant	Speech recognition, Python, and Google text-to-speech	Voice Assistant	Here they have developed a voice assistant which can perform any kind of task in exchange for commands given by the users without any error. They have also added more features to it, like that it will listen to the user’s voice only and will not be activated by environmental noise. All the packages required in the Python programming language have been installed and the code was implemented using the VS Code Integrated Development Environment (IDE).
[12]	Vitro: Designing a Voice Assistant for the Scientific Lab Workplace	Design Research, Conversational Agent, Augmented Scientific Workplace	Voice Assistant	In this paper, they have designed a voice assistant and also done research on how the voice assistant can play a major role in a laboratory.
[13]	Firefox Voice: An Open and Extensible Voice Assistant Built Upon the Web	Conversational user interface, CUI, Browser Extension, OpenSource	Open source voice assistant	This paper mainly focuses on "Firefox Voice", a voice assistant which is developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It shows how the voice assistant works and what its features are.
[14]	Voice Assistant Application for the Serbian Language	Voice assistant, continuous speech recognition, Kaldi speech recognition toolkit, Serbian, Android	Voice Assistant	This Voice Assistant is the first mobile phone application developed for the Serbian language, allowing faster and more natural communication between the phone and the user, even under noisy conditions. The results were highly improved by incorporating the noise itself within the acoustic model.
[15]	Home Automation using Arduino and Smart Phone	Nodemcu microcontroller, Google Assistant APP, IFTTT, Adafruit IO	Voice assistant for home	It is a home automation system using the Arduino Uno board and wireless fidelity technology. It accepts commands only through clicks. Home automation through Android mobile is designed for physically challenged and disabled people.
[16]	A voice based text mail system for visually impaired	Voice-based, Visually handicapped, Email System	Voice-based Text Mail System	Here they have proposed an android application designed specifically for visually challenged people. It provides a voice-based mailing service where they can read and send mail on their own, without any guidance. The users have to use certain keywords which will perform certain actions, e.g., read, send, compose mail, address book, etc. This email system can be used by a blind person to access mail easily and efficiently.
[17]	Voice Control Human Assistance Robot	Raspberry pi, Assistant, voice recognition, etc	Voice Controlled Robot	The robot developed in this project is able to move in any direction, like the front, back, left, right, according to the voice commands received from the user through a microphone as part of our hardware in this project. There is an autonomous voice command which can instantly make the robot move automatically without hitting any obstacle using an ultrasonic sensor. This device will help users give a uniform look to their lawn with ease. As well, it can also be used for blind or physically challenged people by embedding this system into the wheelchair, which will make an autonomous wheelchair.
[18]	Virtual Assistant in Native Language	Virtual Assistant, Sinhala, Cloud Deployment, Speech recognition, Translation	Virtual Assistant	In this paper, they have discussed the Sinhala language. The language is useful and what major role does it play in today’s tech industry? So, around 8 million people all around the world use the Sinhala language as their preferred oral language. Most of them are associated with technological advancements. Most Sinhala people suffer from finding the correct words in English. So, this is a start-up solution for such people to virtually assist in their mother tongue to get work done. This project gives a basis for how virtual assistants work and also gives a basis for what is probably the fastest way to get Sinhala to make an identity in the tech industry.
[19]	Voice Control Device using Raspberry Pi	Virtual Personal Assistant, Natural Language Processing, Query Processing, Raspberry Pi	Voice-Controlled Device	This paper describes the working of a device based on the implementation of a voice command system as an intelligent personal assistant. This voice-driven device uses the Raspberry Pi as its main hardware. A speech-to-text engine is used to convert the voice command to simple text. Query processing is then applied using natural language processing (NLP) to this text to interpret the intended meaning of the command given by the user. After interpreting the intended meaning, text-to-speech conversion is used to give the appropriate output in the form of speech. The services provided by the device depending on the input given, such as weather, telling time, or accessing online applications to listen to music.
[20]	Speech Emotion Recognition using Neural Network and MLP Classifier	MLP-Classifier, MFCC, Model, Neural Networks, Prediction	Speech Emotion Recogniser	Speech Emotion Recognition (SER) is a technique that uses Neural Networks to classify emotions from a given speech. It is based on the fact that the voice often reflects underlying emotion through tone and pitch. Speech Emotion Recognition helps to classify and elicit specific types of emotions. The MLP-Classifier is used to classify the emotions from the given wave signal, which makes the choice of learning rate adaptive. The dataset used will be RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song dataset).
[21]	Voice Recognition based Intelligent Wheelchair and GPS Tracking System	Voice Recognition, GPS module, smartphone application, Firebase, Wi-Fi module, speed control, obstacle detection	Intelligent wheelchair and GPS Tracking System	Here is a development of a voice recognition-based intelligent wheelchair system for physically handicapped people who are unable to drive the wheelchair by hand. So this system works like this: the patient can operate the wheelchair using voice commands and the location of the patient can be tracked using a GPS module in the wheelchair that tracks and sends the information to a smartphone application (app) via Firebase. The Voice Module V3 is used to record a patient's voice and recognize that voice to follow the instructions of the patient.

III. PROBLEMS IDENTIFIED

Voice AI devices, such as Amazon Alexa, Microsoft Cortana, Apple Siri, and Google Assistant, will be the channel through which we communicate with one another and with our software.

For starters, voice AI systems should ideally provide nuanced responses, making us feel as if we're conversing with a fellow human, or something close to it. Conversational AI is currently a command line, not a genuine dialogue, as everyone with a smart speaker or a personal assistant on their phone knows.

We also faced a few problems:

Regarding the weather forecasting part because we need to pay every time we use it.
Excluding Anaconda and Jupyter Notebook, we faced many problems while running the program on other software applications. The user interface of Anaconda is easy to work with and it does not give many indentation errors.

The problems that we faced are minor and we are currently working on them. Even though there are some drawbacks in our system, that doesn’t mean our system isn’t properly working. Leaving the drawbacks, our system is almost capable of doing things that a normal voice assistant can.

IV. PROPOSED APPROACH

The following features will be included in the proposed system:

The system will continue to listen for commands, and the length of time it spends listening is adjustable to meet the needs of the user.
If the system is unable to extract information from the user's input, it will prompt the user to repeat the process until the desired number of times has been reached.
The system will be voiced by a woman.
Playing music, sending emails, sending texts, searching Wikipedia, accessing system-installed applications, opening anything in the web browser, and so on are all supported in the present edition.

V. WORKFLOW & METHODOLOGY

The study began with an analysis of the user's auditory commands delivered through the microphone. This can include obtaining any information, accessing the computer's internal data, and so on. This is empirical qualitative research based on reading the material indicated above and putting the instances to the test. Tests are carried out by programming in accordance with books and internet resources, with the stated purpose of discovering best practices and a deeper understanding of Voice Assistant.

Speech Recognition	The system converts speech input to text using Google's online speech recognition system. The voice input Users can obtain texts from the special corpora organized on the computer network server at the information center, which are temporarily stored in the system before being sent to Google cloud for speech recognition. After that, the equivalent text is received and fed into the central processor.
Python Backend	The python backend reads the voice recognition module's output and determines whether the command or speech output is an API Call, Context Extraction, or System Call. The output is then transmitted back to the python backend to provide the user with the desired results.
API Calls	API is an abbreviation for Application Programming Interface. An application programming interface (API) is a software interface that enables two applications to communicate with one another. In other words, we can say that an API serves as a messenger, who can deliver your request to the provider and then return the response to you.
Content Extraction	Context extraction (CE) is the process of extracting structured information from unstructured and/or semi-structured machine-readable documents automatically. In most cases, this activity involves using natural language processing to process human language texts (NLP). Recent developments in multimedia document processing, such as automatic annotation and content extraction from images/audio/video, could be viewed as context extraction TEST RESULTS.
System Calls	A system call is a programmatic method by which a computer program requests a service from the kernel of the operating system on which it is running. This can include hardware-related services, the creation and execution of new processes, and communication with core kernel services like process scheduling. System calls serve as a vital link between a process and the operating system.
Text-to-Speech	The capacity of computers to read text aloud is referred to as text-to-speech (TTS). Written text is converted to a phonemic representation, which is subsequently converted to waveforms that can be generated as sound by a TTS Engine. Third-party publishers offer TTS engines in a variety of languages, dialects, and specialized vocabularies.

VI. SYSTEM ARCHITECTURE

Speech Recognition: Speech recognition is the ability of a machine to understand what humans are saying. In our project, we're using Python and Google Speech API to develop software that can run devices on command. To recognize voice commands, we must install the Pyaudio Python package. The ‘pip install Pyaudio’ command is used to install Pyaudio.
DateTime: The DateTime package is used to display Date and Time on our output screen. It comes built-in with Python.
Wikipedia: In our project, we used the Wikipedia module to get more information from Wikipedia or to perform a Wikipedia search. We have used ‘pip install wikipedia’ to install this Wikipedia module.
Webbrowser: The Webbrowser package is used to perform a web search. It comes built-in with Python.
gTTS: gTTS is abbreviated as Google Text-to-Speech. It converts your audio commands to text. It will basically convert the response from the lookup function that you write to get the answer to the question or command into audio form. This package connects to the Google Translate API.
OS: OS basically stands for Operating System. Python's OS module provides functions for interacting with the operating system. Python's standard utility modules include OS. This module enables the use of operating system-dependent functionality.
Pyjokes: Pyjokes is a tool for collecting jokes from the Internet. We have included Pyjokes in our project because it includes jokes. It's very intriguing. Pyjokes is a one-line joke that adds interest to our project.
Playsound: The Playsound module is a platform-independent module that can play audio files. It is simple to use Playsound on Python. There are no dependencies for this, simply install with pip in your virtual environment and run.
Pyaudio: Pyaudio is a Python binding for PortAudio, a cross-platform audio input/output library. This essentially means that we can use Pyaudio to record and play sound on any platform or operating system, including Windows, Mac, and Linux.
WolframAlpha: WolframAlpha is a Wolfram Research computational knowledge engine and answer engine. It provides direct answers to factual queries by computing the answer from externally sourced data.
Selenium: Selenium is an open-source umbrella project for a variety of browser automation tools and libraries.
Requests: Python's Requests module allows you to send HTTP requests. It's used to send GET and POST requests. It hides the complexities of making requests behind a beautiful, straightforward API.

VII. RESULTS

This part of the research paper is a brief description of the output of our project. In our project, we chose Python as the preferred programming language. We primarily worked on AI and machine learning. We focused on tasks performed by the voice assistant. The main reason for using Python is that it was easy to deploy in the Jupyter notebook, we have compared other deploying engines and this was comfortable and easy to use.

The output that is shown in the figure is the first thing that our voice assistant responds with .i.e.

“Hi, I am sara your personal voice assistant”

“How may I help you?”

Following this command, the user needs to respond in a particular way that the voice assistant may respond accordingly and give the required output. We have used different algorithms for each command, just like we have used a command that can open Google Chrome, Youtube, Wikipedia, Facebook, etc.

There are a few screenshots of the output that our voice assistant gives on executing the following commands:

“Open Youtube”
“Open Wikipedia”
“Open Facebook”

VIII. COMPARISON BETWEEN OUR PROJECT AND EXISTING PROJECTS

Works	Name	Research	Comparison
[2]	Desktop Voice Assistant Using Natural Language Processing (NLP)	In this study, they have developed a voice assistant that can perform any kind of task in exchange for commands given by the users without any error. They have added more features like listening to the user’s voice only and not being activated by environmental noise.	As it is written in this paper, our voice assistant does not have the feature of listening to the user’s voice. We are currently working on this and will include it in the next updates of our software.
[3]	Desktop Assistant AI Using Python	They discussed a Python-based voice-activated personal assistant in this paper. This assistant currently works online and performs basic tasks like weather updates, and streaming music, searching Wikipedia, opening desktop applications, etc. The functionality of the current system is limited to working online only.	In comparison to their voice assistant, our voice assistant does not work online. Currently, we are working on that feature. And we will roll it in the coming updates of our voice assistant.
[4]	JARVIS: A PC Voice Assistant	This voice assistant has automated various services using a single line command. It eases most of the tasks of the user like searching the web, retrieving weather forecast details, and translating words from one language to another language, accessing youtube videos, sending mail through voice, and solving computational queries.	JARVIS and our project is kind of the same but does not have the features like retrieving weather forecast details and translating words from one language to another language. This is a minor difference and we would like to include these features in the coming updates.
[9]	Personal Assistant with Voice Recognition Intelligence	This paper focuses on "PARI", which is specially designed to help Native and Blind people who work on their Voice Commands. It also has the capability of recognizing voice commands without an internet connection. It has various functionalities for mobile devices, like network connection and managing various applications with just voice commands. It contains key features like Voice Pattern Detection, Keyword Learning, etc.	PARI is a brilliant voice assistant that is specially designed for blind people. It is a great initiative and is very helpful software. Our voice assistant does not focus on this but we would like to add these extra features in our project so that it can be beneficial to the blind as well as disabled people.
[12]	Vitro: Designing a Voice Assistant for the Scientific Lab Workplace	In this paper, they have designed a voice assistant and also done research on how the voice assistant can play a major role in a laboratory.	Here they have built a smart voice assistant that is specifically used for Scientific Lab Workspace. In comparison, our voice assistant is not designed to work in the laboratory but we have this concept on our mind. This is a new concept, so maybe this concept can further be included in our project.
[13]	Firefox Voice: An Open and Extensible Voice Assistant Built Upon the Web	This paper mainly focuses on "Firefox Voice", a voice assistant which is developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It shows how the voice assistant works and what its features are.	Now the voice assistant over here is mainly designed for a web browser. In comparison, our voice assistant is not designed for a web browser and also does not work online. We are currently working on this and this feature will be seen in the coming updates of our software.
[15]	Home Automation using Arduino and Smart Phone	It is a home automation system using the Arduino Uno board and wireless fidelity technology. It accepts commands only through clicks. Home automation through Android mobile is designed for physically challenged and disabled people.	This is a home automation and it is specially designed for physically challenged and disabled people using Arduino. In comparison, our voice assistant is not that advanced and is also not built for physically challenged and disabled people.
[16]	A voice based text mail system for the visually impaired	Here they have proposed an android application designed specifically for visually challenged people. It provides a voice-based mailing service where they can read and send mail on their own, without any guidance. The users have to use certain keywords which will perform certain actions, e.g., read, send, compose mail, address book, etc. This email system can be used by a blind person to access mail easily and efficiently.	Here we have a text-based mail system that is specifically built for visually impaired people. In comparison to their voice assistant, our system does not provide voice-based mail delivery and is also not built for visually impaired people. We are currently working on getting these features onboard as soon as possible. It will be rolled in further updates of our voice assistant.
[17]	Voice Control Human Assistance Robot	The robot developed in this project is able to move in any direction, like the front, back, left, or right, according to the voice commands received from the user through a microphone as part of our hardware in this project. There is an autonomous voice command which can instantly make the robot move automatically without hitting any obstacle using an ultrasonic sensor. This device will help users give a uniform look to their lawn with ease. As well, it can also be used for the blind or physically challenged people by embedding this system into the wheelchair, which will make an autonomous wheelchair.	The project that we have made is different than theirs because we have not developed a robot system. They have developed an advanced robot that is controlled by voice commands. We have this project on our mind and maybe in the future, our voice assistant can be incorporated into a robot system.
[19]	Voice Control Device using Raspberry Pi	This paper describes the working of a device based on the implementation of a voice command system as an intelligent personal assistant. This voice-driven device uses the Raspberry Pi as its main hardware. A speech-to-text engine is used to convert the voice command to simple text. Query processing is then applied using natural language processing (NLP) to this text to interpret the intended meaning of the command given by the user. After interpreting the intended meaning, text-to-speech conversion is used to give the appropriate output in the form of speech. The services provided by the device depending on the input given, such as weather, telling time, or accessing online applications to listen to music.	The device created here is an advanced voice-controlled device that is developed using raspberry pi. This device is way better than ours and can perform many different tasks than our voice assistant. It is an advanced technology and we can research this and make a working model in the upcoming updates of our software.
[21]	Voice Recognition based Intelligent Wheelchair and GPS Tracking System	Here is a development of a voice recognition-based intelligent wheelchair system for physically handicapped people who are unable to drive the wheelchair by hand. So this system works like this: the patient can operate the wheelchair using voice commands and the location of the patient can be tracked using a GPS module in the wheelchair that tracks and sends the information to a smartphone application (app) via Firebase. The Voice Module V3 is used to record a patient's voice and recognize that voice to follow the instructions of the patient.	This is an intelligent wheelchair and GPS tracking system which can be used by physically handicapped people. This is a great initiative by this team and we are looking for this technology for physically handicapped people. We can take points from this project and implement these in the coming updates of our system.

Conclusion

The virtual assistant we have created is able to do almost everything that the user commands it to do from opening a particular file on the system to web surfing to gather or collect information on the required topic. We kept a simple approach to our problem using python. Some main Python packages used in our product are this are speech Recognition, Python PyAudio, and Python TTS. We have successfully made a working virtual assistant which can be activated by the user using the wake keyword “SARA”, and can manipulate the system using verbal commands. It eases most of the tasks of the user like searching the web, accessing youtube videos, sending mail through voice, etc.In the future, we hope to incorporate more Artificial Intelligence into our project, such as Machine Learning, Neural networks, and so on, as well as the Internet of Things. With the addition of these elements, we will be able to improve our voice assistant by adding new features to it.

References

[1] Jaydeep, Dr, P. A. Shewale, E. Bhushan, A. Fernandes, and R. Khartadkar. \"A Voice Based Assistant Using Google Dialogflow and MachineLearning.\" International Journal of Scientific Research in Science and Technology 8, no. 3 (2021): 06-17. [2] Kumar, Lalit. \"Desktop Voice Assistant Using Natural Language Processing (NLP).\" International Journal for Modern Trends in Science and Technology (2020): n. pag. [3] International Journal of Advanced Research in Science, Communication and Technology (IJARSCT) Volume 6, Issue 2, June 2021 \"Desktop Assistant AI Using Python\" [4] Vora, Jash, Deepak Yadav, Ronak Jain, and Jaya Gupta. \"JARVIS: A PC Voice Assistant.\" (2021). [5] Geetha, V., C. K. Gomathy, Kottamasu Manasa Sri Vardhan, and Nukala Pavan Kumar. \"The Voice Enabled Personal Assistant for Pc using Python.\" [6] Pandey, Ankit, Vaibhav Vashist, Prateek Tiwari, Sunil Sikka, and Priyanka Makkar. \"Smart Voice Based Virtual Personal Assistants with Artificial Intelligence.\" Artificial & Computational Intelligence/Published Online: June (2020). [7] July 2021| IJIRT | Volume 8 Issue 2 | ISSN: 2349-6002 \"Voice Assistant Using Python.\" Nivedita Singh, Dr. Diwakar Yagyasen, Mr. Surya Vikram Singh, Gaurav Kumar, Harshit Agrawal [8] Shende, Deepak, Ria Umahiya, Monika Raghorte, Aishwarya Bhisikar, and Anup Bhange. \"AI Based Voice Assistant Using Python.\" Journal of Emerging Technologies and Innovative Research 6, no. 2 (2019): 506-509. [9] Kulhalli, Kshama V., Kotrappa Sirbi, and Mr Abhijit J. Patankar. \"Personal assistant with voice recognition intelligence.\" International Journal of Engineering Research and Technology 10, no. 1 (2017). [10] Patil, Akshay, Suyash Samant, Mohit Ramtekkar, Shubham Ragaji, and Jayashree Khanapuri. \"Intelligent Voice Assistant.\" In Proceedings of the 3rd International Conference on Advances in Science & Technology (ICAST). 2020. [11] International Journal of Research in Engineering and Science (IJRES) ISSN (Online): 2320-9364, ISSN (Print): 2320-9356 www.ijres.org Volume 10 Issue 2 ? 2022 ? PP. 15-20 \"Research Paper onDesktop Voice Assistant.\" [12] Cambre, Julia, Ying Liu, Rebecca E. Taylor, and Chinmay Kulkarni. \"Vitro: Designing a Voice Assistant for the Scientific Lab Workplace.\" In Proceedings of the 2019 on Designing Interactive Systems Conference, pp. 1531-1542. 2019. [13] Cambre, Julia, Alex C. Williams, Afsaneh Razi, Ian Bicking, Abraham Wallin, Janice Tsai, Chinmay Kulkarni, and Jofish Kaye. \"Firefox Voice: An Open and Extensible Voice Assistant Built Upon the Web.\" In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1-18. 2021. [14] Popovi?, Branislav, Edvin Pakoci, Nikša Jakovljevi?, Goran Ko?iš, and Darko Pekar. \"Voice assistant application for the Serbian language.\" In 2015 23rd Telecommunications Forum Telfor (TELFOR), pp. 858-861. IEEE, 2015. [15] Mr. T. M.Senthil Ganesan, M. Rama Jothi, R. S. Sangavi, L. Umayal, 2019 \"Home Automation using Arduino and Smart Phone\" INTERNATIONAL JOURNAL OF ENGINEERING RESEARCH & TECHNOLOGY (IJERT) ETEDM [16] M.J, Carmel Mary Belinda, N. Rupavathy and Mahalakshmi N.R. “A voice based text mail system for visually impaired.” International journal of engineering and technology 7 (2018): 132. [17] John, Linda, Nilesh Vishwakarma, and Rajat Sharma. \"Voice Control Human Assistance Robot.\" In National Conference on Technical Advancements for Social Upliftment, Proceedings of the 2 nd VNC. 2020. [18] Dias, Pubudu M., and Kithsiri Jayakody. \"Virtual Assistant in Native Language.\" In 2020 IEEE Asia-Pacific Conference on Geoscience, Electronics and Remote Sensing Technology (AGERS), pp. 16-18. IEEE, 2020. [19] Singh, Pooja, Pinki Nayak, Arpita Datta, Depanshu Sani, Garima Raghav, and Rahul Tejpal. \"Voice Control Device using Raspberry Pi.\" In 2019 Amity International Conference on Artificial Intelligence (AICAI), pp. 723-728. IEEE, 2019. [20] Joy, Jerry, Aparna Kannan, Shreya Ram, and S. Rama. \"Speech Emotion Recognition using Neural Network and MLPClassifier.\" IJESC, April-2020 (2020). [21] Aktar, Nasrin, Israt Jaharr, and Bijoya Lala. \"Voice recognition based intelligent wheelchair and GPS tracking system.\" In 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE), pp. 1-6. IEEE, 2019. OTHER REFERENCES [1] Cambre, Julia, Alex C. Williams, Afsaneh Razi, Ian Bicking, Abraham Wallin, Janice Tsai, Chinmay Kulkarni, and Jofish Kaye. \"Firefox Voice: An Open and Extensible Voice Assistant Built Upon the Web.\" In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1-18. 2021. [2] Liao, Song, Christin Wilson, Long Cheng, Hongxin Hu, and Huixing Deng. \"Measuring the effectiveness of privacy policies for voice assistant applications.\" In Annual Computer Security Applications Conference, pp. 856-869. 2020. [3] Pérez, Anxo, Paula Lopez-Otero, and Javier Parapar. \"Designing an Open Source Virtual Assistant.\" Multidisciplinary Digital Publishing Institute Proceedings. Vol. 54. No. 1. 2020. [4] Braun, Michael, Anja Mainz, Ronee Chadowitz, Bastian Pfleging, and Florian Alt. \"At your service: Designing voice assistant personalities to improve automotive user interfaces.\" In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 1-11. 2019. [5] Zhang, Ying, Mohammad Pezeshki, Philémon Brakel, Saizheng Zhang, Cesar Laurent Yoshua Bengio, and Aaron Courville. \"Towards end-to-end speech recognition with deep convolutional neural networks.\" arXiv preprint arXiv:1701.02720 (2017). [6] Shalini, Shradha, Trevor Levins, Erin L. Robinson, Kari Lane, Geunhye Park, and Marjorie Skubic. \"Development and comparison of customised voice-assistant systems for independent living older adults.\" In International Conference on Human-Computer Interaction, pp. 464-479. Springer, Cham, 2019. [7] Palanica, Adam, Anirudh Thommandram, Andrew Lee, Michael Li, and Yan Fossat. \"Do you understand the words that are coming outta my mouth? Voice assistant comprehension of medication names.\" NPJ digital medicine 2, no. 1 (2019): 1-6. [8] Friedman, Natalie, Andrea Cuadra, Ruchi Patel, Shiri Azenkot, Joel Stein, and Wendy Ju. \"Voice assistant strategies and opportunities for people with tetraplegia.\" In The 21st International ACM SIGACCESS Conference on Computers and Accessibility, pp. 575-577. 2019. [9] Marr B. Artificial intelligence in practice: how 50 successful companies used AI and machine learning to solve problems. John Wiley & Sons; 2019 May 28. [10] Braun, Michael, Anja Mainz, Ronee Chadowitz, Bastian Pfleging, and Florian Alt. \"At your service: Designing voice assistant personalities to improve automotive user interfaces.\" In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 1-11. 2019. [11] Beirl, Diana, Y. Rogers, and Nicola Yuill. \"Using voice assistant skills in family life.\" In Computer-Supported Collaborative Learning Conference, CSCL, vol. 1, pp. 96-103. International Society of the Learning Sciences, Inc., 2019. [12] Lötsch J, Ultsch A. Machine learning in pain research. Pain. 2018 Apr;159(4):623. [13] Winkler R, Söllner M. Unleashing the potential of chatbots in education: A state-of-the-art analysis. InAcademy of Management Annual Meeting (AOM) 2018. [14] Kepuska V, Bohouta G. Next-generation of virtual personal assistants (microsoft cortana, apple siri, amazon alexa and google home). In2018 IEEE 8th annual computing and communication workshop and conference (CCWC) 2018 Jan 8 (pp. 99-103). IEEE. [15] True, A. Miracle Made. \"IBM’s Watson Analytics for Health Care.\" (2017). [16] Thakur N, Hiwrale A, Selote S, Shinde A, Mahakalkar N. Artificially Intelligent Chatbot. Universal Research Reports. 2017;4(6):43. [17] Hill J, Ford WR, Farreras IG. Real conversations with artificial intelligence: A comparison between human–human online conversations and human–chatbot conversations. Computers in human behaviour. 2015 Aug 1;49:245-50. [18] Noda K, Arie H, Suga Y, Ogata T. Multimodal integration learning of robot behaviour using deep neural networks. Robotics and Autonomous Systems. 2014 Jun 1;62(6):721-36. [19] Lei, Xin, Andrew Senior, Alexander Gruenstein, and Jeffrey Sorensen. \"Accurate and compact large vocabulary speech recognition on mobile devices.\" (2013) [20] Aron, Jacob. \"How innovative is Apple\'s new voice assistant, Siri?.\" (2011): 24. [21] Nguyen P, Heigold G, Zweig G. Speech recognition with flat direct models. IEEE Journal of Selected Topics in Signal Processing. 2010 Sep 27;4(6):994-1006. [22] Huang J, Zhou M, Yang D. Extracting Chatbot Knowledge from Online Discussion Forums. InIJCAI 2007 Jan 6 (Vol. 7, pp. 423-428). [23] Fryer L, Carpenter R. Bots as language learning tools. Language learning and technology. Language Learning & Technology. 2006;10(3):8-14. [24] Allen JB. From Lord Rayleigh to Shannon: How do humans decode speech?. InInternational Conference on Acoustics, Speech and Signal Processing 2002. [25] Atal B, Rabiner L. A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1976 Jun;24(3):201-12.

Copyright

Copyright © 2022 Ayush Chinchane, Aryan Bhushan, Ayush Helonde, Prof. Kiran Bidua. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET44517

Publish Date : 2022-06-18

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here