JARVIS: A Virtual Assistant

Authors: Dr. Yatu Rani, Ms. Gurminder Kaur, Harsh Rana, Sagar , Nikhil

DOI Link: https://doi.org/10.22214/ijraset.2023.49111

Abstract

As we know Python is an emerging language so it becomes easy to write a script for Voice Assistant in Python. The instructions for the assistant can be handled as per the requirement of user. Speech recognition is the process of converting speech into text. This is commonly used in voice assistants like Alexa, Siri, etc. In Python there is an API called Speech Recognition which allows us to convert speech into text. It was an interesting task to make my own assistant. It became easier to send emails without typing any word, Searching on Google without opening the browser, and performing many other daily tasks like playing music, opening your favorite IDE with the help of a single voice command. In the current scenario, advancement in technologies are such that they can perform any task with same effectiveness or can say more effectively than us. I realized that the concept of AI in every field is decreasing human effort and saving time.

Introduction

I. INTRODUCTION

Our digital life is decided by innovations. Especially in recent years, more innovative technologies were developed to ease our professional lifestyle. An intelligent Personal Assistant is proved to be the most vital innovation in terms of easing our lives and providing a hands-free experience. We are building a PC Personal Assistant that works on voice commands and executes the user query. It does not exist that a person's learns to speak with a system, however currently a computer system learns to speak with a person, looking and traversing their actions, habits, behavior, or nature and creating efforts to become his customized assistant.

Speech recognition is a topic that’s extremely useful in many applications and environments in our everyday life.

This system is designed to be used efficiently on desktops. Personal assistant software improves user productivity by managing routine tasks of the user and by providing information from online sources to the user. JARVIS is effortless to use. Call the wake word ‘JARVIS’ followed by the command. Voice searches have dominated over text search. Web searches conducted via mobile devices have only just overtaken those carried out using a computer and allow your intelligent assistant to make email work for you. Detect intent, pick out important information, automate processes, and deliver personalized responses. This project was started on the premise that there is enough openly available data and information on the web that can be utilized to build a virtual assistant that has access to making intelligent decisions for routine user activities. Speech Recognition library is employed to perform speech to text conversion, Wikipedia library is employed to urge information from Wikipedia, pyttsx3 library is employed to perform the text to speech, etc. All the tasks are within the textual form which is then converted into an audio signal. A Text-to speech Engine converts the text into phonemic representation, and then it converts the phonemic representation to waveforms which will be output.

II. LITERATURE REVIEW

Intelligent Personal Assistants (IPA) area unit enforced and utilized in operative Systems, net of Things (IOT), and a spread of different systems. Several implementations of IPAs exist these days and corporations like Apple, Google and Microsoft all have their implementations as a serious feature in their operating systems and devices. With the employment of linguistic communication process (NLP), Machine Learning (ML), Artificial Intelligence (AI), and prediction models from these held in applied science (CS), further as theory and techniques from Human-Computer Interaction (HCI), IPAs are getting a lot of intelli-gent and relevant.

This paper aims to analyze and compare the present major implementations of IPAs so as to work out that implementation is the most developed at this moment in time and is causative to the property way forward for AI. Jarvis could be a system designed to reply to user issued commands to supply convenient management over variety of electronic devices. These devices may be lights, TV's, radios, stereos, etc. The system is going to be designed to figure best among a moderate home with the convenience of a wireless router. The system can take a voice input from a user, match that input to Associate in Nursing acceptable command among its library of recognized commands or reject the command if it isn't recognized by the system Associate in Nursing transmit an acceptable message via a router.

The router can then send the operation to the right device therefore the operation may be performed [2]. Assisting users in their tasks is that the main ACORS goal of today’s personal assistant applications. several such applications square measure being developed, that square measure capable to find the user’s habits, abilities, preferences, and goals, even a lot of accurately and predicting the user’s actions prior to and perform them while not user’s interaction. The assistant agent needs to unceasingly improve its behavior supported previous experiences. enhancements square measure achieved in personal assistant applications by learning mechanism. Agents square measure capable of accessing data from databases to guide individuals through completely different tasks, deploying a learning mechanism to accumulate new data on user behavior. Additionally, the resources need to be used in an extremely economical manner resulting in less power consumption. During this paper we’ve planned a machine learning approach for learning mechanism of non-public assistant agent.

III. REAL LIFE APPLICATION

Saves Time: JARVIS is a desktop voice assistant which works on the voice command offered to it, it can do voice searching, voice-activated device control and can let us complete a set of tasks.
Conversational Interaction: It makes it easier to complete any task as it automatically does it by using the essential module or libraries of Python, in a conversational interaction way. Hence any user when instruct any task to it, they feel like giving task to a human assistant because of the conversational interaction for giving input and getting the desired output in the form of task done.2
Reactive Nature: The desktop assistant is reactive which means it know human language very well and understand the context that is provided by the user and gives response in the same way, i.e., human understandable language, English. So, the user finds its reaction in an informed and smart way.
Multitasking: The main application of it can be its multitasking ability. It can ask for continuous instructions one after other until the user “QUIT” it.
No Trigger Phase: It asks for the instruction and listens to the response that is given by user without needing any trigger phase and then only executes the task.

A. Security

Security testing mainly focuses on vulnerabilities and risks. As JARVIS is a local desktop application, there is no risk of data breaching through remote access. The software is dedicated to a specific system so when the user logs in, it will be activated.

B. Stability

Stability of a system depends upon the output of the system, if the output is bounded and specific to the bounded input then the system is said to be stable. If the system works on all the poles of functionality, then it is stable.

IV. METHODOLOGY

We have got a bent to shape our application able to the utilization of gadget voice with the assistance of sapi5 and pyttsx3. Pyttsx3 can also be a text-to-speech conversion library in Python. Now, like distinctive libraries, it works offline and is similar temperament with each Python 2 and three. The Speech Application Programming Interface or SAPI is a diploma API evolved through Microsoft to permit the usage of speech popularity and speech synthesis inside Windows applications. Then we have got a bent to stipulate the talk function to differ this technique to talk the outputs. At that time, we're visiting outline a characteristic to want voice instructions the utilization of the gadget microphone. The foremost function is then made public during which all the competencies of this technique rectangular degree are made public.

The planned system is meant to possess the following functionality:

The Jarvis asked the user for input and keeps listening for orders. The time for a hearing is visiting be set up in step with the user's control
If the assistant fails to grasp the command its visiting keep asking the user to repeat the command once again} and yet again.
This assistant is visiting be bespoken to possess either male or female voice in step with user’s demand.
The current version of the assistant supports choices like Checking weather updates, deed and checking emails, Searching Wikipedia, Stream music, Open applications, Text messages, checking dates and times, taking notes, show notes, Open YouTube, etc.

a. Streaming Music: The user can command Jarvis to play a music track and it's going to execute a command and search into it from the song Folder.

b. Read the Latest News from Headlines: Jarvis will examine out latest headlines from the knowledge retailers of the required topics you care about or need information.

c. Keep Tabs on the Traffic & the Weather: Jarvis can research the weather forecast or alert you if there is an accident that will delay your morning journey.

d. Set Reminders/timers: You’ll be able to tell Jarvis to wake you up daily morning at 4 a.m.

e. Answer the Following Questions: Jarvis can look up simple information, solve mathematical problems, or tell you a joke.

Conclusion

In this paper we have discussed a Voice Activated Personal Assistant developed using python. This assistant currently works online and performs basic tasks like weather updates, stream music, search Wikipedia, open desktop applications, etc. The functionality of the current system is limited to working online only. The upcoming updates of this assistant will have machine learning incorporated in the system which will result in better suggestions with IoT to control the nearby devices similar to what Amazon’s Alexa does.

References

[1] Rabiner Lawrence, Juang Bing-Hwang. Fundamentals of Speech Recognition Prentice Hall, New Jersey, 1993, ISBN 0-13015157-2 [2] Deller John R., Jr., Hansen John J.L., Proakis John G., DiscreteTime Processing of Speech Signals, IEEE Press, ISBN 0-78035386-2 [3] Hayes H. Monson,Statistical Digital Signal Processing and Modeling, John Wiley & Sons Inc., Toronto, 1996, ISBN 0-47159431-8 [4] Proakis John G., Manolakis Dimitris G.,Digital Signal Processing, principles, algorithms, and applications, Third Edition, Prentice Hall , New Jersey, 1996, ISBN 0-13- 394338-9 [5] Ashish Jain,Hohn Harris,Speaker identification using MFCC and HMM based techniques, university Of Florida, April 25,2004.

Copyright

Copyright © 2023 Dr. Yatu Rani, Ms. Gurminder Kaur, Harsh Rana, Sagar , Nikhil . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET49111

Publish Date : 2023-02-14

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here