Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Aditi Dhumal, Vishal Yadav, Aniket Dhere, Navnath Bagal
DOI Link: https://doi.org/10.22214/ijraset.2024.62158
Certificate: View Certificate
The “AURA” is the project which proposes the development of the versatile desktop voice assistant which mainly focuses on the privacy of the user.The convenience of voice interaction has surged in popularity, but concerns about user privacy remain. Conventional cloud-based voice assistants raise anxieties regarding data collection, storage, and potential misuse. This survey paper explores the burgeoning field of privacy-preserving desktop voice assistants powered by Artificial Intelligence (AI). We delve into the core challenges of on-device speech recognition, natural language understanding, and response generation while maintaining user privacy. We conclude by outlining open research questions and future directions for this evolving field, paving the way for the development of secure and user-centric voice interaction experiences on desktops.
I. INTRODUCTION
The primary objective of the project is to develop the desktop voice assistant which can perform multiple tasks based on the commands given by the user while concerning the privacy of the user.In today's digital world, voice assistants have become increasingly popular, offering a convenient way to interact with technology through spoken commands. However, these assistants often raise privacy concerns as they require access to user audio data for speech recognition and potentially other functionalities. This is particularly true for desktop environments where users may handle sensitive information.
In today's digital world, voice assistants have become increasingly popular, offering a convenient hands-free way to interact with technology. However, these assistants often raise concerns about privacy, as they typically require constant access to user data and potentially transmit recordings to remote servers for processing. This raises the need for a new generation of voice assistants that prioritise user privacy while maintaining functionality.
This paper proposes the development of a privacy-preserving multifunctional desktop voice assistant. This novel system aims to address the growing demand for user privacy in voice interaction while offering a comprehensive suite of features for an enhanced desktop experience.
The following sections will explore the importance of privacy in voice assistants, the limitations of current systems, and the functionalities envisioned for this new privacy-focused desktop assistant. We will then delve into the proposed technical approach that ensures user privacy while enabling robust voice interaction functionalities.
This research aims to contribute to the field of human-computer interaction by providing a secure and user-centric voice assistant experience for desktop environments.
II. LITERATURE SURVEY
III. PROPOSED WORK
Voice assistant is one of the biggest problem solvers for the users of the desktop .Solution of every problem which can be faced by the user can be found in the fraction of the seconds and only with the voice commands.
The voice assistant AURA performs multiple tasks on the voice command given by the user while concerning the privacy of the user. It is useful for the user to perform multiple tasks only just with the voice command and it is also useful for the impaired persons who are unable to perform the tasks visually so the AURA can help them to perform the tasks just by their voice commands.
The user can perform multiple tasks using the AURA and some of the tasks are listed as below :
These are the important features of the voice assistant but other than this, we can do an ample of stuff with the assistant.
The user gives the voice input through microphone performs the STT (Speech to Text) and assistant understand and converts it into a text and understand that text and perform the task said by the user
IV. IMPLEMENTATION OF PROPOSED WORK
A. Technologies Used
5. Anonymization function: There are two types of the anonymization - Text and voice anonymization
This is the added security feature in this project . This feature identifies the particular pattern in the data and protects that particular type of data. ex: email , phone number , address , location , etc.
B. System Architecture
The overall system design consists of following phases:
In the first phase, the data is collected in the form of speech and stored as an input for the next phase for processing. In the second phase, the input voice is continuously processed and converted to text using STT. In the next phase the converted text is analysed and processed using Python Script to identify the response to be taken against the command. Finally once the response is identified, output is generated from simple text to speech conversion using TTS.
Security Feature Added :
Text Anonymization :
Text anonymization is the process of modifying text data to prevent the identification of individuals or sensitive information. It's crucial for protecting privacy in various situations, such as sharing medical records for research or publishing survey results. Here's a breakdown of key concepts and techniques:
Why Text Anonymization?
-Privacy Protection: Anonymization safeguards individuals' identities by redacting or modifying personal details like names, addresses, phone numbers, and even certain demographics.
-Data Sharing: Enables secure sharing of sensitive data for research, analysis, or public reporting while upholding privacy regulations like GDPR and HIPAA.
-Mitigates Risks: Reduces the risk of data breaches or unauthorised access that could expose personally identifiable information (PII).
Text Anonymization Techniques:
-Redaction: Simply removes or replaces sensitive text with symbols (****) or generic terms (e.g., "patient ID").
-Pseudonymization: Replaces PII with fictitious but consistent identifiers throughout the text. This allows some analysis while protecting identities (e.g., "John Doe" becomes "Participant A").
-Tokenization: Replaces sensitive words or phrases with random tokens that don't hold inherent meaning. This can be useful for anonymizing specific terminology within a broader text corpus.
V. RESULTS
The project work of the AURA voice assistant has been clearly explained in this report, how useful it is and how we can rely on a voice assistant for performing any task which the user needs to complete .It can perform the task which are useful for the user and many users can use it easily without any difficulty and perform the tasks with just voice command.
In this project we also created the GUI which helps the user to easily access the AURA . Users can use the AURA by activating it and they can deactivate whenever they want. All the commands and output to that commands are visible on the GUI created.
Development of the software is almost completed from our side and it's working fine as expected which was discussed for some extra development so, maybe some advancement might come in the near future where the assistant which we developed will be even more useful than it is now.
VI. FUTURE SCOPE
Through this AURA voice assistant, we have automated various services using a single line command. It eases most of the tasks of the user like searching the web,retrieving weather forecast details, vocabulary help and medical related queries.We aim to make this project a complete server assistant and make it smart enough to act as a replacement for a general server administration. The AURA mainly focus on providing the services while concerning th users privacy .The future plans include integrating Jarvis with mobile using React Native to provide a synchronised experience between the two connected devices.The privacy-preserving multifunctional desktop voice assistant is a significant step forward in the development of voice interaction technologies that prioritise user privacy and security.
[1] Deller John R., Jr., Hansen John J.L., Proakis John G. ,Discrete-Time Processing of Speech Signals, IEEE Press, ISBN 0-7803-5386-2. [2] Hayes H. Monson,Statistical Digital Signal Processing and Modeling, John Wiley & Sons Inc. , Toronto, 1996, ISBN 0-471-59431-8. [3] Proakis John G., Manolakis Dimitris G.,Digital Signal Processing, principles, algorithms, and applications, Third Edition, Prentice Hall , New Jersey, 1996, ISBN 0-13- 394338-9. [4] Ashish Jain,Hohn Harris,Speaker identification using MFCC and HMM based techniques,university Of Florida,April 25,2004. [5] Rabiner Lawrence, Juang Bing-Hwang. Fundamentals of Speech Recognition Prentice Hall , New Jersey, 1993, ISBN 0-13-015157-2.
Copyright © 2024 Aditi Dhumal, Vishal Yadav, Aniket Dhere, Navnath Bagal. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET62158
Publish Date : 2024-05-15
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here