Data Entry Using Speech Recognition

Authors: Suhasini Konar, Ishita Bhargava, Anushika Balamurgan, Aakanksha Bhatt, Urvashi Patkar

DOI Link: https://doi.org/10.22214/ijraset.2022.40939

Abstract

The domain of data entry has been untouched in terms of changes in its system. The process of entering huge amounts of data still seems daunting to many. Hence to make this process easier we propose this project where we will use two modules to convert raw data into excel format. In the first module, we will be converting from speech to text and then saving the output in excel format. In the second module, we will be converting the pdf file to excel format. The format chosen is excel and not other text-based alternatives as it allows the user to perform analysis tasks on important data tables and also make use of any prediction or analytical algorithms. With the help of python, this project uses the user\'s speech input to give accurate text output.

Introduction

I. INTRODUCTION

Speech to text is a useful feature in the data entry domain. Data management in spreadsheets can be a tedious task, and in a world where the evolution of technology has forced things to change the way they function, it is important to stay up to date with the changes. Generally, data collection, entry, and validation are typically time-consuming tasks that require multiple people to complete. Devoting more time to this also reduces an organization’s manpower and in turn, may reduce the quality of the output. Data entry should be the job of a person and hence the process too should be designed according to an individual.

Speech-to-text modules have been incorporated into many other domains before this. Even some recent instances of software use the user's speech input and convert its data to make better use of it. For example, AI assistants like Google Assistant and Siri use the speech-to-text module prominently. So the obvious and logical upgrade to the existing system of data entry would be the incorporation of speech to text. Instead of writing each and every data in single cells, or remembering big and difficult formulas, a speech to excel system for data entry will allow you save your time and also increase the speed of your work. It will also convert the text format data into extracted excel formatted file for further processing and ease of work.

II. NEED FOR PROJECT

The manual data entry process is a very time consuming process. The manually inputted data file can also contain human errors. Hence we come up with a system that automates this process and makes it faster and more efficient. Instead of typing on a keyboard, the user dictates the input and it is stored, extracted and converted into an excel output.

This method has a lot of advantages over typing on a keyboard because it saves time and improves accuracy. Voice recognition software is being used for data entry. The need for voice data entry is increasing as advances in technology have made it easier for people to use this method over typing on a computer keyboard. It also collects and syncs the data in real time excel. It enhances our productivity by eliminating the use of keyboards and also the manual errors that can be caused. These manual errors can be very time consuming to identify and rectify in spreadsheets. Voice data entry is one way that AI can help people keep up with large amounts of work. Some data entry jobs require the person to accept inbound calls to gather information. The user might have the need to take the name, address, email address and phone number of the caller which then will be typed into a file and then into a final excel spreadsheet that has already been maintained. Such an example depicts the long and dragging process of data entry but with speech recognition software there can be a reduction in data entry errors as well as the time consumption as this is an automated process and makes it faster adding the advantage. Another plus point is storing the data in an excel format and not any other alternative. This allows for easier processing and analyzing of the data as excel files can be easily taken as input for many data analysis and prediction algorithms or software.

III. LITERATURE REVIEW

A. A Review on Methods for SPEECH-TO-TEXT and TEXT-TO-SPEECH Conversion

Layout for development of an interactive voice response-based mailing system that enables users to manage their email accounts using audio commands only and analysis of various methods used for Speech-To-Text conversion.The algorithms used are The Hidden Markov Model with the neural network. Various methods were reviewed in the paper. The accuracy of the proposed model is pretty good and can be very useful for the visually impaired. The most suitable technique for speech to text conversion is by deploying a combination of the Hidden Markov Model with Deep Neural Network

B. Direct Voice Control Speech Data Entry and Database Query Models

They have discussed examples like 1) Sophisticated applications such as Direct Voice Input/Direct Voice Control(DVI/DVC), the personalized speech piloting system in Europe's newest military aircraft, the Eurofighter /Typhoon by EADS and 2) Business-To-Business Electronic Commerce (B2B EC).Algorithm used are MARKOV pattern matching and, more advanced, natural language stimulation techniques Speech as a result of natural language processing provides a natural, efficient and flexible means of user interaction with computers in a messaging and communication environment. Future work will extend the integration and classification techniques to the new upcoming standards around the semantic web. The classification of attribute objects supports searches for products of a certain product group. The remaining issue to be solved is the mapping of any product catalogs to the above data structure

C. Free-Text Data Entry by Speech Recognition Software

This paper evaluates speech recognition software in a unit and to assess its impact on productivity before the general implementation Algorithm that are used are advanced, natural language stimulation techniques This software compares the word recognition error rates for different text types and determined their impact The small amount of time required to enter text into the speech recognition system confirmed the idea that current speech software can recognize and record fluent, conversational, an open language without major problems.

D. A Survey on Optical Character Recognition System

This paper summarizes the research done in the field of OCR. It gives an overview of many aspects of OCR and explores relevant solutions for fixing OCR concerns. It specifies various types, applications, and major phases of OCR which can be useful while making the project. This paper goes over each stage in great detail. OCR is a multi-phase procedure that includes acquisition, pre-processing, segmentation, feature extraction, classification, and post-processing. As a future project, an efficient OCR system could be constructed using a mix of these strategies. Multilingual character recognition systems are another major field of research.

??????? IV. METHODOLOGY

The main goal of this project is to convert speech to text for which Google API can be used and then text can be saved to an excel file.

V. OBJECTIVES

Typing for hours and completing data entry can be both time-consuming and exhausting, so the major goal of our suggested system is to make the process of manual data entry of large amounts of data easier by converting speech to text and saving it to an excel file. Additionally, the data can be converted from a pdf file to an excel file. We're working to make our system more user-friendly and reduce the amount of time and energy spent manually entering data.

??????? VI. PROPOSED SYSTEM

Our proposed system is divided into two parts. The first module is the speech to excel conversion and the second module is the pdf to excel conversion. In the first module we have used google API and pyaudio library for speech recognition then we have converted the speech output to text. We have used the hash method for word separation and we extracted the words one after the other. We then migrate this output to a text file (i.e .txt format). This output is then converted from .txt file to .csv file (i.e to excel format). The dictated data is stored in column wise order. In the second module, we have used pdf to tables API after the conversion data elimination is done to eliminate the unnecessary raw data.

??????? VII. RESULTS

VIII. LIMITATIONS

A. Misinterpretation

Programs cannot understand the context of language the way that humans can, leading to errors that are often due to misinterpretation. For example, it cannot always differentiate between homonyms, such as "their" and "there." It may also have problems with slang, technical words, and acronyms

B. Background Noise Interference

To get the best out of voice recognition software, you need a quiet environment. Systems don't work so well if there is a lot of background noise. They may not be able to differentiate between your speech, other people talking and other ambient noise, leading to transcription mix-ups and errors. Wearing close-talking microphones or noise-canceling headsets can help the system focus on your speech.

???????IX. FUTURE SCOPE

We are planning to integrate our above explained modules into Tkinter. We are also planning to include screenshot image(i.e png format) conversion into excel i.e csv formatted file.

X. ACKNOWLEDGMENT

We thank Mrs. Urvashi Patkar (SIES Graduate School of Technology) for guiding us throughout the project and also helping us with achieving the objectives of project easily.

References

[1] Kothadiya, Deep & Pise, Nitin & Bedekar, Mangesh. (2020). Different Methods Review for Speech to Text and Text to Speech Conversion. International Journal of Computer Applications. 175. 9-12. 10.5120/ijca2020920727. [2] U. H. Langanke, \"Direct Voice Control Speech Data Entry and Database Query Models,\" 2007 International Symposium on Logistics and Industrial Informatics, 2007, pp. 111-115, doi: 10.1109/LINDI.2007.4343522. [3] A Survey on Optical Character Recognition System Noman Islam, Zeeshan Islam, Nazia Noor : https://arxiv.org/pdf/1710.05703 [4] Ilgner J, Düwel P, Westhofen M. Free-text data entry by speech recognition software and its impact on clinical routine. Ear Nose Throat J. 2006 Aug;85(8):523-7. PMID: 16999060.

Copyright

Copyright © 2022 Suhasini Konar, Ishita Bhargava, Anushika Balamurgan, Aakanksha Bhatt, Urvashi Patkar. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET40939

Publish Date : 2022-03-23

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here