Real Time Object Detection Using Deep Learning

Authors: Adwitiya Padigel, Tushar Chintanwar, Shruti Landge, Pooja Khobragade, Tanu Awachat , Prof. Manoj Lade

DOI Link: https://doi.org/10.22214/ijraset.2022.45355

Abstract

Visually impaired people have difficulty moving safely and independently, which interferes with normal indoor and outdoor work and social activities. Similarly, they have a hard time identifying the basics of the environment. This paper presents a model for detecting the brightness and key colors of real-time images using the RGB method with an external camera and identifying basic objects and face recognition from human datasets.[2]. Object detection is a department of pc imaginative and prescient that appears for times of lexical entities in photographs and videos. The gadget makes use of the ESP-32 Cam\'s digital digicam to continuously seize severa frames, which can be sooner or later converted to audio segments. In this project, we use the You Only Look Once V3 (YOLO v3)algorithm, which runs thru a version of a really complex Convolutional Neural Network structure with OpenCV. Then with the aid of using the usage of Google Text to Speech, we convert the photo to textual content and afterwards textual content - to - speech for the visually impaired individual. Thus, the Visually Impaired individual receives the place of the gadgets withinside the digital digicam\'s view through audio. Distance calculation is aided with the aid of using an ultrasonic sensor. By the usage of The amassed consequences show that the proposed prototype is a hit in presenting visually impaired customers with the cappotential to realise surprising settings the usage of a user-pleasant machine that integrates this unique item detection Model.[1]

Introduction

I. INTRODUCTION

A big variety of people stay on this global with the inadequacies of know-how nature due to visible weakness. In spite of the reality that they could create optional approaches to deal and manipulate every day schedules, they revel in positive course problems in addition to social clumsiness. For example, it's far tough for them to find a particular room in a brand new situation. Furthermore, dazzle and outwardly debilitated people assume that it is tough to inform whether or not an person is conversing with them or another.

Object recognition was noteworthy Direction and focus of computer research Vision applicable to automatic vehicles, Robotics, video surveillance and pedestrians recognition. Disclosure of deep learning
Technology has changed the traditional way Object identification and object recognition. Depth Neural networks have powerful feature representations Image processing capacity, usually used as follows: Object recognition feature extraction module. No special model is required for deep learning models Handmade features and can be designed that way Classifier and regression device. therefore, Deep learning technology is very important With object recognition. Problem of Object detection is designed to determine where an object is It's actually in a specific frame (object) Localization) and detect. So pipeline Mainly shared traditional object recognition model In three stages: Beneficial area selection, Feature extraction and recognition.

II. OVERVIEW OF DEEP LEARNING

Deep learning is artificial intelligence Functions that imitate human functions Brain data processing and pattern creation decide. A subset of machine learning Network-enabled artificial intelligence Unsupervised learning from data.

III. LETRATURE REVIEW

Seriette. al [3] presents an object detector based on Deep learning from a small example. The proposed node is Semantic relevance of objects to improve accuracy The number of weak objects in complex scenarios. Kong Tanger. al [4] is the framework design and Model operating principle and model analysis Real-time performance and accuracy recognition. ChristianSzegedyet.al [5] presents something simple Powerful formulation of object detection as one Object bounding box Mask regression problem. this Defines a multiscale inference method to generate Low cost and high resolution object detection from a small number Network application. XiaogangWanget.al [6] Focus on deep learning overview and applications Object discovery, discovery, and segmentation It ’s a major issue in computer vision, Image and video application. ShuaiZhanget.al [7] Suggest a framework to complete the task in A unique network with multiple cameras. new Object detection algorithm by mean shift (MS) Introducing segmentation to further evolve objects Separated by Help with depth information derived from stereo Fixed number of sliding window templates There is a vision that applies. It is also possible for example Supervised learning in problem implementation Use decision tree or SVM in detail Learning conducted in Malay Shahet.al [8]. Xinyi Zhouet. al [9] deals with it Computer vision field mainly for depth Learning with object recognition tasks. There is a simple one Dataset and deep learning overview Algorithm used in computer vision. ZhongQiuZhaoet.al [10] provides a detailed overview of Deep Learning-based object recognition framework Addresses various sub-issues such as B. With Clutter Low resolution, varying degrees Changes to R-CNN.Sandeep Kumaret.al [11] Look at the Easynet model and detect A single network prediction is possible. Or easynet model sees the big picture when testing Therefore, the forecast is notified by Global Context.AdamiFatimaZohraet. Al [12] focuses on Vehicle detection and detection from video Electricity. This method gives better results in the following ways: Accuracy, detection, classification 99.2% accuracy is achieved.

IV. METHODOLOGY

The first step in the use of ESP32-CAM along with Tensorflow.js is to become aware of the gadgets that make up the net web page wherein the belief occurs. To use the Tensorflow Javascript library, we want to comply with those steps: first import the Tensorflow JavaScript libraries, then load the version, on this assignment the COCO-SSD educated ML version might be make used And create labels for processed gadgets, that are displayed the use of the COCO-SSD version at the enter video of the recognized gadgets through drawing rectangles across the gadgets.

A. Components

ESP32-CAM

The board is powered via way of means of an ESP32-S SoC from Espressif, a powerful, programmable MCU with out-of-the-container WIFI and Bluetooth.It’s the cheapest (around $7) ESP32 dev board that gives an onboard digital digicam module, MicroSD card support, and 4MB PSRAM on the identical time.Adding an outside Wifi antenna for sign boosting calls for greater soldering.

2. Ultrasonic Sensor

HCSR04 Ultrasonic Sensor is used in this project. The distance of the object is calculated with the time delay between the transmitter and the receiver.

3. FTDI232 Module

FTDI USB to TTL serial converter modules are used for widespread serial applications. It is popularly used for communique to and from microcontroller improvement forums which includes ESP-01s and Arduino micros, which do now no longer have USB interfaces.

V. PROPOSED SYSTEM

The system will convert image to text and then text to speech by using COCO-SSD algorithm that runs through the Convolutional Neural Network architecture called the Darknet with TensorFlow.JS and Google Text to Speech. Then it converts the annotated text into audio responses and give the location of the objects in the camera’s view. The system will continuously capture multiple frames using a camera on ESP32-CAM and the frames then converted to audio segment. The ultrasonic sensor detects the distance of the object from the device. For better communication we have to add small antenna which provides us better WIFI range and stability.

VI. SYSTEM ARCHITECTURE

In this project, with the aid of using streaming photographs the usage of the ESP32-CAM board and receiving and showing them withinside the browser, we can additionally use Tensorflow.JS to system photographs the usage of the default fashions applied. As quickly because the photograph is acquired with the aid of using the internet server jogging withinside the browser, it's going to assume and examine the gadgets withinside the photograph. In this project, best the default gadgets are recognizable. As you can know, Tensorflow has numerous pre-skilled fashions that we will use to effortlessly begin getting to know the machine. COCO-SSD is an ML version used to localize and pick out gadgets in an images. The equal version is used on this tutorial.

VII. RESULT

The performance of the object detection model is
Based on the precision and recall of each person being evaluated The best bounding box for known objects in images.

Conclusion

The proposed model will prove to be highly beneficial for VI people. Further refinements in the project will bring even more accurate results while achieving its main goal of being cheaper and user-friendly.

References

[1] An Assistive Model for Visually Impaired People using YOLO and MTCNN Proceedings of the 3rd International Conference on Cryptography, Security and Privacy - ICCSP \'19, 2019 Ferdousi Rahman, Israt Jahan Ritun, Nafisa Farhin, Jia Uddin [2] Object Detection With Deep Learning: A Review January 2019 IEEE Transactions on Neural Networks and Learning Systems PP(99):1-21 Zhong-Qiu Zhao , Peng Zheng, Shou-Tao Xu, and Xindong Wu Impaired U.S. Patent No. 9,488,833.8 Nov. 2016. [3] Ce Li, Yachao Zhang and YanyunQu, “Object Detection Based on Deep Learning of Small Samples,” International Conference, pp.1-6, March 2018. [4] Cong Tang, YunsongFeng, Xing Yang, Chao Zheng and Yuanpu Zhou, “The Object Detection Based on Deep Learning,” International Conference, pp.1-6, 2017. [5] Christian Szegedy, Alexander Toshev and Dumitru Erhan, “Deep Neural Networks for Object Detection,” IEEE, pp.1-9, 2007 [6] Xiaogang Wang, “Deep Learning in Object Recognition, Detection, and Segmentation,” IEEE, pp.1-40, Apr. 2014. [7] Shuai Zhang, Chong Wang and Shing-Chow Chan, “New Object Detection, Tracking, and Recognition Approaches for Video Surveillance Over Camera Network,” IEEE SENSORS JOURNAL, vol. 15, no.69, pp. 1-13, May 2015. [8] Malay Shah and Prof. RupalKapdi, “Object Detection Using Deep Neural Networks,” International Conference, IEEE, pp.1-4, 2017. [9] Xiao Ma, Ke Zhou and JiangfengZheng, “Photo Realistic Face Age Progression/Regression Using a Single Generative Adversarial Network,” Neurocomputing, Elsevier B.V., pp.1-16,July 2019. [10] Zhong-Qiu Zhao, PengZheng, Shou-taoXu and Xindong Wu, “Object Detection with Deep Learning: A Review,” IEEE, pp.1-21, 2019. [11] Sandeep Kumar, AmanBalyan and ManviChawla, “Object Detection and Recognition in Images,” IJEDR, pp.1-6, 2017. [12] Adami Fatima Zohra, SalmiKamilia, Abbas Faycal and SaadiSouad, “Detection And Classification Of Vehicles Using Deep Learning,” International Journal of Computer Science trends and technology(IJCST), vol. 6, pp. 1-7, 2018.

Copyright

Copyright © 2022 Adwitiya Padigel, Tushar Chintanwar, Shruti Landge, Pooja Khobragade, Tanu Awachat , Prof. Manoj Lade. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET45355

Publish Date : 2022-07-05

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here