Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Prof. Mrs. A. H. Renushe, Miss.Varsha Poojary, Miss. Salonee Shirsat, Miss. Sakshi Sonawale , Miss. Pratiksha Yadav
DOI Link: https://doi.org/10.22214/ijraset.2024.58499
Certificate: View Certificate
Our offering includes a deep neural network model that can identify firearms in photographs and a machine learning and computer vision pipeline that can detect abandoned luggage in order to identify potential gun-based crime and circumstances involving abandoned luggage in surveillance film. Unusual behavior the technique of identifying undesired human activity in locations and circumstances is called detection. To do this, footage is converted into frames, and the processed frames are then used to analyze the people\'s sports. YOLOv3 is used to find a niche, in dubious sports like lock breaking and bag snatching, among others. Our gadget has a superb processing pace in addition to appropriate accuracy of detection. It is harder for computers to detect things in videos compared to images because of issues like blurriness or things getting blocked. They propose a solution called Shot Video Object Detector, which is a faster kind of detector for videos. It works by combining information from nearby frames to make better guesses about where objects are. Unlike other methods, Shot Video Object Detector does this by figuring out how things move between frames and then using that info to combine features. It also creates new features by borrowing information directly from neighboring frames using a special structure. Automated surveillance in public areas plays a crucial role in upholding law and order and proactively identifying potential risks to the public. Not only does the procedure automatically identify and detect known crooks, but it also tracks people\'s and things\' movements and uses machine learning algorithms to alert the authorities to any questionable activity.
I. INTRODUCTION
In cities where crime rates are rising, surveillance systems are a big help. They gather lots of video data, but it's tough for computers to spot tricky stuff. Breaking down these tough tasks into smaller bits that computers can handle helps. We're focusing on two key things: detecting guns and spotting abandoned luggage in surveillance footage. We used smart computer models trained on big sets of data to learn how to find guns in videos. Another challenge is spotting abandoned luggage in busy places. This is tough for humans, so we're trying to make computer systems that can do it automatically. We've looked at past work on this and used different methods like analyzing images to spot concealed weapons.
Computer vision is about making smart apps that "see" and understand images and videos like we do. It focuses on three things: finding stuff (detection), figuring out what it is (recognition), and following it around (tracking). These skills are super helpful in security, traffic control, and more, especially for watching people and understanding what they're doing. For spotting weird actions, like odd behavior, we need to first spot people in each video frame and then keep an eye on them as they move.
Deep learning has boosted research in computer vision. They've made spotting objects in images much better. But, using these detectors on videos is tough. Videos have lots of details and can be tricky, with blurry or blocked frames. So, there's a need to improve how we find objects in videos. Also, videos have patterns over time that can help us spot things better. This makes it a good area to develop detectors that work across the whole video.
Similar to video surveillance, closed-circuit television (CCTV) transmits signals from cameras to a control center. It's commonly used in places like banks, stores, and airports for keeping an eye on things. Many countries are increasing their surveillance in public spots to help police with investigations and catching suspects. But in busy areas, it's hard for people to watch all the video footage. To fix this, there's a need for a system that automatically spots, tracks, and follows suspicious people or actions. To improve public safety, we aim to add to existing surveillance with a smart system using machine learning. This system, with multiple cameras, could predict and track suspicious activities, helping law enforcement. Our paper focuses on using machine learning tools like object detection, recognizing faces, tracking, and dealing with suspicious things or people.
II. LITERATURE SURVAY
A. Suspicious Activity Detection in Surveillance Footage
Sports that raise suspicion are problematic because of the potential risks they pose to people. Given the rise in street sports in urban and suburban regions, it's critical to find them so you can lessen these kinds of incidents. In the past, human surveillance was performed manually, which was a laborious effort as comparing suspicious activity to regular activity revealed that it was rare.
B. Paying Attention to Video Object Patterns Understanding
This work carries out a scientific investigation into the role of visual interest in the comprehension of video item samples. We found a strong correlation between human interest and express number one item judgments at some point in dynamic, task-pushed viewing, and we quantitatively proved the excessive consistency of visible interest conduct among human observers by meticulously adding dynamic eye-tracking data to three widely used video segmentation datasets (DAVIS16, YouTube-Objects, and SegTrackV2) in the context of unsupervised video object segmentation (UVOS).
C. A Machine Learning Approach for Localization of Suspicious Objects using Multiple Cameras
Automated surveillance in public areas plays a crucial role in upholding law and order and proactively identifying potential risks to the public. This paper suggests surveillance automation, or automating the process of identifying and spotting suspects and suspicious behavior by humans within crowds, based on an analysis of the approaches currently used to monitor crowd dynamics and the tactics employed, especially in public areas like bus stops, train stations, and airports, to capture suspects who are fleeing.
D. Suspicious Activity Detection from Videos using YOLOv3
An automated technique for examining video clips and making deft judgments on the activities depicted in the video is called human activity detection for video systems. In the fields of artificial intelligence and computer vision, it is one of the emerging areas. The practice of identifying undesired human behavior in locations and circumstances is known as suspicious activity detection. This is accomplished by processing video into frames, and then using those frames to analyze human activity.
E. Single Shot Video Object Detector
For object detection in movies, single-shot detectors which have the potential to be more practical than two-stage detectors, faster and easier to operate. However, it is not simple to extend these object detectors from images to videos, particularly when there is visual degradation in the latter, such as motion blur or occlusion.
III. SOFTWARE REQUIREMENT
Operating System |
Windows 10(64 Bit) |
IDE |
Spyder |
Programming Language |
Python version 3.6,3.7,3.8 |
Libraries |
Tensor Flow, OpenCV, Keras, NumPy |
IV. ALGORITHM
A. yolov3
YOLOv3, or You Only Look Once version 3, stands as a groundbreaking object detection algorithm within the realm of computer vision. Its main strength lies in real-time object detection, making it a popular choice in various applications, especially surveillance and security systems. The algorithm operates with impressive accuracy, swiftly analyzing video frames to identify a multitude of objects, spanning people, vehicles, and diverse items within a scene. What sets YOLOv3 apart is its unified framework, which divides an image into a grid and makes predictions based on bounding boxes and associated probabilities. This grid-based approach allows the algorithm to process the entire image simultaneously, eliminating the need for multiple scans and significantly speeding up the detection process. Moreover, YOLOv3's architecture has undergone significant improvements compared to its predecessors. This enhancement enables it to detect objects of varying scales and sizes more effectively, providing a comprehensive view of the scene being analyzed.
V. METHODOLOGY
A. OBJECT DETECTION-
In this usage, "objects" refer to people and baggage (or comparable objects) in a public transportation setting such as an airport. Our responsibility is to identify any questionable connections or disassociations between people and bags (or other objects); a bag left unattended can be dangerous, and the one who has the bag is the suspect: a person can be carrying a bag that is hidden from view from one camera but visible to another. This stage yields a bounding box for each object that is found. After putting this multi-camera system into place, we were pleased with the outcome.
B. Face Recognition
A camera at the entrance to the surveillance environment is set up to identify faces and compare them to a database of passengers and performers within the system. The names of every person found during the object detection stage are the outcome of this step. The server receives this meta-data from all of the cameras when it is relayed to it.
C. Handling of Suspicious Activities
As previously said, a bag or anything left unattended can be dangerous, and the owner of the bag is the person under suspicion. As a result, the primary focus of our suggested approach is on baggage and the people and items that are nearby. Object IDs are assigned to every recognized object. Anytime a bag object is found, the bag is tagged "Person with the bag" and is mapped onto a nearby person. This mapping is retained and repeated until the object ID of that bag disappears from the surveillance environment.
D. User interface
The Python graphics library is used to create the user interface. The required meta-data is superimposed over the identified objects in an easy-to-use user interface. It manages the information that the user needs to comprehend and keeps track of the tracker and server-generated alarms. The person can visually represent the identified objects' location on a two-dimensional map of the monitoring region. All of the things that the tracker has recognized and is tracking have a path created for them at the user interface. When there is suspicious behavior, this aids the surveillance team in narrowing down the area they need to hunt for a specific suspect. When it notices any questionable activity, our system additionally provides visual feedback.
In this paper, we have demonstrated a system that can detect objects in real-time, followed by an effort to classify the objects based on whether or not they are suspicious. False negative detections were less common, and detection accuracy was better than anticipated. We succeeded in identifying faces from our database with a comparatively better degree of precision. This was made feasible by a carefully curated, high-quality training dataset. accuracy has been significantly enhanced through model optimization.
[1] The IMFDB Internet Movie Firearms Database, [Online] Available: http://www.imfdb.org/wiki/Main_Page [Accessed Mar 20, 2019] [2] Soft Computing and Intelligence Information Systems, [Online] Available: https://sci2s.ugr.es/weapons-detection [Accessed Mar 27,2019] [3] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna,“Rethinking the Inception Architecture for Computer Vision,” in arXiv, vol. abs/1512.00567, 2015 [4] A. Borji and L. Itti, “State-of-the-art in visual attention modeling,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 185–207, 2013. [5] M. Siam, C. Jiang, S. Lu, L. Petrich, M. Gamal, M. Elhoseiny, and M. Jagersand, “Video segmentation using teacher-student adaptation in a human robot interaction (hri) setting,” in IEEE International Conference on Robotics and Automation, 2019. [6] B. Taylor, V. Karasev, and S. Soatto, “Causal video object segmentation from persistence of occlusions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp.4268–4276. [7] Revathi, A.R., Kumar, D. ”An efficient system for anomaly detection using deep learning classifier”. SIViP 11, pp. 291–299 (2017). [8] B. Benjdira, T. Khursheed, A. Koubaa, A. Ammar and K. Ouni, ”Car Detection using Unmanned Aerial Vehicles: Comparison between Faster R-CNN and YOLOv3,” 2019 1st International Conference on Unmanned Vehicle Systems-Oman (UVS), Muscat, Oman, 2019, pp. 1-6. [9] M. Jiang, A. Beutel, P. Cui, B. Hooi, S. Yang and C. Faloutsos, ”Spotting Suspicious Behaviors in Multimodal Data: A General Metric and Algorithms,” in IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 8, pp. 2187-2200, 1 Aug. 2016. [10] B. Shi, Q. Dai, Y. Mu, and J. Wang, “Weakly-supervised action localization by generative attention modeling,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020. [11] S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He, “Aggregated residual transformations for deep neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp.1492–1500. [12] S. Ren, K. He, R. Girshick and J. Sun, Faster R-CNN: Towards RealTime Object Detection with Region Proposal Networks, in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39,no. 6, pp. 1137-1149, 1 June 2017, doi:10.1109/TPAMI.2016.2577031. [13] Y. Q. Wang, An analysis of the Viola-Jones face detection algorithm,Image Process. Line, vol. 4, pp. 128-148, Jun. 2014. [14] K. Murawski, Method of Measuring the Distance to an Object Based on One Shot Obtained from a Motionless Camera with a Fixed-Focus Lens, Acta Physica Polonica A. 127. 1591-1596., 2015, 10.12693/APhysPolA.127.1591.
Copyright © 2024 Prof. Mrs. A. H. Renushe, Miss.Varsha Poojary, Miss. Salonee Shirsat, Miss. Sakshi Sonawale , Miss. Pratiksha Yadav. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET58499
Publish Date : 2024-02-19
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here