Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Vishal R, Praneet V K, Suvidha Manjunath Naik, Varun S Raju, Bhanujyothi H C
DOI Link: https://doi.org/10.22214/ijraset.2023.51835
Certificate: View Certificate
In earlier twenty first century, fashion trends has evolved into a way of life. Depending on skin tone, physical stature, gender, as well as social and geographic factors, the amount and style of clothes which we worn might vary significantly. For the huge chunk of population , the in-store buying experience ( buying in the shops ) is still what comes to mind when they think of shopping. The customers can try on clothing in real time, but this process takes too long when there will not be enough trial rooms in the dressing store. The goal here is to create an engaging, interactive, involving and incredibly realistic virtual system that allow the users or customers to select from a wide variety of clothing designs before simulating those outfits on the people or customers virtually. In this study, we have proposed a system that aids in the synchronization of daily attire of the people. The Virtual styling room using live video feed may alter how a person will shop for and tries on new clothing. Customers or people can try on a wide range of clothing items without a need to actually wear them by making use of the idea of “Virtual Reality”. The benefit of doing things in this way is that it would take less time and effort of the people or customers to physically try the garments on. The project we did also aids in market management, reducing the requirement for customers to try on each and every article of clothing in clothing store. Additionally, retailers can save time, money, and space by not keeping a large inventory on hand( in store).
I. INTRODUCTION
Real world and virtual world are the two worlds. When humans first discovered computers, they began to operate digitally. Human has been making an trouble to easily integrate the digital and virtual worlds. numerous technologies were developed in an trouble to close the gap between the virtual and physical worlds
Virtual reality, stoked reality, and mixed reality are three exemplifications of software that connects the virtual and real worlds. This gave rise to a cornucopia of bias that enable users to contemporaneously witness virtual and real worlds. Smart technologies that streamline our conditioning have a significant impact on our diurnal lives as a result of the technology development assiduity's rapid-fire growth. For illustration, online purchasing developed snappily. People are getting more habituated to using internet stores, online deals, etc., to buy the effects they're interested in. This kind of sale has taken over as the most popular bone and offers guests a ton of convenience. Currently the Virtual Trial Room have been used in colorful shopping promenades and stores. As now the technologies have bettered so Virtual reality have been used in order to know the connections between the virtual world and real world. Also by using extended results guests have to be veritably particular while trying the clothes. By using virtual reality we also make sure that the users like to select the clothes and buy them if they love it as they enhance their shopping experience. This is also called as Virtual- try- on requirements.
A. Background
A technology-based tool known as a virtual dressing room enables customers to digitally try on clothing without actually donning it. The technology often applies images of clothing to the user's body using augmented reality (AR) or virtual reality (VR) to let them view how the apparel looks and fits without actually putting it on.
Although the idea of a virtual changing room has been around for a while, it has recently been more well-known because of rising smartphone usage and developments in AR and VR technologies. Early in the new millennium, the first virtual dressing rooms were created, but they had few features and were not extensively used.
Yet, as e-commerce and internet purchasing have grown, virtual changing rooms have emerged. Here we use the try-on and geometric modeling modules, to create a virtual dressing room. Customers can use a virtual dressing room with a try-on module and a geometric matching module to try on clothing and see how they would appear without really trying them on. Here is how it could go: Try-on Module: Using augmented reality technology, the try-on module would place a virtual representation of the item of clothing on the customer's body. By doing so, the consumer would be able to visualize how the garment will fit and appear on their body without actually donning it. Geometric Matching Module: The geometric matching module would assess the customer's body size and shape using computer vision techniques, and then match.
B. Overview of Present Work
The primary goal is to improve in store cloth buying experience for users. In order to save time, it sets to develop an" Augmented Reality "fitting room. Gives customers a way to try different clothing without actually touching it before making a purchase. decreases the necessity for manual or physical clothing putting on, which also lowers the chance of contacting covid. It enables consumers to make wiser decisions. This helps to create a genuine connection between the user and virtual clothing.
C. Problem Statement
We design a solution to reduce human time and effort by developing a virtual styling room using live video feed and images which provides a virtual room to try different clothings. We proposed a system that can consider any posture of the user and produces cloths that properly align the person’s body measurements and fits well. So this can be considered as an alternative for a person to try on a dress in present and is also effective and accurate.
D. Objectives
This mainly focuses on the following objectives
Describe the cutting-edge tools and methods used in virtual changing rooms in general, with an emphasis on those that involve machine learning.
Highlight the advantages and disadvantages of each machine learning method by contrasting different algorithms' results in virtual changing rooms.
Examine the elements, such as lighting conditions, body shapes, clothing textures, and camera quality, that have an impact on the accuracy and dependability of virtual dressing rooms.
Discover the various 3D scans, images, and video formats that can be used to train algorithms for virtual changing rooms.
Examine how machine learning can be utilized to enhance personalization, increase realism, and reduce processing time to improve the user experience of virtual changing rooms.
II. LITERATURE SURVEY
In [1], this paper the author Hsiao proposes an Image- predicated virtual try-on system that aims to transfer decided target clothes onto a person, which has attracted the increased attention of multitudinous people. The former styles are heavily predicated on accurate parsing results. It still remains a big challenge for generating largely realistic try-on images without a mortal parser. To resolve this issue, they proposed a new Parser-Free Virtual Try-On Network(PF-VTON), which is suitable to induce high-dimensioned try-on images without leaning on a mortal parser. Compared to the former styles, they introduced two pivotal inventions. One was the foreword of a new geometric matching module, which warps the pixels of the decided target clothes and the features of the primary depraved clothes to attain the final depraved clothes with realistic texture and robust alignment. The other they designed is a new U- Transformer, which is largely effective for generating largely-realistic images in a try-on emulsion. In [2], they investigated the virtual try-on under arbitrary acts has attracted lots of hunt attention due to its huge eventuality operations. However, being styles can hardly save the details in clothing texture and facial identity( face, hair) while fitting new clothes and acts onto a person. In this paper, they proposed a new multi-stage frame to synthesize persons in periods, where rich details in salient regions can be well preserved. Specifically, a multi-stage frame is proposed to putrefy the generation into spatial alignment followed by a coarse-to-fine generation. To save the details in salient areas similar to apparel and facial areas, they proposed a Tree- Block( tree dilated emulsion block) to harness multi-scale features in the creator networks. With end-to-end training of multiple stages, the whole frame can be concertedly optimized for results with significantly better visual dedication and richer details. expansive trials on standard datasets demonstrate that our proposed frame achieves the state of- the-art performance, especially in conserving the visual details in apparel texture and facial identity.
In [3] it remains a big challenge to induce print-realistic try-on images when large occlusions and mortal acts are presented in the reference person. To address this issue, they proposed a new visual try-on network, the videlicet Adaptive Content Generating and Preserving Network( ACGPN). In particular, ACGPN first predicts the semantic layout of the reference image that will be changed after try-on (e.g., long sleeve shirt → arm, arm → jacket), and also determines whether its image content needs to be generated or saved according to the predicted semantic layout, leading to print-realistic pass- on and rich vesture details. ACGPN generally involves three major modules.
First, a semantic layout generation module utilizes semantic segmentation of the reference image to rashly predict the asked semantic layout after try-on. Alternatively, the clothes-warping module warps vesture images according to the generated semantic layout, where an alternate-order difference constraint is introduced to stabilize the torturing process during training. Third, an inpainting module for content conflation integrates all information(e.g., reference image, semantic layout, and demoralized clothes) to adaptively produce each semantic part of the mortal body.
In paper [4] In the lately proposed Image- rested virtual pass-on (VTON) approaches there were numerous several challenges regarding different mortal acts and attire styles. First, attire torturing networks constantly induce largely misshaped and deranged demoralized clothes, due to the incorrect attire agnostic mortal representations, mismatches in input images for attire-mortal matching, and the incongruous regularization transfigures parameters. Second, blending networks can fail to retain the remaining clothes due to the wrong representation of humans and incongruous training loss for the composition-mask generation. Hence, they proposed a CP- VTON( Clothing shape and texture Conserving VTON) to overcome these issues, which significantly outperforms the state-of-the-art styles, both quantitatively and qualitatively. [5] Image style transfer is an underdetermined problem, where a large number of results can satisfy the same constraint ( the content and style). Although there have been some sweats to ameliorate the diversity of style transfer by introducing an indispensable diversity loss, they've confined conception, limited diversity, and poor scalability. In this paper, they have attacked these limitations and proposed a simple yet effective system for diversified arbitrary style transfer. The crucial idea of our system is an operation called deep feature anxiety( DFP), which uses an orthogonal arbitrary noise matrix to undo the deep image point charts while keeping the original style information unchanged. Our DFP operation can be fluently integrated into numerous WCT( whitening and coloring transfigure)- grounded styles, and empower them to induce different results for arbitrary styles. Experimental results demonstrate that this literacy-free and universal system can greatly increase diversity while maintaining the quality of stylization. In [6], The first image-based full-body generative model of persons wearing garments is presented in this study. And avoid the requirement for high-quality 3D scans of dressed persons as well as the frequently employed complicated graphics rendering pipeline. It produces a semantic segmentation of the body and attire first. The second step is to apply a conditional model to the obtained segments to produce realistic visuals. The entire model can be differentiated based on position, shape, or colour. As a result, there are examples of people wearing many kinds of attire. The suggested model has the ability to create completely unique persons wearing actual attire.
The model may be biased as a result of this dataset, which might not be representative of the broader population. Although the model can produce a wide range of clothing designs, the authors admit that it might have trouble producing some styles of clothes, such as those with intricate patterns or textures. The suggested model has a number of intricate parts, such as a body form model, a clothing model, and a texture transfer technique. Because of this, scaling the model to bigger datasets or real-time applications could be challenging.
Automatically generated clothing designs from the model might not always match user requirements or preferences.
This paper [7] presents the first semantic image synthesis model that can generate photorealistic outputs for a variety of scenarios, including interior, outdoor, landscape, and street scenes, and is produced as a result of the suggested normalization. Despite the diversity of the created garment designs, they might not always appear realistic or organic. This is especially true for complex or intricate clothing designs, which the suggested method might not be able to simulate effectively. The model concentrates on creating garments at a high degree of abstraction, which could leave out little elements like folds or wrinkles. This might cut down on how realistic the created clothes designs can be. The main goal of the suggested method is to produce garment designs for computer graphics and virtual try-on systems. It may not be as applicable to other fields, like fashion design or clothing production.
Paper [8] The authors demonstrate how deep neural network features that have been taken from large-scale image classification projects can be utilized to predict human evaluations of image quality. They specifically show that human perception variations between two images are closely correlated with the distance between their deep feature distances. The authors demonstrate that their method beats conventional image quality measurements, such as PSNR and SSIM, in terms of their association with subjective evaluations of image quality. They compare the performance of their deep feature metric with these metrics. Additionally, the authors show how their deep feature metric may be used in a variety of image processing tasks, such as image denoising, image inpainting, and picture style transfer. The fact that this study is restricted to natural photographs and might not be applicable to other types of visual content, such as graphics, text, or films, is one of the paper's potential flaws. In their publication, the authors accept this restriction and propose that additional study is required to apply the findings to these other fields. The study's restriction to a particular set of deep neural network characteristics that were trained on the ImageNet dataset is another potential weakness. These traits may not be the best for all perceptual tasks, even if they have been demonstrated to be useful for picture classification tasks. Future studies may need to look into the application of more deep feature types or architectural frameworks for various perceptual tasks. [9] , in order to initiate actions within an electronic marketplace on behalf of the user, a method and system were provided to facilitate recognition of the body based on gestures that represent commands to initiate actions. Such that, by using the first set of spatial data, a model of the user body is generated. Then, a second model is generated by the action machine based on the second spatial dataset received. In [10] a virtual dressing room application using Kinect sensors was introduced. The proposed approach was based on extracting the user from a video stream, as well as skin color detection and alignment of models. In order to align the 2D cloth models with the user, the 3D locations of the joints were used for positioning, scaling, and rotating. In [11] a new framework, GANalyze, based on Generative Adversarial Networks (GAN), to study the visual features and properties that underlie high-level cognitive attributes. We focus on image memorability as a case study, but also show that the same methods can be applied to study image aesthetics and emotional valence. In [12] garment modelling which is based on creating virtual bodies by using standard measurements was presented. The 3D reconstruction methods focus on recovering the actual 3D shape of a captured garment alone, the body shape of a subject wearing the garment, or both simultaneously. Method utilizing controlled RGB and RGB-D images have been presented, that select and refine 3D garment templates based on image observations.
In paper [13] In this paper a new augmented reality concept for dressing rooms was introduced. It enables the customers to combine easy simulated try on with a tactile experience of the fabrics. The dressing room has a camera and a projection surface instead of a mirror. The customers put visual tags on their clothes. Facial image manipulation is an important task in computer vision and computer graphic, enabling lots of applications such as automatic facial expressions and styles (e.g. hairstyle, skin color) transfer.
In [14], this paper while Physics- Grounded Simulation (PGS) can directly trim a 3D garment on a 3D body, it remains too expensive for real- time operations, similar as virtual try- on. By contrast, conclusion in a deep network, taking a single forward pass, is important and faster. Taking advantage of this, they proposed a new method to fit a 3D garment template to a 3D body. Specifically, upon the recent progress in 3D point pall processing with deep networks to prize garment features at varying situations of detail, including point wise, patch-wise and global features. They fused these features with those uprooted in parallel from the 3D body, so as to model the cloth- body relations. The performing two- sluice armature, which they called it as GarNet, is trained using a loss function inspired by physics- grounded modeling, and delivers visually presumptive garment shapes whose 3D points are, on average, lower than 1 cm down from those of a PGS system, while running 100 times faster. also, the proposed system can model colorful garment types with different cutting patterns when parameters of those patterns are given as input to the network.
In [15].this paper they have proposed Tailor Net, a neural model which predicts apparel distortion in 3D as a function of three factors pose, shape and style( garment figure), while retaining wrinkle detail. This goes beyond previous models, which are moreover specific to one style and shape, or generalize to different shapes producing smooth results, despite being style specific. The thesis is that ( indeed on-linear) combinations of exemplifications smooth out high frequency components similar as fine- wrinkles, which makes learning the three factors concertedly hard. At the heart of fashion is a corruption of distortion into a high frequence and a low frequence element. While the low- frequence element is prognosticated from disguise, shape and style parameters with an MLP, the high- frequence element is predicted with a admixture of shape- style specific disguise models. The weights of the admixture are reckoned with a narrow bandwidth kernel to guarantee that only prognostications with analogous high- frequence patterns are combined. The style variation is attained by calculating, in a canonical disguise, a subspace of distortion, which satisfies physical constraints similar a sinter-penetration, and draping on the body. Tailor Net delivers 3D garments which retain the wrinkles from the Physics grounded simulations (PGS) it's learned from, while running further than 1000 times faster. In discrepancy to classical PBS, Tailor Net is easy to use and completely differentiable, which is pivotal for computer vision and literacy algorithms. Several trials demonstrate Tailor Net produces more realistic results than previous work, and indeed word generates temporally coherent distortions on sequences of the AMASS( 34) dataset, despite being trained on static acts from a different dataset. To stimulate farther rehunt in this direction, they used a dataset conforming of 55800 frames
In [16], this paper, we present a simple yet effective system to automatically transfer textures of apparel images( front and back) to 3D garments worn on top SMPL( 42), in real-time. We first automatically cipher training pairs of images with aligned 3D garments using a custom-rigid 3D to 2D enrollment system, which is accurate but slow. Using these dyads, we learn a mapping from pixels to the 3D garment face. Our idea is to learn thick correspondences from garment image outlines to a 2D- UV chart of a 3D garment face using shape information alone, fully ignoring texture, which allows us to generalize to the wide range of web images. Several trials demonstrate that our model is more accurate than extensively used nascences similar as thin-plate-spline screwing and image-to-image restatement networks while being orders of magnitude briskly. Our model opens the door for operations similar to virtual try- on, and allows for the generation of 3D humans with varied textures which is necessary for learning.
III. CONSOLIDATED TABLE
SL NO |
REFERENCE |
YEAR |
DESCRIPTION |
LIMITATIONS |
1 |
[1] |
2019 |
|
|
2 |
[2] |
2019 |
|
|
3 |
[3] |
2020 |
|
|
4 |
[4] |
2020 |
|
|
5 |
[5] |
2018 |
|
|
6 |
[6] |
2017 |
|
|
7 |
[7] |
2019 |
|
|
8 |
[8] |
2018 |
|
|
9 |
[9] |
2019 |
|
.
|
10 |
[10] |
2019 |
|
|
11 |
[11] |
2019 |
|
|
12 |
[12] |
2020 |
|
|
13 |
[13] |
2020 |
|
|
14 |
[14] |
2019 |
|
|
15 |
[15] |
2020 |
|
|
16 |
[16] |
2020 |
|
|
IV. ACKNOWLEDGEMENT
Any achievement does not depend solely on individual efforts but on the guidance, encouragement, and cooperation of intellectuals, elders, and friends. We extend our sincere thanks to Dr. Kamalakshi Naganna, Professor and Head, Department of Computer Science and Engineering, Sapthagiri College of Engineering, and Bhanujyothi H C, Assistant Professor, Department of Computer Science and Engineering, Sapthagiri College of Engineering, for the constant support, advice and regular assistance throughout the work. Finally, we thank our parents and friends for their moral support.
The popularity of online shopping and people\'s desire to fully utilize it possible when buying clothes justifies the necessity to create an algorithm that digitally dresses them in the chosen clothing. The requirement to spend hours physically trying on a range of outfits is a regular issue clients run into when shopping for clothing. The time available might not be enough, and this might be exhausting. The utilization of a virtual styling room that serves as a trial room using a live video feed is the suggested remedy for this issue. The human body\'s nodes and points are plotted using the pose estimation module, and this information is then utilized to create an image of clothing over the user\'s body, obviating the need for actual fittings and saving time and effort. Online buyers would greatly appreciate the ability to check out themselves in many different outfits with fewer limits which is the main advantage of this technology. We came to the conclusion that this exercise really saves time. It doesn\'t demand extra work. Anyone who is not technically advanced can use this virtual dressing room. It doesn\'t call for a lot of technical expertise. It is hence accessible to everyone. Therefore, it is the perfect addition for a cloth liking person. Overall, the suggested virtual dressing room appears to be a great option for precise and speedy virtual cloth fitting.
[1] W. L. Hsiao, I. Katsman, C. Y. Wu et al., “Fashion++: Minimal edits for outfit improvement,”. in IEEE International Conference on Computer Vision, pp. 5047-5056, 2019. [2] J. Wang, W. Zhang, W Liu et al., “Down to the Last Detail: Virtual Tryon with Detail Carving,”. 2019. [3] H. Yang, R. Zhang, X. Guo et al., “Towards Photo-Realistic Virtual TryOn by Adaptively Generating-Preserving Image Content,”. in IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7850-7859, 2020. [4] M. R. Minar, T. T. Tuan, H. Ahn et al., “CP-VTON+: Clothing Shape and Texture Preserving Image-Based Virtual Try-On,”. in IEEE International Conference on Computer Vision Workshops. 2020. [5] L. Liu?H. Zhang?X. Zhao et al., “Yan Collocating clothes with generative adversarial networks co-supervised by categories and attributes: a multi-discriminator framework,” in IEEE TNNLS, 2019. [6] C Lassner, G Pons-Moll and P. V. Gehler, “A generative model of people in clothing,”. in IEEE International Conference on Computer Vision, pp.853-862, 2017. [7] T Park, M. Y. Liu and T.C. Wang, “Semantic image synthesis with spatially-adaptive normalization,”. in IEEE Conference on Computer Vision and Pattern Recognition, pp.2337-2346, 2019. [8] R. Zhang, P. Isola, A. A. Efros et al., “The unreasonable effectiveness of deep features as a perceptual metric,” in IEEE conference on computer vision and pattern recognition, pp. 586-595, 2018. [9] Y. Jo and J. Park, “SC-FEGAN: Face Editing Generative Adversarial Network with User\'s Sketch and Color,”. in IEEE International Conference on Computer Vision, pp.1745-1753, 2019. [10] C. H. Lee, Z. Liu, L. Wu et al., “Maskgan: Towards diverse and interactive facial image manipulation,”. in IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.5549-5558, 2020. [11] Virtual Dressing Room Application Aladdin Masri Computer Engineering Department An-Najah National University Nablus, Palestine masri@najah.edu Muhannad Al-Jabi Computer Engineering Department An-Najah National University Nablus, Palestine mjabi@najah.edu 2019. [12] J. Johnson, A. Alahi and L. Fei-Fei, “Perceptual losses for real-time style transfer and super-resolution,”. in European conference on computer vision. Springer, Cham, pp.694-711, 2019. [13] 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Learning to Transfer Texture from Clothing Images to 3D Humans Aymen Mir1 Thiemo Alldieck1, 2 Gerard Pons-Moll1 1 Max Planck Institute for Informatics, Saarland Informatics Campus, Germany 2020. [14] C. Patel, Z. Liao, G. Pons-Moll, “TailorNet: Predicting Clothing in 3D as a Function of Human Pose, Shape and Garment Style,”. in IEEE Conference on Computer Vision and Pattern Recognition. 2020. [15] E. Gundogdu?V. Constantin?A. Seifoddini et al., “GarNet: A Twostream Network for Fast and Accurate 3D Cloth Draping,”. in IEEE/CVF International Conference on Computer Vision (ICCV). 2019. [16] A. Mir, T. Alldieck, G. Pons-Moll, “Learning to Transfer Texture from Clothing Images to 3D Humans,”. in IEEE Conference on Computer Vision and Pattern Recognition. 2020.
Copyright © 2023 Vishal R, Praneet V K, Suvidha Manjunath Naik, Varun S Raju, Bhanujyothi H C. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET51835
Publish Date : 2023-05-09
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here