Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Nihar Ranjan
DOI Link: https://doi.org/10.22214/ijraset.2023.55459
Certificate: View Certificate
Deep neural network recently created offers insight on how high-level picture representation may be automatically learned in raw pixels. Deep learning with Convolutional Neural Networks has showed significant potential in the classification and improvement of picture, but is frequently unfit for predictive modelling without spatial connections utilizing data. In order to organize high dimensional vectors in a compact way for CNN- based profound learning, we offer a method to representation. This study shows the use of deep neural networks to create a system that can identify different texture characteristics.
I. INTRODUCTION
Texture is a vital component in computer graphics for many applications. The texture generally represents the surface picture of an item although its definition is slightly different, and does not reflect the form of an object. A full human face image is typically not a texture, for example, but a close-up of a human skin. Artists utilize textures in rendering to give surface elements to things without increasing geometric complexity. Texture is used for image processing to represent surface types which are form-independent. Texture can be thought as a basic element that captures the appearance of surfaces of objects. Accurate classification of textures is also fundamental in many important applications such as inspection and segmentation for image processing and generation of texture database for rendering. At the same time, texture feature representation is a challenging problem since textures often vary a lot within the same class, due to changes in viewpoints, scales, lighting configurations, etc. In addition, textures usually do not contain enough information regarding the shape of objects which are informative to distinguish different objects in image classification tasks. Due to such difficulties, even the latest approaches based on convolutional neural networks achieved a limited success, when compared to other tasks such as image classification. We propose a unification of two major classification approaches, convolutional neural networks and spectral analyses, to approach the difficulty of texture feature representation.
In the last three years, mainly due to the advances of deep learning, more concretely convolutional networks, the quality of image recognition and object detection has been progressing at a dramatic pace. One encouraging news is that most of this progress is not just the result of more powerful hardware, larger datasets and bigger models, but mainly a consequence of new ideas, algorithms and improved network architectures. The main idea of the Inception architecture is based on finding out how an optimal local sparse structure in a convolutional vision network can be approximated and covered by readily available dense components. Note that assuming translation invariance means that our network will be built from Convolutional building blocks. All we need is to find the optimal local construction and to repeat it spatially. Arora et al. suggests a layer-by layer construction in which one should analyze the correlation statistics of the last layer and cluster them into groups of units with high correlation. These clusters form the units of the next layer and are connected to the units in the previous layer. We assume that each unit from the earlier layer corresponds to some region of the input image and these units are grouped into filter banks. In the lower layers (the ones close to the input) correlated units would concentrate in local regions. This means, we would end up with a lot of clusters concentrated in a single region and they can be covered by a layer of1×1convolutions in the next layer, as suggested in [12]. However, one can also expect that there will be a smaller number of more spatially spread out clusters that can be covered by convolutions over larger patches, and there will be a decreasing number of patches over larger and larger regions. In order to avoid patch-alignment issues, current incarnations of the Inception architecture are restricted to filter sizes1×1,3×3and5×5, however this decision was based more on convenience rather than necessity. It also means that the suggested architecture is a combination of all those layers with their output filter banks concatenated into a single output vector forming the input of the next stage. Additionally, since pooling operations have been essential for the success in current state of the art convolutional networks,
It suggests that adding an alternative parallel pooling path in each such stage should have additional beneficial effect, too Deep neural network based Convolutional neural networks (CNNs) process an input textures-is and collect statistics in the spatial domain. Spectral analysis transforms an input texture into a spectral domain and uses frequency statistics. CNNs are usually good at capturing spatial features, while a spectral analysis is good at capturing scale invariant features. We aim to consider both the spatial and spectral information so that it captures both types of features well under a single model. The key idea is that the pooling layer and the convolution layer in CNNs can be thought as a limited form of a spectral analysis. Based on this idea, we generalize both layers to perform a spectral analysis using multi resolution analysis by wavelet transform. We thus named our model as wavelet convolutional neural networks (wavelet CNNs). The overview of wavelet CNNs is shown in our model is thus easier to train and consumes less memory than CNNs. To summarize, our contributions are: Combination of CNNs and spectral analysis using Google net. Accurate and efficient texture feature representation using our model. Several numerical experiments in the results section validate that our model successfully classified failure cases of existing models. Fuzzy image processing is special in terms of its relation to other computer vision techniques. It is not a solution for a special task, but rather describes a new class of image processing techniques. It provides new methodology, augmenting classical logic, a component of any computer vision tool. A new type of image understanding and treatment has to be developed. Fuzzy image processing can be a single image processing routine, or complement parts of a complex image processing chain.
II. LITERATURESURVEY
A computer-based vision method has been developed in this paper to inspect surface defects on aluminum profiles. Two faults, blisters and scratches were studied and divided into three groups the analysis (non-defective, blister, scratch). For this application, we carried out a feature selection which resulted in 98.6 percent precision with only two features. The high precision of detection combines the existing field literature with a new approach to variables, such as choosing and modifying Values from the co-occurrence matrix statistical characteristics of the Sobel operator gradient magnitude of the image. We have dubbed this matrix GOCM to differentiate it from standard approaches to GLCM to GLGCM. Although the aluminum surface texture is almost stochastic, making it extremely difficult to detect flaws, we have shown that using GOCM statistical features is more acceptable for extruded aluminum surface inspection. More examples of various defect forms such as dialing, picking up, breaking, stamping coauthors, welding lines and black lines can be added to further enhance the analysis. The accuracy is predicted to fall with the detection of further defects when only two characteristics are used so they are losing their ability to discriminate. The details can be best distinguished from related faults using different types of Cameras and Measurement Devices. Future study is also possible to assess the performance of various classification schemes, for example vector supporters and to compare them with performance of neural networks. Last but not least important path in future research is the checking by means of regular texture benchmarks of the suggested GOCM methodology.
2. MyeongAh Cho , Graduate Student Member, IEEE, Taeoh Kim , Graduate Student Member, IEEE, Ig-Jae Kim , Member, IEEE, Kyungjae Lee , Member, IEEE, and Sangyoun Lee , Member, IEEE(2021) et al
The Relative Graph Module (RGM) derives relative knowledge of each identity by integration into a node vector of any face variable and models the relationships between them. A hierarchical methodology focused on extracting relationships solved the problem of divergence between HFR domains. In addition, by plugging in a pre-trained face extractor and fine tuning, the RGM resolved the issue of the lack of suitable HFR databases. Moreover, the node-wise recalibration was done through the Node Attention Unit (NAU) to concentrate on global informative nodes between propagated node vectors. Author’s new C-softmax loss has helped to adjust traditional projection spaces by increasing the level of similarity. We also applied the RGM module to a number of pre-trained backgrounds and examined improved results on NIR-to-VIS and Sketch-to-VIS projects. In addition, each proposed approach demonstrated its effect by improved success in ablation studies. Moreover, the visualization, in VIS, NIR, and sketch pictures, of relational knowledge revealed that relationships within the face are identical in each field, showing representative field invariant characteristics. Author’s methodology was even stronger than the advanced CASIA NIR-VIS 2.0, IIIT-D Sketch, BUAA- VisNir, Oulu-CASIA NIR-VIS and TUFTS approaches.
3. Z. Chen, R. R. Derakhshani, C. Halmen, and J. T. Kevernet al
The author used the DSLR macro light camera. The perpendicular and angled lighting is used for 11 specimens containing light and mildly broken concrete surfaces.
The textural characteristics derived from data on gray level co- occurrence matrixes, from which 3-6 characteristics were chosen. Accuracies of cross-validation with neural network classifier have been as high as 94 per cent, suggesting the feasibility of fast, automated beta cracking evaluation with COTS digital imagery.
4. Lin Chen1, MengYang (2016) et al
This paper suggested a semi-supervised paradigm for discriminatory dictionary analysis. By combining mark distribution with the class-specific reconstruction error of each unlabeled sample, the class of unlabeled samples can be calculated to train Authors model more accurately.
The differential property of branded training data is also well discussed by using the concept of segregation and reducing the scatter of the coefficients in the classroom. Several trials, including face-recognition simulations, digital recognition and texture classification, demonstrate Authors method's superiority over supervised and other semi-supervised approaches to dictionary learning. More questions of classification may be discussed in future, for example where the samples in training do not belong to any recognized class.
5. Kaveri Chatra1 · Venkatanareshbabu Kuppili1 · Damodar Reddy Edla1 (2019) et al
This paper suggests a new approach for the classification of texture images called BDADNN. Based on the health function of binary dragon flies based on the deep neural network. We begin with a two-level threshold to decompose gray images into a number of binary images. We measure fractal dimensions for the capturing of binary image boundary complexities and derive GLCM and GLRLM matrices features to capture the spatial pixels dependences in the gray image. Fusion of fractal dimensional characteristics and GLCM and GLRLM are also a best solution.
However, we chose a naturally influenced algorithm called the Binary Dragonfly Algorithm, with a new function for the collection of features, to reduce its high dimensionality. In order to optimize deep neural network and importance and minimize the amount of characteristic features of input data, the suggested fitness function of the binary dragonfly Algorithm is formulated. Authors suggested approach has been experimented. Different groups are used to experimental assessment for two well-known texture image datasets called textured surfaces and KTH-TIPS. Cross validation methodology is employed to consider the classification efficiency impact of the training dataset in scale. The suggested solution performance assessment is compared to the SVM. In terms of correct labeling, the proposed procedure exceeds SVM. There is also a high statistical importance of the rise in classification accuracy.
6. Hyun Sung Chang, Member, IEEE, and Kyeongok Kang (2005) et al
In this article, we introduced a new algorithm that was useful in a broad range of applications to detect and identify edge components in block levels [7],[8],[14],[16]. This scheme, derived systematically from a pixel domain algorithm, performs with minimal arithmetical operations on the DCT coefficient domain (especially multiplications). Although, as outlined in Sections IV and V, the proposed approach also applies to moving images without significantly increasing complexity, it can also be
used for effective video analyses, such as scene recognition and classification. This is one of Authors constant research topics.
The paper suggests a new technique, which uses gray-scale graphics and a location histogram, for horizontal and vertical edge processing and segmentation to extract ably detect the desired ROI.
Author’s data collection, which includes different lighting conditions, different distances and resolution, is used for experimental evaluation of Authors algorithm.
The outcome is highly successful when applying the work recommended and provides a detection rate of 93.34 percent for vehicle licensing plates. This illustrates the applicability of the proposed work to classify license plates. Methods are therefore vulnerable to different variables, such as broken license plates, tags, stamps placed on the car's body parts and which degrade the detection rate, including the location, personal appearance and exterior factors.
We can solve these problems by different techniques of image processing consisting of histogram equalization, dynamic range imaging (HDR). Some issues, such as image rotation number plate identification, must be resolved in the future.
III. PROPOSED METHODOLOGY
Texture representation, i.e., the extraction of features that describe texture information, is at the core of texture analysis. As a clas sical pattern recognition problem, texture classification primarily consists of two critical sub problems: texture representation and classification. It is generally agreed that the extraction of powerful texture features plays a relatively more important role, since if poor features are used even the best classifier will fail to achieve good results.
Image classification means assigning the label to input image from fixed set of categories. The image classification includes variety of application like designing robots, objects identifications, automatic cars, traffic signal processing. Feature extraction is more important task for the image representation in classification problem. DNN applied on large-scale datasets to learn images representation and reuse it for the classification.
Objective of this paper is to apply convolutional neural network for image classification problem. DNN architecture are proposed. To test the performance of CNN we have used Google net.
The main objective of convolutional layer is to obtain the features of an image by sliding smaller matrix (kernel or filter) over the entire image and generate the feature maps. The pooling layer used to retain the most important aspect by reducing the feature maps. Fully connected layer interconnect every neuron in the layer to the neurons from the previous and next layer, to take the matrix inputs from the previous layers and flatten it to pass on to the output layer, which will make the prediction.
In recent years, machine learning (ML) has produced numerous insights from the surge of data generated in diverse areas. In computer science, the representation of an image can take many forms. Most of the time, it refers to the way that the conveyed information, such as color, is coded digitally and how the image is stored, i.e., how is structured an image file. Several open or patented standards were proposed to create, manipulate store and exchange digital images. They describe the format of image files, the algorithms of image encoding such as compression as well as the format of additional information often called metadata. Differently, the visual content of the image can also take part in its representation.
Figure: System Architecture Mathematical Model:
A. System Description
S= {I, F, O}
INPUT:
F1=Image processing applied on natural texture
F2=feature extraction from images
R1= model creation from training. R2= model based image testing
B. Space Complexity
The space complexity depends on Presentation and visualization of discovered patterns. More the storage of data more is the space complexity.
C. Time Complexity
We are going to use Google net for fast and better recognition with higher accuracy. So time complexity is less. So the time complexity of this algorithm is O(????????).
a. High accuracy achieved by using all type of image dataset.
b. User gets result very fast according to their needs.
2. Failures:
a. Huge database can lead to more time consumption to get the information.
b. Hardware failure.
c. Software failure.
Mathematical Model in Equation format Notation
Where,
A Neuro-Fuzzy Function:
A Neuro-Fuzzy Function is a fuzzy system that uses a learning algorithm derived from or inspired by neural network theory to determine its parameters (fuzzy sets and fuzzy rules) by processing Texture Image. Neuro-fuzzy Function which explain in more detail below.
Modern neuro-fuzzy Function are usually represented as special multilayer feed forward neural networks (see for example models like ANFIS [13], FuNe [12], Fuzzy RuleNet [16], GARIC [8], or NEFCLASS and NEFCON [14]).
However, fuzzifications of other neural network architectures are also considered, for example self-organizing feature maps [9, 17]. In those neuro--fuzzy networks, connection weights and propagation and activation functions differ from common neural networks. Although there are a lot of different approaches [10, 11, 14, 15], we usually use the term neuro- fuzzy Function for approaches which display the following properties:
A Neuro-Fuzzy Function depends on a fuzzy machine that has been trained using a neural network-based learning algorithm. The learning procedure works on local knowledge and results in only small variations to the overall fuzzy structure.
A Neuro-Fuzzy Feature is a three-layer feed forward neural network. The first layer represents the pixels in the input image, the middle (hidden) layer represents the fuzzy law, and the third layer reflects the texture representation. (Fuzzy) relation weights are used to encrypt fuzzy sets. To introduce a supervised learning to a fuzzy inference system, this is not appropriate to interpret it in this manner. It may, furthermore, be useful since it reflects the data flow of information processing and training within the system. A 5- layer structure is often used, with the fuzzy sets defined in the units of both the second and fourth layers.
IV. ALGORITHM USED DEEP NEURAL NETWORK
The mainstream Classifier for image data categorization may be seen in the deep neural network paradigm. The CNNs quickly dominated the field of visual identification with high accuracy results despite the attempts of the scientific community to realise visual target detection and classification using standard methods of machine learning. Like many deep neural network concepts, CNNs need a minimum a previous understanding of training data compared to standard methods for machine learning. The use of CNNs compared with traditional learning processes is a significant benefit. There are several cached layers and one input and one output layer in a CNN. The hidden CNN layers comprise of convolution, pooling, fully linked layers and layers of normalization.
The input (in our example) is the target picture to be identified and the output is the bird nest context in the picture. There is also a cost function that identifies the most appropriate collection of parameters and functions for the activation of the final output. Our method uses a CNN classification model to precisely identify the picture representation textures on the background server.
Google Net is a 22-layer, deep convolutions neural network built by Google Researchers, and a version of the Inception Network. In the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14), Google's architecture addressed computer vision challenges like as picture classification and object recognition – discover out how successful it was at the conclusion of this article. Google Net is currently utilized for different visual tasks on the computer such as face detection and recognition, adverse training, etc.
A neural network structure dubbed the startup module improves the categorization of images and the identification of objects. This module is also known as GoogleNet. In our project, the GoogLeNet CNN architecture is employed. The architecture of GoogLeNet maximizes the usage of computer resources. GoogleNet architecture enhances neural network breadth and depth at the lowest cost. The quality of architectural optimisation is based on the Hebbian principle and lack of multiple calculations. The rectified linear activation function is used by every convolution layer. GoogLeNet comprises of 22 layers, with 21 convolution levels linked to one fully linked layer. The network consist of components of convolution layers and is used repeatedly with the spatial characteristics to determine the optimum approach to develop a local region. The lowest layer comprises of a focus on the input in the local areas and the next layer has 1 to 1 convolutions. The filtering size varies in the following convolutions from 1 to 1, 3 to 3 and 5 to 5 for image alignment. Finally, the output classification of the supplied input data picture is provided using a softmax classifier.
We will use this approach for representing and classifying the image, a CNN based on DNN with Fuzzy logic. The architecture is proposed by DNN. To test the performance of CNN, we used MNIST data sets. The primary objective of a thick layer is to glide over the complete image (kernel or filter) and generate the maps to acquire the picture\'s characteristics. The pooling layer preserved the most important aspect with diminishing function mappings. The whole layer connections connect each of the layer\'s neurons to the neurons of the previous layer and the following one, taking input matrices of the previous layer.Future work will be based on the various texture representation and its result comparison.
[1] Apostolos Chondronasios1 · Ivan Popov1 · Ivan Jordanov1 “Feature selection for surface defect classification of extruded aluminum profiles” Received: 22 October 2014 / Accepted: 30 June 2015 © Springer-Verlag London 2015 [2] MyeongAh Cho , Graduate Student Member, IEEE, Taeoh Kim , Graduate Student Member, IEEE, Ig-Jae Kim , Member, IEEE, Kyungjae Lee , Member, IEEE, and Sangyoun Lee , Member, “Relational Deep Feature Learning for Heterogeneous Face Recognition” 1556-6013 © 2020 IEEE [3] Nihar Ranjan, Zubair Ghouse “A Multi-function Robot for Military Application” Imperial Journal of Interdisciplinary Research (IJIR) Vol-3, Issue-3, ISSN: 2454-1362, pp 1785-1788, 2017 [4] Z. Chen, R. R. Derakhshani, C. Halmen, and J. T. Kevern “A Texture-based Method for Classifying Cracked Concrete Surfaces from Digital Images using Neural Networks” [5] Lin Chen1, Meng Yang1,2,3 “Semi-supervised dictionary learning with label propagation for image classification” c The Author(s) 2016. This article is published with open access at Springerlink.com [6] Nihar Ranjan, Midhun C., \"A Brief Survey of Machine Learning Algorithms for Text Document Classification on Incremental Database\" TEST Engineering and Management, ISSN: 0193-4120, Volume 83, pp 25246 – 25251, June 2020 [7] R.V. Darekar, Nihar Ranjan, et al. “A hybrid meta-heuristic ensemble based classification technique speech emotion recognition” Advances in Engineering Software , Elsevier, Volume 180, June 2023, [8] Kaveri Chatra1 · Venkatanareshbabu Kuppili1 · Damodar Reddy “Texture Image Classification Using Deep Neural Network and Binary Dragon Fly Optimization with a Novel Fitness Function” © Springer Science+Business Media, LLC, part of Springer Nature 2019 [9] Deepak Mane, Ranjeet Bidve, et.al., “Traffic Density Classification for Multiclass Vehicles Using Customized Convolutional Neural Network for Smart City” Lecture Notes in Network and Systems, Springer Nature , September 2022, pp 1015-1030 [10] Hyun Sung Chang, Member, IEEE, and Kyeongok Kang “A Compressed Domain Scheme for Classifying Block Edge Patterns” 1057-7149/$20.00 © 2005 IEEE [11] Manisha Gawade, Tejashree Mane, et al. “Text Document Classification by using WordNet Ontology and Neural Network” International Journal of Computer Applications (0975 – 8887) Volume 182 – No. 33, December 2018 [12] Akarsh Aggarwal, Anuj Rani, Manoj Kumar “A robust method to authenticate car license plates using segmentation and ROI based approach” Smart and Sustainable Built Environment [13] V Brindha Devi, Nihar M Ranjan, Himanshu Sharma, “IoT Attack Detection and Mitigation with Optimized Deep Learning Techniques”, Cybernetics and Systems, Taylor & Francis, December 202 [14] Nihar M. Ranjan, Rajesh S. Prasad, “Text Analytics: An Application of Text Mining”, Journal of Data Mining and Management, Volume-6, Issue-3 (September-December, 2021 [15] Shikha Singh, Priti Sarote, et al., “ Detection of Parkinson’s Disease using Machine Learning Algorithm”, International Journal of Computer Applications (0975 – 8887) Volume 184 – No.6, April 2022
Copyright © 2023 Nihar Ranjan. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET55459
Publish Date : 2023-08-22
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here