In this paper, we present a novel approximate computing scheme suitable for realizing the energy-efficient multiply-accumulate (MAC) processing. First we design the approximate 4-2 compressors generating errors in the opposite direction while minimizing the computational costs. Based on the probabilistic analysis, positive and negative multipliers are then carefully developed to provide a similar error distance. Simulation results on various practical applications reveal that the proposed MAC processing offers the energy-efficient computing scenario by extending the range of approximate parts. This Design is implemented by Verilog HDL and simulated by Modelsim 6.4 c. The Performance is measured by Xilinx tool Synthesis Process. The proposed Sobel edge detection algorithm uses approximation methods to replace the complex operations; This design is done by Matlab and Modelsim using hdldameon, This proposed multipliers arerpelaced in the sobel operator based image Edge detection.
Introduction
I. INTRODUCTION
In applications like multimedia signal processing and data mining which can tolerate error, exact computing units are not always necessary. They can be replaced with their approximate counterparts. Research on approximate computing for error tolerant applications is on the rise. Adders and multipliers form the key components in these applications. In approximate full adders are proposed at transistor level and they are utilized in digital signal processing applications. Their proposed full adders are used in accumulation of partial products in multipliers. To reduce hardware complexity of multipliers, truncation is widely employed in fixed-width multiplier designs. Then a constant or variable correction term is added to compensate for the quantization error introduced by the truncated part. Approximation techniques in multipliers focus on accumulation of partial products, which is crucial in terms of power consumption. Broken array multiplier is implemented, where the least significant bits of inputs are truncated, while forming partial products to reduce hardware complexity. The proposed multiplier saves few adder circuits in partial product accumulation. Two designs of approximate 4-2 compressors are presented and used in partial product reduction tree of four variants of 8 × 8 Array multiplier. The major drawback of the proposed compressors is that they give nonzero output for zero valued inputs, which largely affects the mean relative error (MRE) as discussed later. The approximate design proposed in this brief overcomes the existing drawback. This leads to better precision. In static segment multiplier (SSM) proposed, m-bit segments are derived from n-bit operands based on leading 1 bit of the operands. Then, m × m multiplication is performed instead of n × n multiplication, where m<n. Partial product perforation (PPP) multiplier omits k successive partial products starting from jth position, where j ∈ [0, n-1] and k ∈ [1, min(n-j, n-1)] of a n-bit multiplier. In [8], 2 × 2 approximate multiplier based on modifying an entry in the Karnaugh map is proposed and used as a building block to construct 4 × 4 and 8 × 8 multipliers. In [9], inaccurate counter design has been proposed for use in power efficient Wallace tree multiplier. A new approximate adder is presented which is utilized for partial product accumulation of the multiplier. For 16-bit approximate multiplier 26% of reduction in power is accomplished compared to exact multiplier. Approximation of 8-bit Wallace tree multiplier due to voltage over-scaling (VOS) is discussed. Lowering supply voltage creates paths failing to meet delay constraints leading to error. Previous works on logic complexity reduction focus on straightforward application of approximate adders and compressors to thepartial products. In this brief, the partial products are altered to introduce terms with different probabilities. Probability statistics of the altered partial products are analyzed, which is followed by systematic approximation. Simplified arithmetic units (half-adder, full-adder, and 4-2 compressor) are proposed for approximation. The arithmetic units are not only reduced in complexity, but care is also taken that error value is maintained low. While systemic approximation helps in achieving better accuracy, reduced logic complexity of approximate arithmetic units consumes less power and area. The proposed multipliers outperforms the existing multiplier designs in terms of area, power, and error, and achieves better peak signal to noise ratio (PSNR) values in image processing application.
II. PROPOSED MULTIPLIER BLOCK DIAGRAM
Three approximate 4-2 compressors (UCAC1, UCAC2, and UCAC3) are proposed in this section. Then, the ECM is presented to detect an input pattern with a large probability and correct the erroneous compensation in this case. Furthermore, the proposed designs are embedded in 8-bit multipliers based on the partial product tree. And all the analyses are performed with the uniform distribution.The proposed approximate compressors and ECM are designed to simplify and accelerate the compression process, four 8-bit multipliers (N = 8) are designed to evaluate these blocks, accordingly.
1. MUL1: Multiplier with UCAC1 and Constant Correcting Bit;
2. MUL2: multiplier with UCAC1 and ECM;
3. MUL3: multiplier with UCAC2 and EC
Edge detection algorithms are widely used in various research fields like Image Processing, Video Processing and Artificial Intelligence etc. Edges are most important attribute of image information, and a lot of edge detection algorithms are defined in literature. Sobel edge detection algorithm is chosen among of them due to its property of less deterioration in high level of noise. FPGA is becoming the most dominant form of programmable logic over past few years and it has advantages of low investment cost and desktop testing with moderate processing speed and thereby offering itself as suitable one for real time application. This paper describes an efficient architecture for Sobel edge detector which is faster and takes less space than the Existing architecture.
IV. SOBEL EDGE DETECTION
Algorithm for Sobel Edge Detection The Sobel Edge Detection Operator is 3×3 spatial mask. It is based on first derivative-based operation. The Sobel masks are defined as:
References
[1] P. N. Whatmough, S. K. Lee, H. Lee, S. Rama, D. Brooks, and G.-Y. Wei, “14.3 A 28 nm SoC with a 1.2 GHz 568 nJ/prediction sparse deep-neural-network engine with >0.1 timing error rate tolerance for IoT applications,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2017, pp. 242–243.
[2] V. Gupta, D. Mohapatra, A. Raghunathan, and K. Roy, “Low-power digital signal processing using approximate adders,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 32, no. 1, pp. 124–137, Jan. 2013.
[3] M. Kazhdan, “An approximate and efficient method for optimal rotation alignment of 3D models,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 7, pp. 1221–1229, Jul. 2007.
[4] M. J. Wainwright and M. I. Jordan, “Log-determinant relaxation for approximate inference in discrete Markov random fields,” IEEE Trans. Signal Process., vol. 54, no. 6, pp. 2099–2109, Jun. 2006.
[5] J. Jo, J. Kung, and Y. Lee, “Approximate LSTM computing for energyefficient speech recognition,” Electronics, vol. 9, no. 12, p. 2004, Nov. 2020.