A Methodology for Evaluating Delay and Power on Binary Counters and Block Level Optimization

Authors: Yalla Taraka Venkata Ramana, Bighneswar Panda, B Jeevana Rao, Bhaskara Rao Doddi

DOI Link: https://doi.org/10.22214/ijraset.2022.40647

Abstract

In this paper, slice level optimization is performed on the conventional 6:3 counter and then finally integrated all the slices to the original one. Slice level optimization corresponds to partition the given circuit in to number of blocks such that final integration can be done effectively. Considering individual blocks Power testing and delay testing, results were taken by triggering the activities which lead to power consumption and all possible critical paths were also tested for every individual block and then comparison is made. Test vectors are also applied such that every consecutive cycle output is complemented, so that low to high and high to low delays can be captured with in a smaller number of test vectors. Identical strategy is applied to measure the power because for every two cycles only one power consuming event occurs on a single node under consideration. The proposed 6:3 counter is 36% faster than the conventional one and also saves the power for about 56%. Utilizing more NAND, NOR and AOI gates instead of AND, OR gates have led to the achieved optimization.

Introduction

I. INTRODUCTION

Row compression technique is used in [1,2,3] for integrating the partial products effectively. Delay is more in these counters due to needing an equality circuit in the maximum delay taking paths. Compressor of size 5:2 and 4:2 is proposed in [4,5]. Date selector is used to improve the delay in maximum delay taking paths in[6,7]. Low power compressor was proposed in[8], Adder architecture was proposed in [9]. In this paper, we present a slice level optimization method on the existing design[10] and then every slice is optimized to the best possible extent with respect to the power and delay and finally integrated. Slice will be most probably a sub –circuit with primary inputs and intermediate outputs or it may be with intermediate inputs and also intermediate outputs or at the final slice we can imagine a slice as having intermediate output as the primary input and primary output as the output. Every slice, in detail power and delay testing were performed. Delay testing corresponds to examining all possible critical paths for low to high and high to low of that output. Power testing corresponds to examining for the all possible low to high of that particular node.

II. LITERATURE REVIEW [10]

In stacker 3-bit, the basic hardware required is carry logic for output Y1 and three input AND gate and three input OR gate for rest of the outputs. Output will be generated with in two levels and the delay will be the summation of two input AND and three input OR delays. First output will be ‘’0’’ if all the three bits are zero, second output will be “0” if any two of the inputs are zero and the final output will be “0" if any one of the inputs is zero.

All the blocks were analyzed for it’s performance. Every block in this design will be optimized for the VLSI constraints. Detailed delay and power analysis will be done on the existing and also new design is proposed.

In 16T Block, It is used for generating the output S for the 6:3 counter and two such copies of hardware is needed to realize the circuit. Delay is the Summation of not gate in the first level, AND gate in the second level and OR gate in the third level. There are six power consuming internal and external nodes which may lead to more power consumption. XOR is needed with two inputs and those inputs are 16T BLOCK with inputs as H2, H1, H0 as one input and one more 16T BLOCK with inputs as I2, I1, I0. Output being produced is S for 6:3 counter.

In 26T Block, this block requires two levels of logic to generate the output, where in the first level it requires AND gates and in the second level it requires OR gate. There are four power consuming internal and external nodes at a time out of eight power consuming nodes. It is used to generate C2 output of 6:3 Counter. In 34T Block, this block is used to generate C1output of 6:3 Counter. It requires five levels to produce the output, where it needs AND, OR, NOT, AND and OR in the levels starting from one to five. 6:3 Counter requires six levels to produce the output S, three levels to generate C2 output and six levels to produce C1 output. There are 48 power consuming internal and external nodes in the circuit.

III. PROPOSED COUNTER

A. Stacker 3-BIT

The basic hardware required is shown in the fig. 1 and it requires carry logic for output Y1 and two input NOR gate and three input OR gate for rest of the outputs. Output will be generated with in two levels and the delay will be the summation of two input NAND and three input NAND delays. There are seven power consuming nodes in the proposed circuit and when compared with the design [10] where there are 12 power consuming nodes which increases the dynamic power consumption.

B. 8T Block

Proposed circuit in fig. 2 has not gate followed by OAI21 and it needs eight transistors. There are only two power consuming nodes for the proposed design and when compared to the design[10], it needs six power consuming nodes. Two levels of logic is needed to compute for the proposed design and it needs five levels of logic for the design[10].

C. XOR Block

There are no modifications of the XOR block and it is the same design [10] was utilized as in fig.3.

D. 18T Block

Proposed block in fig.4 requires two levels of logic to do the computation and in the design[10] it needs four levels of logic. There are eight power consuming nodes in the design[10] and the proposed one has four power consuming nodes. Considerable savings are there with respect to power and delay.

E. 22T BLOCK

Proposed design in fig.5 needs four levels of logic to produce the result and in the design[10] it needs five levels of logic for the computation. There are five power consuming nodes in the proposed design and in [10] it requires eleven power consuming nodes

F. Counter

Proposed counter in fig.6 has been optimized at almost all the intermediate blocks used in the design which leads to maximum optimization. Existing arcitecture[10] is used to design the counter with majority of sub-blocks being optimized.

IV. PERFORMANCE ANALYSIS

Performance analysis is much desired to estimate how better the existing and proposed designs are with respect to the constraints of our interest. This section evaluates each and every block of the proposed counter for power and delay and for accomplishing this suitable input combinations are applied such that they give the results in an appropriate manner by triggering al l the power consuming events for power estimation and activating each. Table I,II,III shows the delays and test sets for the 3-bit stacker in [10].

Table I shows the maximum delay of 195ps when the current applied input is “111” and also the previous input should be “000”. We can validate this one because when all the inputs are ‘0’ it means the output node is strongly discharged and there is only one possible way to get the output of ‘1’.

Table II shows the maximum delay of 283ps when the current applied input is “101” and also the previous input should be “000”. Delay is maximum when the output node changes from low to high when compared with high to low.

Table III shows the maximum delay of 167ps when the current applied input is “000” and also the previous input should be “111”. Delay is maximum when the output node changes from high to low when compared with low to high.

Table IV shows the maximum delay of 167ps when the current applied input is “000” and also the previous input should be “111”. Delay is maximum when the output node changes from high to low when compared with low to high.

Table V shows the maximum delay of 174ps when the current applied input is “001” and also the previous input should be “111”. Delay is maximum when the output node changes from high to low when compared with low to high for the proposed circuit.

Table VI shows the maximum delay of 166ps when the current applied input is “111” and also the previous input should be “000”. We can validate this one because when all the inputs are ‘0’ it means the output node is strongly discharged and there is only one possible way to get the output of ‘1’.Table IV, V, VI shows the delays and test sets for the proposed 3-bit stacker .

Table VII shows the maximum delay of 256ps when the current applied input is “011” and also the previous input should be “101”. Delay is maximum when the output node changes from low to high when compared with high to low for the conventional circuit [10].

Table VIII shows the maximum delay of 98ps when the current applied input is “001” and also the previous input should be “010”. Delay is maximum when the output node changes from high to low when compared with low to high for the proposed circuit.

Table IX shows the maximum delay of 147ps when the current applied input is “00” and also the previous input should be “10”. Delay is maximum when the output node changes from high to low when compared with low to high for the proposed circuit. Same delay is achieved for the proposed and the XOR in [10].

Table X shows the maximum delay of 280ps when the current applied input is “000111” and also the previous input should be “111111”. Delay is maximum when the output node changes from high to low when compared with low to high for the conventional one [10].

Table XI shows the maximum delay of 145ps when the current applied input is “100110” and also the previous input should be “111111”. Delay is maximum when the output node changes from high to low when compared with lowto high for the proposed one.

Table XII shows the maximum delay of 506ps when the current applied input is “0010010” and also the previous input should be “0000001”. Delay is maximum when the output node changes from low to high when compared with high to low for the conventional one [10].

Table XIII shows the maximum delay of 365ps when the current applied input is “1010000” and also the previous input should be “1111110”. Delay is maximum when the output node changes from high to low when compared with low to high for the proposed one.

TableXIV shows the overall delay which is computed ate three levels. Level1 needs 283ps, level2 needs 280ps and level3 takes 506ps and the summation is 1069ps for the conventional one [10].

TableXV shows the overall power consumption which is computed ate three levels. Level1 needs 10.5mw, level2 needs 8.9mw and level3 takes 5.72mw and the summation is 25.12mw for the conventional one[10].

TableXVI shows the overall delay which is computed at three levels. Level1 needs 174ps, level2 needs 145ps and level3 takes 365ps and the summation is 684ps for the proposed one.

Table XVII shows the overall power consumption which is computed ate three levels. Level1 needs 5mw, level2 needs 2.6mw and level3 takes 3.15mw and the summation is 10.90mw for the proposed one.

Tabl eXVIII shows the overall power delay product comparison which is computed as product of delay and average power consumption. Delay for proposed 6:3 counter needs 684ps and for conventional one [10] it needs 1069ps. Power for proposed one requires 10.9mw and for existing one[10] it takes 25mw. Power delay product for existing one is 26.863aj and for the proposed one it is 7.456aj

Conclusion

Proposed counter can be applied where high efficiency is needed. Delay analysis was done on each and every block of the counter by evaluating critical paths. Suitable test vectors are generated to activate the events. proposed logic for counter outperforms the conventional counter delay, PDP and power. Proposed counter needed 126 number of transistors and it takes for the conventional counter it is 186. So, in total 60 number of transistors were reduced. Power consumption isalso saved for about 56.6% and coming to the PDP for about 72.25% is better for the proposed design. Coming to the delay for about 36% is better for the Proposed design.

References

[1] C. S. Wallace, (1964) A suggestion for a fast multiplier: IEEE Trans. Electron.Comput., vol. EC-13, no. 1, pp. 14–17, Feb. [2] L. Dadda, (1965) Some schemes for parallel multipliers: Alta Freq., vol. 34,pp. 349–356, May . [3] Z. Wang, G. A. Jullien, and W. C. Miller(1995), A new design technique for column compression multipliers, IEEE Trans. Comput., vol. 44, no. 8,pp. 962–970, Aug. [4] J. Gu and C.-H. Chang, (2003) Low voltage, low power (5:2) compressor cellfor fast arithmetic circuits, in Proc. IEEE Int. Conf. Acoust., Speech,Signal Process. (ICASSP), vol. 2. Apr., pp. 661–664. [5] S.-F. Hsiao, M.-R. Jiang, and J.-S. Yeh,(1998) “Design of high-speed low-power 3-2 counter and 4-2 compressor for fast multipliers, Electron.Lett., vol. 34, no. 4, pp. 341–343, Feb. [6] S. Veeramachaneni, L. Avinash, M. Krishna, and M. B. Srinivas,(2007) Novel architectures forefficient (m, n) parallel counters, in Proc. 17th ACM Great Lakes Symp. VLSI, pp. 188–191. [7] S. Veeramachaneni, K. M. Krishna, L. Avinash, S. R. Puppala, and M.B. Srinivas,(2007) Novel architectures for high-speed and low-power 3-2, 4:2 and 5-2 compressors, in Proc. 20th Int. Conf. VLSI Design Held Jointly 6th Int. Conf. Embedded Syst. (VLSID), Jan, pp. 324–329. [8] K. Prasad and K. K. Parhi, (2001) Low-power 4-2 and 5-2 compressors, in Proc. Conf. Rec. 35th Asilomar Conf. Signals, Syst. Comput., vol. 1.Nov, pp. 129–133. [9] D. Radhakrishnan,(2001) Low-voltage low-power CMOS full adder, IEEE Proc.-Circuits, Devices Syst., vol. 148, no. 1, pp. 19–24,Feb. [10] Christopher Fritz and Adly T. Fam(2016) Fast Binary Counters Based on Symmetric Stacking IEEE Trans.on Very Large Scale Integration (VLSI)Systems, Vol25, No.10, pp 2971-2975.

Copyright

Copyright © 2022 Yalla Taraka Venkata Ramana, Bighneswar Panda, B Jeevana Rao, Bhaskara Rao Doddi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET40647

Publish Date : 2022-03-05

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here