## DESIGN OF CMOS 8-BIT PARALLEL ADDER ENERGY EFFICIENT STRUCTURE USING SR-CPL LOGIC STYLE

Felix Muthu, Aravinth.T.S, Rajendran.T

Department of ECE, FOE-CB, Karpagam University, Coimbatore, Tamil Nadu, India E.mails: felixmuthul1@gmail.com, arajrajesh1990@gmail.com & aravinth.tsa@gmail.com

#### ABSTRACT

**Objectives:** We present high speed and low power 8-Bit parallel adder cells designed with modified SR-CPL logic styles that had a reduced power delay product (PDP) as compared to the previous logics DPL and pass transistor logic. **Tool Used:** All the parallel adders were designed with a 0.18µm CMOS technology virtuoso cadence environment. **Results:** Simulations of the circuit show that the proposed parallel adders have reduced the power from 0.33mW to 0.24mW. **Applications:** In mere future the system can be implemented in high speed processors for achieving low power

Keywords: SR-CPL logic Styles, PDP, DPL, Pass Transistor Logic, Virtuoso Cadence Environment.

#### I. INTRODUCTION

Energy efficiency is one of the desired features for new era low power devices, which have been skteched for very high performance portable devices and its applications. On the other hand, the ever enlarging market segment of portable devices needs the availability of low power blocks that activate the execution of long lasting battery enabled systems. But the general trend of enlargement in operating frequencies and circuit complexity, in order to cope with the throughput needed in modern high-performance applications, needs the design of very high-speed circuitries. The PDP metric relates the aggregate of energy exhausted during the cognizance of a resolute task, and stands as the more fair performance metric when differentiating methodologies of a module tested and designed using different technologies, operating frequencies and frameworks. Addition or summation is a rudimentary arithmetic operation that is broadly used in many VLSI systems, such as application specific DSP building blocks and microprocessors. This module is the core of many arithmetic operations such as add, subtract, multiply, divide and address generation<sup>1-23</sup>.

As stated above, the PDP evinced by the Parallel-adder would affect the circuitries overall performance. Thus, taking this fact into consideration, the design of a Parallel adder having low-power utilization and low dissemination delay results of great interest for the implementation of modern digital systems. <sup>1-23</sup> In this paper, we report the design and performance comparison of Parallel-adder cells implemented with an different intramural logic structure, depends on the multiplexing of the Boolean functions XOR or XNOR and AND or OR, to acquire equitable delays in SUM as well as CARRY outputs, respectively, and SR-CPL logic styles, in order to reduce power consumption. The resultant Parallel-adders show to be more efficient on regards of power consumption and delay when compared with other ones delineated formerely as good candidates to build low-power arithmetic blocks. This work is well ordered as follows. Section II represents the internal logic embraced as standard in existing articles for intriguing a Parallel-adder cell. 1-23 Section III institutes the alternative internal logic structure and the SR-CPL logic styles used to build the proposed technique of Parallel-adders. Section IV details the features of the simulation environment used for the

comparison carried out to obtain the power and speed performance of the Parallel-adders. Section V reviews the results obtained from the simulations, and Section VI concludes this work<sup>1-23</sup>.

#### **II. Literature Review**

It proposes to use a zero-delay overhead self-timed pipeline style that supports very high speed operation. Developed techniques to enable the application of zero delay-overhead self-timed pipeline in this context and realize run-time pipeline depth control. Simulations under variable data rate scenarios demonstrate a significant performance gain<sup>1</sup>.

The design of high-speed low-power full adder cells based upon a substitute logic method has been presented. Such results in a great improvement on regards of power-delay metric for the propounded adders, when juxtaposed with several formerely published realizations<sup>2</sup>.

The Parallel Asynchronous Self Timed Adder (PAS-TA) circuit is effectively narrated by a handshaking protocol and also compared with other adders. The MAC unit is executed and implemented, and such process is attained effectively. Simulation results be speaked the effectiveness of this framework in parallel prefix adder using multiplication (product) through addition process<sup>3</sup>.

This paper describes an asynchronous parallel adder. It is based on Radix method for faster computation of sum and to reduce delay caused by carry chain. The computation has been carried out using parallel process. The aim of this work is to reduce the Power Delay Product (PDP) and Energy Delay Product (EDP) of an adder<sup>4</sup>.

This article presents a Parallel Single Rail Self Timed Adder (PSRSTA). It is formed, based on a repeated formulation for executing multiple bit binary addition. The operation is parallel for those bits that do not need any carry chain propagation. Thus, the architecture of PSRTA grasps logarithmic performance over random operand conditions without any special speedup circuitry or look-ahead schema<sup>5</sup>.

## III. The Existing Methods

a. Pass Transistor Logic (PTL)

In electronics, PTL describes several logic families used in the structuring of ICs. It minimizes the total count of transistors, which is used to make different logic gates and other functionaries by removing dispensable transistors. Transistors are used as switches to pass transistor logic levels between different nodes of a circuit, instead of as switches bridged directly to supply voltages<sup>1-17</sup>. This lessen the total number of active devices, but has the disadvantage that the disimmilarity of the voltage levels with high and low logic levels diminshes at each stage. Each transistor in series is less saturated at its output than at its input<sup>18</sup>. If several devices are chained in series in a logic path, a regularly fabricated gate may be necessitated to restore the signal voltage to the maximum peak value. By constradistinction, tradional CMOS logic transistors, output connects to one of the power supply rails, so logic voltage levels in a chain do not dwindle. Simulation process of circu its may be required to ensure adequate performance <sup>1-23</sup>. b. Complementry Pass Transistor Logic (CPL)

CPL is a logic style used for implementing logic gates that uses transmission gates CMOS pass transistors.<sup>19</sup> other researchers use the term CPL to stipulate a style of executing logic gates where each gate consists of a NMOS-only pass transistor network, followed by a CMOS output inverter<sup>20</sup> and some other researchers use the term CPL to stipulate a style of executing logic gates by dual-rail encoding. Every CPL gate has two output wires, both the +ve signal and the reciprocal signal, exterminating the need for inverters<sup>21-23</sup>. CPL or DPL cites to a logic family which is designated for certain advancements. It is very usual to use this logic for multiplexers and latches. CPL uses series transistors to select between possible upturned outputs of the logic, the output of the same can be meant to drive an inverter. Here in the CMOS logic transistors are connected in parallel<sup>1-23</sup>.

Many projects have been published regarding the optimization of Low Power-FA trying to implement different combinations for the standard CMOS, Differential Cascaded Voltage Switch (DCVS), CPL, DPL and the current scheme Swing Restored Comple-mentary Pass Transistor Logic (SR-CPL), and the logic structure used to build the full adder circuitry. The internal logic block structure has been used as a stan-dard grouping in most of the intensifications developed for the sinle bit FA blocks. In this scheme, the FA circuit is formed by three main logical blocks: an XOR-XNOR blocks and XOR blocks or MUXs to produce the SUM and CARRY outputs. The major hurdle arises in this scheme is that its propagation delay for FA built with the logic blocks and its critical path<sup>1-23</sup>.

## IV. The Proposed Method

In this proposed technique, the signals are not generated internally that control and decide the choice of the output MUXs. Instead, the input, evincing a full voltage swing and no additional delays are used to drive the MUXs unit, to reduce the propagation delays. The load capacitance to the input is reduced, as it is connected to some transistor and no longer to drain or source terminals, where the diffusion in capacitance is very large for sub micron technologies. (refer figure 1 to 5) Hence, the overall circuit delay for complex modules has been reduced for the critical paths. The propagation delay for the SUM (So) and CARRY (Co) output ports can be tuned seperately by regulating the XOR or XNOR gates and the AND or OR gates; this feature is the most advantageous technique for the applications, where the skew between incoming signals is complex for a free flow operation and well adjusted propagation delays at the outputs to minimize the glitches in cascaded circuitaries (refer figure 1 to 5)<sup>1-23</sup>.

## V. Energy Efficient Parallel Full Adder

From the Table 1, Table 2 & Table 3 for the current dissipation, Delay and power delay product the 8-Bit parallel full adder using SR-CPL logic styles shows higher performance during its each transition from  $A \rightarrow B$  and vice versa as tabulated below<sup>1-23</sup>.

## **VII. Results**

Two modern designs based on SR-CPL and DPL style full adders are being examined in this project. The main advantages of this design are: Multiplexers are directly controlled by Cin instead of internally generated signals thereby reducing delay. Capacitive load on Cin is reduced. (Refer: Table 1, 2 & 3 and Figure: 6 to 8)

#### VI. Discussion

The propagation delay of So and Co can be tuned by sizing XOR/XNOR gates appropriately. The inclusion of buffer at input can be integrated by using NAND/NOR gates instead of XOR/XNOR gates. Buffers are placed at the inputs are placed to account for the load the device offers at the inputs.

Also, since the designs presented here consist of pass transistor logic which has no direct power supply conection, the power consumed by the device also comes through these inverters. The output inverters account for the power due to degraded voltage swing and slopes of full adder output. The full adders have been simulated using 180-nm CMOS technology using cadence virtuoso. The value of supply voltage VDD used was 1.8 V. **VII. Conclusion** 

We have presented 8- Bit PFA SR-CPL and Logic style adders. The key features observed were: (Refer: Table 1, 2 & 3 and Figure: 6 to 8)

1. The proposed designs reduce both total average power and worst-case delay of the circuit. Refer (Table 1)

2. Delay of D1 is comparable to that of the CPL logic (it is slightly greater for most transitions). Delay of proposed logic style is significantly smaller as compared to the reference DPL and CPL Design – from about 16% to 31%

3. The overall Power delay product of Proposed SR-CPL design is reduced as compared to the DPL and CPL designs.

4. The proposed design is more efficient both power wise and delay wise as compared to the DPL and CPL designs.

5. The transistor count for the proposed SR-CPL design (26) is also much lesser than CPL (28) and DPL (38) styles.

6. The proposed SR-CPL designs occupy much less area (116  $\mu$ m<sup>2</sup>) as compared to the CPL (118  $\mu$ m<sup>2</sup>) and DPL adder (238  $\mu$ m<sup>2</sup>).

### REFERENCES

- [1] A.P.Chandrakasan, et al., Lowpower CMOS digital design . IEEE JSSC (1992).
- [2] A.M. Shams & M. Bayoumi, Performance evaluation of 1-bit CMOS adder cells. IEEE ISCAS (1999).
- [3] K. M. Chu & D. Pulfrey, A comparison of CMOS circuit techniques: Differential cascode voltage switch logic versus conventional logic. IEEE J. Solid-State Circuits (1987).
- [4] N. Weste & K. Eshraghian, Principles of CMOS design, A system perspective, Addison-Wesley (1988).
- [5] K. Yano, et al., A 3.8 ns CMOS 16-b multiplier using complementary pass-transistor logic. IEEE J. Solid-State Circuits (1990).
- [6] M. Suzuki, et al., A 1.5 ns 32-b CMOS ALU in double pass-transistor logic. IEEE J. Solid-State Circuits (1993).
- [7] R. Zimmerman & W. Fichtner, Low-power logic styles: CMOS Versus pass-transistor logic. IEEE J. Solid-State Circuits (1997).
- [8] N. Zhuang & H. Wu, A new design of the CMOS full adder. IEEE JSSC (1992).
- [9] A.M. Shams & M. Bayoumi, A new cell for low power adders, Proceedings of the Int. Conf. MWSCAS (1995).
- [10] M. Aguirre & M. Linares, An alternative logic approach to implement high-speed low-power full adder cells, Proc. SBCCI (2005).
- [11] C. Chang, J. Gu, & M. Zhang, A review of 0.18full adder performances for tree structured arithmetic circuits, IEEE Trans. Very Large Scale Integr. (VLSI) Syst. (2005).
- [12] Reto Zimmermann & Wolfgang Fichtner, Low-Power Logic Styles: CMOS Versus Pass-Transistor Logic. IEEE Journal Of Solid-State Circuits (1997).
- [13] Aguirre-Hernandez, et al., CMOS full adders for energy-efficient arithmetic applications. IEEE Transac-tions (2011).
- [14] Quintana, J. M. et al., Low-power logic styles for full-adder circuits. Electronics, Circuits and Systems, The 8th IEEE International Conference Vol. 3 (2001).
- [15] Jaume Segura & Charles F., Hawkins CMOS electronics: how it works, how it fails, Wiley-IEEE, (2004).
- [16] Clive Maxfield, Bebop to the boolean boogie: an unconventional guide to electronics Newnes (2008).
- [17] J. Rabaey, et al., Digital Integrated Circuits: A Design Perspective. Upper Saddle River, NJ: Prentice-Hall (2003).
- [18] Vijay, V., et al., A review of the 0.09 μm standard full adders. Int. J. of VLSI Design & Communication Systems (2012).
- [19] Gary K. Yeap, Practical Low Power Digital VLSI Design (2012).
- [20] V.G. Oklobdzija, Digital Design and Fabrication, (Inpress).

- [21] Ajit Pal, Low-Power VLSI Circuits and Systems, (Inpress).
- [22] Wai-Kai Chen, Logic Design (2003).
- [23] V.G. Oklobdzij, The Computer Engineering Handbook (2001).



Figure 1 Existing Circuitary<sup>1-23</sup>



Figure 2 Completion Detection Circuit



Figure 3 SR-CPL Logic Style



Figure 4 Proposed Parallel Half Adder using SR-CPL



Figure 5 8-Bit Parallel Full Adder using SR-CPL



Figure 6 Output Plot of Proposed Circuit



#### **Table 1 Current Dissipation**

| Transitions                | CPL   | Existing  | Proposed SR- |
|----------------------------|-------|-----------|--------------|
|                            | Logic | DPL Logic | CPLLogic     |
| A=0 to 1 to 0; B=0; Cin=1  | 32.6µ | 47.2μ     | 32.5µ        |
| A=0 to 1 to 0; B=1; Cin=0  | 32µ   | 47.4μ     | 31.4µ        |
| B=0 to 1 to 0; A=0; Cin=1  | 32.5µ | 45.2μ     | 31.3µ        |
| B=0 to 1 to 0; A=1; Cin=0  | 35.5µ | 45.4μ     | 32.7µ        |
| Cin=0 to 1 to 0; A=0; B=1  | 27.7μ | 45.2μ     | 24.7μ        |
| Cin =0 to 1 to 0; A=1; B=0 | 31µ   | 45.2μ     | 25.1µ        |

#### Table 2 Delay

| Transitions                | CPL   | Existing  | Proposed SR- |
|----------------------------|-------|-----------|--------------|
|                            | Logic | DPL Logic | CPLLogic     |
| A=0 to 1 to 0; B=0; Cin=1  | 361p  | 330p      | 239p         |
| A=0 to 1 to 0; B=1; Cin=0  | 391p  | 304p      | 241p         |
| B=0 to 1 to 0; A=0; Cin=1  | 386р  | 288p      | 240p         |
| B=0 to 1 to 0; A=1; Cin=0  | 383p  | 322p      | 229p         |
| Cin=0 to 1 to 0; A=0; B=1  | 329p  | 299p      | 205p         |
| Cin =0 to 1 to 0; A=1; B=0 | 368p  | 313p      | 188p         |

**Table 3 Power Delay Product** 

| Transitions                | CPL   | Existing  | Proposed SR- |
|----------------------------|-------|-----------|--------------|
|                            | Logic | DPL Logic | CPLLogic     |
| A=0 to 1 to 0; B=0; Cin=1  | 21    | 28        | 14           |
| A=0 to 1 to 0; B=1; Cin=0  | 23    | 30        | 14           |
| B=0 to 1 to 0; A=0; Cin=1  | 23    | 24        | 14           |
| B=0 to 1 to 0; A=1; Cin=0  | 25    | 26        | 14           |
| Cin=0 to 1 to 0; A=0; B=1  | 17    | 25        | 9            |
| Cin =0 to 1 to 0; A=1; B=0 | 21    | 26        | 9            |

# **Figure 7** Power Delay Plot for 8- Bit PFA SR-CPL vs Existing Styles



Figure 8 Power Dissipation Plot for 8- Bit PFA SR-CPL vs Existing Styles



**Figure 9** Energy Delay Product Plot for 8- Bit PFA SR-CPL vs Existing Styles