# Data Path Design

## Contents

#### Adder

- Bit adder circuits
- Ripple Carry Adder
- CLA Adder
- Multipliers and Shifter
  - Partial-product generation
  - Partial-product accumulation
  - Final addition
  - Barrel shifter

## A Generic Digital Processor



## Building Blocks for Digital Architectures

#### □Arithmetic unit

• Bit sliced data path – adder, multiplier, shifter, comparator, etc.

#### □Memory

• RAM, ROM, buffers, shift registers

#### □Control

- Finite state machine (PLA, random logic)
- Counters

#### □Interconnect

• Switches, arbiters, bus

## Bit-Sliced Design



Tile identical processing elements

### Bit Adder Circuits

□ Consider two binary digits A an B, binary sum is denoted by A+B such that

0 + 0 = 00 + 1 = 11 + 0 = 11 + 1 = 10Half adder В А Half Carry Adder Sum

| A | В | Sum | Carry |
|---|---|-----|-------|
| 0 | 0 | 0   | 0     |
| 0 | 1 | 1   | 0     |
| 1 | 0 | 1   | 0     |
| 1 | 1 | 0   | 1     |

$$Sum = A \oplus B$$
$$Carry = A \cdot B$$



### Full-Adder



sum\_out =  $A \oplus B \oplus C$ = $ABC + A\overline{B}\overline{C} + \overline{A}\overline{B}C + \overline{A}\overline{C}B$ carry\_out=AB + AC + BC

AOI Full-Adder Logic





## Full-Adder Circuits

Adding *n*-bit binary words

 $a_3 a_2 a_1 a_0$ 

 $+ b_3 b_2 b_1 b_0$  (12.3)

 $c_4 \ s_3 s_2 s_1 s_0$ 

In the standard carry algorithm, each of the *i*th columns (*i* = 0, 1, 2, 3) operates according to the full-adder equation

 $\begin{array}{c} c_i \\ a_i \\ + b_i \end{array} (12.4) \\ \hline c_{i+1} s_i \end{array}$ 

$$s_{i} = a_{i} \oplus b_{i} \oplus c_{i}$$

$$c_{i+1} = a_{i} \cdot b_{i} + c_{i} \cdot (a_{i} \oplus b_{i})$$
Of
$$c_{i+1} = a_{i}b_{i} + c_{i} \cdot (a_{i} + b_{i})$$
(12.5)
(12.6)



### Full-adder logic networks



Full-adder logic networks

### Full-adder truth-table

| A | В | <i>C</i> <sub><i>i</i></sub> | S | С, | Carry<br>status |
|---|---|------------------------------|---|----|-----------------|
| 0 | 0 | 0                            | 0 | 0  | delete          |
| 0 | 0 | 1                            | 1 | 0  | delete          |
| 0 | 1 | 0                            | 1 | 0  | propagate       |
| 0 | 1 | 1                            | 0 | 1  | propagate       |
| 1 | 0 | 0                            | 1 | 0  | propagate       |
| 1 | 0 | 1                            | 0 | 1  | propagate       |
| 1 | 1 | 0                            | 0 | 1  | generate        |
| 1 | 1 | 1                            | 1 | 1  | generate        |

Generate (G) = AB

**Propagate** (P) =  $A \oplus B$ 

$$C_o(G, P) = G + PC_i$$
$$S(G, P) = P \oplus C_i$$

### Inversion Property



$$\bar{S}(A, B, C_{i}) = S(\bar{A}, \bar{B}, \bar{C}_{i})$$
$$\overline{C}_{o}(A, B, C_{i}) = C_{o}(\bar{A}, \bar{B}, \bar{C}_{i})$$

#### Minimize Critical Path by Reducing Inverting Stages



**Exploit Inversion Property** 

### **Mirror Circuits**

- Mirror circuits are based on series-parallel logic gates, but are usually faster and have a more uniform layout
  - » Output 0's imply that an nFET chain is conducting to ground
  - » Output 1's means that a pFET group provides support from the power supply



| a | b | $a \oplus b$ | On devices |
|---|---|--------------|------------|
| 0 | 0 | 0            | - nFET     |
| 0 | 1 | 1            | - pFET     |
| 1 | 0 | 1            | - pFET     |
| 1 | 1 | 0            | - nFET     |

Figure 9.1 XOR function table



### Mirror Circuits

- □ Based on series-parallel logic gates
- □ Usually faster, more uniform layout
- Same transistor topology for nFETs and pFETs



### Evolution of carry-out circuit



#### A Better Structure: The Mirror Adder



24 transistors

### Transmission-gate full-adder circuit



#### **Complementary Pass-Transistor Logic**

Complementary Pass-Transistor (CPL): an dual-rail tech. that is based on nFET logic equations

$$f = a \cdot b + \overline{a} \cdot a \qquad (9.41)$$
$$\Rightarrow a \cdot \overline{b} + \overline{a} = \overline{a} + \overline{b} = \overline{a \cdot b} \qquad (9.42)$$

- CPL has several 2-input gates that can be created by using the same transistor topology with different input sequences
  - » Less layout area
  - » However, threshold will be loss and the fact that an input variable may have to drive more than one FET terminal



#### **Complementary Pass-Transistor Logic**

Dual-rail complementary pass-transistor logic (CPL)

 $a_i \oplus b_i$  and  $a_i \oplus b_i$  $s_n = \overline{(a_i \oplus b_i)} \cdot c_i + (a_i \oplus b_i) \cdot \overline{c_i}$  $\overline{a_i} \cdot b_i + \overline{b_i} \cdot \overline{c_i}$  $\overline{b_i} \cdot c_i + a_i \cdot b_i$  $\overline{a_i} \cdot \overline{b_i} + b_i \cdot \overline{c_i}$  $b_i \cdot c_i + a_i \cdot \overline{b_i}$ 



(a) 2-input array



(b) Sum circuit



#### Full Adder using Half Adders



## **Ripple Carry Adders**

• A carry ripple adder chain : 16-bit binary adder



- Cascade-connection
- Fast carry-out response is essential
- the delay will increase significantly when the number of bit is increased

## **Ripple-Carry Adders**



Figure 12.12 A 4-bit ripple-carry adder



Figure 12.13 Worst-case delay through the 4-bit ripple adder



Figure 12.14 4-ibt adder-subtractor circuit

### Carry Look-Ahead Adder





 $c_1$ 

C<sub>0</sub>

Ca

 $c_A$ 

Figure 12.15 Basis of the carry look-ahead algorithm Figure 12.16 Logic network for 4-bit CLA carry bits Figure 12.17 Sum calculation using the CLA network

### Carry Look-Ahead Adders



+ C1

## Carry Look-Ahead Adders

 $c_2$ 



(a) C1 logic





(a) Complementary

(b) Pseudo nMOS





Figure 12.18 nFET logic arrays for the CLA terms

Figure 12.19 Possible uses of the nFET logic arrays in Figure 12.18

(c) Dynamic

### Manchester Carry Chains

| $a_i$ | $b_i$ | $p_i$ | $g_i$ | $k_i$ |
|-------|-------|-------|-------|-------|
| 0     | 0     | 0     | 0     | 1     |
| 0     | 1     | 1     | 0     | 0     |
| 1     | 0     | 1     | 0     | 0     |
| 1     | 1     | 0     | 1     | 0     |

Figure 12.23 Propagate, generate, and carry-kill values



Figure 12.24 Switching network for the carry-out equation



Figure 12.26 Dynamic Manchester carry chain

 $c_2$ 

C 1

c3

### Multipliers



Figure 12.39 Bit-level multiplier

|       |           |           |           | $a_3 \times b_3$ | $a_2$<br>$b_2$ | $a_1$<br>$b_1$ | $a_0$<br>$b_0$ | multiplicand<br>multiplier |
|-------|-----------|-----------|-----------|------------------|----------------|----------------|----------------|----------------------------|
|       |           |           |           | a3 b0            | a2 b0          | a1 b0          | a0 b0          | manipilei                  |
|       |           | +         | $a_3 b_1$ | $a_2 b_1$        | $a_1 b_1$      | $a_0 b_1$      |                |                            |
|       |           | + a3 b2   | $a_2 b_2$ |                  | $a_0 b_2$      |                |                |                            |
| +     | $a_3 b_3$ | $a_2 b_3$ | $a_1 b_3$ | $a_0 b_3$        |                |                |                |                            |
| $p_7$ | $p_6$     | $p_5$     | $p_4$     | $p_3$            | $p_2$          | $p_1$          | $p_0$          | product                    |

Figure 12.40 Multiplication of two 4-bit words





## Multipliers

|       |                  |         | ×                | $a_3$<br>$b_3$ | $a_2$<br>$b_2$ | $a_1$<br>$b_1$ | a <sub>0</sub><br>b <sub>0</sub> | multiplicand<br>multiplier |
|-------|------------------|---------|------------------|----------------|----------------|----------------|----------------------------------|----------------------------|
|       |                  |         |                  | (a3            | $a_2$          | $a_1$          | $a_0$ ) $	imes$ $b_0$            | ( $a \times b_0$ ) $2^0$   |
|       |                  |         | ( a <sub>3</sub> | $a_2$          | $a_1$          | $a_0$ )        | $\times b_1$                     | ( $a \times b_1$ ) $2^1$   |
|       |                  | ( $a_3$ | $a_2$            | $a_1$          | $a_0$ )        | $\times b_2$   |                                  | ( $a 	imes b_2$ ) $2^2$    |
| +     | ( a <sub>3</sub> | $a_2$   | $a_1$            | $a_0$ )        | $\times b_3$   |                |                                  | ( $a 	imes b_3$ ) $2^3$    |
| $p_7$ | $p_6$            | $p_5$   | $p_4$            | $p_3$          | $p_2$          | $p_1$          | $p_0$                            | product                    |

Figure 12.42 Alternate view of multiplication process



Figure 12.43 Using a product register for multiplication

### Partial product generation logic



### Shift-right multiplication sequence



Figure 12.44 Shift-right multiplication sequence

### Register based multiplier network



## Partial Product Accumulation- Array Multipliers



Figure 12.46 An array multiplier



Figure 12.47 Modularized view of the multiplication sequence



Figure 12.48 Details for a 4 X 4 array multiplier

### General Rotator



### 4-bit rotate-right network



### Left-rotate switching array



### 8 x 4 Barrel Shifter



| shift  | $b_0b_1b_2b_3$                    |
|--------|-----------------------------------|
| 0      | $a_0 a_1 a_2 a_3 a_1 a_2 a_3 a_4$ |
| 2<br>3 | $a_2 a_3 a_4 a_5 a_3 a_4 a_5 a_6$ |
| 4      | $a_4 a_5 a_6 a_7$                 |

### FET Array Barrel Shifter



# Thank You