l8 9 arithmetic

L8/9: Arithmetic Structures
L8/9: Arithmetic Structures
Acknowledgements:
Materials in this lecture are courtesy of the following sources and are used with permission.
Rex Min
Kevin Atkinson
Alice Wang
Prof. Randy Katz (Unified Microelectronics Corporation Distinguished Professor in Electrical
Engineering and Computer Science at the University of California, Berkeley) and Prof. Gaetano
Borriello (University of Washington Department of Computer Science & Engineering) from
Chapter 2 of R. Katz, G. Borriello. Contemporary Logic Design. 2nd ed. Prentice-Hall/Pearson
Education, 2005.
J. Rabaey, A. Chandrakasan, B. Nikolic. Digital Integrated Circuits: A Design Perspective.
Prentice Hall/Pearson, 2003.
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 1
Number Systems Basics
Number Systems Basics
How to represent negative numbers?
Three common schemes: sign-magnitude, ones
complement, twos complement
Sign-magnitude: MSB = 0 for positive, 1 for negative
Range: -(2N-1 1) to +(2N-1 1)
Two representations for zero: 0000& & 1000&
Simple multiplication but complicated addition/subtraction
_
Ones complement: if N is positive then its negative is N
Example: 0111 = 7, 1000 = -7
Range: -(2N-1 1) to +(2N-1 1)
Two representations for zero: 0000& & 1111&
Subtraction implemented as addition and negation
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 2
Twos Complement Representation
Twos Complement Representation
Twos complement = bitwise complement + 1
0111 1000 + 1 = 1001 = -7
1001 0110 + 1 = 0111 = 7
Asymmetric range: -2N-1 to +2N-1-1
Only one representation for zero
Simple addition and subtraction
Most common representation
0100
1100 0100
4 1100
-4 -4
4
0011
1101 1101
+ 3 0011
+ (-3) + 3
- 3
0111
11001 10001
7 1111
-7 -1
1
[Katz05]
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 3
Overflow Conditions
Overflow Conditions
Add two positive numbers to get a negative number or two negative numbers
to get a positive number
-1
+0
-1
+0
-2
-2
1111
0000
1111
0000
+1
+1
1110
1110
0001
0001
-3
-3
+2
+2
1101
1101
0010
0010
-4
-4
1100
1100
+3
+3
0011
0011
-5
-5
1011
1011
0100
0100
+4
+4
1010
1010
-6
-6
0101
0101
+5
+5
1001
1001
0110
0110
-7
-7
+6
+6
1000
0111 1000
0111
-8
-8
+7
+7
-7 - 2 = +7!
5 + 3 = -8!
1 00 0
0 1 1 1
-7 1 0 0 1
5 0 1 0 1
-2 1 1 0 0
3 0 0 1 1
7 1 0 1 1 1
-8 0 1 0 0 0
If carry in to sign equals carry out then can ignore carry out, otherwise have overflow
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 4
Binary Full Adder
Binary Full Adder
A B
S = A �" B �" Ci
= ABCi + ABCi + ABCi + ABCi
Full
Ci
Co
Adder
Co = AB + Ci (A+B)
S
A B CI S CO A B
0 0 0 0 0
CI 00 01 11 10
0 0 1 1 0
0 0 1 0 1
0 1 0 1 0
S
0 1 1 0 1
1
1 0 1 0
1 0 0 1 0
1 0 1 0 1
A B
1 1 0 0 1
1 1 1 1 1 CI 00 01 11 10
0 0 0 1 0
CO
1
0 1 1 1
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 5
Ripple Carry Adder Structure
Ripple Carry Adder Structure
B1 A1
B2 A2
B3 B0
A3 A0
Co,3
Co,1 Ci,0
Co,0
Co,2
Full
Full Full Full
Adder
Adder Adder Adder
S1
S3 S2 S0
Worst case propagation delay linear with the number of bits
tadder = (N-1)tcarry + tsum
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 6
Extension to Subtraction
Extension to Subtraction
Under twos complement, subtracting B is the same as
adding the bitwise complement of B then adding 1
Combination addition/subtraction system:
_
mux selects B for addition, B for subtraction
B1 B1
B2 B2
B0 B0
B3 B3
A2 0 1 A1 0 1
A0 0 1
A3
0 1
Add/Subtract
FA FA
FA
FA
Co,1 Co,0
Co,2
Co,3
Add 1 for
S0
S1
S3 S2
subtraction using
carry in
Overflow occurs if carry in to sign bit differs from final carry out
overflow
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 7
Comparator (one approach)
Comparator (one approach)
B2 B2 B1 B1
B0 B0
B3 B3
A2 0 1 A1 0 1
A0 0 1
A3
0 1
1
FA FA
FA
FA
Co,1 Co,0
Co,2
Co,3
S2 S0
S1
S3
N
true if negative
result
true if zero result
Z
A < B = N
A = B = Z
A d" B = Z + N
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 8
Alternate Adder Logic Formulation
Alternate Adder Logic Formulation
How to Speed up the Critical (Carry) Path?
(How to Build a Fast Adder?)
A B
Full
Cin
Co
Adder
S
Generate (G) = AB
Propagate (P) = A B
�"
Note: can also use P = A + B for Co
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 9
Carry Bypass Adder
Carry Bypass Adder
A0 B0 A1 B1 A2 B2 A3 B3
Can compute P, G
P,G P,G P,G
P,G
in parallel for all bits
P0 G0 P1 G1 P2 G2 P3 G3
FA FA
FA
FA
Ci,0
Co,0 Co,1 Co,2
Co,3
BP= P0P1P2P3
P,G P,G P,G
P,G
P0 G0 P1 G1 P2 G2 P3 G3
Ci,0
FA FA 0
FA
FA
Co,3
Co,0 Co,1 Co,2
1
Key Idea: if (P0 P1 P2 P3) then Co,3 = Ci,0
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 10
16-bit Carry Bypass Adder
16-bit Carry Bypass Adder
BP= P0P1P2P3
BP= P4P5P6P7
BP= P8P9P10P11
BP= P12P13P14P15
P,G P,G P,G
P,G
P,G P,G P,G
P,G
P,G P,G P,G
P,G
P,G P,G P,G
P,G
Ci,0
Co,3
Co,11
Co,7
FA FA FA 0
FA
FA FA FA
FA 0
FA FA FA
FA 0
FA FA FA
FA 0
Co,0 Co,1
Co,2
Co,4 Co,5
Co,6
Co,8 Co,9
1 Co,10 Co,12 Co,13
Co,14
1
1
1
Co,15
Assume the following for delay each gate:
P, G from A, B: 1 delay unit
P, G, Ci to Co or Sum for a FA: 1 delay unit
2:1 mux delay: 1 delay unit
What is the worst case propagation delay for the 16-bit adder?
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 11
Critical Path Analysis
Critical Path Analysis
BP= P0P1P2P3
BP2= P4P5P6P7
BP3= P8P9P10P11
BP4= P12P13P14P15
P,G P,G P,G
P,G
P,G P,G P,G
P,G
P,G P,G P,G
P,G
P,G P,G P,G
P,G
Ci,0
Co,3
Co,11
Co,7
FA FA FA 0
FA
FA FA FA
FA 0
FA FA FA
FA 0
FA FA FA
FA 0
Co,0 Co,1
Co,2
Co,4 Co,5
Co,6
Co,8 Co,9
1 Co,10 Co,12 Co,13
Co,14
1
1
1
Co,15
For the second stage, is the critical path:
BP2 = 0 or BP2 = 1?
Message: Timing Analysis is Very Tricky
Must Carefully Consider Data Dependencies For
False Paths
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 12
Carry Lookahead Adder
Carry Lookahead Adder
Re-express the carry logic as follows:
C1 = G0 + P0 C0
C2 = G1 + P1 C1 = G1 + P1 G0 + P1 P0 C0
C3 = G2 + P2 C2 = G2 + P2 G1 + P2 P1 G0 + P2 P1 P0 C0
C4 = G3 + P3 C3 = G3 + P3 G2 + P3 P2 G1 + P3 P2 P1 G0 + P3 P2 P1 P0 C0
&
Each of the carry equations can be implemented in a two-level logic
network
Variables are the adder inputs and carry in to stage 0
Ripple effect has been eliminated!
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 13
Carry Lookahead Logic
Carry Lookahead Logic
Ai
Pi
Bi
Adder with propagate and
Si
Ci
generate outputs
Gi
Later stages have increasingly complex logic
C0 C0 C0
P0 C1 P0 P0
P1 P1
G0
P2 P2
P3
G0
P1 G0
C0
P2 P1
P0
P2
P1
G1 C3 P3
G0 P2
C2 G1
P1
P2
G2 P3
G1 C4
G2
P3
G3
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 14
Block Generate and Propagate
Block Generate and Propagate
Gj:i and Pj:i denote the Generate and Propagate functions, respectively, for a group of bits
from positions i to j. We call them Block Generate and Block Propagate. Gj:i equals 1 if
the group generates a carry independent of the incoming carry. Pj:i equals 1 if an
incoming carry propagates through the entire group. For example, G3:2 is equal to 1 if a
carry is generated at bit position 3, or if a carry out is generated at bit position 2 and
propagates through position 3. G3:2 = G3 + P3G2. P3:2 is true if an incoming carry
propagates through both bit positions 2 and 3. P3:2 = P3P2
C2 = (G1 + P1 G0 ) + (P1 P0 )C0 = G1:0 + P1:0 C0
C4 = G3 + P3 G2 + P3 P2 G1 + P3 P2 P1 G0 + P3 P2 P1 P0 C0
= (G3 + P3 G2 ) + (P3 P2 )Co,1 = G3:2 + P3:2 C2
= G3:2 + P3:2(G1:0 + P1:0 C0) = G3:0 + P3:0 C0
The carry out of a 4-bit block can thus be computed using only the block generate and propagate
signals for each 2-bit section, plus the carry in to bit 0. The same formulation will be used to generate
the carry out signals for a 16-bit adder using the block generate and propagate from 4-bit sections.
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 15
More Definitions&
More Definitions&
( , ) " ( , ) ( , )
g p g' p' = g + pg' pp'
The above dot operator obeys the associative property, but it is not commutative
(G3:2,P3:2) = (G3,P3) " (G2,P2)
Co, 3, 0 = G3 P3 G2 P2 G1 P1 G0 P0 Ci, 0, 0
() (( , ) " ( , ) " ( , ) " ( , )) " ()
G3:0 P3:0 = G3 P3 G2 P2 G1 P1 G0 P0
() [( , ) " ( , )] " [( , ) " ( , ) ]
,
= G3:2 P3:2 G1:0 P1:0
() " ()
, ,
Co, k, 0 Ci, 0, 0
() (( , ) " ( " & " ( , )) " ()
= Gk Pk Gk 1, Pk 1) G0 P0
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 16
Logarithmic Look-Ahead Adder
Logarithmic Look-Ahead Adder
A0
F
A1 A2 A3 A4 A5 A6 A7
tp: O(N)
A0
A1
A2
A3
F
A4
A5
tp:O(log2N)
A6
A7
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 17
16-bit Kogge-Stone Tree Adder
16-bit Kogge-Stone Tree Adder
Sum Logic
Propagate, Generate Logic
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 18
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
0
0
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
12
12
13
13
14
14
15
15
(
A
,
B
)
(
A
,
B
)
(
A
,
B
)
(
A
,
B
)
(
A
,
B
)
(
A
,
B
)
(
A
,
B
)
(
A
,
B
)
(
A
,
B
)
(
A
,
B
)
(
A
,
B
)
(
A
,
B
)
(
A
,
B
)
(
A
,
B
)
(
A
,
B
)
(
A
,
B
)
Adder Performance
Adder Performance
Ripple
Bypass
Select
Lookahead
Delay vs. number of bits
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 19
Addition of M, N-bit Numbers
Addition of M, N-bit Numbers
IN1N-1
+
+ +
+
INM-1N-1
IN0N-1 IN3N-1
IN2N-1
IN1N-2
+
+
+ +
INM-1N-2
IN0N-2 IN3N-2
IN2N-2
IN11
+
+ +
+
INM-11
IN01 IN31
IN21
IN10
+
+ +
+
INM-10
IN00 IN30
IN20
Cin =0
Cin =0 Cin =0
Cin =0
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 20
16-bit Carry Lookahead Schematic
16-bit Carry Lookahead Schematic
181 configured for A+B:
M = 0, S3-0 = 1001
A3:0 A7:4 A11:8
B3:0 B7:4 B11:8
A15:12
B15:12
Cn+4 Cn+4 Cn+4
Cn Cn Cn Cn+4
Cn
181 181 181
181
P G P G P G
P G
S3:0 S7:4 S11:8
S15:12
P3:0
G3:0
P0 G0 P1 G1 P2 G2 P3 G3
G
182
Cn
P
Cin
Cn+x Cn+y Cn+z
182 computes Cin for later stages,
using block G & P from earlier stages
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 21
Binary Multiplication
Binary Multiplication
y0
x3
x2 x1 x0
y1
x3
Partial product computation x2 x1 x0
z0
is simple (single and gate)
HA
FA FA HA
y2
x3 x2
x1
x0
z1
FA FA
FA
HA
x2
x3 x1
x0 y3
z2
FA
FA FA
HA
z5
z7 z6 z4
z3
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 22
A Serial (Magnitude) Multiplier
A Serial (Magnitude) Multiplier
Shift/LD
Shift
xBus
[7]
0
0
D Q
1
rst
[6]
0
0
D Q
1
[5] 8
0
0
D Q
1
Q
D
+
acc_out
[4]
8
0 xBus
0
D Q
1
[3]
8 Shift
0
x3
D Q
1
[2]
LD
0
x2
D Q CLK
1
Q
yReg D
[1]
XY
0
x1
D Q
1
CLK
[0]
0
x0
D Q
CLK
1
0
CLK
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 23
add_out
Q
Q
Q
Q
D
D
D
D
0
1
0
1
0
1
3
Y
Shift/LD
2
0
1
Y
Y
Y
Timing Diagram
Timing Diagram
CLK
Shift
0 0 0 0 x3 x2 x1 x0
0 0 0 0 x3 x2 x1 x0 0 0 0 x3 x2 x1 x0 0 0 0 x3 x2 x1 x0 0 0 0 x3 x2 x1 x0 0 0 0
xreg
y0 y1 y2 y3
y0 y1 y2 y3 y1 y2 y3 X y2 y3 X X y3 X X X
yreg
00000000
00000000 Accum_1 Accum_2 Accum_3
Acc_out
PRODUCT
PRODUCT
X*Y
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 24
Verilog of Serial Multiplier
Verilog of Serial Multiplier
always @ (posedge clk)
module serialmult(shift, clk,
x, y, xy);
begin
input shift, clk;
if (shift == 1'b0)
input [3:0] x, y;
begin
output [7:0] xy;
xReg <= {4'b0, x};
reg [7:0] xReg;
yReg <= y;
reg [3:0] yReg;
acc_out <= 8'b0;
reg [7:0] xBus, acc_out,
xy_int <= add_out;
end
xy_int;
else
wire[7:0] add_out;
begin
assign add_out = xBus +
xReg <= {xReg[6:0], 1'b0};
acc_out;
yReg <= {y[3], yReg[3:1]};
assign xy = xy_int;
acc_out <= add_out;
xy_int <= xy;
always @ (yReg[0] or xReg)
end // if shift
begin
end // always
endmodule
if (yReg[0] == 1'b0) xBus =
8'b0;
else xBus = xReg;
end
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 25
Simulation
Simulation
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 26
Baugh Wooley Formulation
Baugh Wooley Formulation
Assuming X and Y are 4-bit twos complement numbers:
2
2
X = -23x3 + Ł xi2i Y = -23y3 + Ł yi2i
i=0 i=0
The product of X and Y is:
2 2 2
2
XY = x3y326 - Ł xiy32i+3 - Ł x3yj2j+3 + ŁŁxiyj2i+j
j=0
i=0
j=0
i=0
For twos complement, the following is true:
3 3
-Ł xi2i = -24 + Ł xi2i + 1
i=0
i=0
The product then becomes:
2 2 2
2
XY = x3y326 + Ł xiy32i+3 + 23 - 26 + Ł x3yj2j+3 + 23 26 + ŁŁxiyj2i+j
i=0 j=0 i=0
j=0
2 2 2 2
= x3y326 + Ł xiy32i+3 + Ł x3yj2j+3 + ŁŁxiyj2i+j + 24 27
i=0 j=0
i=0 j=0
= 27 + x3y326 + (x2y3 + x3y2)25 + (x1y3 + x3y1 + x2y2 +1)24
+ (x0y3 + x3y0 + x1y2 + x2y1)23 + (x0y2 + x1y1 + x2y0)22 + (x0y1 + x1y0)21
+ (x0y0)20
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 30
Twos Complement Multiplication
Twos Complement Multiplication
y0
x3
x2 x1 x0
y1
x3
x2 x1 x0
z0
1
FA
FA FA HA
y2
x3 x2
x1
x0
z1
FA FA
FA
HA
x2
x3 x1
x0 y3
1
z2
FA
HA FA FA
HA
z7 z5
z6 z4
z3
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 27
Summary
Summary
Performance of arithmetic blocks dictate the
performance of a digital system
Architectural and logic transformations can
enable significant speed up (e.g., adder delay
from O(N) to O(log2(N))
Similar concepts and formulation can be applied
at the system level
Timing analysis is tricky: watch out for false
paths!
Area-Delay trade-offs (serial vs. parallel
implementations)
L8/9: 6.111 Spring 2006 Introductory Digital Systems Laboratory 28

Wyszukiwarka

Podobne podstrony:
chap2 l8
ArithmeticModulo README
math Complex Numbers and Complex Arithmetic
ponadgim m3 L8
Binary Arithmetic
language operators arithmetic
M6 Engine Workshop Manual L8 LF L3 1 (2)
K4 L8
ArithmeticException
ArithmeticException
Arithmetic
Arithmetic3 README
L8
ArithmeticModulo2 ZADANIA
V L82809?lass101

więcej podobnych podstron