L16: 6.111 Spring 2006
1
Introductory Digital Systems Laboratory
L16: Power Dissipation in Digital Systems
L16: Power Dissipation in Digital Systems
L16: 6.111 Spring 2006
2
Introductory Digital Systems Laboratory
Problem #1: Power Dissipation/Heat
Problem #1: Power Dissipation/Heat
5KW
18KW
1.5KW
500W
4004
8008
8080
8085
8086
286
386
486
Pentium® proc
0.1
1
10
100
1000
10000
100000
1971 1974 1978 1985 1992 2000 2004 2008
Year
Power (Watts)
4004
8008
8080
8085
8086
286
386
486
Pentium® proc
P6
1
10
100
1000
10000
1970
1980
1990
2000
2010
Year
Power Density
(W/cm2)
Hot Plate
Nuclear
Reactor
Rocket
Nozzle
Sun’s
Surface
Courtesy Intel (S. Borkar)
How do you cool these chips??
How do you cool these chips??
chip
heat sink
L16: 6.111 Spring 2006
2
Introductory Digital Systems Laboratory
Problem #2: Energy Consumption
Problem #2: Energy Consumption
(Image by MIT OCW. Adapted from Jon
Eager, Gates Inc. , S. Watanabe, Sony Inc.)
No Moore’s law for batteries…
Today: Understand where power goes
and ways to manage it
What can
One Joule
of energy do?
Send a 1
Megabyte
file over
802.11b
Operate a
processor
for ~ 7s
The Energy Problem
7.5 cm
3
AA battery
Alkaline:
~10,000J
Mow your
lawn for
1 ms
Image by MIT OCW.
L16: 6.111 Spring 2006
3
Introductory Digital Systems Laboratory
Dynamic Energy Dissipation
Dynamic Energy Dissipation
V
DD
C
L
E
0
→1
= C
L
V
DD
2
E
cap
= 1/2C
L
V
DD
2
i
DD
E
diss, RP
= 1/2C
L
V
DD
2
V
DD
C
L
IN =1
E
diss,RN
=1/2C
L
V
DD
2
Charging
Discharging
IN =0
P = C
L
V
DD
2
f
clk
R
N
R
P
R
N
R
P
L16: 6.111 Spring 2006
4
Introductory Digital Systems Laboratory
The Transition Activity Factor
The Transition Activity Factor
α
α
0
0
−
−
>
>
1
1
Current
Next
Output
Input
Input
Transition
00
00
1
−> 1
00
01
1
−> 1
00
10
1
−> 1
00
11
1
−> 0
01
00
1
−> 1
01
01
1
−> 1
01
10
1
−> 1
01
11
1
−> 0
10
00
1
−> 1
10
01
1
−> 1
10
10
1
−> 1
10
11
1
−> 0
11
00
0
−> 1
11
01
0
−> 1
11
10
0
−> 1
11
11
0
−> 0
α
0
−>1
= 3/16
Assume inputs (A,B) arrive
at f and are uniformly
distributed
What is the average
power dissipation?
P =
α
0−>1
C
L
V
DD
2
f
Z
A
B
L16: 6.111 Spring 2006
5
Introductory Digital Systems Laboratory
Junction (Silicon) Temperature
Junction (Silicon) Temperature
Simple Scenario
T
j
-T
a
= R
θJA
P
D
Silicon
R
θJA
is the thermal resistance
between silicon and Ambient
R
θJA
P
D
T
j
= T
a
+ R
θJA
P
D
Make this as low as possible
Realistic Scenario
R
θJC
P
D
R
θCA
= R
θCS
+ R
θSA
Sink
Case
Silicon
T
J
T
A
T
J
T
C
T
S
T
A
T
J
T
C
T
S
T
A
R
θCS
R
θSA
is minimized by facilitating heat transfer
(bolt case to extended metal surface – heat sink)
L16: 6.111 Spring 2006
6
Introductory Digital Systems Laboratory
Intel Pentium 4 Thermal Guidelines
Intel Pentium 4 Thermal Guidelines
Pentium 4 @ 3.06 GHz dissipates 81.8W!
Maximum T
C
= 69 °C
R
CA
< 0.23 °C/W for 50 C ambient
Typical chips dissipate 0.5-1W (cheap
packages without forced air cooling)
Image by MIT OpenCourseWare.
Image by MIT OpenCourseWare. Adapted
from Intel Pentium 4 documentation.
.
L16: 6.111 Spring 2006
7
Introductory Digital Systems Laboratory
Power Reduction Strategies
Power Reduction Strategies
Reduce Transition Activity or Switching
Events
Reduce Capacitance (e.g., keep wires
short)
Reduce Power Supply Voltage
Frequency is typically fixed by the
application, though this can be adjusted to
control power
P =
α
0−>1
C
L
V
DD
2
f
Optimize at all levels of design hierarchy
Optimize at all levels of design hierarchy
L16: 6.111 Spring 2006
8
Introductory Digital Systems Laboratory
Clock Gating is a Good Idea!
Clock Gating is a Good Idea!
+
X
Global Clock
Adder Clock
Multiplier Clock
Adder Off
Enable_Adder
Enable_Multiplier
Multiplier On
100’s of different clocks in a microprocessor
Clock Gating Reduces Energy, does it reduce Power?
Clock Gating Reduces Energy, does it reduce Power?
Clock gating reduces activity
and is the most common low-power
technique used today
L16: 6.111 Spring 2006
9
Introductory Digital Systems Laboratory
Does your GHz Processor run at a GHz?
Does your GHz Processor run at a GHz?
Processor
Thermal
Sensor
Note that there is a difference between average and peak
power
On-chip thermal sensor (diode based), measures the silicon
temperature
If the silicon junction gets too hot (say 125 °C), then the
activity is reduced (e.g., reduce clock rate or use clock gating)
Chip
Activity
Control
Use of Thermal Feedback
Use of Thermal Feedback
L16: 6.111 Spring 2006
10
Introductory Digital Systems Laboratory
Power Supply Resonance
Power Supply Resonance
L
board
L
package
R
grid
Switching
currents
Board decap
On-die
decap
Can write a Virus to Activate
Can write a Virus to Activate
Power Supply Resonance!
Power Supply Resonance!
Image removed due to copyright restrictions.
Image removed due to copyright restrictions.
Image removed due to copyright restrictions.
L16: 6.111 Spring 2006
11
Introductory Digital Systems Laboratory
Number Representation:
Number Representation:
Two
Two
’
’
s Complement vs. Sign Magnitude
s Complement vs. Sign Magnitude
Sign-Magnitude
Two’s complement
Consider a 16 bit bus where inputs toggles
between +1 and –1 (i.e., a small noise input)
Which representation is more energy efficient?
0000
0111
0011
1011
1111
1110
1101
1100
1010
1001
1000
0110
0101
0100
0010
0001
+0
+1
+2
+3
+4
+5
+6
+7
-0
-1
-2
-3
-4
-5
-6
-7
L16: 6.111 Spring 2006
12
Introductory Digital Systems Laboratory
Time Sharing is a Bad Idea
Time Sharing is a Bad Idea
Time Sharing Increases Switching Activity
Time Sharing Increases Switching Activity
2
L16: 6.111 Spring 2006
13
Introductory Digital Systems Laboratory
Not just a 6
Not just a 6
-
-
1 Issue:
1 Issue:
“
“
Cool
Cool
”
”
Software ???
Software ???
CPU
01111111
00000000
01111111
00000001
01111111
00000010
01111111
00000011
10000000
00000000
10000000
00000001
10000000
00000010
10000000
00000011
a[0]
a[1]
a[2]
a[3]
b[0]
b[1]
b[2]
b[3]
float a [256], b[256];
float pi= 3.14;
for (i = 0; i < 255; i++) {a[i] = sin(
pi * i /256
);}
for (i = 0; i < 255; i++) {b[i] = cos(
pi * i /256
);}
float a [256], b[256];
float pi= 3.14;
for (i = 0; i < 255; i++) {
a[i] = sin(pi * i /256);
b[i] = cos(pi * i /256);
}
address
MEMORY
address
16
512(8)+2+4+8+16+32+64+128+256
=
4607 bit transitions
2(8)+2(2+4+8+16+32+64+128+256)
=
1030 transitions
L16: 6.111 Spring 2006
14
Introductory Digital Systems Laboratory
Glitching
Glitching
Transitions
Transitions
Balancing paths reduces glitching transitions
Structures such as multipliers have lot of glitching transitions
Keeping logic depths short (e.g., pipelining) reduces glitching
+
+
+
A
B C
D
(A+B) + (C+D)
+
+
+
A
B
C
D
(((A+B) + C)+D)
Chain Topology
Tree Topology
L16: 6.111 Spring 2006
15
Introductory Digital Systems Laboratory
Reduce Supply Voltage : But is it Free?
Reduce Supply Voltage : But is it Free?
IN
OUT
V
DD
+
-
C
L
t =0+
2
)
(
2
T
V
DD
V
K
−
S
DD
V
V
DD
V
S
G
D
DD
T
DD
DD
V
V
V
V
T
V
DD
V
k
DD
V
L
C
D
i
V
L
C
Delay
1
)
(
2
2
)
(
2
2
≈
−
∝
−
⋅
=
Δ
⋅
=
V
DD
from 2V to 1V,
energy
↓ by x4
,
delay
↑ x2
L16: 6.111 Spring 2006
16
Introductory Digital Systems Laboratory
Transistors Are Free
Transistors Are Free
…
…
(What do you do with a Billion Transistors?)
(What do you do with a Billion Transistors?)
OUT
IN
X
P
serial
= C
mult
2
2
f
P
f =1GHz
V
DD
=2V
parallel
= (2C
mult
1
2
f /2) =
P
serial
/4
X
X
IN
f = 500Mhz
V
DD
=1V
f = 500Mhz
V
DD
=1V
IN
SELECT
Trade Area for Low Power
Trade Area for Low Power
OUT
L16: 6.111 Spring 2006
17
Introductory Digital Systems Laboratory
Algorithmic Workload
Algorithmic Workload
Exploit Time Varying Algorithmic Workload
Exploit Time Varying Algorithmic Workload
To Vary the Power Supply Voltage
To Vary the Power Supply Voltage
Image by MIT OCW.
L16: 6.111 Spring 2006
18
Introductory Digital Systems Laboratory
Dynamic Voltage Scaling (DVS)
Dynamic Voltage Scaling (DVS)
ACTIVE
IDLE
E
FIXED
= ½ C V
DD
2
Fixed Power Supply
ACTIVE
E
VARIABLE
= ½ C (V
DD
/2)
2
=
E
FIXED
/ 4
Variable Power Supply
0.2
0.4
0.8
1.0
0.2
0.4
0.6
0.8
1.0
Normalized Workload
Normalized Energy
Fixed Supply
Variable
Supply
0
0
0.6
[Gutnik97]
L16: 6.111 Spring 2006
19
Introductory Digital Systems Laboratory
DVS on a Processor
DVS on a Processor
Digitally adjustable DC-DC
converter powers SA-1110 core
μOS selects appropriate clock frequency
based on workload and latency constraints
SA-1110
Control
μOS
V
out
Controller
3.6V
5
Figure by MIT OpenCourseWare. Adapted
from R. Min, T. Furrer, and A. P. Chandrakasan.
"Dynamic Voltage Scaling Techniques for
Distributed Microsensor Networks." Workshop
on VLSI (April 2000): 43-46.
Ener
gy per Operation
Frequency (MHz)
Core Voltage (V)
59.0
88.5
118.0
147.5
176.9
206.4
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1
0.8
0.6
0.4
0.2
0
L16: 6.111 Spring 2006
20
Introductory Digital Systems Laboratory
Energy Efficiency of Software
Energy Efficiency of Software
CLB
CLB
C LB
C LB
FPGA (Xilinx)
“
“
Software
Software
”
”
Energy Dissipation has Large Overhead
Energy Dissipation has Large Overhead
Processor (StrongARM-1100)
0.25
0.2
0.15
0.1
0.05
0
ARM Instructions
Average Current (A)
Figure by MIT OpenCourseWare. Adapted from A. Sinha, DAC.
45
40
35
30
25
20
15
10
5
0
Cache
Cpntrol
GCLK
EBOX
I/O,PLL
Power (%)
Figure by MIT OpenCourseWare. Adapted from Montanaro 1996, JSSC.
Interconnect
Clock
CLB
I/O
5%
9%
21%
65%
Image by MIT OpenCourseWare. Adapted from Kusse 1998, UCB.
L16: 6.111 Spring 2006
21
Introductory Digital Systems Laboratory
Trends: Leakage and Power Gating
Trends: Leakage and Power Gating
Low V
T
devices
are
leaky - Use a
High V
T
device
is used
to gate
leakage
current
Sleep
Duty Cycle (%)
Total Energy
/S
witching Energy
V
DD
C
V
DD
C
E
E
=
=
V
V
DD
DD
I
I
0
0
10
10
-
-
V
V
T
T
/
/
S
S
E
E
=
=
CV
CV
DD
DD
2
2
Switching
Switching
(computing)
(computing)
Leakage
Leakage
(standby)
(standby)
0
1
L16: 6.111 Spring 2006
22
Introductory Digital Systems Laboratory
Trends: Energy Scavenging
Trends: Energy Scavenging
Image removed due to
copyright restrictions.
Vibration-to-Electric
Conversion
~ 10
μW
MEMS Generator
Power Harvesting Shoes
Courtesy of Joe Paradiso (MIT Media Lab).
Used with permission.
After 3-6 steps, it provides 3 mA
for 0.5 sec
~10mW