UMA 3.x Cycle Accurate Simulator
User’s Guide
CCS Version: CCS Reindeer 3.2.40.13
December 2006
IMPORTANT NOTICE
Texas Instruments Incorporated and its subsidiaries (TI) reserve the right to make corrections,
modifications, enhancements, improvements, and other changes to its products and services at any
time and to discontinue any product or service without notice. Customers should obtain the latest
relevant information before placing orders and should verify that such information is current and
complete. All products are sold subject to TI’s terms and conditions of sale supplied at the time of order
acknowledgment.
TI warrants performance of its hardware products to the specifications applicable at the time of sale in
accordance with TI’s standard warranty. Testing and other quality control techniques are used to the
extent TI deems necessary to support this warranty. Except where mandated by government
requirements, testing of all parameters of each product is not necessarily performed.
TI assumes no liability for applications assistance or customer product design. Customers are
responsible for their products and applications using TI components. To minimize the risks associated
with customer products and applications, customers should provide adequate design and operating
safeguards.
TI does not warrant or represent that any license, either express or implied, is granted under any TI
patent right, copyright, mask work right, or other TI intellectual property right relating to any
combination, machine, or process in which TI products or services are used. Information published by
TI regarding third-party products or services does not constitute a license from TI to use such products
or services or a warranty or endorsement thereof. Use of such information may require a license from a
third party under the patents or other intellectual property of the third party, or a license from TI under
the patents or other intellectual property of TI.
Reproduction of information in TI data books or data sheets is permissible only if reproduction is without
alteration and is accompanied by all associated warranties, conditions, limitations, and notices.
Reproduction of this information with alteration is an unfair and deceptive business practice. TI is not
responsible or liable for such altered documentation.
Resale of TI products or services with statements different from or beyond the parameters stated by TI
for that product or service voids all express and any implied warranties for the associated TI product or
service and is an unfair and deceptive business practice. TI is not responsible or liable for any such
statements.
Following URLs provide information’s on other Texas Instruments products and application solutions:
Products
Applications
Amplifiers amplifier.ti.com Audio www.ti.com/audio
Data Converters
dataconverter.ti.com Automotive www.ti.com/automotive
DSP dsp.ti.com Broadband
www.ti.com/broadband
Interface interface.ti.com Digital
Control
www.ti.com/digitalcontrol
Logic logic.ti.com Military
www.ti.com/military
Power Mgmt
power.ti.com
Optical
Networking
www.ti.com/opticalnetwork
Microcontrollers microcontroller.ti.com Security
www.ti.com/security
Low power transmitter
Telephony
www.ti.com/telephony
Video
&
Imaging
www.ti.com/video
Wireless
www.ti.com/wireless
Mailing Address:
Texas Instruments
Post Office Box 655303 Dallas, Texas 75265
Copyright © 2006, Texas Instruments Incorporated
ii
Preface
Read This First
About this Manual
This document provides an overview of the UMA 3.x Cycle Accurate
Simulator. This guide includes basic guidelines that are related to the
simulator.
Intended Audience
This document is indented for users or developers of the TI simulator.
How to Use This Manual
This document includes the following chapter:
•
– Provides an introduction to UMA 3.x Cycle Accurate
Simulator. It also includes installation procedure, resources, features,
limitations, analysis events, and cycle accuracy.
Trademarks
All trademarks are the property of the Texas Instruments Inc.
Software Copyright
Software Copyright © 2006 Texas Instruments Inc.
iii
UMA3.X Cycle Accurate Simulator
This page is intentionally left blank
iv
Contents
Installation Procedure for Code Composer Studio (CCS) ................................. 1-3
Features are not Supported in This Release..................................................... 1-7
v
UMA3.X Cycle Accurate Simulator
This page is intentionally left blank
vi
Figures
vii
UMA3.X Cycle Accurate Simulator
This page is intentionally left blank
viii
Tables
ix
UMA3.X Cycle Accurate Simulator
This page is intentionally left blank.
x
Chapter 1
UMA 3.x Cycle Accurate Simulator
This chapter provides an overview of the UMA 3.x Cycle Accurate
Simulator. It also provides detailed information on installation, resources,
and features. This chapter contains the following sections:
Topic Page
1-2
1.2 Installation Procedure for Code Composer Studio (CCS)
1-3
1.3 Supported Hardware Resources
1-4
1-6
1.5 Features are not Supported in This Release
1-7
1-7
1-8
1
UMA3.X Cycle Accurate Simulator
1.1 Introduction
The C55x+ is a new C55x CPU core in the C5500 family. The C55x+ core
is an assembly code, compatible with the cores in the C5500 family. It
supports new instructions, pipeline stages, and parallelism.
UMA 3.1 consists of the C55x+ core along with the memory subsystem.
The memory subsystem supports instruction cache, two levels of data
cache and open core protocol (OCP) interface for data, program, and
peripheral access.
2
UMA3.X Cycle Accurate Simulator
1.2 Installation Procedure for Code Composer Studio (CCS)
The CCS Reindeer 3.2.40.13 supports the following simulator:
• C55x+ Functional Simulator
• C55x+ Cycle Accurate Simulator
• UMA 3.1 Cycle Accurate Simulator
Do the following to install the simulator in CCS 3.2:
Step Action
Result
1
Select the Code
Composer Studio set-
up v3.2 in the Programs
menu.
Displays Code Composer Studio
Setup screen. See
Code Composer Studio Showing
Simulators.
2
In the Factory Boards
display, select C55x
under the Family drop-
down menu. (Select the
simulator that you want to
work).
Displays the type of simulators
that are available.
3
Drag and drop the
simulator that you want to
work into the System
Configuration window.
- or -
Right-click on the
simulator that you want to
work and select Add to
System.
The simulator is added to the
system. See
Simulator to the System
Configuration
.
4
Select Save & Quit.
Quits the Code Composer Studio
Setup and Code Composer
Studio is invoked.
3
UMA3.X Cycle Accurate Simulator
1.2.1 Code Composer Studio Screens
Figure 1-1. Code Composer Studio Showing Simulators
Figure 1-2. Adding Simulator to the System Configuration
4
UMA3.X Cycle Accurate Simulator
1.3 Supported
Hardware
Resources
The following section prov
resources:
ides detailed information on supported hardware
1.3.1 CPU
The C55x+ functional simulator available with this product simulates all
instructions functionally and neglects pipeline effects. The C55x+ functional
t
ution pipeline
ator models the pipeline effects without
1.3.2 Memory
The C55x+ simulator use flat memory system (memory without latency and
DARAM/SARAM). In the memory subsystem, the UMA 3.1 simulator
16
Banks
•
gular memory
or memory buffers
4
Lines
•
2 – way set associative (each way: 4KW, 8KW total)
•
2- way set associative (each way: 4KW, 8KW total)
•
Instruction OCP interface, 64 bit
ce, 64 bit
1.3.3 Peripherals
A C55x+ simulator simulates UMA 3.x Megacell Timer.
simulator does not model the following:
• Instruction buffer unit
• Pipeline protection uni
• Instruction fetch or exec
• Memory bypass mechanism
The C55x+ Cycle Accurate Simul
any limitations.
models the following hardware blocks:
• L1 SARAM – Banked memory
o
L1 SARAM – Re
• L1
PDROM
• L0 Data cache
o
L1
Data
Cache
o
L1 Instruction Cache
o
OCP
Interfaces
o
Data OCP interfa
o
o
Peripheral OCP interface, 16 bit
5
UMA3.X Cycle Accurate Simulator
1.4 Features in This Release
tures supported in
this release:
1.4.1 Architecture F
hitectural features are supported:
on encoding (Ryujin is not binary compatible with Laijin)
o
ng
•
mode:
Supports new byte addressing instructions
o
and memory accesses in byte
•
•
con
peripheral OCP ports
configurable latency
1.4.2 CCS Features
u
nect feature
r
g-in
This section provides the detailed information on the fea
eature Support
Following arc
• C55x+ Instructi
• Supports C55x+ ISA additions:
o
New registers
New
instructions
• Support for AU and DU data forwardi
Support for byte pointer
o
Supports data address computation
pointer mode
Stack address computations in linear mode
Supports reset vector changes for stack and pointer mode
figurations
• Supports branch prediction
• Supports banked and regular SARAM
• Supports L0/L1 data cache
• Supports L1 instruction cache
• Supports data, program, and
• Supports external memory with
S pport
Following CCS features are supported:
• Pin con
• Port connect feature
• Pipeline
stall
analyze
• Simulator analysis plu
• CCS
profiler
6
UMA3.X Cycle Accurate Simulator
1.5 Features are n
This Release
this release:
rget-host communications
ware accelerator interface
reporting
1.6 Known
Issues
is section provides detailed information on the known issues related to
UMA 3.x cycle accurate simulator:
nces:
mode
•
program op-code, it displays an
ser of the error. If the user continues
•
ot Supported in
Following features are not supported in
• RTDX
support
• Host-target and ta
• Address trace support
• Support for generic hard
• Rewind
• Stack
size
• Analysis
toolkit
Th
the
• The C55x+ Cycle Accurate Simulator has limitations on cycle
accuracy under following circumsta
o
Bus
errors
Illegal prediction (self modifying code)
o
o
C54CM
compatibility
When a simulator meets an invalid
error message that apprising the u
to run the simulator, behavior is undefined.
The UMA 3.1 simulator has limitations on cycle accuracy under
Peripheral/IO access circumstance.
• Cache, OCP port, MMU, and System Module registers are not
supported.
• Slave port (Mport) is not supported.
7
UMA3.X Cycle Accurate Simulator
1.7 Supported
Analysis
Events
The following table lists the detailed information on the analysis events
supported by UMA 3.x Cycle Accurate Simulator.
Table 1-1. UMA 31 C Model Analysis Events
Event Name
Description
CPU.discontinuity.summary
Summary of the events.
CPU.discontinuity.branch
Only 'taken' branches are counted;
conditioned branches with false condition
cannot be counted.
CPU.discontinuity.interrupt.sum
mary
This event is counted when the interrupt is
'taken' (not when latched); if an interrupt is
asserted more than once before the
interrupt is 'taken', then the event can be
counted only once.
CPU.execute_packet This
event counts number of execute-
packets that are decoded.
CPU.instruction.decoded
This is an additive event.
CPU.instruction.executed This
event is counted whenever an
instruction is executed.
CPU.instruction.condition_false This event is counted whenever an
instruction is decoded, but stopped due to
false predicate.
CPU.NOP
This event is counted whenever the CPU
executes the standard 'No Operation'
instruction.
CPU.stall.ppu.summary
This event counts the number of cycles that
the PPU is stalled
CPU.stall.ppu.ac2
This event counts the number of cycles that
the PPU stalls the AC2 phase.
CPU.stall.ppu.ad2
This event counts the number of cycles that
the PPU stalls the AD2 phase.
CPU.stall.ppu.dc
This event counts the number of cycles that
the PPU is stalled in DC phase
CPU.stall.prefetch
This event counts the number of cycles that
the execution pipeline of the CPU is stalled
due to lack of pre-fetch.
CPU.stall.bypass.ac1
This event counts the number of cycles that
the CPU is stalled in AC1 phase due to
bypass condition.
8
UMA3.X Cycle Accurate Simulator
Event Name
Description
CPU.stall.mem.wr3_stall This
counts the number of cycles that the
CPU is stalled due to latency in memory for
memory read and memory write.
L1P.access
The event count is equal to (L1P.miss +
L1P.hit).
L1P.hit
Multiple fetches to the same address to a
cacheable location causes one hit per
access from the second access onwards.
L1P.miss
First time accesses to cacheable location.
Consecutive accesses to conflicting
locations. Consecutive accesses to
addresses like total access size exceeds the
size of the cache. Resetting the cache and
accessing any location.
L1P.miss.conflict
Consecutive accesses to conflicting
locations.
L1P.miss.non_conflict First
time
accesses to cacheable location.
Consecutive accesses to conflicting
locations.
L1D.access
The event count is equal to (L1D.hit +
L1D.miss).
L1D.hit
The event count is equal to (L1D.hit.read +
L1D.hit.write).
L1D.miss
The event count is equal to (L1D.miss.read
+ L1D.miss.write).
L1D.miss.conflict
Consecutive accesses to conflicting
locations.
L1D.miss.non_conflict
Consecutive accesses to addresses like
total access size exceeds the size of the
cache. Resetting the cache and accessing
some location.
L1D.hit.read Multiple
accesses to the same address to a
cacheable location will cause one hit per
access from the second access onwards.
Accesses to two addresses both mapping to
the same set (different ways).
Read falls in line.
Parallel reads to same addresses that are
not already cached, causes a read miss and
a read hit (streaming).
9
UMA3.X Cycle Accurate Simulator
Event Name
Description
L1D.hit.write
Read followed by a write to the same
cacheable address location.
Reading two addresses both mapping to the
same set (different ways) and writing to
those two locations.
Write falls completely in a line.
Parallel read and write to same addresses
will cause a read miss and a write hit.
L1D.miss.read
First time access to cache-able location.
Consecutive accesses to conflicting
locations.
Consecutive accesses to addresses such as
total access size exceed the cache size.
Resetting the cache and accessing some
location.
Access spills over a line like such that one
line is cached.
Parallel reads to same addresses that are
not already cached, causes a read miss and
a read hit.
Parallel read and write to same addresses,
will cause a read miss and a write hit.
L1D.victim
Three write accesses to conflicting
locations.
Two write and one read accesses to
conflicting locations.
Consecutive read accesses to conflicting
locations.
10
UMA3.X Cycle Accurate Simulator
1.8 Cycle Accuracy (CA)
The following table lists the results of CA benchmarking completed for SRC
and FR application (internal memory).
Table 1-2. CA Results for SRC and Full Rate Encoder and Decoder Applications
Model
Application
CA % diversion from RTL
8_48_Mixer -0.0007
48_8_Mixer -0.00258
16_8_Mixer -0.00223
8_16_Mixer -0.00204
16_48_Mixer -0.00069
48_16_Mixer -0.0008
8_48_NoMixer -0.00031
48_8_NoMixer -0.00213
16_8_NoMixer -0.001
8_16_NoMixer -0.00092
16_48_NoMixer -0.00031
Sample Rate
Converter
48_16_NoMixer -0.00036
Decoder -0.0096
Full Rate
Encoder -0.01971
11