首页 > 器件类别 >

IDT79RC64T574-250DP

Advanced 64-bit Microprocessors Product Family

厂商名称:IDT(艾迪悌)

厂商官网:http://www.idt.com/

下载文档
文档预览
Advanced 64-bit
Microprocessors
Product Family
Features
High-performance 64-bit embedded Microprocessor
– 250MHz operating frequency
– >330 Dhrystone MIPS performance
– 300MFLOPS/s floating-point performance
– Up to 125 million multiply accumulate per second (MAC/s)
– MIPS-IV Instruction Set Architecture (ISA), with integer DSP
and 3-operand integer multiply extensions
– Limited dual-issue microarchitecture
Compatible with RC4640 and RC32364 DSP extensions
– DSP Extensions, for consumer applications
– 2-cycle repeat rate, on atomic Multiply-add
– Multiply-subtract (MSUB) support, for complex number
processing
– Count-leading-zero/one support, for string searches and
normalization
High-performance on-chip cache subsystem
– 32kB, two-set associative instruction cache (I-cache)
– 32kB, two-set associative data cache (D-cache)
– Write-through and write-back data cache operations
– High-performance cache-ops, bandwidth management
I-cache and D-cache locking capability (per line), provides
improved real-time support
Joint TLB on-chip, for virtual-to-physical address mapping
79RC64574
79RC64575
Big- or Little-endian capability
RC5000 compatible memory management
– On-chip 48-entry, 96-page TLB, for advanced operating
system support
– Compatible with major operating systems:
Windows
®
CE, VxWorks, and others
Bus compatible with IDT 64-bit microprocessor families
– Pipeline runs at 2 to 8 times the bus frequency
– Bus speeds to 125MHz
– 32-bit bus option, for lower cost systems
– Enhanced timing protocol for SyncDRAM systems (compatible
with IDT79RC64474/475)
RC64574:
– 32-bit SysAd bus, for low-cost systems
– Pin compatible with RC4640 and RC64474
– 128-pin QFP package
RC64575:
– 64-bit SysAd bus interface
– Pin compatible with RC4650 and RC64475
– 208-pin QFP package
Industrial temperature range support
JTAG Boundary Scan Interface
2.5V operation with 3.3V tolerant I/O
Block Diagram
PLL
64-bit
Integer
Execution Unit
DSP
Accelerator
RC5000
Dual-Issue Instruction Fetch Unit
Primary Cache Controller
Compatible
System Control
Coprocessor
48-entry
96-page
TLB
Floating-Point
Accelerator
666 MFIOPS
IEEE 1284
32kB
2 set-associative
Instruction
Cache
(Lockable)
32kB
2 set-associative
Data
Cache
(Lockable)
64-bit/32-bit
RC64474/475 Compatible
System Interface
ClkIn
Figure 1 RC64574/RC64575 Block Diagram
IDT and the IDT logo are trademarks of Integrated Device Technology, Inc.
1 of 28
©
2001 Integrated Device Technology, Inc.
December 14, 2001
DSC 5607
79RC64574™ 79RC64575™
Device Overview
1
IDT’s 79RC64574/575 processors serve a wide range of perfor-
mance-critical embedded applications that include high-end internet-
working systems, digital set-top boxes, web browsers, color printers,
and graphics terminals.
The RC64574/575 allow a socket compatible upgrade path for IDT’s
RC4640/50 and RC64474/475 processors. This unprecedented upgrad-
ability allows a 2:1 range of frequencies; 4:1 range of cache size; 15:1
range of floating-point; and 4:1 range of DSP performance in a single
socket.
With special emphasis on system bandwidth, floating- point and DSP
operations, the RC64574/575 have been optimized for high-perfor-
mance applications through the integration of high-performance compu-
tational units and a high-performance memory hierarchy. The result is a
low-cost CPU that is capable of more than 330 Dhrystone MIPS.
Through the RC64574/64575 processors IDT offers:
High-performance upgrade paths to existing embedded
customers in the internetworking, office automation and
visualization markets.
Significant floating-point performance improvements over
currently available, moderately priced MIPS CPUs.
Performance improvements through the use of the MIPS-IV ISA.
High-performance DSP acceleration
Instruction Issue Mechanism
The RC64574 and RC64575 are limited dual-issue super-scalar
machines that use a traditional 5-stage integer pipeline, as shown in the
pipeline diagram on Page 3. For multi-issue operations, these devices
recognize the following two general classes of instructions:
Floating-point ALU
All others
Such a broad separation of instruction classes insure that there are
no data dependencies to restrict multi-issue performance. As they are
brought on-chip, these instruction classes are pre-decoded by the
RC64574/575, and the class information is then stored in the instruction
cache. Assuming there are no pending resource conflicts, the devices
can issue one instruction per class per pipeline clock cycle.
However, longer latency resources—in either the floating-point ALU
(for example, division or square root instructions) or integer unit (such as
multiply)—can restrict the issue of instructions. Note that these proces-
sors do not perform out-of-order or speculative execution; instead, the
pipeline slips until the required resource becomes available.
On dual-issue instruction pairs, there are no alignment restrictions,
and the RC64574/575 fetch two instructions from the cache per cycle.
Thus, for optimal performance, compilers should attempt to align branch
targets to allow dual-issue on the first target cycle, because the instruc-
tion cache only performs aligned fetches.
Detailed system operation information is provided in the RC64574/RC64575
user’s manual.
1.
RISCore4000/RISCore5000 Family of Socket Compatible Processors
32-bit External Bus Processors
RC4640
CPU
Performance
FPA
Caches
External Bus
64-bit RISCore4000 w/
DSP extensions
>350MIPS
89 mflops, single preci-
sion only
8kB/8kB, 2-way,
lockable by set
32-bit
64-bit External Bus Processors
RC4650
64-bit RISCore4000 w/
DSP extensions
>350MIPS
89 mflops, single preci-
sion only
8kB/8kB, 2-way,
lockable by set
32- or 64-bit
RC64474
64-bit RISCore4000
>330MIPS
125 mflops, single and
double precision
16kB/16kB, 2-way,
lockable by set
32-bit, Superset pin
compatible w/RC4640
3.3V
180-250 MHz
128 QFP
96 page TLB
Cache locking, JTAG,
syncDRAM mode, 32-bit
external bus
RC64574
64-bit RISCore5000 w/
DSP extensions
>330MIPS
666 mflops, single and
double precision
32kB/32kB, 2-way,
lockable by line
32-bit, Superset pin
compatible w/RC4640,
RC64474
2.5V
200-250 MHz
128 QFP
96 page TLB
Cache locking, JTAG,
syncDRAM mode, 32-bit
external bus
RC64475
64-bit RISCore4000
>330MIPS
125 mflops, single and
double precision
16kB/16kB, 2-way,
lockable by set
32-or 64-bit, Superset
pin compatible w/
RC4650
3.3V
180-250 MHz
208 QFP
96 page TLB
Cache locking, JTAG,
syncDRAM mode, 32-
64- bit bus option
RC64575
64-bit RISCore5000 w/
DSP extensions
>330MIPS
666 mflops, single and
double precision
32kB/32kB, 2-way,
lockable by line
32-or 64-bit, Superset
pin compatible w/
RC4650, RC64475
2.5V
250 MHz
208 QFP
96 page TLB
Cache locking, JTAG,
syncDRAM mode, 32-
64- bit bus option
Voltage
Frequencies
Packages
MMU
Key Features
3.3V
100-267 MHz
128 PQFP
Base-Bounds
Cache locking, on-chip
MAC, 32-bit external
bus
3.3V
100-267 MHz
208 QFP
Base-Bounds
Cache locking, on-chip
MAC, 32-bit & 64 bit
bus option
Table 1 RISCore4000/RISCore5000 Processor Family
2 of 28
December 14, 2001
79RC64574™ 79RC64575™
Instruction Set Architecture
The RC64574/575 implement a superset of the MIPS-IV 64-bit ISA,
including CP1 and CP1X functional units and their instruction set. Both
32- and 64-bit data operations are performed by utilizing thirty-two
general purpose 64-bit registers (GPR) that are used for integer opera-
tions and address calculation. The complete on-chip floating-point co-
processor (CP1)—which includes a floating-point register file and execu-
tion units—forms a “seamless” interface, decoding and executing
instructions in parallel with the integer unit.
CP1’s floating-point execution units
support both single and
double precision arithmetic—as specified in the IEEE Standard 754—
and are separated into a multiply unit and a combined add/convert/
divide/square root unit. Overlap of multiplies and add/subtract is
supported, and the multiplier is partially pipelined, allowing the initiation
of a new multiply instruction every fourth pipeline cycle. The
floating-
point register file
is made up of thirty-two 64-bit registers. The floating-
point unit can take advantage of the 64-bit wide data cache and issue a
co-processor load or store doubleword instruction in every cycle.
The
system control coprocessor (CP0) registers
are also incorpo-
rated on-chip and provide the path through which the virtual memory
system’s page mapping is examined and changed, exceptions are
handled, and any operating mode selections are controlled. A secure
user processing environment is provided through the
user, supervisor,
and kernel operating modes
of virtual addressing to system software.
Bits in a status register determine which of these modes is used.
Load and branch latencies are minimized by the short pipeline of the
RC64574/575, and the caches contain special logic that will allow any
combination of loads and stores to execute in back-to-back cycles
without requiring pipeline slips or stalls, assuming the operation does
not miss in the cache.
Computational Units
The RC64574/575 implement a full, single-cycle 64-bit arithmetic
logic unit (ALU), for
Integer ALU
functions other than multiply and
divide. Bypassing is used to support back-to-back ALU operations at the
full pipeline rate, without requiring stalls for data dependencies.
To allow the longer latency operations to run in parallel with other
operations, the
Integer Multiply/Divide
unit of the RC64574/ 575 is
separated from the primary ALU. The pipeline stalls only if an attempt to
access the HI or LO registers is made before an operation completes.
The
Floating-point ALU
unit is responsible for all of the CP1/CP1X
ALU operations—other than DIV/SQRT operations—and is pipelined to
allow a single-cycle repeat rate for single-precision operations.
The
Floating-point DIV/SQRT
unit is separated from the floating-
point ALU, to ensure that these longer latency operations do not prevent
the issue of other floating-point operations. Separate logical units are
also provided on the RC64574/575 to implement load, store, and branch
operations.
Intended to enhance the performance of DSP algorithms such as fast
fused multiply-adds, multiply-subtracts and three operand multiply oper-
ations,
new instructions
have been added over and above the MIPS-IV
ISA.
Integer Pipeline
The integer instruction execution speed is tabulated—in number of
pipeline clocks—as follows:
Operation
Load
Store
MULT/MULTU
DMULT/DMULTU
DIV/DIVU
DDIV/DDIVU
MAD/MADU
MSUB/MSUBU
Other Integer ALU
Branch
Jump
2
2
4
6
36
68
3
4
1
2
2
Latency
1
1
3
5
36
68
2
3
1
2
2
Repeat
System Interfaces
The
RC64575 supports a 64-bit system interface
that is pin and
bus compatible with the RC4650 and RC64475 system interface. The
system interface consists of a 64-bit Address/Data bus with eight parity-
check bits and a 9-bit command bus.
During 64-bit operation, RC64575 system address/data (SysAD)
transfers are protected with an 8-bit parity check bus, SysADC. When
initialized for 32-bit operation, the RC64575’s SysAD can be viewed as a
32-bit multiplexed bus that is protected by four parity-check bits.
The
RC64574 supports a 32-bit system interface
that is pin and
bus compatible with the RC4640 and RC64474. During 32-bit operation,
SysAD transfers are performed on a 32-bit multiplexed bus (SysAD
31:0) that is protected by 4 parity check bits (SysADC 6:0).
Writes to external memory—whether they are cache miss write-
backs, stores to uncached or write-through addresses—use the on-chip
write buffer.
The write buffer holds a maximum of four 64-bit addresses
and 64-bit data pairs. The entire buffer is used for a data cache write-
back and allows the processor to proceed in parallel with memory
updates.
Included in the system interface are
six handshake signals:
RdRdy*, WrRdy*, ExtRqst*, Release*, ValidOut*, and ValidIn*;
six inter-
rupt inputs,
and a
simple timing
specification that is capable of trans-
Table 2 Integer Instruction Execution Speed
To insure that the maximum frequency of operation is not limited by
the speed of the multiplier unit, a
“fast multiply”
disable reset mode bit
(see Table 2) is featured. When this bit is asserted, each multiply opera-
tion shown in Table 1 has its latency and repeat rate increased by one
cycle.
3 of 28
December 14, 2001
79RC64574™ 79RC64575™
ferring data between the processor and memory at a peak rate of
1000MB/sec. A boot-time selectable option to run the system interface
as 32-bits wide—using basically the same protocols as the 64-bit
system—is also supported.
A
boot-time mode control interface
initializes fundamental
processor modes and is a serial interface that operates at a very low
frequency (SysClock divided by 256). This low-frequency operation
allows the initialization information to be kept in a low-cost EPROM;
alternatively, the twenty-or-so bits could be generated by the system
interface ASIC or a simple PAL. The boot-time serial stream is shown in
Table 3.
Serial
Description
Bit
0
1:4
Reserved
Transmit-data-
pattern.
Bit 4 is MSB
Value & Mode Setting
Serial
Description
Bit
11
TimerIntEn
Value & Mode Setting
Timer interrupt settings:
0: Enable Timer Interrupt on Int(5)
1: Disable Timer Interrupt on Int(5)
12
System Interface Interface bus width control settings:
Bus Width.
0: 64-bit system interface
1: 32-bit system interface
Drv_Out
Bit 14 is MSB
Slew rate control of the output drivers:
10: 100% strength (fastest)
11: 83% strength
00: 67% strength
01: 50% strength (slowest)
13:14
15:17
Must be set to 0.
64-bit bus width:
0: DDDD
1: DDxDDx
2: DDxxDDxx
3: DxDxDxDx
4: DDxxxDDxxx
5: DDxxxxDDxxxx
6: DxxDxxDxxDxx
7: DDxxxxxxDDxxxxxx
8: DxxxDxxxDxxxDxxx
9-15: Reserved. Must not be selected.
32-bit bus width:
0: WWWWWWWW
1: WWxWWxWWxWWx
2: WWxxWWxxWWxxWWxx
3: WxWxWxWxWxWxWxWx
4: WWxxxWWxxxWWxxxWWxxx
5: WWxxxxWWxxxxWWxxxxWWxxxx
6: WxxWxxWxxWxxWxxWxxWxxWxx
7: WWxxxxxxWWxxxxxxWWxxxxxxWWxxxxxx
8: WxxxWxxxWxxxWxxxWxxxWxxxWxxxWxxx
9-15: Reserved. Must not be selected.
Write address to From 0 to 7 SysClk cycles:
write data delay. 0: AD...
1: AxD...
2: AxxD...
3: AxxxD...
4: AxxxxD...
5: AxxxxxD...
6: AxxxxxxD...
7: AxxxxxxxD...
Reserved
Extend
Multiplication
Repeat Rate.
User must select ‘0’
Initial setting of the “Fast Multiply” bit.
0: Enable Fast Multiply
1: Do not Enable Fast Multiply
Note:
For pipeline speeds >250MHz, this bit must
be set to ‘1’.
18
19
20:24
25:26
Reserved
System
configuration
identifier.
Reserved
User must select ‘0’
Software visible in processorConfig[21:20]
0: Config[21:20] = Mode Bit [25:26]
Must be set to 0.
User must select ‘0’
27:256
Table 3 Boot-time Mode Stream (Page 2 of 2)
5:7
PClock-to-
SysClk-Ratio.
Bit 7 is MSB
0: 2
1: 3
2: 4
3: 5
4: 6
5: 7
6: 8
7: Reserved
0: Little endian
1: Big endian
00: R4400 compatible
01: Reserved
10: Pipelined-Write-Mode
11: Write-Reissue-Mode
8
9:10
Endianness
Non-block write
Mode. Bit 10 is
MSB
The
clocking interface
allows the CPU to be easily mated with
external reference clocks. The CPU input clock is the bus reference
clock and can be between 33 and 125MHz. An on-chip
phase-locked-
loop (PLL)
generates the pipeline clock (PClock) through multiplication
of the system interface clock by values of 2,3,4,5,6,7 or 8, as defined at
system reset. This allows the pipeline clock to be implemented at a
significantly higher frequency than the system interface clock. The
RC64574/575 support both single data (one byte through full CPU bus
width) and 8-word block transfers on the SysAD bus.
The RC64574/575 implement additional
write protocols
that
double the effective write bandwidth.
The write re-issue has a repeat
rate of 2 cycles per write. Pipelined writes have the same 2-cycle per
write repeat rate, but can issue an additional write after WrRdy* de-
asserts.
Table 3 Boot-time Mode Stream (Page 1 of 2)
4 of 28
December 14, 2001
79RC64574™ 79RC64575™
Choosing a 32- or 64-bit wide system interface dictates whether a
cache line block transaction requires 4 double word data cycles or 8
single word cycles as well as whether a single data transfer—larger than
4 bytes—must be divided into two smaller transfers.
As shown in Table 3, the bus delay can be defined as 0 to 7
SysClock cycles and is activated and controlled through mode bit
(17:15) settings selected during the reset initialization sequence. The
‘000’ setting provides the same write operations timing protocol as the
RC4640, RC4650, and RC5000 processors.
To facilitate discrete
interface to SyncDRAM,
the RC64574/575 bus
interface is enhanced during write cycles with a programmable delay
that is inserted between the write address and the write data (for both
block and non-block writes).
Board-level testing
during Run-Time mode is facilitated through the
full JTAG boundary scan facility. Five pins—TDI, TDO, TMS, TCK,
TRST*—have been incorporated to support the standard JTAG inter-
face.
The RC64574/575 devices offer a direct migration path for designs
that are based on IDT’s RC4640/RC4650 and RC64474/RC64475
processors
2
, through full pin and socket compatibility. Full 64-bit-family
software and bus protocol compatibility ensures the RC64574/575
processors access to an existing market and development infrastruc-
ture, allowing quicker time to market.
To lock critical sections of code and/or data into the caches for quick
access, a per line
“cache locking”
feature has been implemented.
Once enabled, a cache is said to be locked when a particular piece of
code or data is loaded into the cache and that cache location will not be
selected later for refill by other data.
Power Management
Executing the WAIT instruction enables the processor to enter
Standby mode. The internal clocks will shut down, thus freezing the
pipeline. The PLL, internal timer, and some of the input pins (Int[5:0]*,
NMI*, ExtReq*, Reset*, and ColdReset*) will continue to run. Once in
Standby Mode, any interrupt, including the internally generated timer
interrupt, will cause the CPU to exit Standby Mode.
Thermal Considerations
The RC64574 is packaged in a 128-pin QFP footprint package and
uses a 32-bit external bus, offering the ideal combination of 64-bit
processing power and 32-bit low-cost memory systems. The RC64575
is packaged in a 208-pin QFP footprint package and uses the full 64-bit
external bus. The RC64575 is ideal for applications requiring 64-bit
performance and 64-bit external bandwidth.
Both devices are guaranteed in a case temperature range of 0° to
+85° C for commercial temperature devices and -40° to +85° C for
Industrial temperature devices. Package type, speed (power) of the
device, and air flow conditions affect the equivalent ambient temperature
conditions that will meet these specifications.
Using the thermal resistance from case to ambient (∅
CA
) of the
given package, the equivalent allowable ambient temperature, T
A
, can
be calculated. The following equation relates ambient and case temper-
atures:
T
A
= T
C
- P *
CA
where P is the maximum power consumption at hot temperature,
calculated by using the maximum I
CC
specification for the device.
Typical values for
CA
at various air flow are shown in Table 4. Note
that the RC64574/575 processor implements advanced power manage-
ment, which substantially reduces the typical power dissipation of the
device.
CA
Airflow (ft/min)
128 QFP
208 QFP
0
16
20
200
10
13
400
9
10
600
7
9
800
6
8
1000
5
7
Development Tools
An array of hardware and software tools is available to assist system
designers in the rapid development of RC64574/575 based systems.
This accessibility allows a wide variety of customers to take full advan-
tage of the device’s high-performance features while addressing today’s
aggressive time-to-market demands.
Cache Memory
To keep the high-performance pipeline of the RC64574/575 full and
operating efficiently, on-chip instruction and data caches have been
incorporated. Each cache has its own data path and can be accessed in
the same single pipeline clock cycle.
The 32kB two-way set associative
instruction cache
is virtually
indexed, physically tagged, and word parity protected. Because this
cache is virtually indexed, the virtual-to-physical address translation
occurs in parallel with the cache access, further increasing performance
by allowing both operations to occur simultaneously. The instruction
cache provides a peak instruction bandwidth of 2GB/sec at 250MHz.
The 32kB two-way set associative
data cache
is byte parity
protected and has a fixed 32-byte (eight words) line size. Its tag is
protected with a single parity bit. To allow simultaneous address transla-
tion and data cache access, the D-cache is virtually indexed and physi-
cally tagged. The data cache can provide 8 bytes each clock cycle, for a
peak bandwidth of 2GB/s.
2.
Table 4 Thermal Resistance (∅CA) at Various Airflows
Revision History
July 22, 1999:
Original data sheet.
To ensure socket compatibility, refer to Table 8 and Table 9.
5 of 28
December 14, 2001
查看更多>
热门器件
热门资源推荐
器件捷径:
A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF AG AH AI AJ AK AL AM AN AO AP AQ AR AS AT AU AV AW AX AY AZ B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF BG BH BI BJ BK BL BM BN BO BP BQ BR BS BT BU BV BW BX BY BZ C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 CA CB CC CD CE CF CG CH CI CJ CK CL CM CN CO CP CQ CR CS CT CU CV CW CX CY CZ D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 DA DB DC DD DE DF DG DH DI DJ DK DL DM DN DO DP DQ DR DS DT DU DV DW DX DZ
需要登录后才可以下载。
登录取消