DSP Bricklayer: FP hardware

Floating Point (FP) hardware
The Goal of this section is both sided
1) Emphasize the historical importance of the FP COP among all classes of COP. For that, this covers all types of FP Hardware. This includes peripheral, coprocessor, execution unit (FPU), Building Block, FP boards, FP DSP..
2) The natural relation to Matlab (and ease of implementation).

Background

On May 8 1980, Electronics [ref 1] had a wonderful concept drawing (*) of two hands working together to illustrate the upcoming of Intel Numeric Processor (sic!) [ref 2] which happened to be the birth of a new type of chip: the Coprocessor (COP).
At the time, this was real progress, as the only way to add FP performance to a microprocessor-based equipment was to use either the peripheral AMD 9511/2 [ref 7] or a specialized board.

Of course, computational intensive co-processing had existed in mainframe including the somehow difficult dialog between Cray1 and DEC mini-computers.
Today, the FPU is so much part of a CPU that it is difficult to understand all the brouhahas of a separate chip. But one has to be in the context of limited silicon real-estate. Incidentally this is also the case in today's embedded resources.
Very soon Motorola answered with the 68881, which benefited from the more advanced interface of the 68000 architecture [ref 4]. In the same spirit, but not the same success NS [ref 5] came up with their own chips.
Meanwhile, as the years went by, Intel was climbing up the numbering schemes with the 80287, then 80387 [ref 3] then 80487. Except it took us a while to realize that the 80487 was not a chip. While we will not bet on that, the 487 was purely an integration exercise and the interface might have been the same COP interface. But it was now 1990, the heyday of Computer Architecture so that by now the RISC architecture was dominant. The FP COP became the FP unit (FPU), totally integrated to the pipeline [ref 8] and in the case of superscalar working in parallel with the Integer Processing (IP) unit. From that point, the story is largely available on the web [refer to Hot Chips] and goes much beyond the background scope. Much more relevant to our scope is the story of FP Building Blocks( see FP BB) .

FPU and FP extensions

We will give an incomplete list of FP units and FP extensions. FP extensions are characterized by a separate ISA and document which is added to the base architecture. This is not for the case for the Pentium which is natively FP , but this corresponds pretty much to all other architectures even the PPC.

PPC: Book E may 99
ARM:
MIPS:
Hitachi: SH7705 FPU
TenSilica:
TriCore: TriCore FPU
TI DSP C28xxx

This is an interesting case as they offer two very different solutions.
The 283xx core is a 28xx core to which has been added a FPU. This is the standard situation.
The 2803x family called Piccolo does not change thee 28xx core. The FPU is I/O COP which acts as a CPU front end by processing signals coming from ADC modules.[ref Piccolo Control Law Accelerator]

FP BB (1985-1995) [ref 10]

1) At first we had the usual DSP/Bit Slice school (TRW, ADI, AMD) which naturally increased their portfolio with a DAU (see bit slice section) . They started 4-bit and up the integer curve 8,16,32. At 32-bit, FP had the same data size, so why not? It presents the disadvantage of larger size due to the FP adder but note that the multiplier is only 24x24. The biggest issue is the large step in complexity (such as IEEE standard) which is a relatively small price to pay for the comfort of numeric accuracy. Very soon as always in FP 32-bit was not enough and 64-bit chips appeared,. Even further the only market especially associated to the IEE standard.
2) Driven by new names (Weitek, Cyrix, IDT) the 64-bit FP BB turned into Intel coprocessor socket which at this writing baffles me completely. They must have missed the RISC revolution somehow. Even with the 486, Weitek was still pushing the usage of external COP. That being said, there are quite a few lessons to learn: their block diagrams are marvels,a heck of a datapath, and good host to cop interface (that was no Cray 2). .
Quick list

MULTIPLIER

TRW 1042
Weitek 1032,1064
Weitek 2264 (IEEE)
ADI 3210

ALU, RALU,

TRW 1022/3 (22-bit FP)
Weitek 1033, 1065
Weitek 2265 (IEEE)
ADI 3220

DAU --> COP

AMD 29325/C327
Weitek 3132/3, 3364
LSI 64132, TI ACT8847

Intel COP

Weitek 3167, Cyrix 83D87

FP DSP

follow this link

FP IP (1995-1999) FP in FGA (now) [ref 11]

Last in the whole story is the Intellectual Property (IP) craze of the second part of the 90s where amateurs were writing a bit of C code and sold it as a product (virtual silicon?).
Today, the best ones are productized piece of IP for FPGA. Xilinx, Altera, etc.. have solid specs describing these components. Multiple other sources can be found on the web.
GPU (now)
The story cannot be complete without mentioning the GPU. While the architecture is somehow exemplary, it is much beyond the scope of this section..

Topics

Coprocessor Interface

68881 outsmarts the 8087 with non blocking

How much IEEE complexity?

The "sacred" standard.
ARM cheeky solution: problem with exception? no problemo, invent a new IEEE model

TI gives up integer because Matrices and Matlab: the C66

References

John Palmer and his Intel colleagues "Making mainframe mathematics accessible to micro computers" Electronics May 8,1980

(*) from Fred J. Sklenar

Intel AFN 01525A "8087 80-bit HMOS Numeric data Processor" June 1980
8087, 80287, 80387

Intel "80387 Programmer's reference manual" 1987
Prakash Chandra "Programming the 80387 Coprocessor" Byte 1988
Cedar Yoram, Meir Ben-Nun " Improving computational throughput" Computer Design, March 1983

Motorola 68881

C. Huntsman "bla bla 6881" IEEE micro, dec 1983

NS 16881

Moshe Gavrielov (yes the same) , etc.. "NS 16081 something.." Computer Design October 1983
Subash Bal, etc. "NS 16081 something..." ED march 5, 1981
Robert Grappel " FPU improves 16-bit uC performance" EDN September 15,1983

Diverse

Frederick ware " HP stuff" Electronics, Feb 10,1982

FP peripheral : AMD 9511, 9512

Krishna Rapallali "Chips make fast math a snap for microprocessors " Electronics April 1980

Placeholder for FPU

PPC: Book E may 99
SH7705 FPU
TriCore FPU

Placeholder for BB FP

Weitek 1032,1064
Weitek 2264 (IEEE)
ADI 3210

TRW 1022/3 (22-bit FP)
Weitek 1033, 1065
Weitek 2265 (IEEE)
ADI 3220

AMD 29325
Weitek 3132/3

T.Fleming " Monolothic floating point processors streamline microcomputer design" EDN june 23, 1988
John Gallant "Math coprocessor ICs" EDN june 7, 1990
Mauro Bonomi, Pete Wilson etc.. "Floating Point Processing" March 1988, BYTE special issue
Placeholder for FP operator

Amphion, others?
Xilinx, Altera

Placeholder for GPU

Saturday, December 17, 2011

FP hardware

No comments:

Post a Comment

Followers