Under the benchmark heading, we will cover all techniques which are also known as Performance Analysis in Computer architecture.
Background
With each new type of chips, it was usual for electronics magazine to do a special report where the main item was a synthetic table of features.So we became very adept to do the same and for a DSP, a table of 20 features was largely sufficient. Even better, it could be reduced to 2 lines: Mhz and Nbr of MAC operations/s- thinking about it, in these days all DSPs were single MAC so that MHz was sufficient.The big issue was the number of data buses.
- We also uses graphical analysis. In a nutshell, memory buses, MAC and ALU were drawn with their data path width; the bigger area, the better.
Before going further we must note two very important concepts.
The first one is the concept of benchmark mix.The biggest advance of BDTI, was to standardize the MIX in a simple and intelligent way.
The second one is when using kernel benchmarks, we gave up on speed and instead used cycles/kernel. In other words smaller was better as opposed to the CA methodology of bigger is better (e.g. D-MIPS).
And then, we naturally went from kernel to application benchmarks. By that time we had plenty of competition from ersatz DSP which used MHz as the measure of benchmarks and so we came with the concept of DSP MIPS.
- By then, the CA guys were completely lost. Our reasoning is complex, but trust us this the best way to tackle the problem.
And then, we were so clever, we also could use this figure of merits for DSPs. All DSPs were single issue so that MHz and DSP MIPS were synonymous. So it was very easy and safe to predict that a 50MHz DSP had the workpower of 50 DSP MIPS and enough to implement a G723 speech coder.
By this simple technique we had merged chips, algorithms under one umbrella. Kernel benchmarks could also easily be integrated such as the BDT 256 point FFT had a 0.008 DSP MIPS (8000 cycles)
But then, all hell broke loose for multiple reasons:
- DSP became CPU like and the CPU standard is Dhrystone MIPS.
- C became the only acceptable of benchmarking applications.
- Because MultiMedia was kind of becoming synonymous with DSP with ended up with all the benchmarks politics of data size.
- audio is 24-bit so how do you compare apples and oranges etc.. to avoid the problem BDT use the concept of native size.
- Even truer, CPUs and cores took a back seat to application platforms. Applications became the only respectable benchmarks (and by the same token totally impractical).
- Nobody in his right mind is going to implement a full application for benchmarking purpose.
- So as of 2012, benchmarking is 90% Linux testbench (downloading, rebuilding, compiling -o3, running, testing and measuring) and 10% optimization. We are miles away from evaluating the performance of a DSP.
List of techniques
- Table of features
- Problem: nowadays this is quantitatively more difficult. A standard SOC is made of hundreds of basic piece of IP.
- Problem: when a 32-bit shifter is not equal to a 32-bit shifter
- Single figure of Merits
- MIPS
- DSP MIPS
- MMAC/s
- the problem
- MOPS
- the problem: 1 MMAC = 2MOPS
- ITU WMOPS
- D-MIPS (Dhrystone MIPS)
- Graphical figure of merit: the Lucent Cube
- Graphical analysis
- Manufacturer benchmarks (kernels)
- personal and custom benchmark
- Industry standard DSP benchmark (assembler) - BDT
- Industry standard DSP benchmarks - the rest -> from bad to worse
- Benchmark results gives ranking linearly proportional to MHz! So why bother?
- Application benchmarks
- Models
- graphical models; fatter is better
- Bob Owen nice little drawings.
References
- BDTI web site
- Eric Martin
No comments:
Post a Comment