Friday, November 27, 2015

T.O.C 27Nov2015

=====================   last updated: 17dec2015 ============================
  1. So you want to be a DSP architect?
    1. THE LONG STORY
      1. DSP Architecture today
      2. Why Matlab?
      3. The SSS (Sad State of Software) story
    2. BACKGROUND CHECKS 
      • Henneson and Patersy (CA)
      • Coprocessing
      • The "CORE"/the "CELL" / The "SLICE"
      • Benchmarking
        • BDT as an architectural tool
      • Fixed Point dialects
      • Algorithms and Data types
      • Platforms and SOC architecture
      • BUILDING BLOCK ISSUES
        1. Physical limits
        2. Design time, Build time and Run time
      • THE FLOW: METHODOLOGY AND TOOL ISSUES
        1. My stories
          1. Back to 1979: What is descriptive language? 
          2. Mapping vs Compiling: the never ending story?
          3. Soft vs Hard Macros - is Firm the answer?
        2. NPU lessons
        3. Retargetable compilers
        4. Simulation: we KNOW profiling is THE key! got the answer?
        5. Beserkley or Berkeley? and Alberto during that time...
      • FOUND IN THE WEBS
        1. The Matlab Engine 
        2. Is Kurt Keutzer the Donald Trump of Hardwired Processing? 
        3. Another Berkeley Randy Cat?
        4. Found in the cobwebs: my garage
        5. Jeff Bier is promoting CNN! 
        6. Andre Dehon is promoting Multics!
        7. The Trailing Edge
      • DSP (uP) (HISTORY OF ~ ARCHITECTURES) 
        1. DSP of the First Kind (1980-95)
        2. DSP of the Second Kind (1995-2005)
        3. DSP of the Third Kind (2010- ?)
        4. DSP of the lost kind (1995-2005)
        5. DSP  of any kind (1950-2050)
        6. DSP vs Micros
        7. DSP vs FPGA
        8. The Archives and the DSP historian
      • FROM DSP TO DSP
        1. This wonderful world of DSP
          1. A world? More like a sect! 
            1. The Pope (Will), the Cardinal (Jeff) and the Wizard (Gene)  
        2. The DSP Old Timer's Club.
        3. Stop me if you've heard this one before!
        4. DSP history
          1. Bit slice: a saga 1975-1985
          2. Building Block:  was 1992 and IDT  the last DSP BB? or Weitek?
      • TARGET BBs
        1. Analog programs (Basic, HPC and other gizmos)
        2. TI C25 Tips and Tricks 
        3. SPmag Tips and Tricks
        4. SPMag 1990-2005
    3. BB Level 1
      1. Arithmetic BB
        1. add,mul,compare, conditionals, degenerated, compound, muladd 
      2. Bit wise BB
        1. Shift, 
        2. Bit-field, 
        3. Byte field, 
        4. Bit count logic (cls,clz, clo. etc..) 
        5. Boolean processing 
      3. Vector BB
        1. Vector arithmetic, 
        2. Memory vector search, structured tables, lookup
        3. Shapes (Matlab)
          1. Shape generation
          2. Matlab primitives
          3. Reporting
    4. BB1 DSP STRUCTURES
      1. DSP trilogy 
        1. ALU
        2. BMU
        3. MULT  
      2. MAC
      3. DAU
      4. AGU 
      5. PCU
      6. RF (Register File) and MEM
      7. VPU (Vector Processing Units)
      8. SU (The Shuffle Unit) 
      9. Bit slice: Bit slice building blocks
    5. BB2 DSP FUNCTIONS
      1. Filter
      2. FFT
      3. Correlators & bit comm. engines
      4. Sampling functions
      5. Matlab dsp functions
        1. accumarray
        2. diff
        3. sum
    6. BB3 MATLAB CONSTRUCTS
      1. Find
      2. Predication with Matlab
      3. Matrix functions
    7. BB4 MATH 
      1. Divider
      2. Math.h
      3. Complex numbers
      4. Further with numerical recipes
    8. BB5 CODING
      1. Basic Coders
        1. Coding generalized engine
        2. Gray, Hamming, etc..
        3. Parity
        4. Cyclic codes: Firecode
        5. CRC
        6. Huffman coding
        7. Arithmetic coding
        8. Gallois Fields
      2. Convolutional coders
        1. Gsm decode
        2. Hard decision
        3. distab
        4. distaabcc
        5. Viterbi equalizer
        6. Viterbi butterfly
      3. Com. Block coders
      4. Iterative coders
        1. Turbo encoding
        2. Turbo decoding
        3. LDPC
      5. Walsh, Hadamard and cdma
        1. FHT
        2. Happy Chirper.
    9. 5 MATLAB TOOLBOXES
      1. NUFFT
      2. SAR
    10. APPENDIX 

    Sunday, November 22, 2015

    Is Andre Dehon a genuine philosopher and Multics misunderstood?

    The answer is: YES
    =====================================================================
    That's beat all for this week!

    We all know how bad Multics was and so Unix was created. From that point history has been written:

    • Unix is the white knight....mac OS ...embedded linux
    • Microsoft windows/dos/PC  is the dark side
    • Multics? dead and buried.. never again.

    So, imagine my surprise, while checking some older MIT stuff from AD (Andre Dehon) I came across the Multics Reunion  MIT 2014 with a paper of AD.

    Abstract: At a time when computers are increasingly involved in all aspects of our lives, our computer systems are too easily broken or subverted.  The current state of affairs is, no doubt, unsurprising to Multicians who are painfully aware of the design and security compromises that went into the base design of today's mainstream systems.  The past 30 years has also brought vast changes in the availability and costs of computer hardware as well as significant advances in formal methods.  How do we exploit these advances to make computer systems worthy of the trust we are now placing in them?  We specifically take a clean-slate approach to computer architectures and system designs based on modern costs and threats.  We spend now cheap hardware to reduce or eliminate traditional security-performance tradeoffs and to provide stronger hardware safety and security interlocks that prevent gross security and safety violations even when there are bugs in the code.  We embrace well-known security principles of least and separate privileges and complete mediation of operations.  Our system revisits many pioneering Multics concepts including gates between software components with different-privileges, small and verified system components, and formal information flow properties and guarantees.
    Project paper: http://www.crash-safe.org     

     Now, I consider AD a madman since he went back to the east coast instead of staying in sunny California. But then AD was(is) always the ultimate contrarian, lateral thinker, deep searcher of truth, and so I should not be surprised by such a support on Multics..
    Ten years ago he was fighting for the replacement of silicon platforms by {bio? to be checked}. And asked about the prospect, he rightly said that he was an academic and would not dare fighting the semiconductor industry...

    Is there a lesson to learn?   Obviously, cutting corners and bowing to economics pressure will bite you in the long run. Was Multics the answer to hacking? I have my doubts (but deep interest in this kind of rhetorical questions).

    And I cannot not leave without mentioning Andre Dehon bio
    Andre DeHon received S.B., S.M., and Ph.D. degrees in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology in 1990, 1993, and 1996 respectively.  From 1996 to 1999, Andre co-ran the BRASS group in the Computer Science Department at the University of California at Berkeley.  From 1999 to 2006, he was an Assistant Professor of Computer Science at the California Institute of Technology. 
    In 2006 he joined the Electrical and Systems Engineering Department at the University of Pennsylvania, where he is now a Full Professor.  He is broadly interested in how we physically implement computations from substrates, including VLSI and molecular electronics, up through architecture, CAD, and programming models.  
    He places special emphasis on spatial programmable architectures (e.g. FPGAs) and interconnect design and optimization.  

    Multics BIO:  Andre DeHon is a bastard child of the tail end of LISP Machine and Multics eras, having been a research assistant for Knight and a teaching assistant for Saltzer.  As a member of MIT's Student Information Processing Board (SIPB), he was part of the group that pushed Multics access to MIT students and was logged in during the decommissioning of MIT-Multics. So, while he never contributed to Multics, he was around in time to learn that there were computer systems that predated Unix and Windows and that did have a principled way to address safety and security.  He hopes the world is now ready for many of the Multics and LISPM ideas that were ahead of their time and have mostly been forgotten during the dark ages of mainstream Internet growth.

    1.4 THE FLOW: METHODOLOGY AND TOOL ISSUES

    The goal of this section is to put together all topics relevant to methodology and tools

    The proposed M2IMPIR flow

    M2IMPIR (Matlab2Implementation) quick description 
    1. Matlab BB (Building Blocks)
      1. Develop specific BBs. 
      2. Reuse BBs, out of one of the several thousands of Toolboxes and librairies.
      3. Build AS-DSP and simulate it.
    2. 2 Map to Ideal Platform (BB to BB)
    3. IMPlementation with Ideal Resources
      1. Run and verify 
      2. Iterate 
    M2IMPIR issues
    • There are 3 types of issues: Matlab, Mapping, Implementation platform. 
    1. Matlab: Here the problem is "Matlab BBs". 
      1. Because, more often than none, a Matlab BB ends up as a FPGA IP, i.e hdl code, i.e concurency. And Matlab is everything but concurent. 
      2. The other issues have to do with the language. Going through all of them is outside this section just say the number is large and many are not obvious.    
    2. Mapping:  Mapping has a few pitfalls but they are well known. 
    3. Implementation platform:
      1. My current methodology is to use a platform with ideal resources (which is not the same as infinite resources). That's all. 
      2. Even an ideal platform has some issues 
        1. but at least the flow is managable. 
      3. For those  interested in implementation platforms and architecture, I propose to start with  Andre Dehon web site (at Uni Penn) , read all papers and go from there.

    Further Topics

    1. My early life: looking for an oxymoron: the descriptive Language
    2. The NPU lessons
      1. The thesis of Norwich Rich. 
    3. Retargetable compiler: from bit slice to Archelon.
    4. Meet the infinite loop: simulation and profiling on RC platforms (reconfigurable computing).

    Friday, November 20, 2015

    Jeff Bier is promoting CNN

    ----  Jeff Bier is promoting CNN! ------

    Thesis: Neural Networks? not again!

    I used to say " when I hear the word RISC (or speech recog) I draw my guns". Things have slightly changed so instead theses days, my current one is "when I hear Neural Network ..."
    I never had a really good fight with NN because I never met NN in the open. 
    NN was an esoteric digital technique (like many others) found in the DSP conference papers. Over the years I have kept NN implementation of DCT, FFT, MAC, etc.. and never had a chance to take a look.
    But recently I have seen  many micro-controllers promoting the technique. 

    Antithesis: Neural Networks are alive and well

    1. Matlab has a Toolbox
      1. and maybe much more than that. 
    2.  Jeff Bier put his money where his mouth is:
      1. The Embedded Vision Alliance invites you to attend an exciting event --> see below:
    Synthesis: No thanks, I pass on that one. I still think it is bleeding edge. 


    Wednesday, November 18, 2015

    DSP architecture today


    The goal of this section is to put in one place any architecture bits and pieces which are interesting for this blog. 

    The generic design methods                                                                                                  
    • Vector processing 
    • SIMD, swp (sub-word parallelism), multi lanes, packed arithmetic... 
      • where to stop? 256? 1024?
      • why stop at 1K? Matlab row vectors, cln vectors, arrays are a magnitude higher 
      • Minimum parallelism= 64K 
        • simd of 16  x 16 clusters x 16 cores  x 16 modules 
      • Golden ratio (x4, x16, x64, x256) 

    On My watch                                                                                                                          

    • "ASPI world"
      • Andre Duhon
      • FCCM
      • Xilinx at Large
    • GPU
      • Matlab and GPU
      • Coda
      • Deep packet searching
    • Hot Chips
    • TenSilica cores and domain specific platforms, 
      • Cadence

    They still Matters                                                                                                                  

    • ARM Cortex A - family and evolution
      • how far will they go with speculation? 
      • repeating the same mistakes in a new way?
    • Intel ecosystem
      • Altera integration
    • TI Platforms evolution 
      • replacing DSP core with hardwired COP

    Lessons Learnt                                                                                                                         

      • NPU

      ---------------                                                                                                                               


          Saturday, November 14, 2015

          SPmag1990 jan

          SPmag 1990 jan

          (this post was written while listening to: Dead Europe 72 interleaved with NRPS first album)

          We start lucky. In this issue, the main article is a survey of multiplier techniques by the master himself (Fred J. Taylor). Also interesting is the introduction of Matlab version 3.5, including the all new signal processing toolbox. All the book reviewed and many items are relevant to this work. We will NOT finish with a very fashionable topic: Neural Network.  

          BB struct                                                                                                                        

          The multiplier (fred J. taylor)

          This is the DSP BB by excellence. There is hardly any algo which does not rely on a multiplier and it is not cheap. As Fred mentionned, the 16x16 multiplier occupies 40% of the 32020 chip. And I believe 80% of non-memory real estate.  
          Now, in our world of ideal resources why do we bother with the size of the Mult? Because ideal does not mean infinite. If we develop (say) a 2000 Million node chip, each node would better be optimised. And the best choice for a node (or PE) is a MULT. The next best choice is a DSP core, where MULT is still 80% of the core. (it reminds me of the debate at Xilinx to replace the embedded 18x18MULT with a C25 like core to extend their market share; all people cannot be right all of the time)
          So here are the techniques (flattened) given by Fred:
          • Shit add or iterative
          • Booth algorithm (and modified~)
          • Wallaces trees
          • Cellular arrays ( Perazis,  Baugh-Wooley)
          • Systolic Array Multiplier
          • Bit Serial
          • Distributed arithmetic
          • Canonic signed digit number systems
          • Logarithmic number systems
          • Residue Number sytems  
          With the hindsight of 25 years, what to think? Well it stands pretty good, except for the systolic array.
          What about the non-traditional number systems? Frankly we are very NOT positive about these, because the problem is the time lost in the interface. 
          As for the others, bit serial and DA are common FPGA techniques and are available in Matlab too  (how to generate HDL code for a lowpass FIR filter with Distributed Arithmetic (DA) architecture). 
          The parallel techniques are standard logic techniques but they are still the most promising because the exploration space is very vast.
          We have a favorite, which is a kind of subword parallelism. The longest delay in a 16x16 MULT is the final 32-bit adder. This delay can be halved by splitting the adder in 3 16-bit adders. By the same token we can subdivide further with byte and nibble adders hence reducing the delay from 30 to 6 logic gates. We simulated this type of architecture in Matlab (using FP!) (see elsewhere).    

          Workshop: VLSI for Signal processing at Iccasp90 (Edward Lee)

          It is mentionned that several Experimental Parallel Machines for signal processing will be explained. OK let us see.
          With the hindsight of 25 years, what to think? A bit of disaster array (sorry area) and we know the results (the GPU). However, in the M2IMP world where the resources are ideal, and the software is straightforward, each of these EPM will be given a good look in due time.

          Vector Quantizer (Tran & all,  IEEE tr. on com, sep 89)

          It is worth asking if the VQ is a structural or functional (speech specific) BB. 
          In our experience we consider it as a generic unit with app. specific parameters. 
          Matlab?  It is an object in DSP system toolbox .
          With the hindsight of 25 years, what to think? This is a good example where Matlab is now the reference. 
            

          BB DSP fun                                                                                                                     

          FFT workshop at ICCASP90 (John Cooley) 

          In this workshop the FFT master explains all the options and innards, including the different radices. Great!
          Matlab?  FFT is a built-in. The number of samples is totally flexible. The problem is that it is a black box. Not quite all dark as it is based on FFTW. But we had a mixed experience with the golden model methodology. While trying to develop a radix 9 FFT and comparing it to Matlab, stage by stage. As a matter of fact it was easier in Excel. As I am writing this, it does not make sense since in effect I am saying that FFT is non determinstic. But so is my memory of it. I do remember Marc having a lot of trouble with Viterbi decoder but it is different cause because it is a treillis.

          Further Transforms - workshop at ICCASP90 (John Cooley)


          In this workshop, John goes further with number-theoretic transforms(NTT) and polynomial transforms. 
          Matlab? As far as I know they are not included. Fortunatly the community supplies them 
          • NTT: NTT.m on matlab central. 
          • PT brought me to the site of chemnitz, Daniel Pots. In turn brough me the NUFFT toolbox. 
          With the hindsight of 25 years, what to think? 
          • FFT: is now old hand, {see elsewhere}. 
          • NTT; I have no clue, but a NTT engine would be nice. 
          • NUFFT (non uniform FFT) an extremely promising field which I bundled with CS (compressed sampling).  

          Communication paper: dpll

          Matlab? not included; see ecosystem:  Modeling and Simulating an All-Digital PhaseLocked Loop By Russell Mohn, Epoch Microelectronics Inc.  

          Communication paper:  sigma delta 

          Matlab? included

          BB in DSP domain                                                                                                           

          A tutorial at ICASSP 90: SAR (Synthetic Aperture radar) (David Munson) 

          Matlab?  Included, excellent Block diagram and tutorial.  


          BB in Matlab specific                                                                                                      


          These Building Blocks are Matlab specific because if they exist it will be more likely in the shape of M-code than anything else. Never met them in DSP software libraries or commercial chips. 

          Signal processing Toolbox is introduced {advert for Matlab3.5}
          First ever SP toolbox. This includes a large number of functions many of which are not very common. 

          Non Linear Spectral Estimation Methods (Prabhu &all, IEEE proceedings, June 1989)
          Here is an article which covers algos than a standard DSP cannot implement. Among these techniques are arma, music and esprit. 
          Matlab? They can be found over diverse Matlab toolboxes.   
          Also, many of them can be found in the very professional looking HOSA toolbox which is available from the Central.  {HOSA - Higher Order Spectral Analysis Toolbox by  12 Feb 2003 (Updated Spectral and polyspectral analysis, and time-frequency distributions.}

          SUMMARY TABLE 




          Monday, November 9, 2015

          The DSP Historian

          DSPs - Archives

          1. The DSP Historian (see below)
          2. 1978 - 1987: the call-Up
          3. 1988 - 1995
          4. 1996 - 2004
          5. 2004 - Today 
          6. Pre History

          The DSP Historian:
          The goal of this section is to go through all the papers having for topic "the DSP uP chips". This will become clearer as we go in order of things.

          Background
          The knowledge of " History of DSP uP chips" is a major tool in my "alternative world" methodology to develop AS-DSP IP. As explained elsewhere I believe that anything after 1990 has little value compared to whatever happens before. Hence the "History of DSP uP chips" become 2 major fields:
          • before 1990: the chips ans architecture to know 
          • after 1990: the field of ideas: sounding board, the mistakes, the real advances, etc..
          Taking that at face value, there are plenty of buddying architects who  would like a quick summary of DSP uP chips before investing further time.
          Hence, the issue becomes:  where is the best story? "who is the best DSP historian?"

           Anything in my garage?                                                                        
          I will do the usual and go through my stash. In one folder I found stuff which should be available on the web, but not always free.
          • Noted, in a 10 page paper [1], 
            • the first generation dsp (amd2900 and TI C10) followed by the 2nd generation (C25, dsp16). This is so wrong that you have to be admirative of the author(a lateral thinker? an hyperspace traveller?).  
            • Anyway, the point here is that by studying the history of DSPs, the architect is obliged to classify (by generation, by features) and hence make choice, which as we all know, the lifeline of a DSP architect. If you cannot even classify DSPs don't bother.
            • Also noted are references to the pipeline, including the mind-boggling <data stationary versus time stationary>.  This, I believe, is a complete red herring but in the world of free world DSP, anything is fair and square.
            • OOh, and the DSP32C has a reservation table!! 
            • Finally, another "gem" is Edward Lee's "interleaved pipelined" architecture. Hum?
          • Speaking about Edward Lee, not only he is the grand father of many things (see BDT) but also the father of the classical paper[2,3], a first, dated 1988 < programmable DSP architectures>. If you have access to the paper, you can stop here. 
            • There is an IEEE micro version dated 1990 [4], in which Lee conservativly dropped  the architecture term. 
              • In these days, only a few people could risk using the term.(see MPR oct 1989)  
            • In my case, Lee was was far from being the first one. Since 1978, I had already written 4 different versions of History of DSP chips including a 5m long mural, so Lee was of little value.
          • At this stage it is worth mentioning the TRILOGY (Will, Gene, Jeff). What where they doing in 1988?For different reasons, none of them had written"THE REFERENCE". 
            • Jeff: obviously, being a student of Lee, postdate his work. 
            • Gene: being in full war with his competitors, he was not in the position to write objective reports. Still, good grunting noise could be heard at TI conferences.
            • Will: I came across his first(?) report circa 1986. Primarily a marketing document, but in these days it was a technical gem.    
          • Following this, here are much more recent papers, freely available, all being university courses on DSP . 
            • With the title, Special Purpose Digital Processors (DSP) [5], a new type of acronym is introduced: the warped acronym. Why not? e.g. General Purpose Processor (CPU). It is a rather exhaustive 29 pages lecture note document from TUHH.
            • Less exhaustive a Texas loaded ppt from Austin[6]
            • A french in french 144 slides pdf from Irisa [7]. Note the T.O.C is perfect for a classical story of DSPs.
              • I. Introduction
              • II. Architectures MAC/Harvard
              • III. Evolutions des DSP
              • IV. Flot de développement (Flow) 
            • Finally [8] is a french paper in English from 2002. The authors are from McGill.
          Gene's law
          As already mentionned Gene Frantz was kind of busy proseletyzing the world to the DSP mantra till in 2000 he was convinced to write a Millenium paper [9] for IEEE micro in which developped his popular (among us DSP architects) Gene's law. Gene's law is the equivalent of Moore's Law.

          First for the not so good.
          • Here is the lesson for us all architects: we DON'T KNOW HOW TO limit ourselves to history. It is always past, present, future. Extrapolation is what sells the paper and 9 out of 10 we are wrong. 
          • Unfortunatly this paper was badly timed as Moore's law (speed) was going broke.. 
          • Gene extrapolated for the 2010 and came with the 10GHz DSP . 
            • Myself, around that time I had papers extrapolating speed for CPU and embedded CPUs such that 2010:10G __ 2020: 20G__ 2030:30G for CPUs you see the trend. Embedded CPUs had a lower slope and were around 5G in 2010.
          • Now, even better is the Moore analogy. Little known but true, in multiple interviews circa 1980 Moore explained that with ultra VLSI, except for FFT he had no idea how to fill a logic chip. Good one, Gordon, no wonder you came up with the !IAPX432. In the same way, is Gene, turning the TI slogan on its head "limits to your imagination" here is the excerpt    (Determining how to use that processing power effectively will require imagination that goes beyond conventional engineering methodologies.)...ooooh my oh my!
          • In all fairness, the paper was called DSP trends, so...
          Now, for the good things, .
          • Moore's law is broken but I would not be so sure about Gene's Law
          • Gene changed architecture by putting power consumption (and efficiency) to the forefront.
            • This is what matters to me. Effectivelly it means than in the end, after all low hanging fruits are gone,  there is only 1 technique left to improve efficiency: hardwired is replacing software. Customization is replacing cpu clock.{at this stage parallelism and multicore is just an implementation detail} 
          • Gene extrapolated the 50 GIPS DSP which did happen and even by today seems pretty tame. 
          • Gene mentionned a lot of stuff, details that I picked up and in the end, this is it. MY MOST RECENT PAPER ON THIS TOPIC WITH ANY VALUE. 
            From Lee to BDT
            By the 1990s, Jeff Bier was starting an exceptional business <www.BDTI.com> B standing for Berkeley (see the story of B&B elsewhere). As part of his duties to the DSP community at large, Jeff was given lectures on DSP chips. So if you've read Lee's seminal paper the next logical step is a Jeff's paper. Which one? No idea, I have 41 in my stash. Even Jeff does not know how many he made so better go the BDTI web site.
            Also, there is much more to BDTI than DSP chips history and trends. They are THE reference in DSP benchmark (see elsewhere).   

            Krishna Yargalada
            By 1996, BDT had the monopole of "DSP chips - history and trends". Serious talk about breaking it, was heard in congress. Courageously Krishna Yargalada attempted to infiltrate DSP conferences with his own version[10] but t,hinking about it, it was a clever disguise to launch Hellosoft in what became the big revolution of our world of DSP "outsourcing assembly to India" {see elsewhere}.

            Henry Davis
            Hi Henry! 

            Robert Cushman and the EDN DSP directory
            From 1981 to 1988(?), Robert Cushman, a major contributor to EDN,  wrote many articles on DSP. He put together the first EDN (microprocessor-like) DSP directory in 1987.  I have all of them till 2008. Going through them in order, gives an exceptional view of history.

            THE UBER REFERENCE
            Noted with pleasure that a lot of papers put as reference the BDTI classical book.
             Phil LapsleyJeff BierAmit ShohamEdward A. Lee , “DSP Processor Fundamentals: Architectures and Features,” IEEE Press, 1996.
            This book is THE BIBLE reference for DSP architect (see elsewhere). It is much much more than an introduction to DSP chips.

            References                                                                                                                                      
            1. <honestly I dont know; lost in history>
            2. Edward Lee "Programmable DSP Architectures Part 1" IEEE ASSP oct1988
            3. Edward Lee "Programmable DSP Architectures Part 2" IEEE ASSP jan1989
            4. Edward Lee "Programmable DSPs - a brief overview" IEEE Micro oct1990. This is essentially the intro of paper 1 + an excellent  summary table.  
            5. F. Mayer-Lindenberg, "Special Purpose Digital Processors (DSP)" TUHH (Hamburg University) lecture notes, dated ? 
            6. Brian L, Evans "Introduction to DSP" University of Texas, Austin
            7. <http://www.irisa.fr/R2D2>   Olivier Sentieys, IRISA, 2005
            8. Benoit Champagne, fabrice Labreau "An introduction to Digital Signal Processors" Compiled 2002 
            9. Gene Frantz "DIGITAL SIGNAL PROCESSOR TRENDS", IEEE micro nov-dec 2000 p.52
            10. Krishna Yagarlada "DSP chips for communications"  which conf?? 2003?? (see excellent slide)
            This is technical slide, illustrating different architecture choices !! 
            Editor notes: no links are given as they become obsolete. Google keywords instead!

            Saturday, November 7, 2015

            The philosophy of BB

            The philosophy of BB

            1. A BB is either simple or complex. It should never be complicated. 
              1. The difference between complex and complicated is that complex can always be broken down in a sum of simple things. 
              2. And complicated is well .. complicated. For instance a lot of software is complicated.
            2. People familiar with the evolution of  electronics, remember the SSI, MSI, LSI stages and can put the heydey of BBs as the days of Bit Slice. 
              1. A bit slice was built with simple elements and a bit slice was itself the building block used in upper dsp Building Blocks, dsp functional Blocks (FFTer, Filter, Correlator), dsp CPU  or even the DEC mini-computer. 
            3. For instance a standard dsp CPU (called DSP thanks to TI) are made of 4 standard bit slice blocks (ALU, MULT, AGU,  LSU) + memories + I/O Space.
            4. A very good DSP design should have a maximum of 3 hierarchical levels (MSI, LSI, Final product). 
              1. Anything above 5 levels is prohibitive and should not be tackled here.
            5. Because we rely on a hierarchy of blocks, testing of each block is vital to the whole process. 
              1. A bug in a low level block is a disaster since it will be repeated hundred of times over several final products. 
            6. Same thing for an inefficient block but with a twist.
              1. Contrary to a bug, efficiency is a relative term. 
                1. You can redesign the BB with a different name. 
                2. (for sophisticated designer) you can redesign an upward compatible block  with the same name 
                  • but then you need a very hefty test suite.
            7. So what about repairing a bug? 
              1. In my book, a bug is both visible and NOT compatible with itself. 
              2. Hence if you repair a bug you don't know the result; somewhere down the line somebody wrote software taking this bug into account or somebody used the BB taking the bug into account.
            8. This is the INTRINSIC problem to the BB approach. You are going BOTTOM UP in the definition. So forget about reparing bugs. Instead create another BB.

            Matching functions to structures

            ====== Building Block ISSUES  ======= 

            Matching functions to structures

            When designing a new BB it is because you need it. Hence, we call this need: a strictly functional need. There are two ways to go:
            - stick to it and solve the present issue.
            - have a broader view and design a more general purpose brick which can be reused.
            A third case happens when the functional block can be turned into a generic structural block for no additional price. So before designing a new block, it is worth asking oneself.
            Can I make this BB more generic?
            Let us take as example the addition needed in a complex multiplication:
            zr = xr*kr -xi*ki
            zi = xi*kr + xi*kr
            same as
            zr = p1 - p2
            zi=  p3 + p4
            If we use a dual multiplier, the 4 multiplications can be done in 2 cycles.

            Then it must be followed by two BBs for the complete equations. We now write the equations as follows.
            1.   [zr zi] = thru22(p1,p3)   %   zr=p1 zi=p3
            2.   [zi  zr]= addsub42(zi,p4,zr,p2) %zi= zi+p4  zr=zr+p2
            It is easy to see that the two BBs are generic but the inputs and the equations are not straightforward..

            The standard alternative is
            1. zr = sub(p1,p2)
            2. zi=  add(p3,p4)
            which is simpler. So what are taking about here?

            In fact there are two issues with the standard way;
            1. It needs the 4 input data on the first dual multiplication
            2. It does not work when we want  real_imag  to be in the same register so that the complex number is considered an entity made of 2 parts..

            Thursday, November 5, 2015

            T.O.C 4Nov2015

            1. So you want to be a DSP architect?
              1. THE LONG STORY
                1. DSP Architecture today
              2. BACKGROUND CHECKS 
              3. BUILDING BLOCK ISSUES
                1. Physical limits
                2. Design time, Build time and Run time
              4. METHODOLOGY AND TOOL ISSUES
              5. FOUND IN THE WEBS
                1. The Matlab Engine 
                2. Is Kurt Keutzer the Donald Trump of Hardwired Processing? 
                3. Found in the cobwebs: my garage
                4. The Trailing Edge
              6. DSP (GP) ARCHITECTURES 
                1. DSP of the First Kind (1980-95)
                2. DSP of the Second Kind (1995-2005)
                3. DSP of the Third Kind (2010- ?)
                4. DSP of the lost kind (1995-2005)
                5. DSP  of any kind (1950-2050)
              7. FROM DSP TO DSP
                1. This wonderful world of DSP
                  1. A world? More like a sect! 
                  2. The Pope (Will), the Cardinal (Jeff) and the Wizzard (Gene)  
                2. The DSP Old Timer's Club.
                3. Stop me if you've heard this one before!
                4. DSP history
                  1. Bit slice: a saga 1975-1985
                  2. Building Block:  was 1992 and IDT  the last DSP BB? or Weitek?
              8. EXAMPLES of Target AS-DSP
                1. Analog programs (Basic, HPC and other gizmos)
                2. TI C25 Tips and Tricks 
                3. SPmag Tips and Tricks

            2. BB Level 1
              1. Arithmetic BB
              2. Bit wise BB
              3. Vector BB
            3. BB1 DSP STRUCTURES
              1. Processing trilogy (ALU, BMU, MULT)  
              2. MAC
              3. DAU
              4. AGU 
              5. PCU
              6. RF (Register File) and MEM
              7. VPU (Vector Processing Units)
              8. SU (The Shuffle Unit) 
              9. Bit slice: Bit slice building blocks
            4. BB2 DSP FUNCTIONS
              1. Filter
              2. FFT
              3. Correlators & bit comm. engines
              4. Sampling
              5. Matlab functions
            5. BB3 MATLAB BB
            6. BB4 MATH 
              1. including Complex numbers
            7. APPENDIX 

            Is Kurt Keutzer the Donald Trump of Hardwired Processing?

            The answer is No!

            The link
            http://www.eecs.berkeley.edu/~keutzer/

            Background
            In the sputtering days of  CPU Architecture, just after the millenium, I remember KK (and Berkeley in general) as a different voice as to which directions to take for future CPUs. Effectivelly here are a few arguments of the time (*):
            -  we are now in a hardware cycle as opposed to a software cycle
            -  the next step in MIPs is NOT reached by increasing the clock
            -  instead programmable logic or FPGA or hardware or ASIP (Application Specific IP)
            -  hurray for configurable core (typ. tensilica,  then stretch)
            -  hurray hooray for reconfigurable computing (all dead).

            Jump to 2015. As I am going through my notes on possible architecture trends (of interest to this Blog), I have 2 KK papers that I want in electronics format. Google the paper name + author and got nothing. Lost in the ozone. Okay I will scan them.
            To cut a long story short, the KK I found is not the one I had in mind(*), but still an exciting guy to follow..
            Anyway, here is  his web page; among other things, KK mentions that he lost money with Catalytic, sorry, I contributed to that Kurt!
            Here is his current interest :
             Exploring Design Patterns for Parallel Computing
            which is somehow related to this Blog.
            I downloaded 2 papers:
            - how to map recursivity in hardware
            - speculation.
            And the other papers are good too!

            (*) as I remember; I might be confused, not sure if KK was leading all that;

            1.1 The Long Story


            1.1 - THE LONG STORY


            The goal of this section is to give some meat (and the background) to our base statements.But it does not try to discuss or justify any of our choices.
            • As mentionned in the introduction, we are in the very comfortable position of presenting a system which is outside reality. 
              • In effect, all decisions have NO tangible reasons (such as based on cost, design time or available tools and platforms).
              • It is all in the eyes of the beholder.

            Q&A


            • DSP architect is Dead?
              • Today DSP architect job's split: 95% platform, 2% Core, 3% tools.
              • PLATFORM: 
                • Is there a DSP architect in the house? No, he is split into pieces
                • The system guy (m-code and simulation) --> 90% of the work
                • The DSP guy (*) --> 10% of the work
                • The firmware guy: can take the job of the DSP guy, especially writing in C
                • The ASIC/FPGA designer:  can take the job of the DSP guy.
            • From above, the REAL DSP architect is the system guy? yes
              • But this guy is not (does not want to be) an architect
              • {..} Here would come all the efforts such as Matlab to C/verilog, System C, etc,, and in general all the failures of the Tool vendors to provide a flow:system to implementation.  


            • Why customisation?
              • Because we used to be vertical, went horizontal and now we are back to vertical.
            • What is accelerated processing? 
              • see  IEEE micro july/aug 2008
            • Why Matlab? 
              • Because it is the most popular system tool.
              • Also Matlab is much more than a language (randy's 3 things 1):
                • a language, *** 
                • an EDE  ****
                • a simulation environment *****
              • Matlab is not a concurent language (contrary to verilog). Does that prevent you from sleeping? here i avoid the use of the word parallelism because of the funny parfor. 
            • What is mapping?
              • An analogy: mapping is hardmacro, compiling is soft macro
              • An approximation: mapping is by hand, compiling use a tool.
              • An aromatism: mapping is an assembler macro, compiling is C
            • Can mapping be used with a compiler? 
              • You bet! As a matter of fact, it is the only way to be succesful.