RE: CELL processors considered interesting?

Tom Locke
18 February 2005
RE: CELL processors considered interesting?
> From what I understand the Cell has one general purpose core and 8 SIMD 
> cores.

> Please correct me if I'm wrong... as I understand it this makes it an 
> entirely different beast to the multi-core Transputer that was once on 
> the cards. Better suited to crunching large data sets than general 
> purpose CSP style concurrency? (the vendors certainly seem to be 
> pitching it like this - video-game graphics, 'multimedia workstation')

Hi Tom,

My understanding is that they are self contained hybrid integer/Vector
processors and not just extra vector units tacked onto the standard core.  

Yes they will be best at processing large amounts of single precision
floating point data but then a Pentium 4 will also get the best throughput
processing large amounts of floating point data in its SSE2/3 units but that
isn't what it spends most of it's time doing...  

Each SPE seems to be a fully functional integer machine in its own right
with their own branching, logical tests, load-stores and all the other stuff
that makes up a standard RISCy processor.

The unusual things about the architecture are that the (large) register set
is shared between vector, floating point and integer operations (from what I
can tell) and also that the 256K of private memory is just that: private,
and not a hardware controlled primary cache (although I think you can use
software to make it act a bit like a cache).

I admit it isn't 100% clear that my understanding is correct, but have a
look at this
http://www.realworldtech.com/page.cfm?ArticleID=RWT021005084318, which is
easily the most detailed article I've seen yet on the CELL and see what you

It also mentions communications channel DMA/compute overlap and 'channel'
instructions, which sounds rather familiar.

Don't get me wrong, I'm not trying to say it's a Transputer, but I think it
could be a useful target for CSP style concurrency.  Think of it as a T800
with a fixed width instruction set, a less rich set of
concurrency/communications primitives, a large register set and a vector
unit with 256K of on-chip memory.  Doesn't sound so bad?

Jim Moores

