The XMOS threads are all programmable.
Another way to view a single XMOS core is that it is actually
8 logical cores sharing the same memory. So a logical core
can be used as a very flexible DMA controller - it can
scatter/gather, compress/decompress, match external
protocols etc etc - it's all software.
I think this is all described in the invited presentation
to CPA a year or two ago - and also in the hotchips
presentation last year (somewhere on youtube)
On 3 Oct 2012, at 19:42, Larry Dickson wrote:
The reference said:
The difference can mainly be attributed to the fact
that on the XMOS architecture each response is handled by
a separate hardware thread which can respond to the input
signal without needing a context switch.
Question: Is this "hardware thread" programmable, or is it just a converter that changes a hard link to a soft channel input? If the latter, then we also need to know how fast XMOS software can respond to the input. If the former, then it's basically the same as a Transputer high-priority process dedicated to each input, with no context switching between the high-priority processes, which is the right way to go, I think. But it still has to get state change information to the rest of the processes.
I reluctantly added a single priority level to the transputer - at the
time the economics seemed to support this. But I'm still not sure.
Today, I'm sure the economics doesn't support the complexity
and overheads of priority schedulers, interrupts etc. The
hardware event-handling on the XMOS cores is faster than
anything an interrupt-based system can deliver - and
much easier to design with - here's a link:
Even for a single low-end core, this approach will easily
out-perform a conventional interrupt system.
With a lot of cores, as Larry has said, the event-handling will be
at the edge of the system. The core of the system will be running
parallel communication structures but - as I said in the
presentation - these have to be designed to ensure
efficient parallel communication flows.
On 3 Oct 2012, at 15:05, Larry Dickson wrote:
The bad news is that these chips are terribly complex to program. Silicon gates are supposed to be free, so the hardware has zillions of options. To understand the issue: the TI chips can route some 1000 interrupts to each core (using a 3 layer interrupt controller). Obviously, interrupt latency is still good, but relatively slower than on the much simpler ARM M3. The point is that a simple PRI ALT will not do the job. Two priorities are not sufficient. It worked more or less on the transputer because this chip had only one event pin. Impressive as the performance was at the time, it was still lacking for hard real-time when tens of processes were competing for the CPU. 32 to 256 priorities are needed to manage all the on-chip resources more or less satisfactorily.
So, it looks like the occam-restaurant is still at the end of the universe as Douglas Adams would have said. And looking at this discussion, the babelfish isn't helping very much.
Simple design answer: a chip (like the XMOS or Adapteva?) with a boundary of (say) 28 cores, each serving a single event/interrupt/link. Two-level PRI PAR independently on each core, so that each boundary core is absolutely responsive to its event. Extremely fast internal comms between cores, and hardware FIFOs between the boundary cores and the 36 internal cores, so that the reliably captured hard IO can handle slight delays before soft processing.
The reason I resist multiple hard priorities is that I think that solution is mostly a chimera. The top guy is OK, but number 2 and below swiftly become subject to occasional bad delays (when the interrupts happen to fall on top of each other). "[T]ens of processes … competing for the CPU" are only a real-time problem if they depend on lots of independent asynchronous stimuli (given a fast CPU). Multicore, which was not available in the time of the Transputer (except in the form of multiple Transputers, which was way expensive) lets you be responsive to all the stimuli independently.