Larry
My comment was not about the size of the transistors; it was about the density of transistors within the real-estate dedicated to links. I don’t *know* why this is but it is likely that the density is limited by wiring. This is a perennial problem - the manufacturing process used for the transputer had (from memory) 2 layers of interconnect - 1 metal and 1 polysilicon. Modern processes have a lot more capability - perhaps 10 layers or wiring. In practice, even with this, logic density is limited by interconnect, not by transistors. “Transistors are free” isn’t quite true, but they aren’t the critical resource in modern designs. Again, the constraints on interconnect are such that local (same clock domain) is cheap and fast, non-local is expensive and slow. N.B. Throughput is cheap - “just” go parallel, it’s a matter of economics. Latency is hard, you’re up against the laws ofPhysics.
“Use case” really matters. We know that for some domains, specialised machines do very well (GPUs, ML/AI-PUs) do very well. You have to get the balance between compute, store and communication right. If you’re processor is unpowered you have to use too many of them, causing you problems with storage and communication. Remember, the program has to be stored locally, and so requiring double the number of processors because of limited computation capability means twice the store dedicated to program. You’ll also need more communication capability to deal with the number of processors. It’s absolutely the case that the transputer processor is underpowered by today’s standards - I don’t know how by much. So, in your budget for this device, you need to allow many more transistors for the processor, more for RAM - I’m sure 32k is too small - and a lot more for interconnect. The system structure is likely to be “transputer” (processor, RAM and limited comms) and routers. The nearest machine I know of that might have this sort of architecture is the Graphcore Colossus (https://www.graphcore.ai/products/ipu). 60B transistors, 1472 processor cores, each with 900MB. It’s comms system provides non-blocking, any communication pattern (so arguably, under-reassured as a general purpose machine - programs have to be bulk synchronous - which is a problem). Roger |
Attachment:
signature.asc
Description: Message signed with OpenPGP