KRoC continues to win out against the competition
I may well be wildly off here, but I seem to remember that the KRoC compiler improves the performance of the context switch by arranging for the registers to hold nothing important at the point in the code where the switch occurs. Thus eliminating the need for the registers to be saved/restored.
If I am remembering correctly on this point, it strikes me as very misleading to compare only context switch times. The overhead seems to have been moved to the computation that the process is performing, which can no longer take full advantage of the register set.
It's easy to win micro-benchmark games if you put all of your focus on the micro-benchmark in question (e.g. on reducing context-switch overhead). Perhaps this is what Adam was getting at.
From what (little!) I understand, some of the biggest wins in compiler optimisation come from making good use of the registers. For example I've heard it said that a lot of effort is made to inline function calls, not because of the call overhead, but because you then get to really spread out and use all the registers. Would KRoCs strategy of having the registers empty at sync points cause havoc with this kind of optimisation?
Please take all this chiefly as questions - I am well outside my comfort zone here : )
Tom