[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Java Live | January 9, 2001

Rats - I was told about this event but completely missed it :-( ...

> T_Christopher: I'm not familiar with the specifics of the University of
> Kent's efforts, but I strongly disapprove of Occam being a model for
> threading. It is based on Hoare's "Communicating Sequential Processes"
> and connects threads through communication channels. Although it extends
> to distributed memory systems easily, and prevents system crashes caused
> by producer threads running faster than consumers, it doesn't do very
> much that I like to do. I really prefer handling the stuff myself. It's
> easy for me, and I like the flexibility. An example from the book: I
> have a version of Shellsort that requires sharing an array and
> dynamically creating runnables to process it. The communicating process
> structure of Occam isn't appropriate to the algorithm.

The above is a very naive description of the occam/CSP model and its benefits.
So naive, in fact, that I would despair of talking him out of it ...

occam/CSP is neutral with regard to any arguments between shared memory and
distributed memory concurrency.  Channels are in fact an easy and safe way
to secure synchronised access to shared data structures!  And so is the
barrier synchronisation implicit at the end of a PAR construct!!

For sorting, a pipeline of bubble-sort cells is a much simpler was to describe
parallel sorting than any parallel version of shellsort or quickersort (say).
And, given parallel hardware, it has an O(n) complexity that can't be beaten
by any parallelised shellsort or quickersort.  It's a classic example of how
concurrency means we can use simpler algorithms and win performance.

Here's a little example of a shared memory algorithm using the occam/CSP
model - it requires a recursive occam but we're still using the model.
[See also my and David Wood's paper in WoTUG-20 ("Higher Levels of Process
Synchronisation") for a fuller story, my and Kevin Vella's paper on
shared memory occam/CSP (which does a parallel shared memory quickersort)
in WoTUG-22 and David's paper on recursion in occam in WoTUG-23/CPA-2000.]

The following function sums an array in O(log(n)) time:

    INT result:
      VAL INT n IS SIZE A:
        n = 1
          result := A[0]
          VAL INT n2 IS n >> 1:
          INT hi, lo:
              hi := sum ([A FOR n2])
              lo := sum ([A FROM n2])
            result := hi + lo
      RESULT result

It uses no channels (:-) but just relies on the PAR termination
to synchronise things perfectly.  It's recursive, of course, and
requires a fine-grained parallel architecture, with a sufficient
amount of parallelism, to reach the O(log(n)) time.

A JCSP/CTJ version would actually work!  But the overheads would
require either someting more complex than the "+" operation or
some tuning to do bigger clumps serially - see below.

The WoTUG-20 paper presented a more functional version of the above,
where we take advantage of the side-effect-freedom of occam expressions
to allow the compiler to generate parallel code for separate parts
of an expression:

    INT result:
      VAL INT n IS SIZE A:
        n = 1
          result := A[0]
          VAL INT n2 IS n >> 1:
          result := sum ([A FOR n2]) + sum ([A FROM n2])
      RESULT result

bur I prefer the "one-liner":

    [sum ([A FOR (SIZE A) >> 1]) + sum ([A FROM (SIZE A) >> 1]), A[0]]
    [INT ((SIZE A) = 1)]:

where we need lazy evaluation of in-lined-array index lookup (and take
advantage that INT FALSE is 0 and INT TRUE is 1).

Unrolling things so as to "do bigger clumps serially" is easy:

      A[0] + A[1],
      A[0] + (A[1] + A[2]),
      (A[0] + A[1]) + (A[2] + A[3]),
      (A[0] + A[1]) + (A[2] + (A[3] + A[4])),
      (A[0] + (A[1] + A[2])) + (A[3] + (A[4] + A[5])),
      (A[0] + (A[1] + A[2])) + ((A[3] + A[4]) + (A[5] + A[6])),
      ((A[0] + A[1]) + (A[2] + A[3])) + ((A[4] + A[5]) + (A[6] + A[7]))]
     [(SIZE A) - 1],
     sum ([A FROM 0 FOR (SIZE A) >> 1]) +
     sum ([A FROM (SIZE A) >> 1 FOR (SIZE A) - ((SIZE A) >> 1)])]
    [INT ((SIZE A) > 8)]:

So, the occam/CSP model seems to take on board parallel functional
programming as well as shared memory parallelism - and the parallelism
can be as dynamic as you wish.

Now, what was that guy saying again about occam and CSP?

As for "handling the stuff myself" to get the "flexibilty" - good luck!
That's definitely not for the masses and, for the experts, definitely
not the sort of thing you want to do every day.  And, even then, it takes
a long time before you really feel comfortable that you haven't missed
anything - like a race hazard or deadlock or starvation or livelock ...

But CSP is for all and for every day work.