[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


Dear Per,

> Here is a partial response to your email of May 20:

Many thanks ...

> ........................ When I finally got around to
> looking at Java's parallelism I concluded that its designers
> were amateurs who were completely out of their depth.

And its users are running scared!  Most tutorials are pretty
frightening stuff ... apart from ours [1] of course!

> ..... So I carefully explained by example that Java never
> got beyond my earliest ideas in parallel programming - and
> got them all wrong! I was not advocating the use of these
> early ideas today. If I had had more space, I would have
> explained that the final monitor concept in Concurrent
> Pascal did not use waiting loops [2]. I would also have
> liked to point out that Java threads are beyond repair in
> the sense that they make interference check impossible in
> general [14]. So forgive me for omitting occam this time.

Forgiven!  I apologise for not directly referencing your work
in papers on CSP and monitors ... only indirectly as "Hoare and
others" :-( ... will make amends in future!

I'm pleased you eliminated waiting loops in the end.  I'm afraid
I didn't get into parallel computing till around 1983 - enticed
by the elegance and efficiency of occam and CSP away from the
delights of functional programming and the lambda calculus.
It's only since Java appeared that I went back into the literature
to check out the monitor stuff ... and I only went back as far
as the Hoare CACM (1974) paper.

> In 1989, a Computing Surface with 48 T800 transputers was
> installed in my lab at Syracuse. I spent five years using
> this wonderful multicomputer to write parallel occam
> programs for computational science [9-11, 15].

We had a MEiKO installed in 1987 and it grew to 174 T800s by
1989.  It's been out of maintenance since 1991 but still keeps
on running.  Sadly, I'm told the `millenium bug' will kill off
its SunOS (SPARC-hosted) operating system at the end of this
year and nobody will upgrade it to Solaris ... :-(

> By 1994 I had also designed and implemented the parallel
> publication language SuperPascal based on CSP and occam
> [12-16].

I just scanned the SP&E article during lunch - I had seen this
before.  You seem to have occam/CSP parallelism but without any
ALTs ... this gives you completely deterministic parallelism
(so long as you get those "sic" statements right!).  Maybe that's
all you need for supercomputing applications ... although I would
miss the non-determinism.  Many globally deterministic processes
can have non-deterministic subcomponents and I'd miss the freedom
they give me.

> In contrast to occam, Joyce and SuperPascal both support
> recursive process activation. It took me thirty years to
> solve the problem of reclaiming the storage of parallel
> recursion without garbage collection [16]. But now that the
> problem has been solved I would expect future programming
> languages to support efficient parallel recursion as a
> matter of course.

This I have got to look up!

> So I am well aware of the advantages of occam. I was,
> however, unaware of the recent work involving occam and
> its successors. This is partly my fault since I don't read
> as much about parallel programming as I used to. However,
> it also seems to me that the internet makes it too easy to
> communicate with a small group of insiders and makes it
> less attractive to publish papers in widely read journals,
> such as IEEE Computer and SIGPLAN Notices.

There's never any time in modern academia!  It's hard enough
finding the time to write up our ideas and, when we do, we
naturally circulate them to our own community conferences.
It then becomes difficult to re-draft them sufficiently to be
accepted by the widely-read journals as `new' ...

But we must:

> * Write letters to the editor of SIGPLAN Notices and express
>   your own opinions about Java's parallelism.
> * Present your arguments in a carefully reasoned papers for
>   computer scientists in general.

In your posting to Adrain Lawrence:

> In my own work on computational science, I used SuperPascal as a
> publication language and occam2 as an implementation language.
> SuperPascal enabled me to write elegant recursive algorithms for
> parallel versions of Quicksort and the Fast Fourier Transform. I then
> rewrote these programs in occam2 using process trees with fixed depths
> and tested their performance on a Computing Surface.
> The difference between the two notations was striking. Here is the
> SuperPascal definition of a binary process tree:

The occam configuration language for placing processes on a distributed
memory (e.g. transputer) multiprocessor was always a pain!  But we always
modelled such things in plain occam first (the *model* language which runs
on a uniprocessor) and implemented it on the target multiprocessor later
(using the configuration language).

Your parallel recursive tree process is very elegant:

>   procedure tree(depth: integer; bottom: channel);
>   var left, right: channel;
>   begin
>     if depth > 0 then
>       begin
>         open(left, right);
>         parallel
>           tree(depth - 1, left)|
>           tree(depth - 1, right)|
>           root(bottom, left, right)
>         end
>       end
>     else leaf(bottom)
>   end;

BTW, it looks like your channels are to be used in both directions?  This
seems a little naughty as it would offer great scope for misunderstanding
between the processes at either end as to which one is currently reading
and which writing??  Maybe it's no more dangerous, though, than the
pair of channels that occam would demand (?) ...

Anyway, even when the notational clutter of the PLACE statements is removed,
and we compute the depth value statically (rather than sending it around
by initial communications) and we use an occam3 channel structure to bind
up the bi-directional channel pair:


the cleaned up occam is still not so elegant as your recursion:

  PROC tree (VAL INT depth, LINK bottom)  -- depth is #levels in the tree
    VAL INT n.leaves IS 2^(depth - 1):
    VAL INT n.roots IS n.leaves - 1:
    VAL INT n.links IS (n.leaves + n.roots) - 1:
    [n.links]LINK link:
      root (bottom, link[2], link[3])
      PAR i = 2 FOR n.roots - 1
	root (link[i], link[2*i], link[(2*i) + 1])
      PAR i = n.roots + 2 FOR n.leaves
	leaf (link[i])

where, of course, the depth parameter would have to be replaced by a global
VAL INT for real occam -- but we are only modelling here!

> Even as I retype this 1991 program piece, I have to proofread it as
> carefully as assembler language. When you have figured out how the tree
> nodes and channels are numbered, you will appreciate that the need for
> process and channel arrays is due solely to the limitations of the
> programming notation: the absence of recursion.

Yes!  The above code is also pretty horrible ... and I'm not convinced
I've got it right ... I abandoned using elements link[0] and link[1]
to keep the algebra slightly simpler.

BUT ...  this tree is a recursion-removing transformation from the much
prettier recursive occam:

  PROC tree (VAL INT depth, LINK bottom))  -- depth is #levels in the tree
    CASE depth
	leaf (bottom)
	LINK left, right:
	  tree (depth - 1, left)
	  tree (depth - 1, right)
	  root (bottom, left, right)

which is now isomorphic with your code ;-) ...

> It hardly matters that occam has changed since then, considering that
> the occam community still shies away from parallel recursion.

Only because we didn't know how to implement it efficiently!

> ............................................................... So I
> have a simple suggestion to offer:
> Develop a publication language that extends occam with parallel
> recursion using the memory allocation scheme described in [1].

Wow!  I'm trying to get hold of a copy!!

I've always wanted an occam with (parallel) recursion.  Recursion has
always been no semantic problem -- CSP is recursive.  In the shared-
memory (e.g. SMP or ParaPC) world, recursion becomes very practical
since we don't have to worry about placement.  A paper by me and
Kevin Vella [2] at WoTUG-22 shows a recursive parallel quickersort,
written in occam, executing on a 4-processor SMP with as good a parallel
efficiency as anyone could expect.  The problem is that with current
occam, we can do recursion but only to a fixed maximum depth.  That
way, the compiler pre-calculates all the process workspaces assuming
that maximum depth occurs.  An efficient parallel memory allocation
scheme opens up a whole new ball game ... and not just for occam as
a publication language ;-) ...

Lots to think about!

Peter Welch.


The parallel recursive occam above can be written (and executed) in
C (using Jim Moores' CCSP user-level kernel [3]) and in Java (using
our JCSP library).

Here is the JCSP version:

  import jcsp.lang.*;

  class Link {
    One2OneChannel up = new One2OneChannel ();
    One2OneChannel down = new One2OneChannel ();

  class Tree implements CSProcess {

    private final int depth;
    private final Link bottom;

    public Tree (int depth, Link bottom) {
      this.depth = depth;
      this.bottom = bottom;

    public void run () {
      switch (depth) {
	case 1:
	  new Leaf (bottom).run ();
	  Link left = new Link ();
	  Link right = new Link ();
	  new Parallel (
	    new CSProcess[] {
	      new Tree (depth - 1, left),
	      new Tree (depth - 1, right),
	      new Root (bottom, left, right)
	  ).run ();


which, while not quite as snappy as your original or the recursive occam,
is begining to look quite presentable.


[1] "CSP for Java, Multithreading for All",
    abstract only,

[2] "CSP/occam on Shared Memory Multiprocessor Workstations",
    K.Vella and P.H.Welch,
    `Concurrent Computing, Architectures, Languages and Techniques',
    Proceedings of WoTUG-22,
    IOS Press (Amsterdam), ISBN ???,
    April, 1999

[3] "CCSP: a Portable CSP-based Run-time System supporting C and occam"
    `Concurrent Computing, Architectures, Languages and Techniques',
    Proceedings of WoTUG-22,
    IOS Press (Amsterdam), ISBN ???,
    April, 1999

(I'll post you a copy of these proceedings)