[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Concurrency, Exceptions and Poison



Help please!

I've been thinking about how exceptions and concurrency interact - prompted
by Gerald's idea of poisoning channels (rather than sending poison, as a legal
object, through them) and throwing an ArrrghException if anyone, then, tries
to use them.

I mentioned that I remembered (?) Geoff Barrett saying, at the time he was
designing occam3, that he had thought about introducing exceptions but shied
away because he wasn't happy about the semantics.

So far as serial processing is concerned, exceptions screws up the simplicity
(i.e. WYSIWYG-ness) of the SEQ (sequences do not necessarily run) and - worse
- loops do not end only because the WHILE-condition becomes FALSE (or the SEQ
replication count reaches zero).  However, these are semantic complexities
to which we might be able to stretch.

For parallel processes, what should happen when one of them breaks with
an unhandled exception?

In Java, a thread with an unhandled exception just prints out the exception
and dies - the exception propagates no further.

In occam/CSP/JCSP/CTJ, should we do the same?  Thus, if P is a process that
throws an exception, then in:

                               PAR
                                 P
                                 Q

P's exception could be printed (to a SHARED error channel) and P would then
terminate.  The PAR construct then waits for Q to terminate before it itself
terminates.  If both P and Q throw exceptions, both exceptions would be
printed and both processes terminate followed by the PAR.  In all cases,
no exceptions get propagated beyond the PAR.  That seems clean enough ...

BUT ... it really screws up some basic algebraic laws!  For example:

               P       =       PAR          =       PAR
                                 P                    SKIP
                                 SKIP                 P

where the = means semantic equivalence.  But, on the LHS, P's exception is
thrown.  On the middle/RHS, it isn't!  So, with exceptions, that law no longer
holds :-( ...

Suppose we try to fix this by saying that the PAR *does* propagate exceptions
thrown by its component processes (not print them).  Then, we will have to
accommodate the idea of exception *sets* being thrown - since PAR will have
to throw the union of all exceptions thrown by its processes.  Or, PAR just
chooses (arbitrarily?) one of its received exceptions to propagate??

Extending the semantics of concurrency (e.g. for unhandled exceptions) in
a way that is not matched by a similar extension in the semantics for
(external) choice is bound to lead to inconsistency.  This is because PAR
is related to (and actually defined in terms of) choice.  This screws
things up if we say that synchronising on a poisoned event throws an
exception ...

Let's say that we apply the above fix so that a PAR propagates exceptions.

Consider P = (p -> P') and Q = (q -> Q'), where perhaps p and q are channel
input events.  Suppose, also, that p and q are different.  Then, we should
have:

              PAR     =     PAR           =     ALT
                P             p -> P'             p
                Q             q -> Q'               PAR
                                                      P'
                                                      q -> Q'
                                                  q
                                                    PAR
                                                      p -> P'
                                                      Q'

(Sorry about the mix of occam and CSP notation :-)

Now, suppose that p is a poisoned channel - i.e. touching it throws the
ArrrghException!  On the LHS (or middle) expression, P immediately throws
its exception and terminates - meanwhile Q must run to completion.  When
and if Q terminates, the PAR terminates and propagates P's exception.

On the RHS, a poisoned event guard may (or may not) trigger the exception.
Suppose it does - then, presumably, the exception gets propagated by the ALT
and *none* of the Q process is executed.  This is not the same behaviour as
the LHS :-(.

OK - suppose a poisoned event is *never* taken.  Then, the RHS becomes:

  ALT
    q
      PAR
        p -> P'
        Q'

which now does equal the semantics of the LHS (first we wait until q happens;
then Q' happens in parallel with P throwing its exception and terminating;
finally, when and if Q' terminates, the PAR terminates and propagates P's
exception; P's throwing of its exception is here forced to follow q, whereas
on the LHS it may precede it; but that first throwing of the exception (an
event?) is hidden to the outside world and, therefore, does not change the
semantics).

But no cheers yet!  Suppose both p and q are poisoned.  Now the RHS behaves
as STOP ... but the LHS terminates (and propagates two exceptions)!

Hmmm ... I still like the idea of poisoning channels (or any synchronisation
object) but I'm not sure about exceptions :-( ...

Either:

  (a) we give up the algebraic laws in the presence of exceptions;
  (b) we give up exceptions and think of a better way to react to
      poisoned events;
  (c) we give up posoning events;
  (d) there's a clean semantics that unifies excpetions and concurrency
      and preserves existing algebraic laws and I can't see it.

I don't believe (a) is acceptable.  I don't like (c) since it's such a good
idea.  I hope it's (d) ... help!

Meanwhile, here's a go at (b).  It gives a consistent semantics to poison
and concurrency ... but we have to forrget about throwing exceptions!

The semantics of a poisoned event is just that you can't sync on it.  This
means that, as above, a poisoned event is *never* taken in a choice (ALT).
If you don't have a choice (e.g. p -> P), then it's a STOP.  Note that
this is the way run-time errors (e.g. divide-by-zero) are currently handled
by occam.  So, that's consistent - being forced to use a poisoned event
leads to a run-time error :) ...

Stopping an event from happening is easy in CSP - just get a parallel process
(that has the event in its alphabet) to refuse it.  For example:

  WHILE TRUE
    ALT
      p
        SKIP
      poison ? any
        STOP

Note that it's just as easy to model a temporary paralysis of the event:

  WHILE TRUE
    ALT
      p
        SKIP
      poison ? any
        STOP
      suspend ? any
        release ? any

So, with this semantics for a poisoned channel, we will have consistency.
Consider the earlier example:

              PAR     =     PAR           =     ALT
                P             p -> P'             p
                Q             q -> Q'               PAR
                                                      P'
                                                      q -> Q'
                                                  q
                                                    PAR
                                                      p -> P'
                                                      Q'

If neither p and q are poisoned, it holds by the usual laws.  If both p and
q are poisoned, the RHS is STOP (neither event can be taken).  The LHS and
middle resuce to:

  PAR
    STOP
    STOP

which is, of course, STOP.

If p is poisoned and q is not, the LHS/middle is:

  PAR
    STOP
    q -> Q'

which is:

  SEQ
    q -> Q'
    STOP

The RHS, as before, reduces to:

  ALT            =     ALT           =     ALT           =     SEQ
    q                    q                   q                   q -> Q'
      PAR                  PAR                 SEQ               STOP
        p -> P'              STOP                Q'
        Q'                   Q'                  STOP

Q.E.D. :)

The good thing about this is that it gives us a safe way of freezing processes
temporarilly or permanently.  The bad thing is that, in the latter case, it
doesn't permit any tidy-up actions (that involve further synchronisation).

I suppose this also gives option (d) above.  We have:

  (1) throw (and catch from) exception sets (not just singletons);
  (2) PAR propagates the union of exceptions thrown by its components
      - unless that union is empty, in which case it terminates normally;
  (3) a process cannot engage in a poisoned (or suspended) event.

That gives us consistency (I think?) and it gives a simple mechanism for
freezing processes.  But it doesn't quite give us what we want - namely
the *termination* of processes that touch poisoned events (and the chance
to trap the termination).

I'd like to get the semantics of poison and exceptions right.  The obvious:

  (4) a process trying to engage in a poisoned event throws an exception;
  (5) PAR does not propagate exceptions.

appears to be wrong.  Separately or together, clauses (4) and (5) leads to
inconsistency.

Peter.