[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A CSP model for Java threads

Hi Jeremy,

> So far my FDR analysis of JCSP has been met with total silence, perhaps
> because most of you are away on holiday ...

Sorry - I'm back for a while now but hope to grab a few more days sometime!

> Here is an FDR analysis of your model for java threads which shows that
> the JCSP channel is equivalent to something very simple in CSP.

You have had a very good idea here!!  I was getting worried that that CSP
model of Java threading wasn't helping ... very well done :-) :-) :-)

Error in my CSP model for Java threads

>                                                                  I found
> a couple of very minor bugs in your initial CSP specification, but
> otherwise I stayed true to your definition.

And in your last message:

> In Peter's CSP model the wait method was defined like this...
> WAIT(o,me) = release.o.me -> waita.o.me ->  waitb.o.me -> claim.o.me ->
> Using this definition I found that JCSP was incorrect - the channel may
> deadlock. The reason is that there is window of time between events
> release.o.me and waita.o.me where a thread neither holds the monitor
> nor belongs to the auxilliary pool of waiting threads. During this
> time interval it is possible for a notify event to be fired and
> to miss its target.
> So I redefined the wait method like this...
> WAIT(o,me) = waita.o.me -> release.o.me ->  waitb.o.me -> claim.o.me ->
> Now everything works out fine, because a thread holds onto the monitor
> until after it has joined the waiting pool..
> But was I right to do this? Or do implementations of Java exist where the
> time interval that Peter modelled really does happen? I hope not, because
> if that were so JCSP would be prone to arbitrary deadlocks, and we would
> all have to go back to the drawing board.

YES - you are right to make this change.  The JCSP channel implementation
stays ... I just got the CSP model of the wait wrong (as you immediately
dicovered when you tried to run a proof of the channel correctness through
FDR - another win for the use of formal tools).  Quoting from Doug Lea's
"Concurrent Programming in Java":

  (Following a wait) the JVM places the thread in an internal and otherwise
  inaccessible `wait set' associated with the target object.

  The synchronization lock for the target object is released, but all other
  locks held by the thread are retained.

where the context of the quote implies that the above paragraphs happen in
the given sequence.  So all JVMs are obliged to follow the sequence of your
above revision.

[Aside: Doug Lea, whilst not being a part of Sun-JavaSoft, is an authority
we can trust on these matters.]

However, there is one knock-on from this change that ought to be addressed.
When the WAIT(o,me) kicked off with the release.o.me, it was automatically
prevented from happening unless the LOCK(o) had been claim.o.me'd.  This
is because that release.o.me is part of the alphabet of LOCK(o) which, if
not claim.o.me'd, would refuse it.  [I think an earlier draft of the model
had the relase and the waita the right way round ... I think I was tempted
astray by this automatic `benefit' :-( ...]

Fixing WAIT(o,me) so that it kicks off with the waita.o.me means that this
can go ahead (putting `me' into the `wait set' for object `o') without having
first claimed the LOCK(o).  That shouldn't be allowed.  To fix that, we just
need to extend the alphabet of LOCK(o) to include all the waita.o.me events
and refuse them when not locked - i.e.

  alphaLOCK(o) = {claim.o.t, release.o.t, notify.o.t, notifyall.o.t, waita.o.t
                  | t <- Threads}

  LOCK(o) = claim.o?t -> LOCKED(o,t)

  LOCKED(O,t) = release.o.t -> LOCK(o)
              notify.o.t -> LOCKED(O,t)
              notifyall.o.t -> LOCKED(O,t)
              waita.o.t -> LOCKED(O,t)

[Aside: Java doesn't actually deadlock if you try to do a wait on an object
you haven't locked - it throws an exception.  I don't know how to try and
model an exception in CSP, so a deadlock seemed reasonable ... ?]

Non-deterministic Release following a Notify/NotifyAll

In my version of the WAIT[o][n] process, I maintained just a *count*, n, of
the threads in the waiting set.  Consequently, when a notify (or notifyall)
occurred, I had to use the external choice operator, [], to listen for
wait-b[o][s] across *all* threads s.  An internal choice, |~|, would have
been wrong as it could chose to go for a wait-b[o][s] where the s was not
one of those in the wait set.

Your version, SWAIT(o,ts), maintains the whole wait *set*, ts, which is neat.
It also allows you to use the (non-determiistic) internal choice operator
when reacting to a notify/notifyall for a non-null wait set.

But, in this context, are not [] and [~] equivalent?  All those processes
in the wait set will be trying (or are about to try) to do the wait.o.t
events - i.e. the environment is offering *all* the events over which the
choice is to be made - in which case, [] and [~] look equivalent to me?
Is there some technical advantage (to do with FDR?) in going for |~|?

Actually, since we now have the wait set held by the SWAIT(o) process,
we can simplify the desciption of the RELEASE(o,ts), following a notifyall,
by simply doing a parallel clearing of all the waitb.o.t events.  I couldn't
do this before when I maintained just a wait *count*:

  RELEASE(o.ts) = (|| t:ts @ waitb.o.t -> SKIP) ; SWAIT(o,{})

if I've got the syntax right?

[Note: I think there's a typo in the model in your first mailing.  In your
SWAIT(o,ts) and RELEASE(o.ts), you use waitb.o!s and waitb.o!t.  In your
WAIT(o,me), you also use waitb.o!me.  Both sides of the channel are writing!
Shouldn't that get rejected by FDR?  One side ought to be reading or, as in
your last email, just use dots.]

[Another typo: in your definition of alphaCHANNEL(o,t,t'), the "t'"s on the
write.o/ack.o events should be on the ready.o/read.o ones instead - I think?]

Simplifying the (JCSPCHANNEL = CHANNEL) Model and Proof

You have proved equivelence between a JCSPCHANNEL and a CHANNEL made up
from two processes:

   write |    __________                   ___________    | read
  --->---|---|          |    transmit     |           |---|--->---
         |   |   LEFT   |-------->--------|   RIGHT   |   |
  -------|---|__________|                 |___________|---|-------
    ack  |                                                | ready
         |                    CHANNEL                     |

where, loosely speaking:

  LEFT  = write?x -> transmit!x -> ack -> LEFT
  RIGHT = ready -> transmit?x -> read!x -> RIGHT

The thesis being that:

      _____________                             _____________
     |             |                           |             |
     |             |             c             |             |
     |      A      |------------->-------------|      B      |
     |             |                           |             |
     |_____________|                           |_____________|

is the same as:

      _____________                             _____________
     |             | write  ___________  read  |             |
     |             |--->---|           |--->---|             |
     |      A'     |       |  CHANNEL  |       |      B'     |
     |             |-------|___________|-------|             |
     |_____________|  ack                ready |_____________|

where A = A',

  except that (c!x -> P) from A is replaced by (write!x -> ack -> P) in A'

and B = B',

  except that (c?x -> Q) from B is replaced by (ready -> read?x -> Q) in B'.

Now, this thesis is self-evident (!) ... it would be nice to prove though???

BUT, can't we simplify the model (and the proof) as follows.  Drop the RIGHT
process, leaving the channel just as LEFT.  Also, we can drop the ack event
and reuse the write channel for this purpose.  So, consider:

         |                                                |
   write |                                                | read
  --->---|                    CHANNEL                     |--->---
         |                                                |


  CHANNEL = write?x -> read!x -> write?any -> CHANNEL

and now our thesis is that:

      _____________                             _____________
     |             |                           |             |
     |             |             c             |             |
     |      A      |------------->-------------|      B      |
     |             |                           |             |
     |_____________|                           |_____________|

is the same as:

      _____________                             _____________
     |             |        ___________        |             |
     |             | write |           | read  |             |
     |      A'     |--->---|  CHANNEL  |--->---|      B'     |
     |             |       |___________|       |             |
     |_____________|                           |_____________|

where A = A',

  except that (c!x -> P) from A is replaced by (write!x -> write!x -> P) in A'

and B = B',

  except that (c?x -> Q) from B is replaced by (read?x -> Q) in B'.

In the READ(o,t) process within JCSPCHANNEL, remove the ready.o.t from its
alphabet and its first line of code.  In the WRITE(o,t), remove the ack.o.t
from its alphabet and replace its occurrence in its code with write.o.t?any.

Now the assertions that JCSPCHANNEL and CHANNEL refine each other should still
go through ... ;-) ... ???

1-1, Many-1, 1-Many and Many-Many Channels

After your FDR assertions that CHANNEL and JCSPCHANNEL refine each other,
you write:

-- That works fine with the current system where only two threads
-- are in existence, but when we increase the number of threads in the
-- system beyond two, perhaps by redefining Threads = {0,1,2}, we find that
-- the above pair of assertions no longer holds. FDR reveals that this is
-- because the new threads may `fiddle' with the state of the channel
-- implementation.
-- So how do we stop other threads from interfering with the channel object?
-- In CSP we can add a parallel process to the channel implementation which
-- blocks access to any of the the relevant events! (Unfortunately this is
-- not supported by the actual Java implementation, and so must be
-- regarded as a usage rule for JCSP.)

The read/write methods you checked were from page 274 of the WoTUG-21
proceedings and are only safe to use for a channel between one reader
and one writer (at a time) - i.e. a 1-1 channel.  This is a usage rule
for occam channels and it is only intended to be secure if we abide by that.
In occam, under KRoC, SPoC or a transputer, if we switch off that usage
checking and allow more than one process to read from (or write to) a channel
in parallel, the channel implementation also fails.  So, I'm comfortable
with this usage rule.  [Have I understood your point here?]

I would be interested in your checking the other forms of channel :-) ...

A Many-1 channel is an efficient, but restricted, form of (fair) ALTing -
just like an occam3 SHARED channel.  The Many-1 channel code can be extracted
from the code on page 276 of WoTUG-21 - replace the `read' with the `read'
from page 274.  Page 276 gives the Many-Many channel.  A 1-Many channel is
the page 276 stuff, but using the `write' from page 274.

CSP, of course, doesn't have a Many-1 channel primitive - but it does have
choice primitives.  What we want to prove is that a Many2OneJCSPCHANNEL is
the same as a Many2OneCHANNEL process modelled, in the spirit of the above
simplification, by something like:

  Many2OneCHANNEL(o,t') = [] t:diff(Threads,{t'}) @ write.o.t?mess ->
                                            read.o.t'!mess ->
                                            write.o.t?any ->

  alphaMany2OneCHANNEL(o,t') = {write.o.t.m, read.o.t'.m |
                                  t <- diff(Threads,{t'}),
                                  m <- Data}

The One2ManyCHANNEL model may need to go back to your (LEFT||RIGHT) model,
as will certainly the Many2ManyCHANNEL - left for another day.


Many ALTing scenarios can be modelled more simply and efficiently with
a Many-1 (SHARED) channel.  But ALTing *is* more flexible than Many-1
and probably needs to be retained - for example, sometimes we may want
to choose between channels {a,b,c} and other times from {b,c,d}.  That
can't be supported by a Many-1 concept.

But we leave this for later!