[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: jcsp Parallel doesn't handle Error cleanly



Hi Peter, Rick, and everybody else reading this,

this is a most interesting problem. 


On Friday 24 February 2006 17:54, P.H.Welch wrote:

> Otherwise, what should be done when a process crashes with an uncaught
> exception? ÂShould we bring down other processes in the same PAR (and
> their children)? Â

Well if I understand the code correctly any exception thrown by a CSProcess 
gets actually caught by the ParThread class. This then produces a trace 
message, resigns from the parallel.barrier and parks the thread. Thus, the 
Parallel and the ParThread assume a successful termination of the process, 
instead of an exceptional one. 

This behaviour is wrong in my opinion, because it can cause deadlocks when 
other processes want to communicate with the exceptionally terminated 
process. Therefore, I would opt for terminating the whole PAR, but even then 
it may cause problems as we cannot be sure that processes outside the PAR do 
not want to communicate with processes inside the PAR. 



I see at present two possible solutions:

1.)ÂÂÂÂÂIgnore the exception and restart the process in case an exception is 
caught. Printing a trace message, or logging of some sort should be done 
nevertheless. 

The problem of this is of course that a process may behave differently than at 
the time it was when it threw the exception. Therefore, this approach is 
somewhat dangerous as it may cause possible deadlocks. 

The use of valves, which when closed avoid deadlocks by disposing messages, is 
not appropriate here as we can not afford to loose messages. Contrary to a 
Software Defined Radio where signal data may be disposed while a software 
module is exchanged. 

This option is only good for legacy code, but even there it may not be proper.



2.) ÂÂÂÂBring down the whole program, to ensure that the program is not 
deadlocking. However, unless an advanced termination technique is used, this 
may leave a system in an unknown state. This is undesirable, especially when 
dealing with hardware devices. It would therefore, have to be a gracefull 
termination. 

The question is how to do it, even with Poisonable-Channels this is not so 
easy as at least one dedicated channel-end would need to be available to the 
PAR. This smears the borders between processes and the Parallel, which is not 
desireable. Â

Another way of gracefully terminating the network in case of an uncaught 
exception, is by registerring the channel-ends of a process with an instance 
of ParThread. This instance could then initiate the termination by poisoning 
the channel-ends. This would be the easiest and safest way. First of all, it 
does not require the creation of extra channels. Secondly, only when an 
exception is caught, these channel-ends get accessed, thus it is safe to do 
so.


Both solutions I proposed above assume that there is no higher instance that 
could possible bring the network into a functioning state. When trying to go 
into this direction I think it is worth while to take a look at these papers 
from CPA 2005: 
- `On Issues of constructing an Exception Handling Mechanism for CSP-Based 
Process-Oriented Concurrent Software' by Jovanovic et. al. 
- `Exception Handling Mechanism in CTJ' by ÂHilderink.



Any comments, other ideas?

Cheers
Bernhard 

-- 
If you want the holes in your knowledge showing up try teaching
someone. Â-- Alan Cox