[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: jcsp Parallel doesn't handle Error cleanly
Hi Peter, Rick, and everybody else reading this,
this is a most interesting problem.
On Friday 24 February 2006 17:54, P.H.Welch wrote:
> Otherwise, what should be done when a process crashes with an uncaught
> exception? ÂShould we bring down other processes in the same PAR (and
> their children)? Â
Well if I understand the code correctly any exception thrown by a CSProcess
gets actually caught by the ParThread class. This then produces a trace
message, resigns from the parallel.barrier and parks the thread. Thus, the
Parallel and the ParThread assume a successful termination of the process,
instead of an exceptional one.
This behaviour is wrong in my opinion, because it can cause deadlocks when
other processes want to communicate with the exceptionally terminated
process. Therefore, I would opt for terminating the whole PAR, but even then
it may cause problems as we cannot be sure that processes outside the PAR do
not want to communicate with processes inside the PAR.
I see at present two possible solutions:
1.)ÂÂÂÂÂIgnore the exception and restart the process in case an exception is
caught. Printing a trace message, or logging of some sort should be done
nevertheless.
The problem of this is of course that a process may behave differently than at
the time it was when it threw the exception. Therefore, this approach is
somewhat dangerous as it may cause possible deadlocks.
The use of valves, which when closed avoid deadlocks by disposing messages, is
not appropriate here as we can not afford to loose messages. Contrary to a
Software Defined Radio where signal data may be disposed while a software
module is exchanged.
This option is only good for legacy code, but even there it may not be proper.
2.) ÂÂÂÂBring down the whole program, to ensure that the program is not
deadlocking. However, unless an advanced termination technique is used, this
may leave a system in an unknown state. This is undesirable, especially when
dealing with hardware devices. It would therefore, have to be a gracefull
termination.
The question is how to do it, even with Poisonable-Channels this is not so
easy as at least one dedicated channel-end would need to be available to the
PAR. This smears the borders between processes and the Parallel, which is not
desireable. Â
Another way of gracefully terminating the network in case of an uncaught
exception, is by registerring the channel-ends of a process with an instance
of ParThread. This instance could then initiate the termination by poisoning
the channel-ends. This would be the easiest and safest way. First of all, it
does not require the creation of extra channels. Secondly, only when an
exception is caught, these channel-ends get accessed, thus it is safe to do
so.
Both solutions I proposed above assume that there is no higher instance that
could possible bring the network into a functioning state. When trying to go
into this direction I think it is worth while to take a look at these papers
from CPA 2005:
- `On Issues of constructing an Exception Handling Mechanism for CSP-Based
Process-Oriented Concurrent Software' by Jovanovic et. al.
- `Exception Handling Mechanism in CTJ' by ÂHilderink.
Any comments, other ideas?
Cheers
Bernhard
--
If you want the holes in your knowledge showing up try teaching
someone. Â-- Alan Cox