Rick, I think I understand the explanation in Stack Overflow. Essentially this says that if a one-buffered channel is used, the sending process wonât block even if the receiver is dead, and the process and channel will be garbage collected. Presumably in the implementation, a processâs memory is reclaimed only when it terminates, whereas dangling channels are garbage collected. I deduce it is not an error to have a program which outputs N items into a channel but inputs only N-1. [I know how you get there, how you end up with buffered channels, and then end up with the nasty consequences. I think the solution is change the programming model so the cases that cause problems can be better expressed - occam and Go are too low level). I think the buffered channel âfixâ only works if the channel eventually becomes ready - if it doesnât, the process cannot complete. I canât be bothered to go through the whole of the background to this, and my knowledge of go is only superficial. However, I did design and implement the occam implementation of InputOrFail etc. so I think I have some insight into this. However, I donât understand the motivation of this example, i.e. what the programmer is really trying to do. (I suspect that some problems are caused by having these mono-pole processes which fork but do not necessarily join). In occam we write an input with timeout as:- ALT InputChannel ? message HandleMessage(message) TIME ? AFTER timeout HandleLackOfMessage() Of course, this doesnât really match the go example which is a bit like CHAN c OF Thing : PAR SEQ TIME ? t TIME ? AFTER t + delay c ! someThing ALT c ? message HandleMessage(message) TIME ? AFTER timeout HandleLackOfMessage() which makes clear the deadlock in the case the ALT chooses the timeout path. So, we might take the Go example as an example of problems with process termination rather than a problem with channel. But, as I said I donât really understand the semantics of Go. Is the system supposed to detect and suppress deadlocked processes? The other reason why I think you might want to write this sort of stuff is to dead with failing communication - that is, a failure of the model that the language supports. There are at least two problems with failing communication - the first is defining it and detecting it, the second is what to do about it (recovery). With the transputer although we were primarily concerned about link* failure (probably due to disconnection) our solution would work on internal channels. Because we were dealing with a failure of the underlying hardware to support the programming model we couldnât rely on the use of âinput with timeoutâ (notwithstanding we didnât support âoutput with timeoutâ). [One problem is that for a link channel, it is the arrival of the first byte of the communication which causes the ALT to make itâs selection. A subsequent failure which meant that the message dis not complete would fail to trigger the timeout]. Addressing "the first is defining it and detecting itâ - we defined a failure as a failure to complete an input or output before either a timeout or another message arrived. We provided the functionality by four procedures: InputOrFail.t(CHAN c, []BYTE mess, TIMER TIME, INT t, BOOL aborted) OutputOrFail.t(CHAN c, VAL []BYTE mess, TIMER TIME, INT t, BOOL aborted) InputOrFail.c(CHAN c, []BYTE mess, CHAN kill, BOOL aborted) OutputOrFail.c(CHAN c, VAL []BYTE mess, CHAN kill, BOOL aborted) I canât remember exactly how they worked but it must have been something like PROC OutputOrFail.c(CHAN c, VAL []BYTE mess, CHAN kill, BOOL aborted) CHAN completed : INT workspace_pointer_of_first_process_of_PAR, endP_instruction_of_first_process_of_PAR PAR â set workspace_pointer_of_first_process_of_PAR to workspace pointer of outputting process SEQ â outputing process c ! mess completed ! ANY LABEL: endP_instruction_of_first_process_of_PAR ALT completed ? x aborted := FALSE kill ? x INT c_pid, completed_cid : SEQ â at this point (given this Sequential process is running) the outputting process may be â waiting for communication on c to complete â on the run queue â waiting to communicate on completed (i.e. the communication on c worked but we have defined this as a failure) aborted = TRUE â ! c_pid = resetch(c) â reset channel instruction completed_pid = resetch(completed) â we will cause outputting process to execute its join() [ENDP] â set up IPtr of process workspace to be ENDP WriteToMemory(workspace_pointer_of_first_process_of_PAR, Iptr.s, endP_instruction_of_first_process_of_PAR) â add process to queue if it isnât there already IF (c_pid <> NotProcess.p) OR (completed_pid <> NotProcess.p) â process was waiting on a channel RUNP(workspace_pointer_of_first_process_of_PAR) TRUE SKIP So, returning to the theme, this code works by causing the outputting process and the watchdog process to perform their join() in the normal way. So, I suspect that the Go system could do something similar - causing the âdeadlockedâ process to complete by slight of hand. By the way, this code relies on certain atomicity properties of the transputer scheduler (e.g. the operation of the reset channel instruction). Otherwise it can become difficult and/or expensive to deal with possibly zombie processes. Roger *A link failure can affect both of the channels that the link implements, and processes at either end of the link.
|