[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Go - deadlocked processes causing memory leaks




On Mar 29, 2015, at 9:01 AM, Roger Shepherd <rog@xxxxxxxx> wrote:


On 29 Mar 2015, at 15:18, Rick Beton <rick.beton@xxxxxxxxx> wrote:

PS
"Because the channel is non-blocking, ..."
should read
"Because the channel is unbuffered, ..."

(edited version follows below)
Rick





Rick,

I think I understand the explanation in Stack Overflow. Essentially this says that if a one-buffered channel is used, the sending process won’t block even if the receiver is dead, and the process and channel will be garbage collected. Presumably in the implementation, a process’s memory is reclaimed only when it terminates, whereas dangling channels are garbage collected. I deduce it is not an error to have a program which outputs N items into a channel but inputs only N-1. [I know how you get there, how you end up with buffered channels, and then end up with the nasty consequences. I think the solution is change the programming model so the cases that cause problems can be better expressed - occam and Go are too low level).

I think the buffered channel “fix” only works if the channel eventually becomes ready - if it doesn’t, the process cannot complete. 

I can’t be bothered to go through the whole of the background to this, and my knowledge of go is only superficial. However, I did design and implement the occam implementation of InputOrFail etc. so I think I have some insight into this.

I’ve always thought InputOrFail and brethren were very important, and that the theory needed to be expanded to try to embrace those cases, for reasons like those in this thread. In my book Crawl-Space Computing, I worked on one case (orchard sensors) that had failing connections, and found I had to carry it a little farther (p 133):

PRI ALT
  bool1 & c1 ? mess1
    — code
  — more channel guards if needed
  clock ? AFTER timeout
    — code, e.g. timedout := TRUE
  FAILURE
    — code, e.g. notdone := FALSE

where c1 ? mess1 is really an InputOrFail and branches to FAILURE if aborted. The reason it works is that once an inputting link channel guard is selected, and before its communication is done, its process’s address remains in its channel word and can be recovered. If the first, “unsolicited” byte has not been sent, or has not finished coming, then the channel’s branch of the ALT is not ready, and the timer branch wins. So this makes the ALT bulletproof.

Larry Dickson