[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: FW: A "CJT" in python



Jim,
   The stack vs workspace model is actually an opposition, as I
described in my 1996 paper; related to dynamic vs static, temporary
vs eternal, sequential vs parallel, etc...

> From: <jm40@xxxxxxxxx>
> To: "'Lawrence Dickson'" <tjoccam@xxxxxxxxxxxxx>
> Cc: <java-threads@xxxxxxxxx>
> Subject: RE: FW: A "CJT" in python
> Date: Sun, 29 Oct 2000 18:25:40 -0000
> Message-ID: <000501c041d5$a5386570$661d02d5@JIM1>
>
> > -----Original Message-----
> > From: Lawrence Dickson [mailto:tjoccam@xxxxxxxxxxxxx]
> > Sent: 29 October 2000 14:30
> > To: jm40@xxxxxxxxx; java-threads@xxxxxxxxx; occam-com@xxxxxxxxx
> > Subject: Re: FW: A "CJT" in python
> >
> >
> > Jim and all,
> >    This is an important point. The workspace pointer is NOT equivalent
> > to a stack pointer, for at least two reasons:
> >    (1) Its requirements are static and known at load time. There is
> > never any overflow (that is why it can expand DOWNWARD to a fixed
>
> Hi Larry,
>
> Why is the expansion of the workspace downwards important?  Most stack
> allocation is done downwards (including on the Intel range ISTR).
>
> > bottom memory address).
>
> Hmm.  I still don't see the difference.  I realise that the Workspace in the
> Transputer is statically allocated, but there is nothing to say that a
> normal C stack's usage must be unbounded.

But nothing can prevent it from being unbounded (short of axe work like
general protection fault from overflow).

The essential difference is that the occam root is at the top; all called
procedures and/or spawned processes (by PAR) grow downward; but CANNOT
GROW DOWNWARD PAST THE BOTTOM. This is only possible in an ESSENTIALLY
static language which can predict all usage at load time.

>
> What I am trying to say is that its use is exactly the same as the stack
> pointer in any other machine.  When you require more local space, you move
> the workspace pointer down, and when you want to release it, you move it
> back up.

No, you start with the entire gob of space predicted by the compiler,
and then you never make any changes during run time. (You are thinking
in run-time allocation terms, but everything can be predicted at load
time.) This knowledge can also be used to eliminate dynamic cache, by
letting innermost loops use lowest addresses = fastest (on-chip) memory;
a trick used by the Transputer.

> That is Dykstra activation record allocation if there ever was an
> example of it.  Also, the implementation of C on the transputer must have
> used the workspace pointer directly as a `stack pointer'...
>
> >    (2) As a result, non-Transputer CSP (occam) implementations can
> > run multitasked without stack saving. In DOS, I ran over 200
> > multitasked communicating programs on a single shared stack
> > of 10 KB or
> surely they are not `sharing' a stack, but all using very small separate
> stacks.

I am most definitely SHARING a stack. I wrote the assembly code to do so.
And a "very small" stack would be unsafe: there are interrupts. At any
moment, all processes except the running process have zero stack (except
for the system's 30 bytes). This works because at the moment of context
switch, no process has any stack.

> > so, using only the 30 stack bytes per program which was
> > required by the
> > system for nested program execution (a load listing run from the
> > command prompt on top would show all 200 programs), plus a little on
> > top for interrupts and certain instructions.
>
> But you can do that in C.  There is nothing to stop you working out the
> memory requirements of threads/processes in C (as long as you avoid
> unbounded recursion of course).  It could even be done automatically.  Its
> just that its built in with occam.

I very much doubt you could get by without a couple of KB stack for each
process. I can get by with 30, and that's inheriting legacy OS code (DOS).
It's a huge difference. And how do you avoid unbounded recursion without
killing the design of C? (Yes I know the answer; same answer as for
"unbounded heap memory" etc; you kill the implementation by crashing when
some of your "#define" constants get too big.)

>
> I know that its safer not to have colliding stacks.  I don't need to be sold
> on that.  Its just that for along time I didn't realise that the workspace
> was essentially just a stack.  Realising this has made it much easier to
> contemplate writing CSP run-time systems that work with both C and occam.
>

What you are proposing (with limiting recursion etc) is really just the
opposite. The stack is essentially just a workspace. Enforce that, and
you can write a C-to-occam translator, which proves my point. Of course
all the algorithmic designs that gloss over such limitations will have
to be drastically qualified.

> >
> > > From: <jm40@xxxxxxxxx>
> > > <SNIP>
> > > Perhaps I am mis-understanding the idea of "stackless".
> >
> > At context switching between processes of the same priority, no stack
> > is saved. In my DOS implementations, small stacks would arise due to
> > certain instructions, but they would go away at a descheduling point.
> > This is all because of the static nature of occam which allows
> > pre-allocation of needed workspace.
>
> Yes, but the stack pointer _is_ saved implicitly by the fact that you use it
> _also_ as the pointer to the process descriptor.  When you deschedule a
> process due to a communication or whatever, you put it in the channel word.
> Otherwise how can a process hold any context (local variables and so on)?
> And how can you find it when you switch process back?   The
> workspace/stack-pointer duality confuses people, but really it is just as if
> you allocate a small block of local space to hold the current processes
> descriptor just like a local variable.  The decision to use the space just
> below the workspace pointer is arbitrary - it could have been done by
> actually moving the pointer down (to allocate the process descriptor), then
> deschedule, then deallocate it when the process resumed.  But why bother?

Actually it was not arbitrary - it served the static cache model of
the Transputer (fastest memory at lowest address). Obviously the
software-hardware equivalence required by occam means you have to save
all state somehow - so the context switch out and back appears to be
no change to the soft process - but the occam/Transputer method did this
by a minimalist approach made possible only by the static resource
handling (it would apply to other things than memory too).

> >
> > >
> > > What the Transputer does have on it's side is a very
> > minimal register
> > > context - just the Workspace Pointer needs to be saved when
> > switching
> > > (although this is in part due to the implicit invalidation
> > of the evalution
> > > stack (A/B/C) after any instruction that is interruptable).
> >
> > Remember the Transputer could also be programmed in C, and would have
> > the same stack problems as any other processor. It's a language design
> > question.
> Yes, but as I said above, I don't think that the memory block being a fixed
> size stops it from being a stack.

That is quibbling over terminology. The memory block being an UNfixed
size stops it from behaving predictably. And the C stack is designed
specifically to allow the UNfixed size.

> >
> > THIS IS VERY IMPORTANT! Most modern programmers, I think, can
> > no longer
> > even conceive the possibility of operating without massive stacks -
> > massive multiple stacks in the case of multitasking.
> Yes - I have had this experience!  It is almost entirely due to the use of
> the standard C library.  This seems to have large and difficult to predict
> stack requirements.  I would be interested to know what they are doing in
> there...  If you stay out of the libraries - C's stack usage is reasonably
> predictable.  This is not to suggest that `reasonably' or `probably won't
> collide' is good enough in this day and age :).

When real programmers get loose on real programs they always collide
with whatever can be collided with - in my experience.

>
> > Nevertheless it is
> > possible, and solves a whole lot of robustness and efficiency
> > problems.
>
> At the expense of recursion - such a shame that.  I'm suprised that
> alternatives like Brinch Hansen's technique (used by David Wood in his paper
> on recursion in occam) haven't been taken up by more programming languages.
> The current vogue seems to rely on dynamic stacks that use MMU's to allocate
> more space on demand.  Hardly rocket science.
> >
> > Larry Dickson
> >
> >
> Jim

I'm not familiar with Brinch Hansen's technique. Does it go beyond the
LIMITED recursion that is possible by using PAR instances, for example?

My approach is to allow one thread to be dynamic, spawning static
communicating programs at will, which will also handle recursion - the
one thing it refuses to handle is fragmentation - and I think most
practical problems can be done more cleanly without that. It avoids
multiple layers of abstraction (kernel drivers, virtual memory, etc)
which have caused code since around 1990 to become impossibly stiff.

Larry Dickson