[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Transputer - schools


This is great. You now have a compiler running
on an ARM simulator - compiling itself. This is a fairly good
test in itself - the compiler includes a good mix of high and
low level programming techniques. 

However, it should now be fairly simple to bootstrap the 
compiler so that we don't need the simulator. You are
using an ARM to run a simulator that simulates an ARM.

The changes needed are 

1: object code format - probably need to modify the 
    compiler output to produce ELF or something.

2: system interface for i/o etc. 

If you do this, the compiler will produce ARM object 
code that will run on any ARM (I think). The code will
be at least 10 times faster than running it on the 

I am working on putting concurrency into X - this will
be slightly different from occam (reflecting what I have
learned over the last few years). 

From an educational viewpoint, X is interesting for 
several reasons:

1. It is small and understandable - both language and 

2. It is powerful enough to express its own compiler

3. It has no GOTOs, Breaks, Pointers etc - I think 
    Floyd-Hoare logic can be used on any program 
    you can write in X. 

Also - my small ARM subset can easily be used to 
re-target an occam compiler (constants, static link 
and all!). I'm happy to help with this, if someone wants
to take it on. 

Best wishes



On 29 Jun 2013, at 12:10, J.B.W.Webber wrote:

Dear David,
I have now built the code you attached on a Raspberry Pi.
I followed the script you sent, and it all worked as you said.
You mention example programs – could you attach these so that I can test what I have built further ?
If people are interested, I will add this compiler to the compiler set I will make available on a SD card for the Raspberry Pi, for people to experiment with.
                Beau Webber
From: Sympa,pkg125 [mailto:sympa@xxxxxxxxxx] On Behalf Of J.B.W.Webber
Sent: 29 June 2013 09:31
To: David May
Cc: Ruth Ivimey-Cook; Larry Dickson; occam-com@xxxxxxxxxx
Subject: Re: Transputer - schools
Dear David, it is good to have your input here. Thanks for attaching your stripped down Arm code. I have been following your talks on this - really interesting, an important development.
I do now have Occam compiling .occ source, on the Raspberry Pi - this is just the interpreted version of the University of Kent Kroc. Works nicely for my investigative purposes - where speed does not matter.
My interest is a follow up of work I did on getting the concise/agile array manipulating language APL to run on the Transputer, back in the 1980s.
To be more precise, I want APL nodes running on the processors, joined together by a communication harness. And to me, Occam is the ideal language to build that harness, given its  robustness against common multiprocessing pitfalls.
I now have both aplc APL to C compiler and Kroc Occam running on the Raspberry Pi, and am inviting people to join me in experimenting in this very low cost environment.
Cheers, Beau Webber

Sent from my iPhone

On 28 Jun 2013, at 23:04, "David May" <dave@xxxxxxxxxxxxxxxxxxxxx> wrote:

I've seen this trail. But I'm not sure what you are trying to do. If it's to
port occam to ARM - good idea. 
The best way to do this is to retarget the back-end of the occam 
compiler. You'll also need a runtime for the scheduling instructions.
I've recently done a compiler for a simple language that targets 
a subset of ARM Thumb instructions - details attached. It has a 
codebuffer that deals with all of the constant pool issues. The 
instruction subset works on all current ARMs (I think). The language
is the sequential subset of the one I used during the design of the
XMOS architecture. 
I did this to teach students about computer architecture and 
compilers - but it's probably useful in other contexts! I'm working
on a much simpler architecture than the ARM subset - one that
also supports concurrency with thousands of cores. And in the
meantime it would be great to get occam compiled to ARM, 
raspberry-pi etc. 
Also - I suspect a carefully written transputer simulator would
run quite fast on a modern processor. The simulator I've attached
(for the ARM subset) runs at over 100MIPS on my ageing 
MacBook - so the compiler will bootstrap using the simulator 
in under 0.5 seconds. 
Best wishes

I've attached:

1. A description of the language, thumb subset and 
   compiler - xarmnotes.pdf

2. A simulator - armsim.c - that executes a binary in a.bin. This
   uses another file - armexec.c which is generated by another
   program - armsgen.c (you only need this if you want to 
   modify the simulator). 

3. A compiler binary - xarm.bin

4. A compiler source - xarm.x

5. Another compiler source that generates assembly code (for
which we have no assembler right now) which is easy to read.

6. A source produced using a formatter and latex - xarm.pdf.

To make this lot work, you have to compile the simulator using (eg)

gcc armsim.c

Then you can copy xarm.bin to a.bin:

cp xarm.bin a.bin

and run the simulator; the compiler binary expects source from 
standard input so the command is:

a.out <xarm.x

This will produce a binary output in a file called sim2. 

You can copy sim2 to a.bin and compile again:

cp sim2 a.bin 

a.out <xarm.x

You should be able to do this over and over again with the same 

To see what is being compiled, you can compile xarma.x:

a.out <xarma.x

Now you can copy sim2 to a.bin and compile again:

cp sim2 a.bin 

a.out <xarm.x

This will produce the assembly code for the compiler,
corresponding to the code that is being executed by the

I also have versions that produce different loader formats
etc - as the compiler actually produces position-independent
binary in its codebuffer it should be fairly easy to bootstrap it 
to produce anything you want.    

I have not tried executing the compiler on a real ARM 
although I have run some simple programs. It has an
(undocumented) feature - if you use a constant as an 
array  the compiler will generate memory accesses 
relative to the value of the constant - this provides for
memory mapped i/o. Also if you use a constant as 
a procedure or function the compiler generates an 
SVC - this is used in the simulator to do file i/o.  

The compiler has some rough edges especially in
error reporting. I'm gradually tidying it up. 

Best wishes


On 28 Jun 2013, at 17:02, Ruth Ivimey-Cook wrote:

Larry Dickson wrote:
Can you go into detail about these "literal pools"? I don't remember seeing such a thing (probably it had a different name). I remember the static chain, which sounded complicated.
What I'm wondering is whether they can be avoided or simplified by following a coding convention. Then, if the other cases were done inefficiently, it would not matter so much.
ARM processors - at least the architecture available then - were limited in where they could load or save, using a single instruction, to addresses IIRC +/- 4096 bytes of the PC. If you needed an address outside of that range, you had to calculate or load the address into a register. Similarly, any non-trivial constant had to be loaded in the same way. Therefore, in ARM code you find, between blocks of opcodes, these pools or blocks of literal values that are used by the code in the surrounding area. This is because ARM opcodes are always one 32-bit word, so you can't have a 32-bit constant in one of them.

The problem I had was that, without knowing more about the context than a single instruction, it was hard to translate transputer-code values and addresses into anything remotely efficient in ARM terms, so these literal pools became big - so big in some cases that it became impossible to use just pools at function boundaries. That was when I started looking around for alternatives.

Below I've included some of the code I wrote to work out whether a constant could be loaded directly or whether it had to be pooled.


/* will v fit in mask m (eg. 0xff) with a shift of 2n? */

__inline unsigned int rotate_left_by_2(unsigned int v)
    unsigned r = (v << 2) | (v >> (30));
    return r;

unsigned int is_arm_immed(unsigned int v, int m)
    int n;
    int nm = ~m; /* e.g. 0xffffff00 */

    n = 0; 
    while ((n < 14) && ((v & nm) == 0))
        v = rotate_left_by_2(v);
    return (n != 14);

int big (int n, int bits)                     /* deal with big operands (12..32 bits) */
    if (bits == 8)
        /* small means it will fit in 8 bits, modified by an even rotate, with sign independent */
        int k = (n >= 0) ? n : -n;
        if (is_arm_immed(k, 0xff))
            mnem("ldimm"); T0; dec(n); COMMENT("big constant"); NL;
            return 1;
    else /* 12 bits excl. sign, straight */
        if (is_arm_immed(n, 0xfff))
            mnem("ldimm"); T0; dec(n); COMMENT("big constant"); NL;
            return 1;
    return 0;

Software Manager & Engineer
Tel: 01223 414180
Blog: http://www.ivimey.org/blog
LinkedIn: http://uk.linkedin.com/in/ruthivimeycook/