Thanks David.
Hmm it is 20 years since I have done much digging into the internals of compilers – I now use APL for my multi-tasking lab control and data collection programs.
Further, my understanding of the Arm is limited to what I have picked up porting these languages to the Raspberry Pi.
So I am afraid I will have to leave this next stage to others. However I am certainly interested in doing the bits I am able to do.
“System interface for I/O” is something I am interested in for my work on porting APL to the Adapteve Epiphany – again this is missing I/O – but I am fairly
out of my depth – there are people who understand this sort of thing far better than I.
Cheers,
Beau
From: David May [mailto:dave@xxxxxxxxxxxxxxxxxxxxx]
Sent: 29 June 2013 18:41
To: J.B.W.Webber
Cc: David May; Ruth Ivimey-Cook; Larry Dickson; occam-com@xxxxxxxxxx
Subject: Re: Transputer - schools
This is great. You now have a compiler running
on an ARM simulator - compiling itself. This is a fairly good
test in itself - the compiler includes a good mix of high and
low level programming techniques.
However, it should now be fairly simple to bootstrap the
compiler so that we don't need the simulator. You are
using an ARM to run a simulator that simulates an ARM.
1: object code format - probably need to modify the
compiler output to produce ELF or something.
2: system interface for i/o etc.
If you do this, the compiler will produce ARM object
code that will run on any ARM (I think). The code will
be at least 10 times faster than running it on the
I am working on putting concurrency into X - this will
be slightly different from occam (reflecting what I have
learned over the last few years).
From an educational viewpoint, X is interesting for
1. It is small and understandable - both language and
2. It is powerful enough to express its own compiler
3. It has no GOTOs, Breaks, Pointers etc - I think
Floyd-Hoare logic can be used on any program
Also - my small ARM subset can easily be used to
re-target an occam compiler (constants, static link
and all!). I'm happy to help with this, if someone wants
On 29 Jun 2013, at 12:10, J.B.W.Webber wrote:
I have now built the code you attached on a Raspberry Pi.
I followed the script you sent, and it all worked as you said.
You mention example programs – could you attach these so that I can test what I have built further ?
If people are interested, I will add this compiler to the compiler set I will make available on a SD card for the Raspberry Pi, for people to experiment with.
Dear David, it is good to have your input here. Thanks for attaching your stripped down Arm code. I have been following your talks on this - really interesting, an important development.
I do now have Occam compiling .occ source, on the Raspberry Pi - this is just the interpreted version of the University of Kent Kroc. Works nicely for my investigative purposes - where speed does not matter.
My interest is a follow up of work I did on getting the concise/agile array manipulating language APL to run on the Transputer, back in the 1980s.
To be more precise, I want APL nodes running on the processors, joined together by a communication harness. And to me, Occam is the ideal language to build that harness, given its robustness against common multiprocessing pitfalls.
I now have both aplc APL to C compiler and Kroc Occam running on the Raspberry Pi, and am inviting people to join me in experimenting in this very low cost environment.
I've seen this trail. But I'm not sure what you are trying to do. If it's to
port occam to ARM - good idea.
The best way to do this is to retarget the back-end of the occam
compiler. You'll also need a runtime for the scheduling instructions.
I've recently done a compiler for a simple language that targets
a subset of ARM Thumb instructions - details attached. It has a
codebuffer that deals with all of the constant pool issues. The
instruction subset works on all current ARMs (I think). The language
is the sequential subset of the one I used during the design of the
I did this to teach students about computer architecture and
compilers - but it's probably useful in other contexts! I'm working
on a much simpler architecture than the ARM subset - one that
also supports concurrency with thousands of cores. And in the
meantime it would be great to get occam compiled to ARM,
Also - I suspect a carefully written transputer simulator would
run quite fast on a modern processor. The simulator I've attached
(for the ARM subset) runs at over 100MIPS on my ageing
MacBook - so the compiler will bootstrap using the simulator
I've attached:
1. A description of the language, thumb subset and
compiler - xarmnotes.pdf
2. A simulator - armsim.c - that executes a binary in a.bin. This
uses another file - armexec.c which is generated by another
program - armsgen.c (you only need this if you want to
modify the simulator).
3. A compiler binary - xarm.bin
4. A compiler source - xarm.x
5. Another compiler source that generates assembly code (for
which we have no assembler right now) which is easy to read.
6. A source produced using a formatter and latex - xarm.pdf.
To make this lot work, you have to compile the simulator using (eg)
gcc armsim.c
Then you can copy xarm.bin to a.bin:
cp xarm.bin a.bin
and run the simulator; the compiler binary expects source from
standard input so the command is:
a.out <xarm.x
This will produce a binary output in a file called sim2.
You can copy sim2 to a.bin and compile again:
cp sim2 a.bin
a.out <xarm.x
You should be able to do this over and over again with the same
result.
To see what is being compiled, you can compile xarma.x:
a.out <xarma.x
Now you can copy sim2 to a.bin and compile again:
cp sim2 a.bin
a.out <xarm.x
This will produce the assembly code for the compiler,
corresponding to the code that is being executed by the
simulator.
I also have versions that produce different loader formats
etc - as the compiler actually produces position-independent
binary in its codebuffer it should be fairly easy to bootstrap it
to produce anything you want.
I have not tried executing the compiler on a real ARM
although I have run some simple programs. It has an
(undocumented) feature - if you use a constant as an
array the compiler will generate memory accesses
relative to the value of the constant - this provides for
memory mapped i/o. Also if you use a constant as
a procedure or function the compiler generates an
SVC - this is used in the simulator to do file i/o.
The compiler has some rough edges especially in
error reporting. I'm gradually tidying it up.
Best wishes
David
On 28 Jun 2013, at 17:02, Ruth Ivimey-Cook wrote:
Can you go into detail about these "literal pools"? I don't remember seeing such a thing (probably it had a different name). I remember the static chain, which sounded complicated.
What I'm wondering is whether they can be avoided or simplified by following a coding convention. Then, if the other cases were done inefficiently, it would not matter so much.
ARM processors - at least the architecture available then - were limited in where they could load or save, using a single instruction, to addresses IIRC +/- 4096 bytes of the PC. If you needed an address outside of that range, you had to
calculate or load the address into a register. Similarly, any non-trivial constant had to be loaded in the same way. Therefore, in ARM code you find, between blocks of opcodes, these pools or blocks of literal values that are used by the code in the surrounding
area. This is because ARM opcodes are always one 32-bit word, so you can't have a 32-bit constant in one of them.
The problem I had was that, without knowing more about the context than a single instruction, it was hard to translate transputer-code values and addresses into anything remotely efficient in ARM terms, so these literal pools became big - so big in some cases
that it became impossible to use just pools at function boundaries. That was when I started looking around for alternatives.
Below I've included some of the code I wrote to work out whether a constant could be loaded directly or whether it had to be pooled.
Regards
Ruth
/* will v fit in mask m (eg. 0xff) with a shift of 2n? */
__inline unsigned int rotate_left_by_2(unsigned int v)
{
unsigned r = (v << 2) | (v >> (30));
return r;
}
unsigned int is_arm_immed(unsigned int v, int m)
{
int n;
int nm = ~m; /* e.g. 0xffffff00 */
n = 0;
while ((n < 14) && ((v & nm) == 0))
{
v = rotate_left_by_2(v);
n++;
}
return (n != 14);
}
int big (int n, int bits) /* deal with big operands (12..32 bits) */
{
if (bits == 8)
{
/* small means it will fit in 8 bits, modified by an even rotate, with sign independent */
int k = (n >= 0) ? n : -n;
if (is_arm_immed(k, 0xff))
{
mnem("ldimm"); T0; dec(n); COMMENT("big constant"); NL;
return 1;
}
}
else /* 12 bits excl. sign, straight */
{
if (is_arm_immed(n, 0xfff))
{
mnem("ldimm"); T0; dec(n); COMMENT("big constant"); NL;
return 1;
}
}
return 0;
}
--
Software Manager & Engineer
Tel: 01223 414180
Blog: http://www.ivimey.org/blog
LinkedIn: http://uk.linkedin.com/in/ruthivimeycook/
|