[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: XMOS devices + Altera NiosII

The same partially gutted library source compiles OK on the Altera NiosII.

Here is a simple set of matrix operations in a function "matrix":
Source in AplX:


A â â 5
' '
B â â 9
' '
'C: inner product:'
C â A â.Ã B
' '
'T: transpose:'
T â â C
' '
'O: outer product (matrix multiply):'
O â C  +.Ã T
' '
'S: solve matrix:'
S â C â A 

NiosII simple floating point soft processor, on a Terassic DE1 development board :

 1 2 3 4 5
 1 2 3 4 5 6 7 8 9
C: inner product:
 1  2  3  4  5  6  7  8  9
 2  4  6  8 10 12 14 16 18
 3  6  9 12 15 18 21 24 27
 4  8 12 16 20 24 28 32 36
 5 10 15 20 25 30 35 40 45
T: transpose:
 1  2  3  4  5
 2  4  6  8 10
 3  6  9 12 15
 4  8 12 16 20
 5 10 15 20 25
 6 12 18 24 30
 7 14 21 28 35
 8 16 24 32 40
 9 18 27 36 45
O: outer product (matrix multiply):
  285  570  855 1140 1425
  570 1140 1710 2280 2850
  855 1710 2565 3420 4275
 1140 2280 3420 4560 5700
 1425 2850 4275 5700 7125
S: solve matrix:
 1 2 3 4 5 6 7 8 9

I am currently trying to get ./configure to work from inside Eclipse.
Then look at communications.

-----Original Message-----
From: J.B.W.Webber 
Sent: 06 October 2012 12:22
To: Peter Welch; occam-com@xxxxxxxxxx; rick.beton@xxxxxxxxx
Subject: RE: XMOS devices

Thanks indeed Peter, for pointing me to XMOS.

There has been some discussion re using XMOS chips as the outer ring of processors, to handle I/O as it happens, rather than having to prioritises.
So I thought I would summarise my experiences over the last few days.

On Wednesday I finally had time to have a proper look at the XMOS site, and decided to download their compiler suite - based on Eclipse (like the Altera NiosII suite but customised significantly differently).
The example Flashing-LED example appeared to compile, and it was then quick to create, build and simulate a hello-world program.

So I ordered an XC-1A module from RS - Â73 + VAT - came on Thursday evening. This has 4 cores and 1/4Mb ram.
Flashing-LED and hello world ran on a core fine, and it was quick to change the LED flash rate.

That is where things slowed down. My main wish regarding the XMOS and NiosII devices is to port Apl generated code to them - Apl and Maple are between them my main tools of thought. 

In the 80s I was hand-translating Apl to Basic and 6800 assembler, in the 90s to LabVIEW, and at the moment I am having to do it to VHDL. The aim is to make use of Tim Budd's aplc APl to c translator, which in the 1990s I ported to SGIs and the Dec Alpha, as well as the lowly Atari TT. This has the advantage that in the 19980s/1990s I also added a minimum set of multi-tasking / multi-processor facilities (Julian Timestamp, pipe, spawn, and detailed I/O control) which I believe give it some of the capacity to form multiprocessing nodes  :
	A Pipe has two ends. Using APL in a multiprocess/multiprocessor environment. A proposal for a flexible 	but easy to use syntax. J.B. Webber. Apl Quad Quote 20, 1, 1-2, Sept 1989. 	http://doi.acm.org/10.1145/379199.379200   
	(The proposal and implementation of a software mechanism now used by IBM.) 

The XMOS Eclipse set-up has a graphical Makefile, but not all of it seems to work as one would expect, and there is some other magic one has to do - it is described in the documents, but I will put a blog up with the details, or ask if you are stuck. What I have not got to work (which seems to work on the NiosII setup) is run an external script (./configure) to configure the aplc Apl to C compiler for this processor and compiler suite. However the main need is to compile the aplc run-time library, about 50 .c and .h files, some (LAPACK heavyweight maths package) themselves translated from the Fortran. 

I regret to say that to get this to compile I have had to turn off all the multi-tasking facilities, but for the moment simple code does run. I will have to dig and see what the problem is, because when I look at the library files, it all seems to be there. I am not entirely surprised, I am porting Unix code.

Anyway, for a very short test, of summing a vector :
Apl running in MicroApl's AplX :
      V â â 9
      â â V
1 2 3 4 5 6 7 8 9
      S â +/V
      â â S

Translated to c and compiled and running on a single core of the XC-1A module :
1 2 3 4 5 6 7 8 9

I tried a larger test : "ulam.apl", being a simple (overly-simple) 5 function apl demo program, that generates a list of primes and then proceeds to show them as a pattern (the pattern probably says more about the human desire to find patterns). This compiled fine, but was 1/3 too large for a single core.
My interest in compiling this was anyway to use it as a distributed example. 

For an example of the sort of thing I want to get working in aplc on the XMOS devices, and also on the NiosII using a core with floating point, see some of the hand-coded VHDL examples :

So my next tasks are to find why I am having to turn off the communications in the XMOS compilation, and get them working again, and also to try and persuade the NiosII compiler suite to load a -l aplc library.
Then I can work on having multiple aplc programs interacting. 
Then I will look at real-world interactions using the XMOS devices.

-----Original Message-----
From: P.H.Welch [mailto:P.H.Welch@xxxxxxxxxx]
Sent: 27 September 2012 18:58
To: J.B.W.Webber; occam-com@xxxxxxxxxx; rick.beton@xxxxxxxxx
Subject: RE: Occam-Tau - the natural successor to Occam-Pi

Hi Beau,

> However for my sins these days I spend quite a bit of my time writing 
> VHDL for the FPGA based instrumentation I design for my experiments.
> And I keep looking at the concept of the multiple soft processors with 
> fast interconnection links that can fit on these chips.
> Is there any possibility of any of these Occam related parallel 
> languages  being of use for sitting inside / generating code for these 
> soft processors, on a reasonable time-scale ?

Take a look at:


This was founded by David May (chief architect for the transputer and occam) and others a few years ago.  They are targetting (I think) customers with needs exactly like those you describe.  Their XC language is a specialised derivative from occam and their chips seem designed as FPGA replacements:
soft hardware re-programmable in an occam-like manner with very sleek hardware support for concurrency (up to 32 processes anyway).