Messages from 139050

Article: 139050
Subject: Re: Documenting a simple CPU
From: Jonathan Bromley <jonathan.bromley@MYCOMPANY.com>
Date: Thu, 19 Mar 2009 09:57:34 +0000
Links: << >> << T >> << A >>

On Thu, 19 Mar 2009 02:36:17 -0700 (PDT), -jg wrote:

>So you don't connect the students to the silicon at all ?

Yes, obviously; professional training makes no sense
unless it's anchored in the real world.  We have
courses where students download designs to demo
boards, and look closely at the impact of VHDL or
Verilog coding decisions on implementation.  But the
specific course for which the CPU design was written
is not focused on implementation concerns; the
students generally are experienced designers who
want to know more about what SystemVerilog can
do for them.  They're quite grown-up enough to make
their own decisions about implementation issues!

We have subsequently re-used the design on 
verification courses, where it's simply a way to 
get a bunch of interesting and varied activity 
without excessive complexity.  Again, in that 
context the (poor) efficiency of the design is
not relevant to the course content.

If a student on a course actually cared about 
improving the implementation of this design, 
my colleagues and I would be more than happy 
to discuss it.  But it's a digression.
-- 
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which 
are not the views of Doulos Ltd., unless specifically stated.

Article: 139051
Subject: camera module microblaze and sdram
From: SUMAN <sumansrb@gmail.com>
Date: Thu, 19 Mar 2009 04:19:48 -0700 (PDT)
Links: << >> << T >> << A >>

Hi i am interfacing c3038 (352x288 pixels) camera module to microblaze
using spartan 3a dsp 1800a board. The frame grabber module is
perfectly working and i have even tested in ise using chipscope pro.
The camera sends after every 112ns(8.093MHz) . I want to store data in
sdram. I am trying to interface it using user logic fifo service. Is
thera any alternate way for doing this???????????

Article: 139052
Subject: Re: Zero operand CPUs
From: Albert van der Horst <albert@spenarnc.xs4all.nl>
Date: 19 Mar 2009 11:20:26 GMT
Links: << >> << T >> << A >>

In article <jdydnR3YLPNte13UnZ2dnUVZ_uCdnZ2d@supernews.com>,
Andrew Haley  <andrew29@littlepinkcloud.invalid> wrote:
>In comp.lang.forth Albert van der Horst <albert@spenarnc.xs4all.nl> wrote:
>
>> The shallow stack of the transputer is lost on context switches for
>> equal priority task. Together with the limitation where context
>> switches could occur (only on conditional jumps) this accounts for a
>> very practical design.
>
>Actually UNconditional jumps and lend (loop end).  This was quite nice
>since you could prevent a context switch simply by
>
>   ldc 0; cj L

Silly mistake, sorry.

(Loosing registers on conditional jumps would even prevent conditional
expression, to give an example. You can turn a jump into a conditional
jump, but not the other way around.)

>
>Andrew.


--
-- 
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- like all pyramid schemes -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Article: 139053
Subject: Re: Documenting a simple CPU
From: "Symon" <symon_brewer@hotmail.com>
Date: Thu, 19 Mar 2009 11:24:59 -0000
Links: << >> << T >> << A >>


"Jonathan Bromley" <jonathan.bromley@MYCOMPANY.com> wrote in message 
news:nd24s4ds3v4ig9hmo49v83iegr26heicqo@4ax.com...
>
> There is an embarrassingly large amount of redundancy
> and overlap in the instruction set.  I'm working on it :-)
> -- 
> Jonathan Bromley, Consultant
>
Right. On the soft processor I did, there are 'free' opcodes that are 
useless, but it would take more fabric to turn them off. By the way, you 
might wanna stop reading here if you're not interested in 'yet another 
processor' design!

In this hobby processor I designed, I started from what the fabric would 
easily support. For example, I used 16 16-bit registers because that fits 
into 16 LUTs. I picked AND/OR/XOR/MOV because that's one LUT per bit. 
Likewise for shift/rotate left/right. I could use parallel data and opcode 
fetch because a blockram has two ports. The stack was also a block ram, with 
one port for registers, one for the PC, but the same pointer connected to 
both ports.

I was particularly pleased to get ADD/SUB/ADC/SBC/INC/DEC/INC C/DEC NC all 
working in one LUT per bit, albeit by instantiating the carry elements. 
Maybe the synthesiser would be better these days?

In the end, I even added interrupts, which was surprisingly easy as I 
already had CALL and RET. I just needed a couple of FFs to store the zero 
and carry flags. The eight to one mux into the register bank used the 
MUXF5/6 things to source data from the ALU, the logic/shift, stack, lsb 
mutliplier, msb multiplier, immediate value, RAM or I/O.

This FPGA-centric approach meant that the whole thing fits in about 250 
LUTs, 2 BRAMS and 1 multiplier thingy, and runs at over 100MHz in a V2PRO 
which is over 100MIPS because most instructions take one cycle, even 
relative jumps. Bloody zero flag had the worst timing.

Like everyone says, the assembler took me a morning to write in Perl.
As for documentation, it's in VHDL. That's it, right? ;-)

One interesting thing is that it was easy to let the processor do things 
like READ R0,[R1++] because I could increment the registers at the same time 
as using them as indirect addresses.

For me the cool part of the exercise was to get as much out of the processor 
with the least FPGAs resource. (Probably because I cut my teeth on XC2064s 
and XC3010s.) It's interesting to compare it with Jonathan's processor which 
had a completely different focus and application. Instantiating carry 
primitives probably isn't his main teaching objective!

Cheers, Syms.

Article: 139054
Subject: Update code in board
From: Potxoka <potxoka@gmail.com>
Date: Thu, 19 Mar 2009 05:09:47 -0700 (PDT)
Links: << >> << T >> << A >>

Hi,

I am currently doing a design with an FPGA to control a bus signal in
a control board industry. What the board can be updated easily by the
client and can be adapted to different teams. By this I mean that you
can turn on or off certain signals, as well as for other industrial
board, has a different processing of signals. I used the FPGA flash y
not ICSP.

What if someone can help me and explain how this could involve, for
example depending on which data you enter in a flash, the circuit
makes a different treatment of data bus. If such boards used for a
signal to some other individual, depending on industrial machinery.
Planned to implement some core microprocessor in the FPGA and the
flash is stored in the program for this microprocessor, but I find it
laborious and not know whether this task can be performed more easily.
=BFImplementing a CPU as I have said?. =BFStructure of data in flash and
an interpreter in the FPGA? Any ideas?. Thank you.

Greetings
Antony

Article: 139055
Subject: Re: How to load an image onto system ace compact flash embedded on
From: Dirk Koch <dirk.koch@cs.fau.de>
Date: Thu, 19 Mar 2009 13:13:15 +0100
Links: << >> << T >> << A >>

mopra wrote:
> Can anyone tell me how I can load an image file like .jpeg or .bmp on
> to the system ace compact flash embedded on virtex 2 Pro device?
> 
> I need to store image onto the compact flash and then do some
> processing on it. I tried using the different functions in
> sysace_stdio.h header file like sysace_fopen  etc but couldn't find a
> way to load an image file onto the compact flash, so please help me in
> this regard

Try to format the CF card once under Linux with
  mkfs -t msdos /dev/sda1
We have had a problem using System ACE with
CF cards formatted under Windows XP (FAT16).
After mkfs, you can use Windows to copy files
on the card. You should then be able to open
the files using sysace_fopen.

Good luck!
Dirk

Article: 139056
Subject: Re: Xilinx XAPP052 LFSR and its understanding
From: gabor <gabor@alacron.com>
Date: Thu, 19 Mar 2009 05:52:01 -0700 (PDT)
Links: << >> << T >> << A >>

On Mar 18, 11:39=A0pm, Peter Alfke <al...@sbcglobal.net> wrote:
> On Mar 18, 10:04=A0am, Weng Tianxiang <wtx...@gmail.com> wrote:
>
>
>
> > Hi,
> > I want to generate a random number in a FPGA chip and in a natural way
> > it leads to a famous paper Xilinx XAPP052 authored by Peter Alfke:
> > "Efficient Shift Register, LFSR Counters, and Long Pseudo-Random
> > Sequence Generators".
>
> > I have two problems with the note.
>
> > 1. I don't understand "Divide-by-5 to 16 counter". I appreciate if
> > someone explain the Table 2 and Figure 2 in more details.
>
> > 2. In Figure 5, there is an equation: (Q3 xnor Q4) xor (Q1 and Q2 and
> > Q3 and Q4).
> > (Q3 xnor Q4) is required to generate 4-bit LFSR counter. (Q1 and Q2
> > and Q3 and Q4) is used to avoid a dead-lock situation from happening
> > when Q1-Q4 =3D '1'.
>
> > Now the 4-bit LFSR counter dead-lock situation should be extended to
> > any bits long LFSR counter if 2 elements XNOR operation is needed.
> > Especially in Figure 5 for 63-bit LFSR counter. When all 63-bits are
> > '1', it would be dead-locked into the all '1' position, because (Q62
> > xnor Q63) =3D '1' if both Q63 and Q62 are '1'. =A0But the situation is
> > excluded into the equation in Figure 5.
>
> > In another words, if a seed data is closing or equal to all '1'
> > situation, the LFSR is a shorter random number generator than its
> > claim of a 63-bit length generator. There is no way to exactly know if
> > a seed data is closing to all '1' situation.
>
> > We can add logic equation as 4-bit situation does as follows:
> > (Q62 xnor Q63) xor (Q1 and Q2 and ... and Q63).
>
> > There is a new question: If there is a more clever idea to do the same
> > things to avoid the 63-bit dead-lock situation from happening?
>
> > Weng
>
> Weng, there is no mystery. All the LFSRs that I described count by (2
> exp n)-1, since they naturally will never get into the all-ones state.
> If you want them to include that state, you need to decode the state
> one prior, use one gate to invert the input, and the same gate gets
> you out of it again. The "high" cost is that one very wide gate,
> nothing else. LFSRs are well-documented. I had just dug up some old
> information that I had generated at Fairchild Applications in the
> 'sixties.
> Peter Alfke

Depending on the LFSR construction, you may be able to use a
counter instead of a "wide gate".  For example, if your LFSR
is the type where the XOR gates feed only bit 1, you just need
to detect N successive 1's going into bit 1 (where N is the
LFSR length) to find the state where all bits go high.  For
a very long LFSR this approach generally uses much less
resources than the wide gate.  For a "safe mode" LFSR, you
would detect N-1 1's in a row and inject a 0 into bit 1
on the next cycle to prevent lock-up.

Regards,
Gabor

Article: 139057
Subject: Re: Zero operand CPUs
From: Jacko <jackokring@gmail.com>
Date: Thu, 19 Mar 2009 06:26:26 -0700 (PDT)
Links: << >> << T >> << A >>

On 19 Mar, 00:38, rickman <gnu...@gmail.com> wrote:
> On Mar 18, 1:54=A0pm, Jacko <jackokr...@gmail.com> wrote:
>
>
>
>
>
> > On 18 Mar, 16:59, rickman <gnu...@gmail.com> wrote:
>
> > > On Mar 18, 8:36 am, Jacko <jackokr...@gmail.com> wrote:
>
> > > > \ FORTH Assembler for nibz
> > > > \
> > > > \ Copyright (C) 2006,2007,2009 Free Software Foundation, Inc.
>
> > > > \ This file is part of Gforth.
>
> > > > \ Gforth is free software; you can redistribute it and/or
> > > > \ modify it under the terms of the GNU General Public License
> > > > \ as published by the Free Software Foundation, either version 3
> > > > \ of the License, or (at your option) any later version.
>
> > > > \ This program is distributed in the hope that it will be useful,
> > > > \ but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > \ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. =A0See the
> > > > \ GNU General Public License for more details.
>
> > > > \ You should have received a copy of the GNU General Public License
> > > > \ along with this program. If not, seehttp://www.gnu.org/licenses/.
> > > > \
> > > > \ Autor: =A0 =A0 =A0 =A0 =A0Simon Jackson, BEng.
> > > > \
> > > > \ Information:
> > > > \
> > > > \ - Simple Assembler
>
> > > > \ only forth definitions
>
> > > > require asm/basic.fs
>
> > > > =A0also ASSEMBLER definitions
>
> > > > require asm/target.fs
>
> > > > =A0HERE =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ( Begin )
>
> > > > \ The assembler is very simple. All 16 opcodes are
> > > > \ defined immediate so they can be inlined into colon defs.
>
> > > > \ primary opcode constant writers
>
> > > > : BA 0 , ; immediate
> > > > : FI 1 , ; immediate
> > > > : RI 2 , ; immediate
> > > > : SI 3 , ; immediate
>
> > > > : DI 4 , ; immediate
> > > > : FA 5 , ; immediate
> > > > : RA 6 , ; immediate
> > > > : SA 7 , ; immediate
>
> > > > : BO 8 , ; immediate
> > > > : FO 9 , ; immediate
> > > > : RO 10 , ; immediate
> > > > : SO 11 , ; immediate
>
> > > > : SU 12 , ; immediate
> > > > : FE 13 , ; immediate
> > > > : RE 14 , ; immediate
> > > > : SE 15 , ; immediate
>
> > > > =A0HERE =A0SWAP -
> > > > =A0CR .( Length of Assembler: ) . .( Bytes ) CR
>
> > > What instruction is a CALL? =A0How do you specify the address? =A0How=
 do
> > > you specify literal data?
>
> > > Rick
>
> > $addr ,
>
> How many bits are in an opcode, 4 or 5? =A0I would say it has to be five
> or there is no way for the machine to distinguish between an opcode
> and an address. =A0In other words, there *has* to be a CALL instruction,
> even if it is just a one bit opcode with the rest being the address.
>
> Rick- Hide quoted text -
>
> - Show quoted text -

if(instructionRegister<16) doOpcode(instructionRegister) else
doSubroutine(instructionRegister);

cheers jacko

Article: 139058
Subject: Exporting AccelDSP generated Fixed Point C-Code to MicroSoft Visual
From: Moazzam <moazzamhussain@gmail.com>
Date: Thu, 19 Mar 2009 06:35:10 -0700 (PDT)
Links: << >> << T >> << A >>

Hi,
I am using Xilinx AccelDSP Synthesis tool to rapidly prototype image
processing algorithms. In one of my tasks, I need to incorporate the
code generated from AccelDSP to my existing project in Microsoft
Visual Studio as a function call.

I simply added all the source files (Generated from AccelDSP) in my
Visual studio project, and a few compile time error made me add some
extra header files from AccelDSP and Matlab directories. After adding
all of the header files, the visual studio gives error messages within
the code.

Am I missing some thing, has any one successfully incorporated
AccelDSP generated code in existing visual studio project ?

Regards
Moazzam

Article: 139059
Subject: Re: Zero operand CPUs
From: "Antti.Lukats@googlemail.com" <Antti.Lukats@googlemail.com>
Date: Thu, 19 Mar 2009 06:40:59 -0700 (PDT)
Links: << >> << T >> << A >>

On Mar 19, 3:26=A0pm, Jacko <jackokr...@gmail.com> wrote:
> On 19 Mar, 00:38, rickman <gnu...@gmail.com> wrote:
>
>
>
>
>
> > On Mar 18, 1:54=A0pm, Jacko <jackokr...@gmail.com> wrote:
>
> > > On 18 Mar, 16:59, rickman <gnu...@gmail.com> wrote:
>
> > > > On Mar 18, 8:36 am, Jacko <jackokr...@gmail.com> wrote:
>
> > > > > \ FORTH Assembler for nibz
> > > > > \
> > > > > \ Copyright (C) 2006,2007,2009 Free Software Foundation, Inc.
>
> > > > > \ This file is part of Gforth.
>
> > > > > \ Gforth is free software; you can redistribute it and/or
> > > > > \ modify it under the terms of the GNU General Public License
> > > > > \ as published by the Free Software Foundation, either version 3
> > > > > \ of the License, or (at your option) any later version.
>
> > > > > \ This program is distributed in the hope that it will be useful,
> > > > > \ but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > \ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. =A0See the
> > > > > \ GNU General Public License for more details.
>
> > > > > \ You should have received a copy of the GNU General Public Licen=
se
> > > > > \ along with this program. If not, seehttp://www.gnu.org/licenses=
/.
> > > > > \
> > > > > \ Autor: =A0 =A0 =A0 =A0 =A0Simon Jackson, BEng.
> > > > > \
> > > > > \ Information:
> > > > > \
> > > > > \ - Simple Assembler
>
> > > > > \ only forth definitions
>
> > > > > require asm/basic.fs
>
> > > > > =A0also ASSEMBLER definitions
>
> > > > > require asm/target.fs
>
> > > > > =A0HERE =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ( Begin )
>
> > > > > \ The assembler is very simple. All 16 opcodes are
> > > > > \ defined immediate so they can be inlined into colon defs.
>
> > > > > \ primary opcode constant writers
>
> > > > > : BA 0 , ; immediate
> > > > > : FI 1 , ; immediate
> > > > > : RI 2 , ; immediate
> > > > > : SI 3 , ; immediate
>
> > > > > : DI 4 , ; immediate
> > > > > : FA 5 , ; immediate
> > > > > : RA 6 , ; immediate
> > > > > : SA 7 , ; immediate
>
> > > > > : BO 8 , ; immediate
> > > > > : FO 9 , ; immediate
> > > > > : RO 10 , ; immediate
> > > > > : SO 11 , ; immediate
>
> > > > > : SU 12 , ; immediate
> > > > > : FE 13 , ; immediate
> > > > > : RE 14 , ; immediate
> > > > > : SE 15 , ; immediate
>
> > > > > =A0HERE =A0SWAP -
> > > > > =A0CR .( Length of Assembler: ) . .( Bytes ) CR
>
> > > > What instruction is a CALL? =A0How do you specify the address? =A0H=
ow do
> > > > you specify literal data?
>
> > > > Rick
>
> > > $addr ,
>
> > How many bits are in an opcode, 4 or 5? =A0I would say it has to be fiv=
e
> > or there is no way for the machine to distinguish between an opcode
> > and an address. =A0In other words, there *has* to be a CALL instruction=
,
> > even if it is just a one bit opcode with the rest being the address.
>
> > Rick- Hide quoted text -
>
> > - Show quoted text -
>
> if(instructionRegister<16) doOpcode(instructionRegister) else
> doSubroutine(instructionRegister);
>
> cheers jacko- Hide quoted text -
>
> - Show quoted text -

and how wide is instruction register?


Antti

Article: 139060
Subject: Re: Documenting a simple CPU
From: rickman <gnuarm@gmail.com>
Date: Thu, 19 Mar 2009 07:05:52 -0700 (PDT)
Links: << >> << T >> << A >>

On Mar 19, 5:00=A0am, Jonathan Bromley <jonathan.brom...@MYCOMPANY.com>
wrote:
> On Thu, 19 Mar 2009 01:14:31 -0700 (PDT), Jim Granville wrote:
> >Err, with no assembler, how do you run the compiled cpu ?
>
> Hand-coding. =A0

In octal no doubt...

> Yup, really. =A0One of the interesting
> benefits of a super-simple instruction set is that
> this can be done, for small programs - and small
> programs is all I've ever run on it (see below).
>
> >Good docs, but missing is a resource report ?
>
> Irrelevant in the target application (teaching
> HDL syntax and techniques). =A0Around 450-500
> logic cells (4-LUT+FF) in typical FPGAs;
> about 90MHz system clock rate; instructions
> take between 3 and 7 system clocks to execute

If they don't get this machine to run in two clock cycles per
instruction, they aren't trying!

> if the APB-connected memory has no wait states.

Opps... in an FPGA I see no reason to connect memory via APB.
Distributed ram for the registers will allow all 8 registers to be
implemented using 32 LUTs and 48 registers.  32 LUs and 32 regs are
used as dual port ram for up to 16 regs (I see a way to have fast
context switching when you add the interrupt) and another 16 regs for
a duplicate copy of R7 as PC.

Cycle 1 would use the current content of the IR to read up to two
registers and perform what ever op is specified.  This result would be
held in a temp register.  On cycle 2 the temp data would be written to
the destination, either a register or memory and the next instruction
fetched.  How to fetch the next instruction when the PC is being
updated on this cycle?  Use a mux in the path to memory so it either
uses the current PC or the PC being calculated when the destination is
R7.

Of course I may have missed something, although with the simplicity of
your instruction set, I think this will work.  But then I took out the
APB.  I guess your students don't get to do that...

Another optimization for the real world, I see two inefficiencies.
One is that a mux has to be added to the register address path so that
one of the two port addresses will come from IR(5 downto 3) or IR(11
downto 9).  On memory instructions I would split the offset field and
swap the upper three bits (octal digit) with the register values so
that the registers being read are always in the same bits of the
instruction.  Put the offset bits, IR(5 downto 3) in IR(11 downto 9),
source reg in IR(8 downto 6) and the upper digit of the offset in IR
(11 downto 9).  This lets you leave out all but one multiplexer in the
register address path.

That last mux in the register address path is because you have dual
port memory and three addresses.  Source addresses have to be provided
in cycle 1 and target address in cycle 2.  So get rid of one field,
the target, and that mux goes away too.  Use one register for both
source and target and you get back three bits.  You could have 16
registers and still have a bit left over, which I am sure someone can
find a good use for.  An added bonus of this optimization is that I
think you can reduce the clock cycles to 1 since you no longer need to
mux a register address between source and target.  I guess a read of
memory still requires a second clock cycle...

> The RTL implementation is pretty dumb, and
> could easily be made much faster and tighter.
> The only criterion for the present implementation
> was that the RTL CPU should be synthesisable.
>
> [snip sundry interesting comments]
>
> But the purpose of this design was to create
> a piece of Verilog code that does interesting
> things, could be modified (specifically, have
> SystemVerilog language features grafted on),
> and was small enough for students to find
> the relevant bits easily in a 50-minute
> lab session. =A0I'm actually working on a
> real-world version, for my own amusement,
> but Ye Olde Original does what it aimed
> to do and I don't plan on fixing it :-)

If it ain't really broke...

Rick

Article: 139061
Subject: Re: Bullshit! - Re: Zero operand CPUs
From: rickman <gnuarm@gmail.com>
Date: Thu, 19 Mar 2009 07:12:29 -0700 (PDT)
Links: << >> << T >> << A >>

On Mar 19, 4:52=A0am, hal-use...@ip-64-139-1-69.sjc.megapath.net (Hal
Murray) wrote:
> >I am finding that none of this is truly obvious and may not always be
> >true. =A0Real world testing and comparisons are in order, real tests
> >that we can all see and understand...
>
> Back in the 70s, Xerox had the Mesa world running on Altos.
> It was a stack based architecture. =A0The goal was to reduce
> code space. =A0(That was back before people figured out that
> Moore's law was going to make code size not very interesting.)

Not entirely true.  Code size is still an issue when a CPU is being
built in a small FPGA or there are a number of them in most any FPGA.
In an FPGA, memory is still a limited resource.


> Given the available technology of the time, it worked great.
>
> In addition to the stack (I think it was 5 or 6 registers)
> there was a PC, a module pointer for global variables, and a
> frame pointer for this procedure context.
>
> The opcodes were implemented in microcode rather than gates,
> so there was a lot of flexibility in assigning values.
>
> Calls were fancy, but the simple case allocated a frame
> off the free list, setup the return link and such.
>
> Most of the opcodes were loads. =A0It was a 16 bit system, but
> there was a lot of support for 32 bit arithmetic and pointers.
>
> Excpet in rare occasions when you were hacking on the system,
> we didn't care about the details of the architecture. =A0Code
> is code. =A0The basic ideas don't change because the architecture
> changes. =A0You have loads, stores, loops, adds, muls...
> =A0 y =3D a*x+b
> turns into (handwave)
> =A0 load a
> =A0 load x
> =A0 mul
> =A0 load b
> =A0 add
> =A0 store y
>
> Some people call "load" push. =A0If a or b are constants,
> the load might be a load immediate...
>
> It might be a little weird if you wanted to write assembly code.
> I think I'd get used to it if I had some good examples to learn
> from. =A0(I've writted quite a bit of microcode back in the old
> days and some Forth recently.) =A0If you have a good complier
> you never think about that stuff.

Why write in assembly then?  I seem to recall that HP made some
machines that were stack based.  A friend had a job of digging through
core dumps to figure out why a program crashed.  I don't remember the
details, but he didn't do that for long before he got promoted out of
there.

Rick

Article: 139062
Subject: Re: Zero operand CPUs
From: rickman <gnuarm@gmail.com>
Date: Thu, 19 Mar 2009 07:21:32 -0700 (PDT)
Links: << >> << T >> << A >>

On Mar 19, 9:26=A0am, Jacko <jackokr...@gmail.com> wrote:
> On 19 Mar, 00:38, rickman <gnu...@gmail.com> wrote:
>
>
>
> > On Mar 18, 1:54=A0pm, Jacko <jackokr...@gmail.com> wrote:
>
> > > On 18 Mar, 16:59, rickman <gnu...@gmail.com> wrote:
>
> > > > On Mar 18, 8:36 am, Jacko <jackokr...@gmail.com> wrote:
>
> > > > > \ FORTH Assembler for nibz
> > > > > \
> > > > > \ Copyright (C) 2006,2007,2009 Free Software Foundation, Inc.
>
> > > > > \ This file is part of Gforth.
>
> > > > > \ Gforth is free software; you can redistribute it and/or
> > > > > \ modify it under the terms of the GNU General Public License
> > > > > \ as published by the Free Software Foundation, either version 3
> > > > > \ of the License, or (at your option) any later version.
>
> > > > > \ This program is distributed in the hope that it will be useful,
> > > > > \ but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > \ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. =A0See the
> > > > > \ GNU General Public License for more details.
>
> > > > > \ You should have received a copy of the GNU General Public Licen=
se
> > > > > \ along with this program. If not, seehttp://www.gnu.org/licenses=
/.
> > > > > \
> > > > > \ Autor: =A0 =A0 =A0 =A0 =A0Simon Jackson, BEng.
> > > > > \
> > > > > \ Information:
> > > > > \
> > > > > \ - Simple Assembler
>
> > > > > \ only forth definitions
>
> > > > > require asm/basic.fs
>
> > > > > =A0also ASSEMBLER definitions
>
> > > > > require asm/target.fs
>
> > > > > =A0HERE =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ( Begin )
>
> > > > > \ The assembler is very simple. All 16 opcodes are
> > > > > \ defined immediate so they can be inlined into colon defs.
>
> > > > > \ primary opcode constant writers
>
> > > > > : BA 0 , ; immediate
> > > > > : FI 1 , ; immediate
> > > > > : RI 2 , ; immediate
> > > > > : SI 3 , ; immediate
>
> > > > > : DI 4 , ; immediate
> > > > > : FA 5 , ; immediate
> > > > > : RA 6 , ; immediate
> > > > > : SA 7 , ; immediate
>
> > > > > : BO 8 , ; immediate
> > > > > : FO 9 , ; immediate
> > > > > : RO 10 , ; immediate
> > > > > : SO 11 , ; immediate
>
> > > > > : SU 12 , ; immediate
> > > > > : FE 13 , ; immediate
> > > > > : RE 14 , ; immediate
> > > > > : SE 15 , ; immediate
>
> > > > > =A0HERE =A0SWAP -
> > > > > =A0CR .( Length of Assembler: ) . .( Bytes ) CR
>
> > > > What instruction is a CALL? =A0How do you specify the address? =A0H=
ow do
> > > > you specify literal data?
>
> > > > Rick
>
> > > $addr ,
>
> > How many bits are in an opcode, 4 or 5? =A0I would say it has to be fiv=
e
> > or there is no way for the machine to distinguish between an opcode
> > and an address. =A0In other words, there *has* to be a CALL instruction=
,
> > even if it is just a one bit opcode with the rest being the address.
>
> > Rick- Hide quoted text -
>
> > - Show quoted text -
>
> if(instructionRegister<16) doOpcode(instructionRegister) else
> doSubroutine(instructionRegister);

Ok, I'll bite, how wide is the instruction register and how does it
get loaded?

Getting straight (and complete) answers out of you is torture.

Rick

Article: 139063
Subject: Re: Zero operand CPUs
From: Jacko <jackokring@gmail.com>
Date: Thu, 19 Mar 2009 07:28:07 -0700 (PDT)
Links: << >> << T >> << A >>

On 19 Mar, 13:40, "Antti.Luk...@googlemail.com"
<Antti.Luk...@googlemail.com> wrote:
> On Mar 19, 3:26=A0pm, Jacko <jackokr...@gmail.com> wrote:
>
>
>
>
>
> > On 19 Mar, 00:38, rickman <gnu...@gmail.com> wrote:
>
> > > On Mar 18, 1:54=A0pm, Jacko <jackokr...@gmail.com> wrote:
>
> > > > On 18 Mar, 16:59, rickman <gnu...@gmail.com> wrote:
>
> > > > > On Mar 18, 8:36 am, Jacko <jackokr...@gmail.com> wrote:
>
> > > > > > \ FORTH Assembler for nibz
> > > > > > \
> > > > > > \ Copyright (C) 2006,2007,2009 Free Software Foundation, Inc.
>
> > > > > > \ This file is part of Gforth.
>
> > > > > > \ Gforth is free software; you can redistribute it and/or
> > > > > > \ modify it under the terms of the GNU General Public License
> > > > > > \ as published by the Free Software Foundation, either version =
3
> > > > > > \ of the License, or (at your option) any later version.
>
> > > > > > \ This program is distributed in the hope that it will be usefu=
l,
> > > > > > \ but WITHOUT ANY WARRANTY; without even the implied warranty o=
f
> > > > > > \ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. =A0See t=
he
> > > > > > \ GNU General Public License for more details.
>
> > > > > > \ You should have received a copy of the GNU General Public Lic=
ense
> > > > > > \ along with this program. If not, seehttp://www.gnu.org/licens=
es/.
> > > > > > \
> > > > > > \ Autor: =A0 =A0 =A0 =A0 =A0Simon Jackson, BEng.
> > > > > > \
> > > > > > \ Information:
> > > > > > \
> > > > > > \ - Simple Assembler
>
> > > > > > \ only forth definitions
>
> > > > > > require asm/basic.fs
>
> > > > > > =A0also ASSEMBLER definitions
>
> > > > > > require asm/target.fs
>
> > > > > > =A0HERE =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ( Begin )
>
> > > > > > \ The assembler is very simple. All 16 opcodes are
> > > > > > \ defined immediate so they can be inlined into colon defs.
>
> > > > > > \ primary opcode constant writers
>
> > > > > > : BA 0 , ; immediate
> > > > > > : FI 1 , ; immediate
> > > > > > : RI 2 , ; immediate
> > > > > > : SI 3 , ; immediate
>
> > > > > > : DI 4 , ; immediate
> > > > > > : FA 5 , ; immediate
> > > > > > : RA 6 , ; immediate
> > > > > > : SA 7 , ; immediate
>
> > > > > > : BO 8 , ; immediate
> > > > > > : FO 9 , ; immediate
> > > > > > : RO 10 , ; immediate
> > > > > > : SO 11 , ; immediate
>
> > > > > > : SU 12 , ; immediate
> > > > > > : FE 13 , ; immediate
> > > > > > : RE 14 , ; immediate
> > > > > > : SE 15 , ; immediate
>
> > > > > > =A0HERE =A0SWAP -
> > > > > > =A0CR .( Length of Assembler: ) . .( Bytes ) CR
>
> > > > > What instruction is a CALL? =A0How do you specify the address? =
=A0How do
> > > > > you specify literal data?
>
> > > > > Rick
>
> > > > $addr ,
>
> > > How many bits are in an opcode, 4 or 5? =A0I would say it has to be f=
ive
> > > or there is no way for the machine to distinguish between an opcode
> > > and an address. =A0In other words, there *has* to be a CALL instructi=
on,
> > > even if it is just a one bit opcode with the rest being the address.
>
> > > Rick- Hide quoted text -
>
> > > - Show quoted text -
>
> > if(instructionRegister<16) doOpcode(instructionRegister) else
> > doSubroutine(instructionRegister);
>
> > cheers jacko- Hide quoted text -
>
> > - Show quoted text -
>
> and how wide is instruction register?
>
> Antti- Hide quoted text -
>
> - Show quoted text -

Hi

The instruction register width like all register widths are controlled
by the generic parameter wide. if wide is 16 then 16 bit registers,
datapaths, addressable element size, alu, instruction width, in fact
all std_logic_vector of relavance are this generic width.

So if the generic wide is set to 4096 then a 4096 bit microprocessor
is rendered. Note the ALU will be slow until much if generate logic is
written in the VHDL.

As program and data memory address size is n bits and each addressable
is n-bits, then memory size is n^2 bits.

The useful knowledge of dividing the memory into optimized sections
such as say the microcode section at 4 bits wide only, saves on bits,
and means the instruction unpack logic is missing (no delays due to
it), yet the same density is achived, well better!!

This packing of the core 'microcode' to 4 bits is the main reason for
not having literal fetch as one might expect.

Then there is the next code layer, which is a threading list of
subroutine addresses and possibly literal values. This can be
compacted by using (m<n) m to n bit mapping, compression.

Then you have a data space of n bit wide memory.

Any other generics wanted??

cheers jacko

Article: 139064
Subject: Re: Zero operand CPUs
From: rickman <gnuarm@gmail.com>
Date: Thu, 19 Mar 2009 07:36:48 -0700 (PDT)
Links: << >> << T >> << A >>

On Mar 19, 10:21 am, rickman <gnu...@gmail.com> wrote:
> On Mar 19, 9:26 am, Jacko <jackokr...@gmail.com> wrote:
>
>
>
> > On 19 Mar, 00:38, rickman <gnu...@gmail.com> wrote:
>
> > > On Mar 18, 1:54 pm, Jacko <jackokr...@gmail.com> wrote:
>
> > > > On 18 Mar, 16:59, rickman <gnu...@gmail.com> wrote:
>
> > > > > On Mar 18, 8:36 am, Jacko <jackokr...@gmail.com> wrote:
>
> > > > > > \ FORTH Assembler for nibz
> > > > > > \
> > > > > > \ Copyright (C) 2006,2007,2009 Free Software Foundation, Inc.
>
> > > > > > \ This file is part of Gforth.
>
> > > > > > \ Gforth is free software; you can redistribute it and/or
> > > > > > \ modify it under the terms of the GNU General Public License
> > > > > > \ as published by the Free Software Foundation, either version 3
> > > > > > \ of the License, or (at your option) any later version.
>
> > > > > > \ This program is distributed in the hope that it will be useful,
> > > > > > \ but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > > \ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > > > > \ GNU General Public License for more details.
>
> > > > > > \ You should have received a copy of the GNU General Public License
> > > > > > \ along with this program. If not, seehttp://www.gnu.org/licenses/.
> > > > > > \
> > > > > > \ Autor:          Simon Jackson, BEng.
> > > > > > \
> > > > > > \ Information:
> > > > > > \
> > > > > > \ - Simple Assembler
>
> > > > > > \ only forth definitions
>
> > > > > > require asm/basic.fs
>
> > > > > >  also ASSEMBLER definitions
>
> > > > > > require asm/target.fs
>
> > > > > >  HERE                   ( Begin )
>
> > > > > > \ The assembler is very simple. All 16 opcodes are
> > > > > > \ defined immediate so they can be inlined into colon defs.
>
> > > > > > \ primary opcode constant writers
>
> > > > > > : BA 0 , ; immediate
> > > > > > : FI 1 , ; immediate
> > > > > > : RI 2 , ; immediate
> > > > > > : SI 3 , ; immediate
>
> > > > > > : DI 4 , ; immediate
> > > > > > : FA 5 , ; immediate
> > > > > > : RA 6 , ; immediate
> > > > > > : SA 7 , ; immediate
>
> > > > > > : BO 8 , ; immediate
> > > > > > : FO 9 , ; immediate
> > > > > > : RO 10 , ; immediate
> > > > > > : SO 11 , ; immediate
>
> > > > > > : SU 12 , ; immediate
> > > > > > : FE 13 , ; immediate
> > > > > > : RE 14 , ; immediate
> > > > > > : SE 15 , ; immediate
>
> > > > > >  HERE  SWAP -
> > > > > >  CR .( Length of Assembler: ) . .( Bytes ) CR
>
> > > > > What instruction is a CALL?  How do you specify the address?  How do
> > > > > you specify literal data?
>
> > > > > Rick
>
> > > > $addr ,
>
> > > How many bits are in an opcode, 4 or 5?  I would say it has to be five
> > > or there is no way for the machine to distinguish between an opcode
> > > and an address.  In other words, there *has* to be a CALL instruction,
> > > even if it is just a one bit opcode with the rest being the address.
>
> > > Rick- Hide quoted text -
>
> > > - Show quoted text -
>
> > if(instructionRegister<16) doOpcode(instructionRegister) else
> > doSubroutine(instructionRegister);
>
> Ok, I'll bite, how wide is the instruction register and how does it
> get loaded?
>
> Getting straight (and complete) answers out of you is torture.
>
> Rick

Ok, I did the digging and found your instruction set doc as well as
the HDL.  I found that the IR is as wide as the rest of the machine.
This means that each instruction is N bits wide.  So on every opcode
that is one of the 16 instructions that are not calls, the real opcode
would be 0x000X in a 16 bit machine.  That is pretty durn inefficient
of program memory.  Your code size is going to suffer rather
severely.  Not only is each instruction large for a MISC machine,
because you only have 16 basic ops, you will need a lot of them.

So it would seem that this machine is slow (multi cycles to execute
one instruction and lots of instructions to do anything useful) as
well as inefficiently using code space.  Not what I would want to
consider for an ASIC, although if the docs are good enough it could be
worth it...  ;^)

Rick

Article: 139065
Subject: Re: Zero operand CPUs
From: "Antti.Lukats@googlemail.com" <Antti.Lukats@googlemail.com>
Date: Thu, 19 Mar 2009 07:37:30 -0700 (PDT)
Links: << >> << T >> << A >>

On Mar 19, 4:28=A0pm, Jacko <jackokr...@gmail.com> wrote:
> On 19 Mar, 13:40, "Antti.Luk...@googlemail.com"
>
>
>
> <Antti.Luk...@googlemail.com> wrote:
> > On Mar 19, 3:26=A0pm, Jacko <jackokr...@gmail.com> wrote:
>
> > > On 19 Mar, 00:38, rickman <gnu...@gmail.com> wrote:
>
> > > > On Mar 18, 1:54=A0pm, Jacko <jackokr...@gmail.com> wrote:
>
> > > > > On 18 Mar, 16:59, rickman <gnu...@gmail.com> wrote:
>
> > > > > > On Mar 18, 8:36 am, Jacko <jackokr...@gmail.com> wrote:
>
> > > > > > > \ FORTH Assembler for nibz
> > > > > > > \
> > > > > > > \ Copyright (C) 2006,2007,2009 Free Software Foundation, Inc.
>
> > > > > > > \ This file is part of Gforth.
>
> > > > > > > \ Gforth is free software; you can redistribute it and/or
> > > > > > > \ modify it under the terms of the GNU General Public License
> > > > > > > \ as published by the Free Software Foundation, either versio=
n 3
> > > > > > > \ of the License, or (at your option) any later version.
>
> > > > > > > \ This program is distributed in the hope that it will be use=
ful,
> > > > > > > \ but WITHOUT ANY WARRANTY; without even the implied warranty=
 of
> > > > > > > \ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. =A0See=
 the
> > > > > > > \ GNU General Public License for more details.
>
> > > > > > > \ You should have received a copy of the GNU General Public L=
icense
> > > > > > > \ along with this program. If not, seehttp://www.gnu.org/lice=
nses/.
> > > > > > > \
> > > > > > > \ Autor: =A0 =A0 =A0 =A0 =A0Simon Jackson, BEng.
> > > > > > > \
> > > > > > > \ Information:
> > > > > > > \
> > > > > > > \ - Simple Assembler
>
> > > > > > > \ only forth definitions
>
> > > > > > > require asm/basic.fs
>
> > > > > > > =A0also ASSEMBLER definitions
>
> > > > > > > require asm/target.fs
>
> > > > > > > =A0HERE =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ( Begin )
>
> > > > > > > \ The assembler is very simple. All 16 opcodes are
> > > > > > > \ defined immediate so they can be inlined into colon defs.
>
> > > > > > > \ primary opcode constant writers
>
> > > > > > > : BA 0 , ; immediate
> > > > > > > : FI 1 , ; immediate
> > > > > > > : RI 2 , ; immediate
> > > > > > > : SI 3 , ; immediate
>
> > > > > > > : DI 4 , ; immediate
> > > > > > > : FA 5 , ; immediate
> > > > > > > : RA 6 , ; immediate
> > > > > > > : SA 7 , ; immediate
>
> > > > > > > : BO 8 , ; immediate
> > > > > > > : FO 9 , ; immediate
> > > > > > > : RO 10 , ; immediate
> > > > > > > : SO 11 , ; immediate
>
> > > > > > > : SU 12 , ; immediate
> > > > > > > : FE 13 , ; immediate
> > > > > > > : RE 14 , ; immediate
> > > > > > > : SE 15 , ; immediate
>
> > > > > > > =A0HERE =A0SWAP -
> > > > > > > =A0CR .( Length of Assembler: ) . .( Bytes ) CR
>
> > > > > > What instruction is a CALL? =A0How do you specify the address? =
=A0How do
> > > > > > you specify literal data?
>
> > > > > > Rick
>
> > > > > $addr ,
>
> > > > How many bits are in an opcode, 4 or 5? =A0I would say it has to be=
 five
> > > > or there is no way for the machine to distinguish between an opcode
> > > > and an address. =A0In other words, there *has* to be a CALL instruc=
tion,
> > > > even if it is just a one bit opcode with the rest being the address=
.
>
> > > > Rick- Hide quoted text -
>
> > > > - Show quoted text -
>
> > > if(instructionRegister<16) doOpcode(instructionRegister) else
> > > doSubroutine(instructionRegister);
>
> > > cheers jacko- Hide quoted text -
>
> > > - Show quoted text -
>
> > and how wide is instruction register?
>
> > Antti- Hide quoted text -
>
> > - Show quoted text -
>
> Hi
>
> The instruction register width like all register widths are controlled
> by the generic parameter wide. if wide is 16 then 16 bit registers,
> datapaths, addressable element size, alu, instruction width, in fact
> all std_logic_vector of relavance are this generic width.
>
> So if the generic wide is set to 4096 then a 4096 bit microprocessor
> is rendered. Note the ALU will be slow until much if generate logic is
> written in the VHDL.
>
> As program and data memory address size is n bits and each addressable
> is n-bits, then memory size is n^2 bits.
>
> The useful knowledge of dividing the memory into optimized sections
> such as say the microcode section at 4 bits wide only, saves on bits,
> and means the instruction unpack logic is missing (no delays due to
> it), yet the same density is achived, well better!!
>
> This packing of the core 'microcode' to 4 bits is the main reason for
> not having literal fetch as one might expect.
>
> Then there is the next code layer, which is a threading list of
> subroutine addresses and possibly literal values. This can be
> compacted by using (m<n) m to n bit mapping, compression.
>
> Then you have a data space of n bit wide memory.
>
> Any other generics wanted??
>
> cheers jacko

oooo my god

so if you have 32 bit wide datapath
then you use 4 bits as instruction and WASTE 28 bits of the
instruction width?

this is soooo stupid, i could not take that option seriously in count,
that the reason why i asked how wide the instruction is!

Antti

Article: 139066
Subject: Re: Documenting a simple CPU
From: Jonathan Bromley <jonathan.bromley@MYCOMPANY.com>
Date: Thu, 19 Mar 2009 14:51:04 +0000
Links: << >> << T >> << A >>

On Thu, 19 Mar 2009 07:05:52 -0700 (PDT), rickman wrote:

>If they don't get this machine to run in two
>clock cycles per instruction, they aren't trying!

I didn't, and you're right - I wasn't trying!

Students don't work on the DESIGN of this CPU.
They take the existing code and make various
point modifications on it, without changing
the overall shape of the design.  Although
the instruction set architecture is derived
from something that's been brewing in my head
for a long time, the implementation was rather
carefully tweaked to provide opportunities for
adding certain SystemVerilog features; I gave
absolutely no priority to making it efficient.

Errors in students' modifications of the design
rather obviously turn up as errors in the CPU's
behaviour, ranging from small arithmetic errors
through to the usual off-into-the-undergrowth.
Our testbench picks up most errors easily enough.

>That last mux in the register address path is because you have dual
>port memory and three addresses.  Source addresses have to be provided
>in cycle 1 and target address in cycle 2.  So get rid of one field,
>the target, and that mux goes away too.  Use one register for both
>source and target and you get back three bits.  You could have 16
>registers and still have a bit left over, which I am sure someone can
>find a good use for.

Funny... this is not the first time you have rather
quickly spotted something that it had taken me 
quite a while to work out for myself.  Yes, you're
right; making it a 2-operand architecture has useful
effects on the instruction format, and going to
16 registers most certainly IS useful.  As I said
before, I'm working on it.... but, as I also said,
only for my own amusement.  It's all been done 
before, better than I ever could.

Thanks again
-- 
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which 
are not the views of Doulos Ltd., unless specifically stated.

Article: 139067
Subject: Re: Zero operand CPUs
From: Jacko <jackokring@gmail.com>
Date: Thu, 19 Mar 2009 07:56:29 -0700 (PDT)
Links: << >> << T >> << A >>

Hi

> > The useful knowledge of dividing the memory into optimized sections
> > such as say the microcode section at 4 bits wide only, saves on bits,
> > and means the instruction unpack logic is missing (no delays due to
> > it), yet the same density is achived, well better!!
>
> > This packing of the core 'microcode' to 4 bits is the main reason for
> > not having literal fetch as one might expect.

0x000X in 4 bit memory 0xX

cheers jacko

Article: 139068
Subject: Re: Xilinx XAPP052 LFSR and its understanding
From: Weng Tianxiang <wtxwtx@gmail.com>
Date: Thu, 19 Mar 2009 08:52:31 -0700 (PDT)
Links: << >> << T >> << A >>

On Mar 19, 12:46=A0am, glen herrmannsfeldt <g...@ugcs.caltech.edu>
wrote:
> Weng Tianxiang <wtx...@gmail.com> wrote:
> > The reason I want to exclude the dead-lock situation is that in my
> > project, I use the random number generator to generate random number
> > to detect design errors. If there is an error, my design will detect
> > it. But if all numbers generated are the same from some point, there
> > is no error generated and my testing is just waiting time, giving a
> > false correct indicator.
> > But many zeros in seed may guarantee that the situation of all '1'
> > will never happen.
>
> Any zeros in the seed will guarantee that it never happens.
>
> The only way to get to the all ones state is to start there.
> (Well, there is also cosmic rays going through and changing
> the bits, but if that happens you have other problems, too.)
>
> That does assume a properly designed LFSR. =A0If you randomly
> choose taps it is likely that you get one with short cycles.
>
> -- glen

glen,
"Any zeros in the seed will guarantee that it never happens."

Your claim is wrong.

My LFSR is based on the famous paper Xilinx XAPP052 authored by Peter
Alfke:
"Efficient Shift Register, LFSR Counters, and Long Pseudo-Random
Sequence Generators". For a 63-bit LFSR, XNOR operation is done on its
62th-bit and 63th-bit to generate first-bit input.

Actually we don't have to see what structure of LFSR is.

If a LFSR claims to have a good behavior, it is natural for it to
generate all '1' situation. The one before it certainly has '0' in it.

Weng

Article: 139069
Subject: Re: Xilinx XAPP052 LFSR and its understanding
From: Weng Tianxiang <wtxwtx@gmail.com>
Date: Thu, 19 Mar 2009 09:32:32 -0700 (PDT)
Links: << >> << T >> << A >>

On Mar 19, 5:52=A0am, gabor <ga...@alacron.com> wrote:
> On Mar 18, 11:39=A0pm, Peter Alfke <al...@sbcglobal.net> wrote:
>
>
>
>
>
> > On Mar 18, 10:04=A0am, Weng Tianxiang <wtx...@gmail.com> wrote:
>
> > > Hi,
> > > I want to generate a random number in a FPGA chip and in a natural wa=
y
> > > it leads to a famous paper Xilinx XAPP052 authored by Peter Alfke:
> > > "Efficient Shift Register, LFSR Counters, and Long Pseudo-Random
> > > Sequence Generators".
>
> > > I have two problems with the note.
>
> > > 1. I don't understand "Divide-by-5 to 16 counter". I appreciate if
> > > someone explain the Table 2 and Figure 2 in more details.
>
> > > 2. In Figure 5, there is an equation: (Q3 xnor Q4) xor (Q1 and Q2 and
> > > Q3 and Q4).
> > > (Q3 xnor Q4) is required to generate 4-bit LFSR counter. (Q1 and Q2
> > > and Q3 and Q4) is used to avoid a dead-lock situation from happening
> > > when Q1-Q4 =3D '1'.
>
> > > Now the 4-bit LFSR counter dead-lock situation should be extended to
> > > any bits long LFSR counter if 2 elements XNOR operation is needed.
> > > Especially in Figure 5 for 63-bit LFSR counter. When all 63-bits are
> > > '1', it would be dead-locked into the all '1' position, because (Q62
> > > xnor Q63) =3D '1' if both Q63 and Q62 are '1'. =A0But the situation i=
s
> > > excluded into the equation in Figure 5.
>
> > > In another words, if a seed data is closing or equal to all '1'
> > > situation, the LFSR is a shorter random number generator than its
> > > claim of a 63-bit length generator. There is no way to exactly know i=
f
> > > a seed data is closing to all '1' situation.
>
> > > We can add logic equation as 4-bit situation does as follows:
> > > (Q62 xnor Q63) xor (Q1 and Q2 and ... and Q63).
>
> > > There is a new question: If there is a more clever idea to do the sam=
e
> > > things to avoid the 63-bit dead-lock situation from happening?
>
> > > Weng
>
> > Weng, there is no mystery. All the LFSRs that I described count by (2
> > exp n)-1, since they naturally will never get into the all-ones state.
> > If you want them to include that state, you need to decode the state
> > one prior, use one gate to invert the input, and the same gate gets
> > you out of it again. The "high" cost is that one very wide gate,
> > nothing else. LFSRs are well-documented. I had just dug up some old
> > information that I had generated at Fairchild Applications in the
> > 'sixties.
> > Peter Alfke
>
> Depending on the LFSR construction, you may be able to use a
> counter instead of a "wide gate". =A0For example, if your LFSR
> is the type where the XOR gates feed only bit 1, you just need
> to detect N successive 1's going into bit 1 (where N is the
> LFSR length) to find the state where all bits go high. =A0For
> a very long LFSR this approach generally uses much less
> resources than the wide gate. =A0For a "safe mode" LFSR, you
> would detect N-1 1's in a row and inject a 0 into bit 1
> on the next cycle to prevent lock-up.
>
> Regards,
> Gabor- Hide quoted text -
>
> - Show quoted text -

Hi Gabor,
Your method is what I am interested in and your response is a hit.

Your method at most generates n same data (n =3D length of LFSR). For my
project purpose, it is a full fit.

But I will change a little bit to your idea to make it generate no
same data.

!!!Thank you very much for your bright idea!!!

Later I will post my coding for it: the code should use the least
amoung of Xilinx FPGA resources to do the 63-bit LFSR:
a 63-bits shift register with initial value.

Weng

Article: 139070
Subject: Re: Xilinx XAPP052 LFSR and its understanding
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Thu, 19 Mar 2009 16:49:21 +0000 (UTC)
Links: << >> << T >> << A >>

Weng Tianxiang <wtxwtx@gmail.com> wrote:
(snip, I wrote)
> "Any zeros in the seed will guarantee that it never happens."

> Your claim is wrong.

> My LFSR is based on the famous paper Xilinx XAPP052 authored by Peter
> Alfke:
> "Efficient Shift Register, LFSR Counters, and Long Pseudo-Random
> Sequence Generators". For a 63-bit LFSR, XNOR operation is done on its
> 62th-bit and 63th-bit to generate first-bit input.

A good N bit LFSR has a 2**N-1 cycle and a 1 cycle.  
As I understand it, the 1 cycle for these are supposed to
be the all 1 state.  Whichever cycle you start in you
stay in, and that should not be the 1 cycle for good random
numbers.

> Actually we don't have to see what structure of LFSR is.

> If a LFSR claims to have a good behavior, it is natural for it to
> generate all '1' situation. The one before it certainly has '0' in it.

If the all '1' situation is a 1 cycle then the one before
it is also all '1's.  It isn't hard to compute what the one
before is, and the one after.

The description in "Numerical Recipes" includes some of
the math, but is reasonably readable, too.  You might look
at that one.

-- glen

Article: 139071
Subject: Re: Documenting a simple CPU
From: Jecel <jecel@merlintec.com>
Date: Thu, 19 Mar 2009 10:11:25 -0700 (PDT)
Links: << >> << T >> << A >>

On Mar 19, 6:03=A0am, Jonathan Bromley wrote:
> There is an embarrassingly large amount of redundancy
> and overlap in the instruction set. =A0I'm working on it :-)

I did notice that all 10XX instructions seem to be free, so you have
room to grow (though it is already nice as it is).

Someone had mentioned the Cgen tool in another thread (and now Jon has
done so here too) and then I couldn't remember the name of the tool,
so when searching for it I came across:

http://archc.sourceforge.net/

I have not tested it yet, but it looks like a reasonable way to get a
simulator and assembler from a compact description of an architecture.

About 2 vs 3 addresses, I liked Jan Gray=B6 designs (http://
www.fpgacpu.org/) where he has one 3 address instruction (ADD) and all
others are 2 addresses. This matches actual use much better than a
more uniform design.

-- Jecel

Article: 139072
Subject: Re: Xilinx XAPP052 LFSR and its understanding
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Thu, 19 Mar 2009 18:44:51 +0000 (UTC)
Links: << >> << T >> << A >>

Weng Tianxiang <wtxwtx@gmail.com> wrote:
(snip) 

> Actually we don't have to see what structure of LFSR is.

> If a LFSR claims to have a good behavior, it is natural for it to
> generate all '1' situation. The one before it certainly has '0' in it.

One thing to realize about LFSR, which is a subset of
possible state machines.  Each state has one successor
and one predecessor.  It isn't hard to figure out what
that state is from the list of taps.  (That is, from the
logic diagram.)  There are no cases where a sequence of
states goes into a small loop.  Whatever state you start
in, you will eventually get back to that state.  

If the all '1' state is a cycle of length one then
you know that no other starting state will ever get
to that state.  That isn't true of all state
machines, but it is true of LFSR.

-- glen

Article: 139073
Subject: Re: Xilinx XAPP052 LFSR and its understanding
From: Weng Tianxiang <wtxwtx@gmail.com>
Date: Thu, 19 Mar 2009 12:13:47 -0700 (PDT)
Links: << >> << T >> << A >>

On Mar 19, 9:49=A0am, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> Weng Tianxiang <wtx...@gmail.com> wrote:
>
> (snip, I wrote)
>
> > "Any zeros in the seed will guarantee that it never happens."
> > Your claim is wrong.
> > My LFSR is based on the famous paper Xilinx XAPP052 authored by Peter
> > Alfke:
> > "Efficient Shift Register, LFSR Counters, and Long Pseudo-Random
> > Sequence Generators". For a 63-bit LFSR, XNOR operation is done on its
> > 62th-bit and 63th-bit to generate first-bit input.
>
> A good N bit LFSR has a 2**N-1 cycle and a 1 cycle. =A0
> As I understand it, the 1 cycle for these are supposed to
> be the all 1 state. =A0Whichever cycle you start in you
> stay in, and that should not be the 1 cycle for good random
> numbers.
>
> > Actually we don't have to see what structure of LFSR is.
> > If a LFSR claims to have a good behavior, it is natural for it to
> > generate all '1' situation. The one before it certainly has '0' in it.
>
> If the all '1' situation is a 1 cycle then the one before
> it is also all '1's. =A0It isn't hard to compute what the one
> before is, and the one after.
>
> The description in "Numerical Recipes" includes some of
> the math, but is reasonably readable, too. =A0You might look
> at that one.
>
> -- glen

Hi glen,
You are right !!!

Based on 63th-bit and 62th-bit XNOR operation to generate 1st-bit
input, the number generated doesn't include all '1' situation if its
initial value is not all '1'.

I browsed "Numeric Recipe" 2nd edition (I have it) and found its
equations implemented in hardware are worse than formular written by
Peter Alfke.

Based on the formulae, we really can get any number 1 cycle before or
1 cycle after a specific random number.

Thank you for your deep and excellent insight.

Weng

Article: 139074
Subject: Re: Zero operand CPUs
From: Jacko <jackokring@gmail.com>
Date: Thu, 19 Mar 2009 15:36:40 -0700 (PDT)
Links: << >> << T >> << A >>

hi

Chuck and the poe are in the design lab,

Chuck sys to pope "Have you got a rubber, my designs are getting big?"
Pope says "That's a bit RISCy!"

So if you had possiblly 4 instructions to do stack init pointers and
save both aswell, what would you use?

cheers jacko

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search