Messages from 54975

Article: 54975
Subject: Re: hardware implementation of viterbi decoder
From: Kai Harrekilde-Petersen <khp@vitesse.com>
Date: 23 Apr 2003 17:17:16 +0200
Links: << >> << T >> << A >>

vikas_akalwadi@indiatimes.com (vikas) writes:

> hi all,
> I am presently working on hardware implementation of viterbi decoder
> with constraint length K=7 with soft decision width = 3. It would be
> very helpful to me if the knowledgable persons can answer my doubts
> regarding the issues:
> 
> 1. The ACS modules (64 in number in my case) are taking most of the
> area. Some documents mention of modified ACS units. But i couldnot get
> those documents as they were preveliged for ieee members.
> 
> 2. To avoid the overflowing of partial path metric values, i am doing
> normalisation i.e., subtracting the lowest value from all the partial
> path metric values. Some documents mention about "localised
> normalisation". How does that work.
> 
> 3. After survival data equal to trace back deapth has been stored, we
> start trace back. If we are to start the traceback with lowest partial
> path metric, how do we determine that state if we do "localised
> normalisation"
> 
> 4. What are the different techniques for trace back operation ?
> Since i am implementing it on hardware, functionality, area and timing
> all are very important.

Jens Sparsø and Steen Pedersen at the Technical University of Denmark
did a full-custom implementation of a Viterbi decoder back in around
1990. They wrote several articles based on that, and we poor students
had as a "standard case-study" for several years :-/

Maybe you could find something on their website.  I believe the right
website is http://www.imm.dtu.dk/ [...] Ah, yes.  Go search for
"Viterbi" and you'll get a bunch of references.


Kai
-- 
Kai Harrekilde-Petersen  <khp@vitesse.com>  Opinions are mine...

Article: 54976
Subject: Re: Problem : Simulating SRL16 with webpack 5.2 and modelsim 5.6e starter
From: "John_H" <johnhandwork@mail.com>
Date: Wed, 23 Apr 2003 16:00:00 GMT
Links: << >> << T >> << A >>

There was a time way back in Sept 2000 when the Verilog simprim I was using
had a problem where the notifier would kick in because of the setup/hold
numbers defined in the X_SRL16E.v simprim.  I just added a delay in the
simprim where the assignment was made, from:

  {data[15:0]} <= {data[14:0], d_in};
to:
  #2 {data[15:0]} <= {data[14:0], d_in};

All it took was a couple picoseconds.  I found this out by drilling down
into the simprim in my simulation to find *why* I was getting the "x" values
for my SRL outputs.  The "notifier" was kicking in when it made no sense.
This may have no bearing on your current situation but it struck a familiar
chord.

- John_H


"Frank Hoffmann" <fh215@xxx.yyy.ac.uk> wrote in message
news:b85uui$27k$1@pegasus.csx.cam.ac.uk...
> I hope that somebody can help me with this ?
>
> I have a small design which uses instantiated SRL16 primitves.
>
> The design simulates fine with webpack 4.2 and the 'matching' modelsim
> simulator, but generates loads of timing errors with the latest set of
> tools. The timing errors all seem to be caused by the SRL16 primitives.
>
> Has anybody come across this and can tell me why this happens and how to
> fix it ?
>
> Thanks for your help in advance,
>
> - Frank
>
> PS:
> to send no-spam email, replace "xxx" with "eng" and "yyy" with "cam".
>
>
>
> ==================================================
> Frank Hoffmann
>
> Laboratory for Communication Engineering (LCE)
> University of Cambridge  -  Dept. of Engineering
> William Gates Building, JJ Thomson Avenue,
> Cambridge, CB3 0FD, UK
>
> phone : +44 1223 767031     fax : +44 1223 767010
> ==================================================
>

Article: 54977
Subject: Re: Reason Xess discontinued XSV prototyping boards?
From: rdschwarz@spamcop.net (Zeke)
Date: 23 Apr 2003 09:26:28 -0700
Links: << >> << T >> << A >>

"John Milbanks" <phony@nowhere.cc> wrote in message news:<w61oa.7947$5R6.7215@fed1read01>...
> Out of curiosity, does anyone know the reason Xess discontinued its Virtex
> prototyping XSV boards and now offers only the SpartanII XSA ones? No
> demand? Too expensive?

Check at http://www.associatedpro.com  APS still carries Virtex boards
in a small form factor. PC104 or stand alone

It seems like XILINX also quit placing the third party board links on
its website., or it is incomplete. Neither APS nor Xess are listed ,
nor is Annapolis Micro or many others. Maybe that is influencing the
board vendors.

Article: 54978
Subject: Nice VHDL tutorial and POD/PCI Board
From: rdschwarz@spamcop.net (Zeke)
Date: 23 Apr 2003 09:41:36 -0700
Links: << >> << T >> << A >>

I am very impressed with the AMONTEC website, and product line. The
VHDL online reference is also very nice.

http://www.amontec.com 

I understand AMONTEC does consulting work in Europe. Laurent Gauch has
done some work for me in the US and I highly recommend them. Also the 
JTAG/I2C/COOLRUNNER/WIGGLER POD is a really cool way to load your
FPGA, and get a COOLRUNNER prototype system all at the same time, and
its only $150.00 bucks and comes with tons of core and source code. It
does everything but catch fish! Great buy!


Rick
http://www.associatedpro.com

Article: 54979
Subject: Challenge: (n mod 3) in hardware???
From: RISC_taker@alpenjodel.de (RISC taker)
Date: 23 Apr 2003 09:54:22 -0700
Links: << >> << T >> << A >>

Hey, I need to calculate (n mod 3) in a Virtex-II design. n is a
10-bit unsigned number and 3 is a constant. This has to be done in the
same cycle (combinatorial!). Now what's a good way to implement that?

I thought of a lookup table (distributed RAM) but this takes quite a
lot of space. Any better ideas? (Ray, the arithmetic guru? :-)

Do you think I can perform this operation at 200 MHz in a Virtex-II?

Thanks!
RISC_taker

Article: 54980
Subject: Re: Xilinx has released SpartanIII
From: mrand@my-deja.com (Marc Randolph)
Date: 23 Apr 2003 10:04:33 -0700
Links: << >> << T >> << A >>

Robert <rpudlik@poczta.onet.pl> wrote in message news:<3EA6A600.5060405@poczta.onet.pl>...
> It was so close: 1.2V core voltage.
> In my current design I'm going to use processor with 1.26V core 
> voltage;) It would be nice to have one regulator less...

Howdy Robert,

Assuming the regulator can handle the current, you should only need
one for voltages that close.

The SP3 datasheet states that 1.26 volts is acceptable.  Or (making an
educated guess) you could probably run your processor 5% low - which
would get you to 1.2V.

Or perhaps the best of both worlds: you could just split the
difference and run that rail at 1.23 Volts.  Then each part would be
only ~2.5% from the spec'ed mid-point voltage, giving you some margin
on both sides.

Have fun,

   Marc

Article: 54981
Subject: Re: NIOS 3.0 Spurious Interrupts
From: kempaj@yahoo.com (Jesse Kempa)
Date: 23 Apr 2003 10:05:26 -0700
Links: << >> << T >> << A >>

IRQs 1-15 are 'reserved' for system-level exceptions - check out the
Nios Programmer's Reference Manual (PDF) file. IRQ 1 is for register
window underflow... something that would happen if you return (ret
instruction) from a subroutine call, but without entering (call
instruction) the subroutine.

Are you doing anything fancy during the boot-up process with your own
startup code? If you're doing something such as compiling a
traditional C program (with main(), and a bunch of subroutines), built
with nios-build, then we will link in code that sets up interrupts,
the register window, etc. and prevents this sort of thing.

- Jesse




jim006@att.net (Jim M.) wrote in message news:<6f3fc0f8.0304230440.35a703d9@posting.google.com>...
> The error message prints in hex huh?  Well that's probably worth
> knowing.
> 
> That explains IRQ 17 and 19 (timer and lan respectively)
> 
> Is it possible to receive a spurious interrupt for an IRQ not assigned
> in SOPC Builder.  I recall having a spurious IRQ #1, although I may
> have been mistaken.
> 
> You mention that IRQs 16-64 are for user exceptions.  What about IRQs
> 1-15 ?
> 
>

Article: 54982
Subject: Re: Challenge: (n mod 3) in hardware???
From: "Falk Brunner" <Falk.Brunner@gmx.de>
Date: Wed, 23 Apr 2003 19:22:54 +0200
Links: << >> << T >> << A >>

"RISC taker" <RISC_taker@alpenjodel.de> schrieb im Newsbeitrag
news:18c289aa.0304230854.6897fb3b@posting.google.com...
> Hey, I need to calculate (n mod 3) in a Virtex-II design. n is a
> 10-bit unsigned number and 3 is a constant. This has to be done in the
> same cycle (combinatorial!). Now what's a good way to implement that?
>
> I thought of a lookup table (distributed RAM) but this takes quite a
> lot of space. Any better ideas? (Ray, the arithmetic guru? :-)

Iam not a guru, but how about a BRAM, used in x4 configuration. Just store
the truth table ther, doen. OK, it has 1 cycle latency, but hey, its pretty
easy and fast. 200 MHz should be possible.

Otherwise you could try to use a excel sheet or something to generate the
truth table. Maybe you will find some clever optimization possibilities.

--
MfG
Falk

Article: 54983
Subject: Re: Xilinx has released SpartanIII
From: rickman <spamgoeshere4@yahoo.com>
Date: Wed, 23 Apr 2003 13:39:06 -0400
Links: << >> << T >> << A >>

Robert wrote:
> 
> It was so close: 1.2V core voltage.
> In my current design I'm going to use processor with 1.26V core
> voltage;) It would be nice to have one regulator less...

You must be talking about the TI C6711C or C6713.  Check the
tolerances.  I bet you can pick one voltage that will suit both.  It is
not like the voltage is *that* critical to these chips.  If it is 50 mV
too high the chip won't *blow*.  It will use about 8% more power by my
estimation.  

The match in power voltage is one reason I would like to use the Spartan
3.  But I can't wait for 9 months.  Beside, this design is not for a
single board.  We plan to use other DSP chips which will not have the
low 1.26 volt power.  Fortunately they *do* make LDOs that will drop the
1.5 volt power to 1.26 volts!  Think about it.  That is 84% efficiency. 
Not bad for an LDO.  You can do a bit better by adding yet another
switcher to your design, but is it worth it?  

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 54984
Subject: Re: DC requirement in FFT
From: Rene Tschaggelar <tschaggelar@dplanet.ch>
Date: Wed, 23 Apr 2003 17:42:09 GMT
Links: << >> << T >> << A >>

Bob wrote:
> Hi,
> 
> When calculating an FFT, how does the inclusion of the DC value affect
> things i.e. I have seen some examples of FFT code where the DC is
> removed and some where it left.
> 
> Is the best strategy to remove the DC or leave it. For example in a
> COFDM system, will its removal or inclusion have any effect? I am
> thinking of its
> implementation in an ASIC, where DC removal adds overhead and I'm
> wondering if it really necessary to remove it. How will adding
> removing the DC level affect the ifft?

What is the problem ?
the DC is an additional number and corresponds to a DC bias.
It depends on the problem whether you need it or not.

Rene
-- 
Ing.Buero R.Tschaggelar - http://www.ibrtses.com
& commercial newsgroups - http://www.talkto.net

Article: 54985
Subject: Re: Problem : Simulating SRL16 with webpack 5.2 and modelsim 5.6estarter
From: Ray Andraka <ray@andraka.com>
Date: Wed, 23 Apr 2003 17:48:43 GMT
Links: << >> << T >> << A >>

How much older.  There was a significant slowdown in the SRL16's in the updated
speed files, I think it was v3.3 service pack 8.  The newer version of the tools
will have the most recent speed files.

Frank Hoffmann wrote:

> Hi-
> thanks for that tip. I'll investigate it, but I wonder whether this is
> the reason for the unexplained errors I'm getting ? Mind you, the
> identical design was simulating without any errors in the older version
> of tools ?
>
> - Frank
>
> Ray Andraka wrote:
> > The clock to Y timing of the SRL16 is not very impressive.  You make things
> > work much better if you feed the output of the SRL16 directly to the
> > flip-flop in the same slice before using it.  It adds one more clock delay,
> > but eliminates the extra routing and FF set up.
> >
> > Frank Hoffmann wrote:
> >
> >
> >>I hope that somebody can help me with this ?
> >>
> >>I have a small design which uses instantiated SRL16 primitves.
> >>
> >>The design simulates fine with webpack 4.2 and the 'matching' modelsim
> >>simulator, but generates loads of timing errors with the latest set of
> >>tools. The timing errors all seem to be caused by the SRL16 primitives.
> >>
> >>Has anybody come across this and can tell me why this happens and how to
> >>fix it ?
> >>
> >>Thanks for your help in advance,
> >>
> >>- Frank
> >>
> >>PS:
> >>to send no-spam email, replace "xxx" with "eng" and "yyy" with "cam".
> >>
> >>==================================================
> >>Frank Hoffmann
> >>
> >>Laboratory for Communication Engineering (LCE)
> >>University of Cambridge  -  Dept. of Engineering
> >>William Gates Building, JJ Thomson Avenue,
> >>Cambridge, CB3 0FD, UK
> >>
> >>phone : +44 1223 767031     fax : +44 1223 767010
> >>==================================================
> >
> >
> > --
> > --Ray Andraka, P.E.
> > President, the Andraka Consulting Group, Inc.
> > 401/884-7930     Fax 401/884-7950
> > email ray@andraka.com
> > http://www.andraka.com
> >
> >  "They that give up essential liberty to obtain a little
> >   temporary safety deserve neither liberty nor safety."
> >                                           -Benjamin Franklin, 1759
> >
> >

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 54986
Subject: Re: Challenge: (n mod 3) in hardware???
From: Ray Andraka <ray@andraka.com>
Date: Wed, 23 Apr 2003 17:58:58 GMT
Links: << >> << T >> << A >>

with only 10 bits input, you only have a 1K address space, and the output
only needs 2 bits (0,1,2), a so this can easily fit in a single block
RAM.  That would be your best bet for single cycle operation at 200 MHz.
If you are really adverse to using block RAM, there is a shortcut similar
to the one for decimal numbers but I don't recall it off hand.  The
decimal shortcut is to sum the digits, and divide that sum by 3.  The
remainder of that division is the mod.  It would take a fair amount of
area to attempt that combinatorially, and you'd be hard pressed to do it
in a single cycle at 200 MHz....go with the BRAM.

RISC taker wrote:

> Hey, I need to calculate (n mod 3) in a Virtex-II design. n is a
> 10-bit unsigned number and 3 is a constant. This has to be done in the
> same cycle (combinatorial!). Now what's a good way to implement that?
>
> I thought of a lookup table (distributed RAM) but this takes quite a
> lot of space. Any better ideas? (Ray, the arithmetic guru? :-)
>
> Do you think I can perform this operation at 200 MHz in a Virtex-II?
>
> Thanks!
> RISC_taker

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 54987
Subject: Re: How to configure USER1 and USER2 of JTAG on Xilinx Virtex2!!
From: "Frederic Bastenaire" <frederic.bastenaire@wanadoo.fr>
Date: Wed, 23 Apr 2003 20:07:33 +0200
Links: << >> << T >> << A >>

Great! I posted questions about this but never got anything concrete. This
is interesting and partly answers my questions.
I have extra questions (sorry) :
- Is there any VHDL similar example?
- What software do you use to play with the JTAG USER1 commands on the PC
side?

Thanks for your help,

Frederic Bastenaire

"Philip Freidin" <philip@fliptronics.com> a écrit dans le message de news:
n7hv9v438bfif84cg3mjoaeim15b3qkla0@4ax.com...
> You need to create your own data register, and connect it to
> the JTAG primitive.
>
> For your entertainment, here is an example of doing it for
> Virtex-II.
>
> >  (...)
> Philip Freidin
> Fliptronics

Article: 54988
Subject: Re: Challenge: (n mod 3) in hardware???
From: ben@ben.com (Ben Jackson)
Date: Wed, 23 Apr 2003 18:09:19 GMT
Links: << >> << T >> << A >>

In article <18c289aa.0304230854.6897fb3b@posting.google.com>,
RISC taker <RISC_taker@alpenjodel.de> wrote:
>Hey, I need to calculate (n mod 3) in a Virtex-II design. n is a
>10-bit unsigned number and 3 is a constant. This has to be done in the
>same cycle (combinatorial!). Now what's a good way to implement that?

Off the top of my head, I think so:

Consider any all-1s binary number with an even number of bits.  That
number is a multiple of 3.  3, 15, 63, 255, or in your case, 1023.
For every EVEN bit NOT set in that number, the remainder goes up by 2.
For every ODD bit not set in that number, the remainder goes up by 1.
For example:

1111 = 15 = multiple of 3
0111 = 7 = remainder 1
1011 = 11 = remainder 2
1101 = 13 = remainder 1
1110 = 14 = remainder 2

and any combination:
0101 = 5 = remainder (1+1) => 2
0110 = 6 = remainder (1+2) => 3 => 0

Conveniently each pair of bits makes a 2-bit number which you can add
up.

Now if you add up all of the possible bits that way for a 10 digit number
the most you can get is 15 (1 x 5 odd bits + 2 x 5 even bits = 15).  You
could then handle that with a lookup table or a few well-chosen gates.
Or you could repeat this process again.

-- 
Ben Jackson
<ben@ben.com>
http://www.ben.com/

Article: 54989
Subject: Re: Challenge: (n mod 3) in hardware???
From: Allan Herriman <allan_herriman.hates.spam@agilent.com>
Date: Thu, 24 Apr 2003 04:20:49 +1000
Links: << >> << T >> << A >>

On 23 Apr 2003 09:54:22 -0700, RISC_taker@alpenjodel.de (RISC  taker)
wrote:

>Hey, I need to calculate (n mod 3) in a Virtex-II design. n is a
>10-bit unsigned number and 3 is a constant. This has to be done in the
>same cycle (combinatorial!). Now what's a good way to implement that?
>
>I thought of a lookup table (distributed RAM) but this takes quite a
>lot of space. Any better ideas? (Ray, the arithmetic guru? :-)
>
>Do you think I can perform this operation at 200 MHz in a Virtex-II?
>
>Thanks!
>RISC_taker

Hi RISC,

If you break up your 10 bit input word into five 2 bit words, you can
take the modulus of each (using five pairs of 2 input LUTs), then add
the results (to get a four bit number), then take the modulus of that.

This works since

a mod (a - 1) = 1

and 

b mod (a - 1) = (a x b) mod (a - 1)

(Think of a as being an even power of 2, which means we don't change
the modulus if we shift the input by 2 bits.)

We can improve the timing by using pairs of 4 input LUTS to take four
bit slices of your input.  
We then sum the three 2 bit values (using 2 levels of logic) to get a
4 bit result, then take the modulus of that in another pair of LUTs.

This takes a total of four levels of logic, which should work at
200MHz in Virtex-II, depending on the speed grade, your patience, etc.

Regards,
Allan.

Article: 54990
Subject: Re: how to synthesize Xilinxcorelib in leonardo or ISE 5.1
From: "Gilad Cohen" <gilad_coh@walla.co.il>
Date: Wed, 23 Apr 2003 11:21:37 -0700
Links: << >> << T >> << A >>


Mike is right, you cannot synthesize the hdl code generated by the coregen. 

The core generator outputs HDL code for simulation, and a *.edn for synthesis. 
You need to place this file in the folder of your top level *.edf file. 

You can use the HDL code to write the instantiation of the core. 

I tend not to use the coregen because of the flow it forces.

Article: 54991
Subject: Re: Challenge: (n mod 3) in hardware???
From: Allan Herriman <allan_herriman.hates.spam@agilent.com>
Date: Thu, 24 Apr 2003 04:28:58 +1000
Links: << >> << T >> << A >>

On Thu, 24 Apr 2003 04:20:49 +1000, Allan Herriman
<allan_herriman.hates.spam@agilent.com> wrote:

>On 23 Apr 2003 09:54:22 -0700, RISC_taker@alpenjodel.de (RISC  taker)
>wrote:
>
>>Hey, I need to calculate (n mod 3) in a Virtex-II design. n is a
>>10-bit unsigned number and 3 is a constant. This has to be done in the
>>same cycle (combinatorial!). Now what's a good way to implement that?
>>
>>I thought of a lookup table (distributed RAM) but this takes quite a
>>lot of space. Any better ideas? (Ray, the arithmetic guru? :-)
>>
>>Do you think I can perform this operation at 200 MHz in a Virtex-II?
>>
>>Thanks!
>>RISC_taker
>
>Hi RISC,
>
>If you break up your 10 bit input word into five 2 bit words, you can
>take the modulus of each (using five pairs of 2 input LUTs), then add
>the results (to get a four bit number), then take the modulus of that.
>
>This works since
>
>a mod (a - 1) = 1
>
>and 
>
>b mod (a - 1) = (a x b) mod (a - 1)
>
>(Think of a as being an even power of 2, which means we don't change
>the modulus if we shift the input by 2 bits.)
>
>
>We can improve the timing by using pairs of 4 input LUTS to take four
>bit slices of your input.  
>We then sum the three 2 bit values (using 2 levels of logic) to get a
>4 bit result, then take the modulus of that in another pair of LUTs.
>
>This takes a total of four levels of logic, which should work at
>200MHz in Virtex-II, depending on the speed grade, your patience, etc.


BTW, the total hardware is 13 LUTs, which might fit into 2 CLBs.

(Or you could use a block ram, as other posters have suggested.)

Regards,
Allan.

Article: 54992
Subject: Re: Challenge: (n mod 3) in hardware???
From: "Avrum" <avrum@REMOVEsympatico.ca>
Date: Wed, 23 Apr 2003 14:34:34 -0400
Links: << >> << T >> << A >>

You are on the right track...

There are three keys to this solution, the first is

(n*4^m mod 3) = n mod 3      for all values of m

The second is

(4n mod 3) = n mod 3

The third is

((a + b) mod 3) = ((a mod 3) + (b mod 3)) mod 3

Using these reductions you can construct a VERY fast mod 3 calculator with
extremely small gate count. I have used this to do 19 bits at over 100MHz
with no problems at all, and I am sure it will do 10 bits at 200 MHz. Its
size is TINY. For 10 bits it will require 3 levels of LUTs and a total of (I
think) 8 LUTs.

The basis of the algorithm is that you can take the input 4 bits at a time,
and generate the mod 3 (which is 2 bits) in two LUTs (one for each output
bit). Then adjacent pairs of these two bit quantities can be concatenated to
generate 4 more bits, which can be reduce to 2. You need to do this
log2(width)-1 times to get to the final 2 bits.

Here it is. You can let the synthesis tool optimize out the LUTs you don't
need (since you are only doing 10 bits), or you chop out the stages you
don't need by hand.

Avrum

------------------------------------------------------------

/*
 * Module:              syn_mod3_32
 * Creation Date:       Tue Feb 20 2000
 * Author:  Avrum Warshawsky
 * Description:         Synthetic Mod3 calculator
 * Instantiated models: none
 * DEFINE:              WIDTH
 *
 *
 * Description:
 *
 * This module will calculate mod 3 for any number up to 32 bits.
 * The parameter WIDTH determines the width of the input.
 * The width of the output is always 2 bits, which will be
 * 0, 1, or 2 (never 3) - the MOD3 of the input data
 *
*/

module syn_mod3_32(out, in);

//**************************************************************************
****
// Port Declarations
//**************************************************************************
****

  parameter WIDTH=19;

  input  [WIDTH-1:0] in;
  output [1:0] out;

  function [1:0] digit_mod;
    input  [3:0] digit;
  case(digit)
    4'h0: digit_mod = 2'd0;
    4'h1: digit_mod = 2'd1;
    4'h2: digit_mod = 2'd2;
    4'h3: digit_mod = 2'd0;
    4'h4: digit_mod = 2'd1;
    4'h5: digit_mod = 2'd2;
    4'h6: digit_mod = 2'd0;
    4'h7: digit_mod = 2'd1;
    4'h8: digit_mod = 2'd2;
    4'h9: digit_mod = 2'd0;
    4'ha: digit_mod = 2'd1;
    4'hb: digit_mod = 2'd2;
    4'hc: digit_mod = 2'd0;
    4'hd: digit_mod = 2'd1;
    4'he: digit_mod = 2'd2;
    4'hf: digit_mod = 2'd0;
  endcase
  endfunction

  wire [1:0] m00, m01, m02, m03,
             m04, m05, m06, m07;

  wire [1:0] m10, m11, m12, m13;

  wire [1:0] m20, m21;

  wire [31:0] my_in = in; // Let it zero extend for us

  assign m00 = digit_mod(my_in[ 3:0 ]);
  assign m01 = digit_mod(my_in[ 7:4 ]);
  assign m02 = digit_mod(my_in[11:8 ]);
  assign m03 = digit_mod(my_in[15:12]);
  assign m04 = digit_mod(my_in[19:16]);
  assign m05 = digit_mod(my_in[23:20]);
  assign m06 = digit_mod(my_in[27:24]);
  assign m07 = digit_mod(my_in[31:28]);

  assign m10 = digit_mod({m01, m00});
  assign m11 = digit_mod({m03, m02});
  assign m12 = digit_mod({m05, m04});
  assign m13 = digit_mod({m07, m06});

  assign m20 = digit_mod({m11, m10});
  assign m21 = digit_mod({m13, m12});

  assign out = digit_mod({m21, m20});

  // synthesis translate_off
     initial
     begin
       if (WIDTH > 32)
       begin
         $display("%t ERROR: Mod3 width must be <= 32 in %m",$realtime);
       end
     end
  // synthesis translate_on

endmodule


"Ben Jackson" <ben@ben.com> wrote in message
news:jFApa.584264$3D1.324134@sccrnsc01...
> In article <18c289aa.0304230854.6897fb3b@posting.google.com>,
> RISC taker <RISC_taker@alpenjodel.de> wrote:
> >Hey, I need to calculate (n mod 3) in a Virtex-II design. n is a
> >10-bit unsigned number and 3 is a constant. This has to be done in the
> >same cycle (combinatorial!). Now what's a good way to implement that?
>
> Off the top of my head, I think so:
>
> Consider any all-1s binary number with an even number of bits.  That
> number is a multiple of 3.  3, 15, 63, 255, or in your case, 1023.
> For every EVEN bit NOT set in that number, the remainder goes up by 2.
> For every ODD bit not set in that number, the remainder goes up by 1.
> For example:
>
> 1111 = 15 = multiple of 3
> 0111 = 7 = remainder 1
> 1011 = 11 = remainder 2
> 1101 = 13 = remainder 1
> 1110 = 14 = remainder 2
>
> and any combination:
> 0101 = 5 = remainder (1+1) => 2
> 0110 = 6 = remainder (1+2) => 3 => 0
>
> Conveniently each pair of bits makes a 2-bit number which you can add
> up.
>
> Now if you add up all of the possible bits that way for a 10 digit number
> the most you can get is 15 (1 x 5 odd bits + 2 x 5 even bits = 15).  You
> could then handle that with a lookup table or a few well-chosen gates.
> Or you could repeat this process again.
>
> --
> Ben Jackson
> <ben@ben.com>
> http://www.ben.com/

Article: 54993
Subject: Newbie question
From: Keith Youngblood <lifer@notthistime.invalid>
Date: Wed, 23 Apr 2003 12:10:47 -0700
Links: << >> << T >> << A >>

Hello all,

I am considering experimenting with FPGA's. I was hoping someone here
could point me in a good starting direction. BTW, I would prefer to use
Linux as a development platform although it is not critical. Are there
emulators out there that I can get my feet wet with?

I have been a programmer for some time and have programmed a number of
different microcontrollers. FPGA stuff is new to me. I am well versed in
electronics and am currently doing some EE work.

My goal with FPGA's is to work with multi-agent AI systems in hardware.
Systems such as genetic algorithms, classifier systems, neural nets,
fuzzy logic engines, etc... Is my understanding of FPGA's proper? Can
this type of thing be done?

Any comments would be greatly appreciated. Even if I am completely
whacked... ;-)

Thanks in advance.

--
Keith Youngblood
lifer@notthistime.invalid

To email me, replace domain name with o l y w a DOT n e t (minus the
spaces)

Article: 54994
Subject: Re: Virtex2 and Logic Analyzer
From: Eric Smith <eric-no-spam-for-me@brouhaha.com>
Date: 23 Apr 2003 12:37:12 -0700
Links: << >> << T >> << A >>

"Basuki Endah Priyanto" <EBEPriyanto@ntu.edu.sg> writes:
> I have Xilinx virtex2-1000 and Textronix Logic Analyzer. The problem
> is the TTL output level from my logic analyzer is 3.8 volt and the
> maximum volatge to my Virtex2 is 3.3 volt.

I asked:
> Are you using the Tek as a pattern generator or something?  Normally
> a logic analyzer has *inputs*, not outputs.

"Basuki Endah Priyanto" <EBEPriyanto@ntu.edu.sg> writes:
> yes .. the "Tex" is as a pattern generator.

Well, if you're sure it's a "Tex", I can't help you much.  We only
use "Tek" (Tektronix) gear around here.  I've never even heard of
"Textronix", but if they made logic analyzers I'd expect Tektronix
to sue them for trademark infringement.

Anyhow, if your "Tex" doesn't have settings for 3.3V CMOS output from
the pattern generator, you may need to kludge up some buffers or
quickswitches to do level conversion.

Article: 54995
Subject: Re: Challenge: (n mod 3) in hardware???
From: rickman <spamgoeshere4@yahoo.com>
Date: Wed, 23 Apr 2003 15:52:15 -0400
Links: << >> << T >> << A >>

Allan Herriman wrote:
> 
> On Thu, 24 Apr 2003 04:20:49 +1000, Allan Herriman
> <allan_herriman.hates.spam@agilent.com> wrote:
> 
> >On 23 Apr 2003 09:54:22 -0700, RISC_taker@alpenjodel.de (RISC  taker)
> >wrote:
> >
> >>Hey, I need to calculate (n mod 3) in a Virtex-II design. n is a
> >>10-bit unsigned number and 3 is a constant. This has to be done in the
> >>same cycle (combinatorial!). Now what's a good way to implement that?
> >>
> >>I thought of a lookup table (distributed RAM) but this takes quite a
> >>lot of space. Any better ideas? (Ray, the arithmetic guru? :-)
> >>
> >>Do you think I can perform this operation at 200 MHz in a Virtex-II?
> >>
> >>Thanks!
> >>RISC_taker
> >
> >Hi RISC,
> >
> >If you break up your 10 bit input word into five 2 bit words, you can
> >take the modulus of each (using five pairs of 2 input LUTs), then add
> >the results (to get a four bit number), then take the modulus of that.
> >
> >This works since
> >
> >a mod (a - 1) = 1
> >
> >and
> >
> >b mod (a - 1) = (a x b) mod (a - 1)
> >
> >(Think of a as being an even power of 2, which means we don't change
> >the modulus if we shift the input by 2 bits.)
> >
> >
> >We can improve the timing by using pairs of 4 input LUTS to take four
> >bit slices of your input.
> >We then sum the three 2 bit values (using 2 levels of logic) to get a
> >4 bit result, then take the modulus of that in another pair of LUTs.
> >
> >This takes a total of four levels of logic, which should work at
> >200MHz in Virtex-II, depending on the speed grade, your patience, etc.
> 
> BTW, the total hardware is 13 LUTs, which might fit into 2 CLBs.
> 
> (Or you could use a block ram, as other posters have suggested.)
> 
> Regards,
> Allan.

If I understand how this algorithm works, I think the logic can be
reduced to 10 LUTs in two levels.  

Instead of adding the values of the pairs of inputs, group them into
even and odd bits.  Use the LUTs with the F5 mux to implement the
modified two bit sum of five equal value inputs.  When I say modified,
you need to produce a two bit result so logically add the result bit 2^2
back in as another input bit.  Five bits in, two bits out.  Or you can
think of this as a 32 entry truth table.  The point is that you only
need two outputs from any of these functions to produce a 0, a 1 or a
2.  

This will use a total of 8 LUTs to give you a two bit even bit sum and a
two bit odd bit sum.  These four signals can then be run through a pair
of LUTs to give you the two bit modulo three result by using a simple
truth table.  

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 54996
Subject: Re: Challenge: (n mod 3) in hardware???
From: "Avrum" <avrum@REMOVEsympatico.ca>
Date: Wed, 23 Apr 2003 16:35:47 -0400
Links: << >> << T >> << A >>

You are right - it can be done with 2 levels of logic and some F5MUXes
(which will be faster than the three levels of logic I propsed above).

Referring to my earlier implementation, you will still need the function
digit_mod, and you will also need an equivalent 5 bit function, lets call it
five_bit_mod - it will be a 5 bit input, 2 bit output case table like
digit_mod (which just lists the mod3 of all 32 5 bit combinations). This
SHOULD be implemented as 2*(2*LUT4+F5MUX), if the synthesis tool does its
job. The digit_mod should be implemented as 2*LUT4

wire [1:0] m0, m1;

m0 = five_bit_mod(in[4:0]);
m1 = five_bit_mod(in[9:5]);

out = digit_mod({m1,m0});

This will (oddly) take two more LUTs, but should be faster.

Avrum
"rickman" <spamgoeshere4@yahoo.com> wrote in message
news:3EA6EEEF.9EEC9047@yahoo.com...
> Allan Herriman wrote:
> >
> > On Thu, 24 Apr 2003 04:20:49 +1000, Allan Herriman
> > <allan_herriman.hates.spam@agilent.com> wrote:
> >
> > >On 23 Apr 2003 09:54:22 -0700, RISC_taker@alpenjodel.de (RISC  taker)
> > >wrote:
> > >
> > >>Hey, I need to calculate (n mod 3) in a Virtex-II design. n is a
> > >>10-bit unsigned number and 3 is a constant. This has to be done in the
> > >>same cycle (combinatorial!). Now what's a good way to implement that?
> > >>
> > >>I thought of a lookup table (distributed RAM) but this takes quite a
> > >>lot of space. Any better ideas? (Ray, the arithmetic guru? :-)
> > >>
> > >>Do you think I can perform this operation at 200 MHz in a Virtex-II?
> > >>
> > >>Thanks!
> > >>RISC_taker
> > >
> > >Hi RISC,
> > >
> > >If you break up your 10 bit input word into five 2 bit words, you can
> > >take the modulus of each (using five pairs of 2 input LUTs), then add
> > >the results (to get a four bit number), then take the modulus of that.
> > >
> > >This works since
> > >
> > >a mod (a - 1) = 1
> > >
> > >and
> > >
> > >b mod (a - 1) = (a x b) mod (a - 1)
> > >
> > >(Think of a as being an even power of 2, which means we don't change
> > >the modulus if we shift the input by 2 bits.)
> > >
> > >
> > >We can improve the timing by using pairs of 4 input LUTS to take four
> > >bit slices of your input.
> > >We then sum the three 2 bit values (using 2 levels of logic) to get a
> > >4 bit result, then take the modulus of that in another pair of LUTs.
> > >
> > >This takes a total of four levels of logic, which should work at
> > >200MHz in Virtex-II, depending on the speed grade, your patience, etc.
> >
> > BTW, the total hardware is 13 LUTs, which might fit into 2 CLBs.
> >
> > (Or you could use a block ram, as other posters have suggested.)
> >
> > Regards,
> > Allan.
>
> If I understand how this algorithm works, I think the logic can be
> reduced to 10 LUTs in two levels.
>
> Instead of adding the values of the pairs of inputs, group them into
> even and odd bits.  Use the LUTs with the F5 mux to implement the
> modified two bit sum of five equal value inputs.  When I say modified,
> you need to produce a two bit result so logically add the result bit 2^2
> back in as another input bit.  Five bits in, two bits out.  Or you can
> think of this as a 32 entry truth table.  The point is that you only
> need two outputs from any of these functions to produce a 0, a 1 or a
> 2.
>
> This will use a total of 8 LUTs to give you a two bit even bit sum and a
> two bit odd bit sum.  These four signals can then be run through a pair
> of LUTs to give you the two bit modulo three result by using a simple
> truth table.
>
> --
>
> Rick "rickman" Collins
>
> rick.collins@XYarius.com
> Ignore the reply address. To email me use the above address with the XY
> removed.
>
> Arius - A Signal Processing Solutions Company
> Specializing in DSP and FPGA design      URL http://www.arius.com
> 4 King Ave                               301-682-7772 Voice
> Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 54997
Subject: Re: Challenge: (n mod 3) in hardware???
From: "Avrum" <avrum@REMOVEsympatico.ca>
Date: Wed, 23 Apr 2003 16:45:26 -0400
Links: << >> << T >> << A >>

I take that back - it won't work (sorry).

32 mod 3 is NOT 1, therefore you cannot use five_bit_mod on bits 9:5. You
can build a different 32->2 function for bit 9:5, but it would be

5'd0: out = 0;
5'd1: out = 2; // the mod3 of 2
5'd2: out = 1; // the mod3 of 4
5'd3: out = 0; // the mod3 of 6
5'd4: out = 2; // the mod3 of 8
etc...

This would account for the 1 bit shift.

Avrum


"Avrum" <avrum@REMOVEsympatico.ca> wrote in message
news:tPCpa.2998$2g5.436588@news20.bellglobal.com...
> You are right - it can be done with 2 levels of logic and some F5MUXes
> (which will be faster than the three levels of logic I propsed above).
>
> Referring to my earlier implementation, you will still need the function
> digit_mod, and you will also need an equivalent 5 bit function, lets call
it
> five_bit_mod - it will be a 5 bit input, 2 bit output case table like
> digit_mod (which just lists the mod3 of all 32 5 bit combinations). This
> SHOULD be implemented as 2*(2*LUT4+F5MUX), if the synthesis tool does its
> job. The digit_mod should be implemented as 2*LUT4
>
> wire [1:0] m0, m1;
>
> m0 = five_bit_mod(in[4:0]);
> m1 = five_bit_mod(in[9:5]);
>
> out = digit_mod({m1,m0});
>
> This will (oddly) take two more LUTs, but should be faster.
>
> Avrum
> "rickman" <spamgoeshere4@yahoo.com> wrote in message
> news:3EA6EEEF.9EEC9047@yahoo.com...
> > Allan Herriman wrote:
> > >
> > > On Thu, 24 Apr 2003 04:20:49 +1000, Allan Herriman
> > > <allan_herriman.hates.spam@agilent.com> wrote:
> > >
> > > >On 23 Apr 2003 09:54:22 -0700, RISC_taker@alpenjodel.de (RISC  taker)
> > > >wrote:
> > > >
> > > >>Hey, I need to calculate (n mod 3) in a Virtex-II design. n is a
> > > >>10-bit unsigned number and 3 is a constant. This has to be done in
the
> > > >>same cycle (combinatorial!). Now what's a good way to implement
that?
> > > >>
> > > >>I thought of a lookup table (distributed RAM) but this takes quite a
> > > >>lot of space. Any better ideas? (Ray, the arithmetic guru? :-)
> > > >>
> > > >>Do you think I can perform this operation at 200 MHz in a Virtex-II?
> > > >>
> > > >>Thanks!
> > > >>RISC_taker
> > > >
> > > >Hi RISC,
> > > >
> > > >If you break up your 10 bit input word into five 2 bit words, you can
> > > >take the modulus of each (using five pairs of 2 input LUTs), then add
> > > >the results (to get a four bit number), then take the modulus of
that.
> > > >
> > > >This works since
> > > >
> > > >a mod (a - 1) = 1
> > > >
> > > >and
> > > >
> > > >b mod (a - 1) = (a x b) mod (a - 1)
> > > >
> > > >(Think of a as being an even power of 2, which means we don't change
> > > >the modulus if we shift the input by 2 bits.)
> > > >
> > > >
> > > >We can improve the timing by using pairs of 4 input LUTS to take four
> > > >bit slices of your input.
> > > >We then sum the three 2 bit values (using 2 levels of logic) to get a
> > > >4 bit result, then take the modulus of that in another pair of LUTs.
> > > >
> > > >This takes a total of four levels of logic, which should work at
> > > >200MHz in Virtex-II, depending on the speed grade, your patience,
etc.
> > >
> > > BTW, the total hardware is 13 LUTs, which might fit into 2 CLBs.
> > >
> > > (Or you could use a block ram, as other posters have suggested.)
> > >
> > > Regards,
> > > Allan.
> >
> > If I understand how this algorithm works, I think the logic can be
> > reduced to 10 LUTs in two levels.
> >
> > Instead of adding the values of the pairs of inputs, group them into
> > even and odd bits.  Use the LUTs with the F5 mux to implement the
> > modified two bit sum of five equal value inputs.  When I say modified,
> > you need to produce a two bit result so logically add the result bit 2^2
> > back in as another input bit.  Five bits in, two bits out.  Or you can
> > think of this as a 32 entry truth table.  The point is that you only
> > need two outputs from any of these functions to produce a 0, a 1 or a
> > 2.
> >
> > This will use a total of 8 LUTs to give you a two bit even bit sum and a
> > two bit odd bit sum.  These four signals can then be run through a pair
> > of LUTs to give you the two bit modulo three result by using a simple
> > truth table.
> >
> > --
> >
> > Rick "rickman" Collins
> >
> > rick.collins@XYarius.com
> > Ignore the reply address. To email me use the above address with the XY
> > removed.
> >
> > Arius - A Signal Processing Solutions Company
> > Specializing in DSP and FPGA design      URL http://www.arius.com
> > 4 King Ave                               301-682-7772 Voice
> > Frederick, MD 21701-3110                 301-682-7666 FAX
>
>

Article: 54998
Subject: Re: Very low pin count FPGA
From: Jim Granville <jim.granville@designtools.co.nz>
Date: Thu, 24 Apr 2003 08:52:57 +1200
Links: << >> << T >> << A >>

Mike Harrison wrote:
> 
> On Wed, 23 Apr 2003 16:57:29 +0800, "Joshua Yin" <joshuayin@cytecht.com> wrote:
> 
<snip>
> 
> >Do you really need a microcontroller?
> 
> I think the fact that we don't see low pin-count PLDs is that for the vast majority of applications
> that might use one, a micro is a better solution, in terms of cost, flexibility, functionality and
> power consumption.
> 
> In most cases  the argument would be 'do you really need a PLD?'
> Micros are infinitely more flexible and powerful, usually take a lot less power, and the only reason
> to use a PLD is that a micro isn't fast enough.

 You need to be carefull to compare like-process devices.
Microcontrollers are highly flexible devices, especially at variable
manipulation,
and state-variable designs.

 That said, they are NOT lower power (same process), and using software
to 'spin fast'
to emulate hardware is inherently inefficent, from a power viewpoint,
and also from
a time-domain viewpoint.

 Scenix (Ubicom) took the pathway of 'all in SW', and they have very
high Icc
levels - this is why most uC have an extensive HW peripheral array.

 A PLD is inherently parallel, is very fast (eg protection), and it does
not 
need 'SW refresh'.
 It is also harder to 'crash' a PLD :)

 There are many TimerChain / Peripheral IO expansion / FastPWM /
DataPath /
Power Management tasks where a low pin count PLD would complement a uC 
very nicely, and we do many designs where a uC is used with one (or
more) 
SPLD/CPLD. The end result is much better than trying to get 
the uC to 'be all things' :)

 One trend we see with uC is as they get smaller (8/11/14 pin devices ),
there is
more opening for distributed IO expansion : eg with lower pin counts
PLDs.

 The PCF8574 is a good IO expansion example, and the prices of these are 
HIGHER than many CPLDs, and they are slower, and far less flexible.

 Atmel CPLDs have uA region static Icc's, and the newer devices from
Xilinx/Lattice
also have uA Icc, but they are following the speed-dominant path, and
there is
certainly room for a smaller package PLD device that follows a
uAFrugal-dominant
pathway.

 Imagine what you could do with a uA PLD that was a morph of a 
PCF8563(RTC) / PCF8574 (IOexp) / 4060(Counter) / TinyLogic ?

-jg

Article: 54999
Subject: Re: Challenge: (n mod 3) in hardware???
From: Allan Herriman <allan_herriman.hates.spam@agilent.com>
Date: Thu, 24 Apr 2003 07:01:35 +1000
Links: << >> << T >> << A >>

On Wed, 23 Apr 2003 15:52:15 -0400, rickman <spamgoeshere4@yahoo.com>
wrote:

>Allan Herriman wrote:
>> 
>> On Thu, 24 Apr 2003 04:20:49 +1000, Allan Herriman
>> <allan_herriman.hates.spam@agilent.com> wrote:
>> 
>> >On 23 Apr 2003 09:54:22 -0700, RISC_taker@alpenjodel.de (RISC  taker)
>> >wrote:
>> >
>> >>Hey, I need to calculate (n mod 3) in a Virtex-II design. n is a
>> >>10-bit unsigned number and 3 is a constant. This has to be done in the
>> >>same cycle (combinatorial!). Now what's a good way to implement that?
>> >>
>> >>I thought of a lookup table (distributed RAM) but this takes quite a
>> >>lot of space. Any better ideas? (Ray, the arithmetic guru? :-)
>> >>
>> >>Do you think I can perform this operation at 200 MHz in a Virtex-II?
>> >>
>> >>Thanks!
>> >>RISC_taker
>> >
>> >Hi RISC,
>> >
>> >If you break up your 10 bit input word into five 2 bit words, you can
>> >take the modulus of each (using five pairs of 2 input LUTs), then add
>> >the results (to get a four bit number), then take the modulus of that.
>> >
>> >This works since
>> >
>> >a mod (a - 1) = 1
>> >
>> >and
>> >
>> >b mod (a - 1) = (a x b) mod (a - 1)
>> >
>> >(Think of a as being an even power of 2, which means we don't change
>> >the modulus if we shift the input by 2 bits.)
>> >
>> >
>> >We can improve the timing by using pairs of 4 input LUTS to take four
>> >bit slices of your input.
>> >We then sum the three 2 bit values (using 2 levels of logic) to get a
>> >4 bit result, then take the modulus of that in another pair of LUTs.
>> >
>> >This takes a total of four levels of logic, which should work at
>> >200MHz in Virtex-II, depending on the speed grade, your patience, etc.
>> 
>> BTW, the total hardware is 13 LUTs, which might fit into 2 CLBs.
>> 
>> (Or you could use a block ram, as other posters have suggested.)
>> 
>> Regards,
>> Allan.
>
>If I understand how this algorithm works, I think the logic can be
>reduced to 10 LUTs in two levels.  

Yes, I realised how to do it in 10 LUTs in *three* levels just after I
posted, but by that time Avrum had already posted the equivalent
solution in Verilog so I didn't bother with a retraction.

If anyone is interested, I have appended the equivalent code in VHDL
(with "automatic" depth detection, and without the 32 bit limitation)
to this post.

Your solution using the F5 mux is faster of course (if less portable).
I'll think about adding it to the VHDL when I get some time.

>Instead of adding the values of the pairs of inputs, group them into
>even and odd bits.  Use the LUTs with the F5 mux to implement the
>modified two bit sum of five equal value inputs.  When I say modified,
>you need to produce a two bit result so logically add the result bit 2^2
>back in as another input bit.  Five bits in, two bits out.  Or you can
>think of this as a 32 entry truth table.  The point is that you only
>need two outputs from any of these functions to produce a 0, a 1 or a
>2.  
>
>This will use a total of 8 LUTs to give you a two bit even bit sum and a
>two bit odd bit sum.  These four signals can then be run through a pair
>of LUTs to give you the two bit modulo three result by using a simple
>truth table.  

Regards,
Allan.


library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity mod3 is
  generic (
    width     : positive := 10
  );
  port (
    input     : in  unsigned(width - 1 downto 0);
    output    : out unsigned(1 downto 0)
  );
end entity mod3;

architecture rtl of mod3 is

  pure function digit_mod (arg : unsigned(3 downto 0)) return unsigned
is
    type mod3_table is array (0 to 15) of unsigned(1 downto 0);
    constant luts : mod3_table := (
                      "00","01","10","00",
                      "01","10","00","01",
                      "10","00","01","10",
                      "00","01","10","00"
                    );
  begin
    return luts(to_integer(arg));
  end digit_mod;

  pure function work_out_depth (width : positive) return positive is
    variable depth : integer := 1;
    variable my_count : integer := (width - 1) / 4;
  begin
    while my_count > 0 loop
      depth := depth + 1;
      my_count := my_count / 2;
    end loop;
    return depth;
  end work_out_depth;

  constant depth : positive := work_out_depth(width);

  type t_unsigned_array is array (depth downto 0) of unsigned(width +
3 downto 0);
  signal unsigned_array : t_unsigned_array := (others => (others =>
'0'));

begin

  unsigned_array(0)(input'range) <= input;  -- zero extend input

  g1: for d in 1 to depth generate
    g2: for w in 0 to (width - 1) / 4 generate
      unsigned_array(d)(2 * w + 1 downto 2 * w)
        <= digit_mod(unsigned_array(d - 1)(4 * w + 3 downto 4 * w));
    end generate g2;
  end generate g1;

  output <= unsigned_array(depth)(1 downto 0);

end architecture rtl;

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search