Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search

Messages from 35325

Article: 35325
Subject: Re: Active-HDL back annotated simulation and PC memory usage
From: Rick Filipkiewicz <rick@algor.co.uk>
Date: Fri, 28 Sep 2001 23:59:59 +0100
Links: << >> << T >> << A >>

Ray Andraka wrote:

> A thorough static timing analysis is much more proof than a timing
> simulation.  Timing simulation only covers one timing case, typically
> everything at worst case (slowest) delay.  It is extroardinarily easy to put
> in a set of test vectors that won't trip up a timing simulation on a design
> that is not meeting timing.  If you have just one clock, the static timing
> is pretty much a no-brainer.  More than one clock, then you have to be
> careful about your constraints to make sure the right constraints apply to
> the right signals.  As a first cut, set all your clocks to a period
> constraint set at your fastest clock.  If that meets timing you are done.
>
> "Atkins, Kate" wrote:
>
>

I'd like to add another reason why ``STA is god''. It is very difficult to make
sure that a simulation toggles all possible FFs in all possible sequences and
basically impossible [at least computationally] to demonstrate that it
excercises all possible paths.

There is only one real exception to this - the IO timing. Here if you have some
good, timing accurate, bus functional models of the external components you can
do a cross-check of your STA IO constraints.

Article: 35326
Subject: Re: Active-HDL back annotated simulation and PC memory usage
From: "S. Ramirez" <sramirez@cfl.rr.com>
Date: Fri, 28 Sep 2001 23:38:38 GMT
Links: << >> << T >> << A >>


"Atkins, Kate" <Kate.Atkins@siraeo.co.uk> wrote in message
news:A070BF372E81D2119C3900609769B49F06F361@server1...
> Static timing analysis says "no worries" but project manager wants more
> proof :-(
> Any useful advice appreciated.
> Thanks in advance
> Kate

Kate,
     Any big and meaningful back-annotated timing analysis will produce huge
files, and some of them require fast and powerful computers with tons of
system memory.  I have 1 GB of system ram in my machine and it can handle
some of these analyses sometimes slow, sometimes fast.
     The great thing about it, though is that I haven't had to do a
back-annotated timing analyses in a long, long time.  The reason:
synchronous design.  If you synchronize everything, you will only have three
FPGA timing analyses to do:  1) clock to clock performance (Altera calls
this registered performance), 2) input setup and hold times, and 3) clock to
out times.  The first one is done automatically in Xilinx with a NET PERIOD
constraint -- it is that simple.  For the latter two, you need to know the
input setup and hold times and clock to out times of the external circuits
in order to specify them in the FPGA constraints file.  For example,
external circuits will have differing clock to out timing.  These times must
be known in order to specify the corresponding inputs' setup timing in the
FPGA constraints file.  Generally speaking, FPGAs have zero hold time so
that usually isn't a problem unless you delete or decrease the built in
delay at the FPGA input.
     Most designs have more than one clock.  You automatically buncle all
logic in a clock domain by specifying a NET PERIOD for each clock.  Also,
the input setup/hold times and clock to output times are a function of a
specific clock and must be specified in the constraints file.  Think of
multiple clock designs as having multiple clock domains, with logic in each
domain dealing only with one clock.   Siince logic in one clock domain
usually has to connect to logic in another clock domain, a synchronous
boundary exists between the two clock domains.  In a block diagram, I
usually draw dotted lines separating the clock domains.  Any signals
crossing the dotted lines must be synchronized at the inputting clock
domain, since that clock domain has the correct clock to be synchronized to.
Synchronizing flip-flops, FIFOs, dual-port RAMs and protocol stabilization
are three techniques commonly used to synchronize the signals from one clock
domain to another.  Note:  dual-port RAMs (and FIFOs) are one reason why
Xilinx leaped ahead of Altera back in the XC4K vs. 10K days!
     If your boss is requiring a back-annotated timing simulation, then you
(or the original designer!) have either not followed the synchronous design
rules or (s)he is used to having the SDF file and waveform viewer tell
him/her what is already built into the constraints file and static timing
analysis.  If (s)he has to see an output signal delayed, for example, it
will just verify that the clock to out constraint is working properly.  Once
you prove to him/her that a properly constrained design is doing all the
work automaticaly, (s)he should accept static timing as very adequate.  If
(s)he doesn't, well, it's his/her budget and schedule!
     If your company doesn't follow synchronous design practices, then you
are stuck with much more analyses and much more work in the constraints
file.  I know of companies that graduated from asynchronous designs in CPLDs
to the same designs/design rules in FPGAs, and the jump was much more
tedious than they bargained for.  I can relate to you example after example,
some of which never made it out the door.
     I hope I haven't bored you with synchronous design.  You are already
probably familiar with it, and if you are, use it to simplify your analyses
down to one -- static timing.  Redesign turn around times are dramatically
reduced when one uses synchronous design and static timing analysis.  I
think of it as letting the tool do all of the work.
     Good luck.
Simon Ramirez, Consultant and Aerocanard Builder
Synchronous Design, Inc.
Oviedo, FL  USA

Article: 35327
Subject: Re: Forcing a LUT logic function (was Synplicity logic replication)
From: Ray Andraka <ray@andraka.com>
Date: Sat, 29 Sep 2001 01:39:04 GMT
Links: << >> << T >> << A >>

Very nice.  I hadn't taken the time to figure out how to parse a string to get it
into a form usable as a bit vector.  We had been building a library of fmapped
functions, but it has grown to the point where it is unwieldy.  I've been looking
to do something similar, but have not had the time to put my mind to it.  What are
your license terms for use of the VExprEval function?  Is it free for use?,
licensed?, GPL?

Tim wrote:

> "Don Husby" <husby_d@yahoo.com> wrote
>
> >   This is what I ended up doing.  I guess the Golden Age of high level
> > HDL hasn't arrived yet.  The sad thing is that everyone thinks it has,
> > so when I write something simple, the tool trys to second guess me and
> > make an "optimization".  I have to go to great effort to tell it to
> > just fucking do what I say.  Instantiating a LUT is to HDL programming
> > as instantiating machine op-codes is to C programming.
> >
> >   Here's how to instantiate a LUT in Synplicity/Verilog:
> >
> >     LUT3 #('h2f) shift0(Shift[0], Sending, Stall, Ready);
> >
> >   Unfortunately, not all synthesis/simulation tools accept
> > the same format.
> >
>
> Please excuse the VHDL in the code below.  (A fuller version is
> at http://www.rockylogic.com/freestuff )
>
> It seems that what we really want is something like:
>
>   signal a,b,c,d,x : std_logic;
>   x <= LUT((a and b) xor ( c and (not d)));
>
> but that is beyond the language.  The nearest I have managed is:
>
>    LU1: VLut4 generic map ( ExprStr => "((I0*I1)@(I2*~I3))" )
>               port map (I0=>a, I1=>b, I2=>c, I3=>d, O=>x );
>
> which evaluates x <= (a and b) xor ( c and (not d));
>
> VLut4 is the this entity:
>
> -- =========================================================== --
> entity VLut4 is generic(ExprString  : string := "(I0*I1*I2*I3)" );
>                 port   (I0,I1,I2,I3 : in  std_logic := '0';
>                         O           : out std_logic);
> end VLut4;
>
> architecture struct of VLut4 is
>     attribute xc_map of struct : architecture is "lut";
>     constant LutBits  : bit_vector(0 to 15) := VExprEval(ExprString);
>     signal   AddrBits : std_logic_vector(3 downto 0);
>     signal   Addr     : integer range 0 to 15;
> begin
>     AddrBits <= (I3, I2, I1, I0);
>     Addr     <= to_integer( AddrBits );
>     O        <= to_stdulogic(LutBits(Addr));
> end struct;
> -- =========================================================== --
>
> Which works as like this:
>   1. the 16-bit constant (A.K.A. the LUT contents) is evaluated
>      (once only) via the function VExprEval().
>   2. the incoming signals are catenated and used to index into
>      LutBits, which is pretty efficient for simulation.
>
> The xc_map is probably not needed as just about any reasonable synth
> will map a 16-bit ROM to a LUT.  I hope.
>
> The tricky part is constructing VExprEval().  What I do is this:
>   1. convert ExprString to a reverse polish (RP) equivalent.
>   2. scan through the RP 16 times, with the 16 possible values of the
>      input vector.  Each scan fills in 1 bit of the LUT constant.
>
> Here is the function:
>
> -- =========================================================== --
>     -- calculate the 16-bit INIT string corresponding to an arbitrary
>     -- function of I0..I3.
>     -- the operators are
>     --    op      priority (0=lowest)
>     --    (       0                       left paren
>     --    +       1                       OR
>     --    *,@     2                       AND,XOR
>     --    ~       3                       NOT
>     --    Ix      4                       one of the I0..I3 variables
>     --
>     -- the method is to
>     --   convert the infix expression to a postfix (reverse polish) string
>     --   evaluate the RP for all 16 possible sets of Ix
>     --
> function VExprEval( s : string; DEBUG : boolean := false) return bit_vector is
>       variable r: bit_vector(0 to 15);
>       variable iInputStr: integer;
>       variable I0,I1,I2,I3: boolean;
>       type TRps is array (0 to 100) of character;
>       variable Rps    : TRps;       -- reverse polish (RP) string
>       variable RpsLen : integer;
>
>       type TPri is array (0 to 15) of integer;        -- priority
>       type TStk is array (0 to 15) of character;      -- stack
>       variable Pri    : TPri;
>       variable Stk    : TStk;
>       variable StkLen : integer;
>       variable Priority: integer;
>
>       type TEStack is array (0 to 15) of boolean;
>       variable EStack  : TEStack;
>       variable ELen    : integer;
>       variable EResult : boolean;
>       variable iRps    : integer;
>
>       variable ch      : character;
>       variable Obuff   : string (1 to 80);
>       --synthesis translate_off
>       variable Lout    : line;
>       --synthesis translate_on
>
>   begin
>
>       --synthesis translate_off
>       if DEBUG then
>         write(Lout, "Input string : ");
>         write(Lout, s);
>         writeline(OUTPUT, Lout);
>       end if;
>       --synthesis translate_on
>
>       -- first build the Reverse Polish sequence
>       RpsLen := 0;
>       iInputStr := 1;
>       StkLen := 0;
>
>       SCAN_LOOP: for iInputStr in s'low to s'high loop
>
>         -- crude GetToken() routine
>         ch := s(iInputStr);
>         next SCAN_LOOP when ch=' ';     -- skip spaces
>         next SCAN_LOOP when ch='I';     -- I0/I1/I2/I3
>
>         -- prioritise token
>         case ch is
>           when '('              => Priority := 0;
>           when '+'              => Priority := 1;
>           when '*'|'@'          => Priority := 2;
>           when '~'              => Priority := 3;
>           when '0'|'1'|'2'|'3'  => Priority := 4;
>           when others           => Priority := 99;
>         end case;
>
>         -- evaluate token
>         case ch is
>           when '(' =>
>             Stk(StkLen) := ch;
>             Pri(StkLen) := Priority;
>             StkLen := StkLen+1;
>           when '+'|'*'|'@'|'~' =>
>             while (StkLen /= 0) and (Priority <= Pri(StkLen-1)) loop
>               StkLen := StkLen-1;               -- pop TOS to RP string
>               Rps(RpsLen) := Stk(StkLen);
>               RpsLen := RpsLen+1;
>             end loop;
>             Stk(StkLen) := ch;                  -- then push this operator
>             Pri(StkLen) := Priority;
>             StkLen := StkLen+1;
>           when '0'|'1'|'2'|'3' =>               -- variable
>             Rps(RpsLen) := ch;
>             RpsLen := RpsLen+1;
>           when ')' =>
>             RBLOOP: loop
>               if StkLen=0 then                  -- unexpected all done
>                 report "Unexpected unmatched ')' in input string.";
>                 exit RBLOOP;
>               elsif Stk(StkLen-1)='(' then      -- pop and discard
>                 StkLen := StkLen-1;
>                 exit RBLOOP;
>               else
>                 StkLen := StkLen-1;             -- pop TOS to RP string
>                 Rps(RpsLen) := Stk(StkLen);
>                 RpsLen := RpsLen+1;
>               end if;
>             end loop;
>           when others =>
>             report "Unexpected token in source string: " & ch;
>         end case;
>       end loop;
>
>       if StkLen /= 0 then
>         report "Unexpected end of input string. Unparsed characters remain.";
>       end if;
>       Rps(RpsLen) := '.';                       -- add an 'end' flag
>       RpsLen := RpsLen+1;
>
>       --synthesis translate_off
>       if DEBUG then
>         write(Lout, "RP string is : ");
>         for iRps in 0 to RpsLen-1 loop
>           Obuff(iRps+1) := Rps(iRps);
>         end loop;
>         write(Lout, Obuff(1 to RpsLen));
>         writeline(OUTPUT, Lout);
>       end if;
>       --synthesis translate_on
>
>       -- evaluate the reverse polish for 0..15
>       for i in 0 to 15 loop
>         I0 := ((i  ) rem 2)=1;
>         I1 := ((i/2) rem 2)=1;
>         I2 := ((i/4) rem 2)=1;
>         I3 := ((i/8) rem 2)=1;
>         ELen := 0;
>         iRps := 0;
>         EX_LOOP: loop
>           ch := Rps(iRps);
>           iRps := iRps+1;
>           case ch is
>             when '~' =>
>               EStack(ELen-1) := not EStack(ELen-1);
>             when '+' =>
>               EStack(ELen-2) := EStack(ELen-1) or EStack(ELen-2);
>               ELen := ELen-1;
>             when '*' =>
>               EStack(ELen-2) := EStack(ELen-1) and EStack(ELen-2);
>               ELen := ELen-1;
>             when '@' =>
>               EStack(ELen-2) := EStack(ELen-1) xor EStack(ELen-2);
>               ELen := ELen-1;
>             when '0' => EStack(ELen) := I0; ELen := ELen+1;
>             when '1' => EStack(ELen) := I1; ELen := ELen+1;
>             when '2' => EStack(ELen) := I2; ELen := ELen+1;
>             when '3' => EStack(ELen) := I3; ELen := ELen+1;
>             when '.' =>                         -- all done
>               EResult := EStack(ELen-1);
>               exit EX_LOOP;
>             when others =>
>               report "Unexpected token in RP string: " & ch;
>           end case;
>         end loop;
>
>         if EResult then r(i) := '1';
>                    else r(i) := '0';
>         end if;
>       end loop;
>
>       --synthesis translate_off
>       if DEBUG then
>         write(Lout,"INIT(15..0) is : ");
>         for i in 0 to 15 loop
>           if r(i)='1' then Obuff(16-i) := '1';
>                       else Obuff(16-i) := '0';
>           end if;
>         end loop;
>         write(Lout, Obuff(1 to 16));
>         writeline(OUTPUT, Lout);
>       end if;
>       --synthesis translate_on
>
>       return r;
>
>   end VExprEval;
> -- =========================================================== --

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 35328
Subject: Re: comparison of performance and advantages for fpga's versus microcontroller+dsp
From: "Martin Euredjian" <0_0_0_0_@pacbell.net>
Date: Sat, 29 Sep 2001 02:13:50 GMT
Links: << >> << T >> << A >>

> > The solution offered by Celoxica is interesting (www.celoxica.com) since
> > I get to stay in my comfort zone of just writing C without having to
learn
> > everything about FPGAs.

"Kevin Neilson" <kevin_neilson@removethis-yahoo.com> wrote in message
news:M7Nq7.15942
> That's exactly what Celoxica wants you to think.  If only it were true!

Can you (or anyone else) elaborate on that comment?  I am also considering
the Celoxica approach and would be very interested in the reasons why it
wouldn't work.  I have an ugent project that has very little room for delays
(in the design process) and the idea of using quasi-C to define the required
FPGA functions is very appealing to me, particularly because I don't
normally work with FPGA's.

Thank you,

--
Martin Euredjian

To send private email:
0_0_0_0_@pacbell.net     where    "0_0_0_0_"  =  "martineu"

Article: 35329
Subject: Re: Using EABs in Leonardo Spectrum with Flex10K
From: Russell Shaw <rjshaw@iprimus.com.au>
Date: Sat, 29 Sep 2001 16:43:58 +1000
Links: << >> << T >> << A >>

It might. My design had ram that used EABs.

Aldo Romani wrote:
> 
> May this affect the missing synthesis into EAB?
> Leonardo Spectrum calls them 'Memory Bits', and I can't get to use them at
> all. Any design, including memories, uses them at 0%.
> 
> Anybody has any suggestion? I'm quite desperate...
> 
> Thanks for your answer,
> Aldo
> 
> "Russell Shaw" <rjshaw@iprimus.com.au> ha scritto nel messaggio
> news:3BB3F3D8.F67A166E@iprimus.com.au...
> > You should download version 'd' of leonardo. My designs crashed version
> > 'a' due to compiler bugs.
> >
> > Aldo Romani wrote:
> > >
> > > Hello to the newsgroup,
> > > maybe some of you may help me.
> > >
> > > I'm a newbie in FPGA programming, so I apologize from now if I am going
> to ask
> > > silly questions.
> > >
> > > I have some VHDL code I have to synthesize on Altera Flex10K devices. I
> use
> > > Leonardo Spectrum, version v20001a2.75.
> >

Article: 35330
Subject: Re: Xilinx 4.1 software
From: hamish@cloud.net.au
Date: Sat, 29 Sep 2001 07:44:23 GMT
Links: << >> << T >> << A >>

Tom Brooks <tbrooks@corepower.com> wrote:
> No, in fact, with 3.3, i was seeing runtimes
> around 30 minutes, with 4.1, it took well over
> an hour.

Did you run with equivalent options? "-l 5" (effort level 5)
implies "-xe 1" (extra effort level 1), which didn't exist on
3.1i. You might need -xe 0 to make the options equivalent.

I have been quite pleased with the 4.1i SP1 results
for a 2V6000 design I'm working on.

Hamish
-- 
Hamish Moffatt VK3SB <hamish@debian.org> <hamish@cloud.net.au>

Article: 35331
Subject: Re: Meta-stability
From: Manjunath <manjunathan_s1@yahoo.com>
Date: Sat, 29 Sep 2001 01:39:29 -0700
Links: << >> << T >> << A >>

Thanks to everyone for spending your valuable time.

Now i got it clear about the issue while simulating the design.

Thanks once Again,

Have Great Day,

Regards,

Manjunath

Article: 35332
Subject: Re: Programming flash connected to CPLD via JTAG
From: Dmitry Kuznetsov <dkuzn@orc.ru>
Date: Sat, 29 Sep 2001 13:14:41 +0400
Links: << >> << T >> << A >>

Fri, 28 Sep 2001 09:55:19 +0200  Matthias Fuchs
<matthias.fuchs@esd-electronics.com> wrote:

>This leads to my question: Why do I need separate jtag chains ? My idea
>is to use the PLDs JTAG interface to 
>access the PLD pins. I think this is called boundary scan, isn't it ?
>
>Is this possible ?

 http://www.orc.ru/~dkuzn/v4j_e.htm - VEC4JTAG - Vectors for JTAG
 http://www.orc.ru/~dkuzn/j_jam.htm - JAM (sorry, in russian only)
 

Best regards!
 Dmitry Kuznetsov, Moscow, http://www.orc.ru/~dkuzn/index.htm
[Team RaceTerrapin]  [Team LEXX]
===

Article: 35333
Subject: Re: How does Altera FLEX 10k communicate with PC?
From: m8931612@student.nsysu.edu.tw (Ru-Chin Tsai)
Date: 29 Sep 2001 03:00:39 -0700
Links: << >> << T >> << A >>

Armin Mueller <armin.mueller@stud.uni-karlsruhe.de> wrote in message
> Depends on you. ISA is easy but rather slow. PCI doesn't
> even fit in a 10K10 part, whatever you use.
> 
> Armin

I have get Altera PCI MegaCore function(PCI interface SOFT-IP).
My core design will be connected to PCI MegaCore function.
The difficult is that I don't know the meanning of PCI bus pins.

Who can tell me how to using PCI MegaCore function easily without
many PCI BUS knowledge?

The PCI-SIG supply PCI specification, but it only for membership.
Being a membership take a fee of $3000(U.S.),it is imposible for
a student. So can someone give me suggestion? Does any free 
PCI sheet or specification available in internet?

Article: 35334
Subject: Re: fir filter
From: renaux <renaux.jacky@wanadoo.fr>
Date: 29 Sep 2001 14:50:35 GMT
Links: << >> << T >> << A >>


just posting a message sent to kuldeep 


Hi 
Personnally I think this architecture is a very easy one for 
  building files 
  updating coefficients ( during debug and later updating ) 
	as not routing is needed just reload a new set 

serial arithmetic
	for your case I undestand 12bits * 16Mhz is = 192 Mhz 
	why not to build up 2 RAMs one for odd bits , the second for 
	even bits and add the 2 results : means running 2 sets in parrallel
	at 6*16 => 96Mhz this is not a big deal any more . The 2 tables will
	have the same content , just a matter on how you feed the addresses
	In case you want to run slower .... use 4 RAMS and running 1/4 the speed 
	and add the results ....... you might go down to have 1 RAM per bit running
	at 16 Mhz each but you will need 12 RAMS    this is not what obviously I would
	recommand as every today ASIC can run 100Mhz systemm clock 
take care 
	just be be sure you run least significant bit first, you must add succesivelly
	the partial product (from the ram) and add it to the accumulated product 
divided
	by 2 (right shift ) but on the last cycle you must substract as the MSB is 
	addressing the ram ( 2's complement )
	as you right shift the accumulated result you can the feed a shift register in
	order to end up with a full scale result 

coefficients (5 in this exemple )  

non symetrical  
 you need 5 shift registers ( which can also be a RAM) 
    reg5 <= (reg4(0) & reg5(11 downto 1));  
    reg4 <= (reg3(0) & reg4(11 downto 1));  
    reg3 <= (reg2(0) & reg3(11 downto 1));  
    reg2 <= (reg1(0) & reg2(11 downto 1));  
    reg1 <= (data_in & reg1(11 downto 1));
    address_ram <= (reg5(0) & reg4(0) & reg3(0) & reg2(0) & reg1(0) );

symetrical 
 
  usually ( coeff 1 = coeff 5 , coeff2  = coeff 4)
  you still need 5 shift registers and group the coefficients by 
  coefficients value


    reg5 <= (reg4(0) & reg5(11 downto 1));  
    reg4 <= (reg3(0) & reg4(11 downto 1));  
    reg3 <= (reg2(0) & reg3(11 downto 1));  
    reg2 <= (reg1(0) & reg2(11 downto 1));  
    reg1 <= (data_in & reg1(11 downto 1));
    
    add1 <= reg5(0) + reg1(0) + carry1;	
    add2 <= reg4(0) + reg2(0) + carry2;

--   this is a carry save type adder  
--   carry1 and carry2 are saved from the previous cycle try to 
--   configure add1 and add2 as 2 bits lsb is the add result 
--   and msb is carry
 
   address_ram <= ( reg3(0) & add2 & add1 );  

-- which requires half bits addresses (in case odd coefficients )

but you must take in account the 13nd carry ( add1 and add2 ) then 
requiring 1 more cycle ( you should consider 14 bits and then running at
14/2 * 16 Mhz  

14 bits is fine as if you run one more cycle and pad zero as 13 and 14 bits into
the shift registers , you will not have to reset carries at each new cycles

RAMS 

  as you have 65 coefficients you'll need 32 bits addresses RAM which is very 
large
  even in ASIC , you must split into smaller blocs ( if you only have 1K* Y RAM ,
 then 
  you'll need 4 of them  )     
  How about Y ( RAM output bus ) it depends on number of coefficients inputs and 

  coefficients size ( if you have 11 bits coefficients , and 10 bits addresses 
  the largest value is when all addresses are '1' in theory the output is 
10+11=21 bits
  BUT ..... is it the largest partial products ? not necessary it depends the 
values 
  some might be negative ... try to calculate the largest value ( excel is fine 
for )
  and then optimize the RAM size sometime it is not necessary as it is design 
dependant 

Speed 

  at 100Mhz you must take care of initialisations and the accumulator must be 
cleared
  every cycle . There is 3 options (at least for me ) 
	1- add one more cycle to clear the accumulator  
      2- add one more cycle and add the ram content + zero ( this is not an 
accumulation
         cycle ) 
	3- using one more register . The last cycle instead to store the adder output 
         you store it to a register which keep the cycle result while you clear 
the 
         accumulator 

Squeleton

    reg5 <= (reg4(0) & reg5(11 downto 1));  
    reg4 <= (reg3(0) & reg4(11 downto 1));  
    reg3 <= (reg2(0) & reg3(11 downto 1));  
    reg2 <= (reg1(0) & reg2(11 downto 1));  
    reg1 <= (data_in & reg1(11 downto 1));
    
    add1 <= reg5(0) + reg1(0) + carry1;	
    add2 <= reg4(0) + reg2(0) + carry2;

    address_ram <= ( reg3(0) & add2 & add1 );  

    from 1 to last_cycle_clock-1
        adder_out   <= accumulator/2 + ram_out;
        accumulator <= adder_out;
    last_cycle_clock 
        adder_out <= accumulator/2 - ram_out;
        result <= adder_out;
        accumulator <= '0'; 


I hope its help do not hesitate to contact me 

   regards    
   


-----Message d'origine-----
De : Kuldeep [mailto:kkdeep@lycos.com]
Envoyé : jeudi 27 septembre 2001 15:52
À : renaux.jacky@wanadoo.fr
Objet : RE: fir filter


Hi jacky ,
   Thanx for reply. This seems to be good architecture as i can tradeoff 
throughput with hardware . Fully serial approach will not work for me as my 
input data is coming at 16Mhz, 12-bit wide. That means i need clock of 192MHz 
(16x12) which i can't afford .correct me ..so i will go for some mix of serial -
parllel approach.
  i have two doubts: 
 quoting a line from ur reply :
1."you better add coefficient before feeding the partial products table"
Here do u mean adding inputs (for which coeffcient happen to be same) before 
feeding the partial product table? Plese elaborate further how can i take 
advantage of symmetrical coeffcients.
2. i have odd number (65) coeffcients. Each LUT take 4 coeff. so where will the 
last coeff go?? should i use 1 LUT for this single coeff.

thanx and regds 
Kuldeep

renaux <renaux.jacky@wanadoo.fr> wrote in message news:<2001925-184146-
543341@foorum.com>...
> Hi 
> 
> I would suggest you read an excellent paper on distributed arithmetic 
> where part of the calculation is done before running while the remaining 
> is done during the run 
> 
> go to http://www.andraka.com/ ,  DSP with FPGA and distributed arith 
> 
> this is intended to fpga , but using a case statement it can be targetted to
> any technology . in addittion , using a ram as table would simplify the 
> FIR filter implementation : a tap per add line , and output bus is as wide as 
> sum of coefficients values ( if 16   12 bits coefficients => 4+12 bits bus )
> do not miss the fact which is coefficients are symetrical you better add 
> coefficient before feeding the partial products table ) 
> 
> do not hesitate to drop me a mail in case it is not clear enough 
> 
> regards , jacky 
> 
> ------
> User of http://www.foorum.com/. The best tools for usenet searching.




Make a difference, help support the relief efforts in the U.S.
http://clubs.lycos.com/live/events/september11.asp

------
User of http://www.foorum.com/. The best tools for usenet searching.

Article: 35335
Subject: Timing on output
From: "Noddy" <g9731642@campus.ru.ac.za>
Date: Sat, 29 Sep 2001 17:06:45 +0200
Links: << >> << T >> << A >>

Sorry, this is somewhat of a newbie question as my knowledge of the use of
Timing Constraints is very limited. How can I check whether data is changing
at the inputs of my output flip-flops before the clock signal. I have a
registered adder going to the output FFs. I am using a CLKDLL. Does this
automatically mean I can be sure that the rising edge at all my clock inputs
happen exactly at the same time?

adrian

Article: 35336
Subject: Re: Timing on output
From: Peter Alfke <palfke@earthlink.net>
Date: Sat, 29 Sep 2001 15:46:53 GMT
Links: << >> << T >> << A >>

Yes, you can be sure that any global clock distribution has very little skew.
In Virtex-II the skew, i.e. the difference in arrival time at thousands of
flip-flop clock inputs, even on a very big global clock net, is less than 200
picosconds.
The actual clock propagation delay is of course much longer, but you can
eliminate that completely by using the DLL.
Peter Alfke, Xilinx Applications

Noddy wrote:

> Sorry, this is somewhat of a newbie question as my knowledge of the use of
> Timing Constraints is very limited. How can I check whether data is changing
> at the inputs of my output flip-flops before the clock signal. I have a
> registered adder going to the output FFs. I am using a CLKDLL. Does this
> automatically mean I can be sure that the rising edge at all my clock inputs
> happen exactly at the same time?
>
> adrian

Article: 35337
Subject: Re: Using EABs in Leonardo Spectrum with Flex10K
From: "Aldo Romani" <romani@free.mail.it>
Date: Sat, 29 Sep 2001 18:05:31 +0200
Links: << >> << T >> << A >>

Sorry to bother you again, and thanks again for your answers,
do you have some sample code to post you were able to synthesize on EABs?
So I can see if the bug is in the compiler or in my code.
We are going to buy the latest version of  Leonardo and MaxPlus, but I'm
afraid it will be a matter of some week.

I've already studied about 30Mb of PDF manuals, but still can't get anything
compiled on EABs.
It's for my degree thesis, and if I dont' get out of this I won't get my
degree.

Another silly question: do you know if the synthesis on EABs has to be done
with Leonardo or with MaxPlus?

Thanks and goodbye!
Aldo Romani

--
Aldo Romani
Student at DEIS - University of Bologna - Italy
romani@free.mail.it
For my mail remove the first dot.

"Russell Shaw" <rjshaw@iprimus.com.au> ha scritto nel messaggio
news:3BB56DAE.695C960A@iprimus.com.au...
> It might. My design had ram that used EABs.
>
> Aldo Romani wrote:
> >
> > May this affect the missing synthesis into EAB?
> > Leonardo Spectrum calls them 'Memory Bits', and I can't get to use them
at
> > all. Any design, including memories, uses them at 0%.
> >
> > Anybody has any suggestion? I'm quite desperate...
> >
> > Thanks for your answer,
> > Aldo
> >
> > "Russell Shaw" <rjshaw@iprimus.com.au> ha scritto nel messaggio
> > news:3BB3F3D8.F67A166E@iprimus.com.au...
> > > You should download version 'd' of leonardo. My designs crashed
version
> > > 'a' due to compiler bugs.
> > >
> > > Aldo Romani wrote:
> > > >
> > > > Hello to the newsgroup,
> > > > maybe some of you may help me.
> > > >
> > > > I'm a newbie in FPGA programming, so I apologize from now if I am
going
> > to ask
> > > > silly questions.
> > > >
> > > > I have some VHDL code I have to synthesize on Altera Flex10K
devices. I
> > use
> > > > Leonardo Spectrum, version v20001a2.75.
> > >

Article: 35338
Subject: Re: comparison of performance and advantages for fpga's versus microcontroller+dsp
From: "Kevin Neilson" <kevin_neilson@removethis-yahoo.com>
Date: Sat, 29 Sep 2001 17:21:54 GMT
Links: << >> << T >> << A >>

You can find more in the archive, at www.fpga-faq.com, as this has been
discussed before.  And I have never written FPGA code in C.  But I think
Celoxica is selling a dream; they are capitalizing on hopes for a shortcut
that doesn't exist, like pills that make you lose weight or schemes that
make you rich with no blood or sweat.

This dream is sold in many forms, including compilers that optimize
sequential logic for VLIW compilers or parallel processing units.  The
hawkers speak of a land of milk and honey which is always on the horizon.

If you can fill out a form in English, then certainly you can write a great
novel, no?  The novel is written in the same language, same syntax.  Yet the
ability to fill out a form doesn't mean you can formulate plot,  develop
characters, and synthesize exciting dialogue.

In the same manner, circuit design, with its inherent concurrency, will
never be the same as sequential code design.  And while you can describe the
circuit in any syntax, including C, the path to proficiency will not be
shortened, and the end result will look eerily like Verilog, with the
addition of unnecessary verbosity.

-Kevin

"Martin Euredjian" <0_0_0_0_@pacbell.net> wrote in message
news:y7at7.1881$KA3.371046508@newssvr13.news.prodigy.com...
> > > The solution offered by Celoxica is interesting (www.celoxica.com)
since
> > > I get to stay in my comfort zone of just writing C without having to
> learn
> > > everything about FPGAs.
>
> "Kevin Neilson" <kevin_neilson@removethis-yahoo.com> wrote in message
> news:M7Nq7.15942
> > That's exactly what Celoxica wants you to think.  If only it were true!
>
> Can you (or anyone else) elaborate on that comment?  I am also considering
> the Celoxica approach and would be very interested in the reasons why it
> wouldn't work.  I have an ugent project that has very little room for
delays
> (in the design process) and the idea of using quasi-C to define the
required
> FPGA functions is very appealing to me, particularly because I don't
> normally work with FPGA's.
>
> Thank you,
>
> --
> Martin Euredjian
>
> To send private email:
> 0_0_0_0_@pacbell.net     where    "0_0_0_0_"  =  "martineu"
>
>
>
>

Article: 35339
Subject: Re: Active-HDL back annotated simulation and PC memory usage
From: "Kevin Neilson" <kevin_neilson@removethis-yahoo.com>
Date: Sat, 29 Sep 2001 17:50:08 GMT
Links: << >> << T >> << A >>

This is very true.  However I often find that something is wrong with my
constraints, so I've met the static timing analysis but the design still
fails.  This happens less with more experience, but often happens
nonetheless.  Sometimes the tools don't interpret the constraints properly.
I just had a design which used the phase-shifted output of the DLL.
Alliance didn't pass the contraint through the DLL properly so I met static
timing but in error.  I also had a module which I declared as a 4-cycle
path.  Indeed, each flop changed but once every four cycles, but one flop
changed a cycle before the rest, so its output was really only a one-cycle
path, so again I met static timing but failed gate-level simulation.  This
latter was an error in the way I set constraints, but an easy-to-make error.
I try to make static timing my main method of testing, but I usually find it
necessary to run the gate-level simulation just in case, especially when
there is any complexity to the constraints.  I do find as I get better that
the gate-level sim becomes more of a routine than a necessity.

-Kevin

"Ray Andraka" <ray@andraka.com> wrote in message
news:3BB4B550.2E5609AE@andraka.com...
> A thorough static timing analysis is much more proof than a timing
> simulation.  Timing simulation only covers one timing case, typically
> everything at worst case (slowest) delay.  It is extroardinarily easy to
put
> in a set of test vectors that won't trip up a timing simulation on a
design
> that is not meeting timing.  If you have just one clock, the static timing
> is pretty much a no-brainer.  More than one clock, then you have to be
> careful about your constraints to make sure the right constraints apply to
> the right signals.  As a first cut, set all your clocks to a period
> constraint set at your fastest clock.  If that meets timing you are done.
>
> "Atkins, Kate" wrote:
>
> > Hi
> >
> > Has anybody else noted Active-HDL (VHDL) using huge amounts of memory
> > when doing post place and route simulation with timing info (SDF)?
> >
> > Specifically it is an Actel RT14100, clocked at <4MHz. When simulation
> > time gets to 600ms memory usage is about 600MByte and as this is into
> > virtual memory for my machine simulation slows to less than snails pace.
> >
> > Static timing analysis says "no worries" but project manager wants more
> > proof :-(
> >
> > Any useful advice appreciated.
> >
> > Thanks in advance
> >
> > Kate
>
> --
> --Ray Andraka, P.E.
> President, the Andraka Consulting Group, Inc.
> 401/884-7930     Fax 401/884-7950
> email ray@andraka.com
> http://www.andraka.com
>
>  "They that give up essential liberty to obtain a little
>   temporary safety deserve neither liberty nor safety."
>                                           -Benjamin Franklin, 1759
>
>

Article: 35340
Subject: Re: Orcad symbol for a Virtex II
From: rotemg@mysticom.com (Rotem Gazit)
Date: 29 Sep 2001 15:43:06 -0700
Links: << >> << T >> << A >>

Tim ,
The .pad file is automatically generated by the Xilinx PAR tool.
I prefer to generate my capture symbols manually, here is how it is
done (I posted this method to this group about 4 month ago)

"I use the following technique to generate Orcad symbols for large
FPGAs .

1)
download the device pinout table from
http://www.xilinx.com/products/virtex/ vepackages.htm or
v2packages.htm.
Alternatively you can use:
> partgen -v YOURDEVICE
Don't count on the printed data sheet it sometimes contains errors
(for example I found an error on 405ebg560 data sheet pinout table ).

2)
Read the pinout text file into your favorite spreadsheet SW (I use
Microsoft excel).
You now have to decide how do you want your symbol to look like.
For parts with more than 100 I/Os I suggest to use "heterogeneous
symbols" ( It means that you have more than one symbol for the SAME
device).
I usually use 6 symbols: one for power , one for configuration and
one for every two I/O banks (including the GCLK and the VCCO ).
You need to arrange the spreadsheet in groups correlated to the number
of symbols you need ( I use "Sort By" -> "bank number" and than by
"pin type").

3) here is the trick:
In Capture select "New Part", Place -> "Pin Array" and draw as many
pins as you need.
Now, select all the pins and than press Ctrl+E. You get a little
spreadsheet with the pins names and numbers.
select all the spreadsheet and press Ctrl+Insert. Switch to the Excel
and paste the data in an empty sheet.
Copy the real pin names from the Original sheet you created and paste
them instead of the automatic pin names.
Mark the new data , switch back to Capture and press Shift+Insert.
Repeat step (3) for each of the heterogeneous symbols.

I used that technique many times on Windows , I don't know if it can
be done in other OS.
It takes no more than 30 min. to create a new Symbol for a > 400 I/O
FPGA .

-----------------------------
Rotem Gazit
mailto:rotemg@mysticom.com
http://www.mysticom.com
-----------------------------

"Tim Hart" <thart@verilink.com> wrote in message news:<ee71e64.1@WebX.sUN8CHnE>...
> I have a basic question. Where can I get a .pin, .pad ro .xnf file to create the part? I am using OrCAD 9.2 and I do have the create part option. I will gladly share the symbol if I can get the necessary file.

Article: 35341
Subject: Re: How does Altera FLEX 10k communicate with PC?
From: "Daniel Lang" <dblx@xtyrvos.caltech.edu>
Date: Sat, 29 Sep 2001 16:03:32 -0700
Links: << >> << T >> << A >>

Take a look at "PCI Handbook" and "PCI System Architecture, 4th Edition"
at Annabooks, http://www.annatechnology.com/annatech/bookBrowseByBusF.asp.
They are not free but MUCH cheaper than $3000!

Daniel Lang


"Ru-Chin Tsai" <m8931612@student.nsysu.edu.tw> wrote in message
news:d22f039b.0109290200.4e741d28@posting.google.com...
> Armin Mueller <armin.mueller@stud.uni-karlsruhe.de> wrote in message
> > Depends on you. ISA is easy but rather slow. PCI doesn't
> > even fit in a 10K10 part, whatever you use.
> >
> > Armin
>
> I have get Altera PCI MegaCore function(PCI interface SOFT-IP).
> My core design will be connected to PCI MegaCore function.
> The difficult is that I don't know the meanning of PCI bus pins.
>
> Who can tell me how to using PCI MegaCore function easily without
> many PCI BUS knowledge?
>
> The PCI-SIG supply PCI specification, but it only for membership.
> Being a membership take a fee of $3000(U.S.),it is imposible for
> a student. So can someone give me suggestion? Does any free
> PCI sheet or specification available in internet?

Article: 35342
Subject: Re: How to fix the hold time violation (clock skew>data skew) in
From: "A. I. Khan" <aikhan@chat.carleton.ca>
Date: 30 Sep 2001 00:03:46 GMT
Links: << >> << T >> << A >>

I think you can do it in the "constraint editor" of Foundation tool
while you synthesize your design where you can force the clock signal(s)
to use Global buffer(s).

Noddy wrote:

> Jean-Baptiste Monnard <jmonnard@horizon-tech.fr> wrote in message
> news:9opa87$7je$1@wanadoo.fr...
> >
> > You should enable the Global Signal option for your clock signals.
>
> Sorry, but how do you do this? (I am using Foundation 3.1)
>
> adrian

Article: 35343
Subject: Re: Using EABs in Leonardo Spectrum with Flex10K
From: Russell Shaw <rjshaw@iprimus.com.au>
Date: Sun, 30 Sep 2001 12:08:17 +1000
Links: << >> << T >> << A >>

I'm just as clueless about manipulating the lower levels of
code into specific hardware. My design just had some sram:

architecture arch of ram is
  type mem_type is array((2**addr_width)-1 downto 0) of data_type;
  signal mem_array:mem_type:=(others=>(others=>'0'));	
begin
  process(inclock,we,address,mem_array)
  begin
  if(inclock= '1' and inclock'event)
  then
    if (we = '1')
    then
      mem_array(address) <= data;
    end if;
  end if;
  q <= mem_array(address);
end process;

Without any specific directives in the code (but i might have had an
"implement in EAB" box ticked), Leonardo simply reported something like
"implementing xxx into EABs". It was a while ago, because i just got
an updated version of leonardo.


Aldo Romani wrote:
> 
> Sorry to bother you again, and thanks again for your answers,
> do you have some sample code to post you were able to synthesize on EABs?
> So I can see if the bug is in the compiler or in my code.
> We are going to buy the latest version of  Leonardo and MaxPlus, but I'm
> afraid it will be a matter of some week.

For learning vhdl, downloading the *free* maxplus2 and leonardo is all
you need. Beware that writing/debugging vhdl in leonardo is hopeless
because of the cryptic error messages. I compile and simulate using
free vhdl-simili, then synthesize in leonardo as the second-last step,
then compile in maxplus2 as the last step.

> 
> I've already studied about 30Mb of PDF manuals, but still can't get anything
> compiled on EABs.
> It's for my degree thesis, and if I dont' get out of this I won't get my
> degree.
> 
> Another silly question: do you know if the synthesis on EABs has to be done
> with Leonardo or with MaxPlus?
> 
> Thanks and goodbye!
> Aldo Romani
> 
> --
> Aldo Romani
> Student at DEIS - University of Bologna - Italy
> romani@free.mail.it
> For my mail remove the first dot.
> 
> "Russell Shaw" <rjshaw@iprimus.com.au> ha scritto nel messaggio
> news:3BB56DAE.695C960A@iprimus.com.au...
> > It might. My design had ram that used EABs.
> >
> > Aldo Romani wrote:
> > >
> > > May this affect the missing synthesis into EAB?
> > > Leonardo Spectrum calls them 'Memory Bits', and I can't get to use them
> at
> > > all. Any design, including memories, uses them at 0%.
> > >
> > > Anybody has any suggestion? I'm quite desperate...
> > >
> > > Thanks for your answer,
> > > Aldo
> > >
> > > "Russell Shaw" <rjshaw@iprimus.com.au> ha scritto nel messaggio
> > > news:3BB3F3D8.F67A166E@iprimus.com.au...
> > > > You should download version 'd' of leonardo. My designs crashed
> version
> > > > 'a' due to compiler bugs.
> > > >
> > > > Aldo Romani wrote:
> > > > >
> > > > > Hello to the newsgroup,
> > > > > maybe some of you may help me.
> > > > >
> > > > > I'm a newbie in FPGA programming, so I apologize from now if I am
> going
> > > to ask
> > > > > silly questions.
> > > > >
> > > > > I have some VHDL code I have to synthesize on Altera Flex10K
> devices. I
> > > use
> > > > > Leonardo Spectrum, version v20001a2.75.
> > > >

Article: 35344
Subject: about JBits
From: "Terrence Mak" <stmak@cuhk.edu.hk>
Date: Sun, 30 Sep 2001 10:43:54 +0800
Links: << >> << T >> << A >>

Hi,
As I have download the JBits from Xilinx, and also read the tutorial notes.
But still seems don't know about the concert of 'column, and row' what have
to state in the JAVA program.

How can I have a simple start on JBITS? eg. just make a counter on
Vertex300.
Thanks

Terrence Mak

Article: 35345
Subject: Re: Programming flash connected to CPLD via JTAG
From: assaf_sarfati@yahoo.com (Assaf Sarfati)
Date: 29 Sep 2001 23:01:30 -0700
Links: << >> << T >> << A >>

Matthias Fuchs <matthias.fuchs@esd-electronics.com> wrote in message news:<3BB2E378.38B42639@esd-electronics.com>...
> Hi,
> 
> I am looking for an application note on how to program a flash that is
> connected to the pins of a small PLD (XC9500).
> Later the PLD/flash combination should boot an FPGA. But first I need a
> way to programm the flash through the JTAG interface of the PLD. Is this
> possible ? Is there some software available to do this ? I think it
> should be somthing like a scriptable JTAG tool set....
> 
> Everything I found are app notes that describe the booting of the fpga
> though the PLD/flash combination but noone cares about programming the
> flash for the first time :-)
> 
> Any ideas or experiences ?
> 
> Matthias

It should be possible if all the Flash pins are connected to pins on
devices in the JTAG chain: the basic operation of JTAG is setting and
reading I/O pins. However, programming a Flash is pretty tricky even
when it's attached directly to a processor; doing it with a JTAG will
be much harder and slower (every Flash read/write cycle will require a
large number of JTAG commands). IIRC, there is(was?) a company called
Corelis (?) which had a tool for doing it, but it was very expensive
and required adding a card in the PC (IIRC > 10K$ a few years back).

A better way is to allow your processor access to the Flash after the
FPGA is loaded and then use the processor to program it; if the
processor requires an operational FPGA to access the Flash (chicken
and egg...), then you can download the FPGA through the JTAG the 1st
time a board is fired up and then run the Flash programming algorithm.

Article: 35346
Subject: Xilinx Virtex-II reconfiguration
From: "Patrick Muller" <patrick at scs dot ch>
Date: Sun, 30 Sep 2001 11:57:42 +0200
Links: << >> << T >> << A >>

Hi

To reconfigure a Xilinx Virtex-II, the Prog_B signal has to be held low for
at least 300 ns. Is it sufficent to held the Prog_B signal as long low,
until the Init_B signal goes low? Or is the delay from the falling edge of
the Prog_B signal to the falling edge of the Init_B signal anyway longer
than 300 ns?

Thanks,

   Patrick

Article: 35347
Subject: MAX Plus Division
From: "yaohan" <engp1590@nus.edu.sg>
Date: Sun, 30 Sep 2001 22:13:20 +0800
Links: << >> << T >> << A >>



Hi, I have a friend who wrote a statement

  Speed <= (1 / ( counter * 100 )) * 100;

Where
port speed : out integer range 0 to 100
variable counter : integer range 0 to 125;

To my surprise the MAX+plus II compiler synthesize and even allow
simulation. However, the result is wrong..
What actually did the compiler interpret the statement?

Thank you!

Article: 35348
Subject: Re: Virtex II current consumption
From: Phil Hays <spampostmaster@home.com>
Date: Sun, 30 Sep 2001 14:46:53 GMT
Links: << >> << T >> << A >>

Jonas Weiss wrote:
> 
> Good morning everybody,
> I'm developping a PC card with 2x XC2V1000 (optionally XC2V3000) on it.
> I extrapolated the results of an old run of XILINXs power estimator
> excel sheet which results in a rather high current for the 1.5V core
> supply of roughly 16A (8Amps each).

I suggest trying the new XPower utility.  It seems to produce much better
estimates, at least from the starting point of a routed NCD and a simulation.

-- 
Phil Hays

Article: 35349
Subject: Re: Xilinx 4.1 software
From: Phil Hays <spampostmaster@home.com>
Date: Sun, 30 Sep 2001 15:50:29 GMT
Links: << >> << T >> << A >>

Tom Brooks wrote:

> So, I installed Xilinx 4.1i software today and
> my results were much worse than with Xilinx
> 3.3i.  I have a 5 ns path that was turned into
> a 7 ns path with the new software.  So, I'm
> going back to 3.3i.

I've had a similar experience over the past week.  I went from 15 ns (66MHz)
with margin to the PAR tool exited without trying.  After finding and fixing an
old bug (3.* didn't handle timing verification on some open drain outputs
correctly!), and then the best the new PAR could do was 17+ ns.  So I went back
to 3.*, produced a UCF of the problem section of the design using the old
floorplanner to add to my existing UCF (as the new one couldn't open the old
NCD), and now I'm back to where I started.  The other designers in my building
all were singing praises, made their stuff fit better and run faster.  And that
didn't help my mood this past week at all! :-)

However, before you abandon 4.1i, find out what the real problem is.  If the new
PAR has a bug, please send it to Xilinx.  If you have a section that you now
need to constrain placement for it to work, do so.  If your design has a
weakness, fix it.

-- 
Phil Hays

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search