Messages from 49475

Article: 49475
Subject: Re: Feedback from a 200 MHz Virtex2 design
From: "Stan" <vze3qgji@verizon.net>
Date: Wed, 13 Nov 2002 02:46:06 GMT
Links: << >> << T >> << A >>

The best luck I've had at that speed is to use a guideline that each
pipeline stage can only be a flop and a simple equation, most of your logic
should only be maybe 3 levels of LUTs between flops.  You can probably go up
to 4 or 5 levels of LUTs in A FEW places but for those make sure you've
constrained the logic so these long paths remain fully within a small
floorplannable block (small meaning under 50 or 100 LUTs.)  Another thing to
be cautions about is Block RAMs - they have an access time of almost 50% of
your cycle.  Try to put RAM outputs directly into flops, maybe a 2-way mux
then a flop but any more and you'll spend your life floorplanning and
re-running P&R.  Also, some of the RAM setups are 20% of your cycle so be
careful there, too.

Also be careful, don't expect a given signal to drive a lot of loads.  I had
a wide data path so I sliced it (8 or 16 bits per slice) so I could
floorplan to a pretty low level; also I created a copy of the controls for
each slice.

It helps to exploit the technology, such as minimizing logic myself and
implementing it directly instancing primitives.   Doing functional
decomposition of a large equation so that the final stage is a MUXF5 is very
effective, for example.

Good luck!  -Stan

Article: 49476
Subject: Re: C\C++ to VHDL Converter
From: Phil Hays <SpamPostmaster@attbi.com>
Date: Wed, 13 Nov 2002 03:02:48 GMT
Links: << >> << T >> << A >>

Austin Franklin wrote:
> 
> "Phil Hays" <SpamPostmaster@attbi.com> wrote in message
> news:3DD05DC9.1395672B@attbi.com...
> > Mike Treseler wrote:
> > >
> > > Phil Hays wrote:
> > >
> > > > Austin Franklin wrote:
> > >
> > > >>That's simply not true.  The Alpha CPUs were designed using schematic
> capture
> > >
> > > > ... by a large building full of designers.
> > >
> > > Who no longer work for Digital Equipment Corp.
> >
> > Yea.  But fairness requires me to point out that schematic entry wasn't
> > the reason why DEC failed.
> 
> Thanks for the laugh, Phil.  I never even thought of that reply in that way
> ;-)

I didn't the first time as well.  Glad to be of service, Austin.

-- 
Phil Hays

Article: 49477
Subject: Re: Feedback from a 200 MHz Virtex2 design
From: Ray Andraka <ray@andraka.com>
Date: Wed, 13 Nov 2002 05:48:26 GMT
Links: << >> << T >> << A >>

Yep, 2V6000 @ 200 MHz.

1) The carry chains are slow.  In a -4 device you'll barely
make 200 MHz with 20 bit carry chains, and that is if the
router is being nice that day (it probably won't make 200 in a
densely packed arithmetic design, not because of the silicon
but because the router is lazy).

2) The LUTs and routing are quite fast.  As long as the logic
is placed close together you can easily do at least 3 or 4
layers of logic between flip-flops.  YOu may have to do some
floorplanning though, as the placer doesn't place the second
level of LUTs very intelligently.  Unfortunately, if you are
using synthesis, inferred LUT names change from run to run, so
you'll have to work around that.

3) We do a fair amount of building up hierarchical blocks
starting with primitives.  That lets us put RLOCs in the VHDL
and structurally generate the data paths.  Using hierarchy and
doing the placement hierarchiaclly like that saves a ton of
time in the floorplanner.  Too bad the tools can't do
hierarchical floorplanning (no, Xilinx they don't.  To be
hierarchical you need to be able to nest placement in multiple
levels).  The RLOCs are hierarchical, so you can do
hierarchical floorplanning from within the source.

4) Occasionally you need to use syn_keeps or syn_preserves to
enforce inferred structures.

5) We don't bother with Amplify.  With our structural
construction technique, it doesn't offer much value added.
For someone working strictly from RTL it may be useful.

6) Multi pass PAR helps a little.  Unfortunately, the biggest
problem with the current tools is that the router gives up too
easily.  It used to be that a good placement got you pretty
consistent routing results regardless of the effort level
because the old algorithm found shortest routes for all
connections, and only compromised when there were conflicts.
The new router (4.x and on) nails down a few critical paths
based on estimated slack, then just routes the remaining runs
willy-nilly without considering the obvious shortest routes.
As long as it makes timing, no problem (other than the fact
that you've just increased your power consumption
dramatically, and made nearly every net a critical net, and in
dense designs needlessly congested routing making timing
closure a dubious proposition.

7)  The cost tables are more or less non-deterministic as soon
as you make any changes to the design.  It helps to run
multi-pass to run through a number of cost tables.  Somethimes
you get lucky.

Amy Mitby wrote:

> Does anyone have any general suggestions or remarks
> from past work on a 200 MHz large Virtex2 design?
> For instance, did you have to do things like:
> -  add input and output flops for each module and pipeline
>    extensively within modules?

>
> -  other RTL tricks?
> -  use a physical synthesis tool like Amplify?
> -  run multi-pass place and route?
> -  use different cost tables?
> -  hand place some or all of the design?
> -  etc...

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin,
1759

Article: 49478
Subject: Re: HDL vs RTL
From: Ray Andraka <ray@andraka.com>
Date: Wed, 13 Nov 2002 05:51:43 GMT
Links: << >> << T >> << A >>

Have you ever tried to push on a rope?  Not too effective,
is it?   You can often get about the same level of
satisfaction trying to get a synthesis tool to generate the
structure you want (and then the next version of the tools
changes everything).

aaron wrote:

> what does 'pushing the rope' mean, ray?
>
> aaron

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin
Franklin, 1759

Article: 49479
Subject: Re: multi-channel filters - how many channels?
From: Ray Andraka <ray@andraka.com>
Date: Wed, 13 Nov 2002 05:54:22 GMT
Links: << >> << T >> << A >>

I bid one job that was to have 160 channels in one filter last year.

Ken Mac wrote:

> Hello folks,
>
> Xilinx coregens DA filter supports up to 8 channels for some of the FIR
> filter types.
>
> Could you please let me know what is the largest number of channels you have
> used/seen used through a single FIR filter of any type (including
> rate-changing) on an FPGA?
>
> Thanks for your time,
>
> Ken

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 49480
Subject: Re: Registering inputs or outputs of modules
From: kayrock66@yahoo.com (Jay)
Date: 12 Nov 2002 22:42:26 -0800
Links: << >> << T >> << A >>

You will get higher performance (as measured by minimum clock period)
by putting those registers on your outputs rather than none at all as
your question seems to indicate.  The higher performance synthesis
tools can move those registers back and forth (balancing) in an effort
to minimize clock period.

Not registering your outputs is something you might do for latency
concerns or area optimization when you instruct the tool to optimize
across hierarchical boundaries.

President, Quadrature Peripherals
Altera, Xilinx and Digital Design Consulting
email: kayrock66@yahoo.com
http://fpga.tripod.com
-----------------------------------------------------------------------------

amyks@sgi.com (Amy Mitby) wrote in message news:<2d2a8f5d.0211121059.5eb0a76d@posting.google.com>...
> Are there any major benefits or disadvantages to optimization
> with a synthesis design flow where module boundaries are
> registered either at inputs or outputs? In other words, do the
> tools' optimizations across module boundaries sometimes work 
> better than self-imposed sequential boundaries for reaching 
> better performance, or is it best to put those register boundaries 
> in yourself and floorplan the location of those registers?

Article: 49481
Subject: Re: How to disable IOB register packing?
From: kayrock66@yahoo.com (Jay)
Date: 12 Nov 2002 22:45:42 -0800
Links: << >> << T >> << A >>

You're going to have to look up the UCF syntax yourself but I wanted
to give you a heads up that if you're using Synplicity, it can put
attributes in the netlist that will also turn on IOB registers even if
the UCF says not to use them.

Regards
President, Quadrature Peripherals
Altera, Xilinx and Digital Design Consulting
email: kayrock66@yahoo.com
http://fpga.tripod.com
-----------------------------------------------------------------------------
Shareef Jalloq <sjalloq@arm_removeMe_.com> wrote in message news:<3DD13234.525C7CA6@arm_removeMe_.com>...
> Hi all,
> 
> I'm trying to disable IOB register packing but am having trouble with
> the UCF syntax.  I know I want to put IOB=FALSE; in there somewhere but
> how do I do it?  I need to add the constraint to a number of top level
> registers that are already grouped by a TIMEGRP constraint.  I tried
> using the following but it didn't like it:
> 
> TIMEGRP "SRAMData" = FFS("*WDATABuf*");
> INST "SRAMData" IOB=FALSE;
> 
> Any ideas guys?  Thanks for your help, Shareef.

Article: 49482
Subject: Simulation Modes
From: "Sanjay Patil" <sanjay@cg-coreel.com>
Date: Wed, 13 Nov 2002 13:36:24 +0530
Links: << >> << T >> << A >>

Hi
What is simulation modes for HyperTransport  HT Tunnel ,HT slave,HT-bridge ,
checkers and monitors
Can anybody clarify me

Sanjay

Article: 49483
Subject: Re: multi-channel filters - how many channels?
From: "Ken Mac" <aeu96186@yahoo.co.uk>
Date: Wed, 13 Nov 2002 09:40:14 -0000
Links: << >> << T >> << A >>

Ray,

That is a lot of channels!

Are you able to elaborate a little more on the specs of the system?
(full-parallel/N clocks per sample, filter type (singlerate/rate-changing)
input sample widths, coefficient widths, clock rate/sampling rate, device
you put it on etc.) - I am just interested to know what sorts of things
people do DSP-wise in the real world (I am in academia just now).

Thanks for your time,

Ken

> I bid one job that was to have 160 channels in one filter last year.

>
> Ken Mac wrote:
>
> > Hello folks,
> >
> > Xilinx coregens DA filter supports up to 8 channels for some of the FIR
> > filter types.
> >
> > Could you please let me know what is the largest number of channels you
have
> > used/seen used through a single FIR filter of any type (including
> > rate-changing) on an FPGA?
> >
> > Thanks for your time,
> >
> > Ken

Article: 49484
Subject: Re: How to disable IOB register packing?
From: Shareef Jalloq <sjalloq@arm_removeMe_.com>
Date: Wed, 13 Nov 2002 09:50:31 +0000
Links: << >> << T >> << A >>

Jay wrote:

> You're going to have to look up the UCF syntax yourself but I wanted
> to give you a heads up that if you're using Synplicity, it can put
> attributes in the netlist that will also turn on IOB registers even if
> the UCF says not to use them.

Thanks guys,

I've had to create an instance that contains the flops and I can then use the INST "<inst.name>" IOB=FALSE;
syntax.

Shareef.

Article: 49485
Subject: Re: buffer ports on lower level VHDL modules
From: "Alan Fitch" <alan.fitch@doulos.com>
Date: Wed, 13 Nov 2002 11:04:11 -0000
Links: << >> << T >> << A >>

<FAQ> wrote in message news:ee7a528.-1@WebX.sUN8CHnE...
> Is there anything 'wrong' with specifiying an output of a lower
level VHDL module as a buffer so that you can read the output
within the module... versus a dummy signal placed between the
assignment to just an output port.

Buffers have two restrictions

 - any net they drive must have only one driver
   (no tristate busses)
 - if they are bound to another port at a higher level,
   that must also be a buffer.

As a result, we recommend against them

regards

Alan

P.S. Both these restrictions are removed in VHDL 2002! But
I don't know any tools that support VHDL 2002 :-(

--
Alan Fitch
[HDL Consultant]

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * Perl * Tcl/Tk * Verification * Project
Services

Doulos Ltd. Church Hatch, 22 Market Place, Ringwood, Hampshire,
BH24 1AW, UK
Tel: +44 (0)1425 471223                          mail:
alan.fitch@doulos.com
Fax: +44 (0)1425 471573                           Web:
http://www.doulos.com

This e-mail and any  attachments are  confidential and Doulos Ltd.
reserves
all rights of privilege in  respect thereof. It is intended for
the use of
the addressee only. If you are not the intended recipient please
delete it
from  your  system, any  use, disclosure, or copying  of this
document is
unauthorised. The contents of this message may contain personal
views which
are not the views of Doulos Ltd., unless specifically stated.

Article: 49486
Subject: Re: Feedback from a 200 MHz Virtex2 design
From: Utku Ozcan <utku.ozcan@netas.com.tr>
Date: Wed, 13 Nov 2002 14:17:33 +0200
Links: << >> << T >> << A >>

"Nicholas C. Weaver" wrote:

> In article <2d2a8f5d.0211121414.8a292e8@posting.google.com>,
> Amy Mitby <amyks@sgi.com> wrote:
> >Does anyone have any general suggestions or remarks
> >from past work on a 200 MHz large Virtex2 design?
>
> I haven't done that on Virtex2, but I have done >100 MHz on Virtex I
> (non E) and 175 MHz on VirtexE (AES encryption):
>
> >For instance, did you have to do things like:
> >-  add input and output flops for each module and pipeline
> >   extensively within modules?
>
> Yes, lots.
>
> >-  hand place some or all of the design?
>
> Yes, lots.
>
> Hand mapping and placing isn't that bad, if you use a nice modular
> design.  The biggest annoying is actually the BlockRAMs, as on Virtex
> 1, they can't be relatively placed, only absolute placement, which is
> a pain when everything else is RLOCed modules.

Yes, p&r engines were very bad during decision of BRAM placements.
I had chosen LOC= constraint to get good results. That was for M2.1i and M3.1i.
Design was XCV2000E-8. Operating frequency was 80 MHz.

Utku

Article: 49487
Subject: Re: Registering inputs or outputs of modules
From: Ray Andraka <ray@andraka.com>
Date: Wed, 13 Nov 2002 12:40:48 GMT
Links: << >> << T >> << A >>

That wasn't the question.  The question was referring to modules within the FPGA design, e.g. modular
design.

Putting registers on the I/O of the design alleviates the need for the synthesis tools to try to
optimize across module boundaries.  As we move into larger devices and get into more of a modular design
flow, you'll generally want to at least register the module outputs, just to maintain consistency in
timing.  When you don't have registers, it can be ugly trying to trace a delay path, and the synthesis
tools will not be able to optimize through a module boundary unless both the module and the next level
up are visible at the time of the compilation.

Jay wrote:

> You will get higher performance (as measured by minimum clock period)
> by putting those registers on your outputs rather than none at all as
> your question seems to indicate.  The higher performance synthesis
> tools can move those registers back and forth (balancing) in an effort
> to minimize clock period.
>
> Not registering your outputs is something you might do for latency
> concerns or area optimization when you instruct the tool to optimize
> across hierarchical boundaries.
>
> President, Quadrature Peripherals
> Altera, Xilinx and Digital Design Consulting
> email: kayrock66@yahoo.com
> http://fpga.tripod.com
> -----------------------------------------------------------------------------
>
> amyks@sgi.com (Amy Mitby) wrote in message news:<2d2a8f5d.0211121059.5eb0a76d@posting.google.com>...
> > Are there any major benefits or disadvantages to optimization
> > with a synthesis design flow where module boundaries are
> > registered either at inputs or outputs? In other words, do the
> > tools' optimizations across module boundaries sometimes work
> > better than self-imposed sequential boundaries for reaching
> > better performance, or is it best to put those register boundaries
> > in yourself and floorplan the location of those registers?

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 49488
Subject: Re: new to fpga, what language is better to start with
From: me@127.0.0.1 (Terry Newton)
Date: Wed, 13 Nov 2002 12:43:21 GMT
Links: << >> << T >> << A >>

Phil Hays <SpamPostmaster@attbi.com> wrote:
...
>Webpack XST (5.1) will not target the 4004, so I pointed it to a
>Spartan2.
>
>------------------------------------------
>Number of 4 input LUTs:                26
>------------------------------------------

Ok this is more consistent. Thanks!
I wouldn't have guessed that there'd be that much difference
between 2 synthesizers on the same code.

>> This seems to
>> imply that schematic entry is almost twice as efficient as
>> structured VHDL!
>
>Weak examples imply nothing.

They do show that how you use a language, and what you use
it with, makes a big difference. Given more datapoints anyway...

Terry Newton

Article: 49489
Subject: Re: Tristate buffers + leonardo Spectrum
From: "Martin Schoeberl" <martin.schoeberl@chello.at>
Date: Wed, 13 Nov 2002 12:49:23 GMT
Links: << >> << T >> << A >>

> sig <= A when sel = "11110" else 'Z';

if sel="11110" then
    sig <= A;
else
    sig <= 'Z';
end if;

As I remember the first statement (when) is called a dataflow statement,
while the if .. then .. else is a sequential statement and can be used in
porcesses.
Martin
--
JOP - a Java Optimized Processor for FPGAs.
http://www.jopdesign.com


"Anup Raghavan" <anup@itee.uq.edu.au> schrieb im Newsbeitrag
news:9d80c593.0211121820.e951e8d@posting.google.com...
> Hello, when i try to synthesize the following code using Leonardo
> Spectrum for Xilinx FPGAs, I get errors " Syntax Error near 'when' "
> If I dont use a process and then synthesize this code, it works fine.
> But I do need to have a process in my design. Can someone provide me a
> solution for this.
>
> Thanks
> Anup Raghavan
>
> entity mux_tbuf is
>
> port (SEL: in STD_LOGIC_VECTOR (4 downto 0);
> A,B,C,D,E: in STD_LOGIC;
> clk : in std_logic;
> SIG: out STD_LOGIC);
> end mux_tbuf;
>
> architecture RTL of mux_tbuf is
> begin
>
> sync: process (clk) is
>
> begin
>   if clk'event and clk = '1' then
> sig <= B when sel = "11101" else 'Z';
> sig <= C when sel(2)= '1' else 'Z';
> sig <= D when sel(3)= '1' else 'Z';
> sig <= E when sel(4)= '1' else 'Z';
>   end if;
>
> end process sync;
>
> end RTL;

Article: 49490
Subject: Re: How to disable IOB register packing?
From: hamish@cloud.net.au
Date: 13 Nov 2002 12:51:18 GMT
Links: << >> << T >> << A >>

Jay <kayrock66@yahoo.com> wrote:
> You're going to have to look up the UCF syntax yourself but I wanted
> to give you a heads up that if you're using Synplicity, it can put
> attributes in the netlist that will also turn on IOB registers even if
> the UCF says not to use them.

UCF appears to override EDF, in my experience.

INST "*" IOB = FALSE;
is useful for test routes of sub-modules.

Hamish
-- 
Hamish Moffatt VK3SB <hamish@debian.org> <hamish@cloud.net.au>

Article: 49491
Subject: Re: EPP slave interface
From: "Martin Schoeberl" <martin.schoeberl@chello.at>
Date: Wed, 13 Nov 2002 12:53:46 GMT
Links: << >> << T >> << A >>

> Does anybody knows about a free EPP (parallel port) slave interface
> module (preferably in VHDL) ? I have checked on opencores, but it seems

If you can go with ECP you can use a version I did some time ago. You can
find it in a larger zip file at the download section on the link below. The
file is ecp.vhd.

Martin
--
JOP - a Java Optimized Processor for FPGAs.
http://www.jopdesign.com

Article: 49492
Subject: Re: Tristate buffers + leonardo Spectrum
From: hamish@cloud.net.au
Date: 13 Nov 2002 12:56:02 GMT
Links: << >> << T >> << A >>

Anup Raghavan <anup@itee.uq.edu.au> wrote:
> begin
>  if clk'event and clk = '1' then
>        sig <= A when sel = "11110" else 'Z';
>        sig <= B when sel = "11101" else 'Z';   
>        sig <= C when sel(2)= '1' else 'Z';     
>        sig <= D when sel(3)= '1' else 'Z';     
>        sig <= E when sel(4)= '1' else 'Z';     
>  end if;

Unfortunately you can't use '... when ... else ...' inside a process in
VHDL, only outside. Annoying, isn't it? Use if/else instead.

Hamish
-- 
Hamish Moffatt VK3SB <hamish@debian.org> <hamish@cloud.net.au>

Article: 49493
Subject: Re: EPP slave interface
From: "Falk Brunner" <Falk.Brunner@gmx.de>
Date: Wed, 13 Nov 2002 14:48:51 +0100
Links: << >> << T >> << A >>

"Steven Derrien" <sderrien@irisa.fr> schrieb im Newsbeitrag
news:3DD17371.5C0234D9@irisa.fr...
> Hi folks,
>
> Does anybody knows about a free EPP (parallel port) slave interface
> module (preferably in VHDL) ? I have checked on opencores, but it seems
> that their EPP controler project has no file on the CVS and has not been
> updated for a while.

Have a look at www.beyondlogic.org
They have tons of techical papers, also many about parallel port /EPP.
Doing a state-machne to interface to a EPP is easy. Just sample Data_stobe
and Address_strobe, thendo your descision on this.
Se the code snippet below.

--
MfG
Falk

----------------------------------------------------------------------------
-----------------
--
-- A basic EPP state machine
--
----------------------------------------------------------------------------
-----------------
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity epp_fsm is
    Port ( clk              : in std_logic;                     -- clock
input >1 Mhz
           reset            : in std_logic;

           epp_data         : inout unsigned(7 downto 0);   -- data
           epp_write        : in std_logic;                         --
           epp_wait         : out std_logic;
           epp_data_strobe  : in std_logic;
           epp_adr_strobe   : in std_logic);


end epp_fsm;

architecture Behavioral of epp_fsm is

type state_type is (idle, wait_end, wait_end_read);

signal state                : state_type;
signal data_strobe          : std_logic;                                --
synchronized data strobe form EPP
signal adr_strobe           : std_logic;                                --
synchronized address strobe form EPP
signal data_register        : unsigned (7 downto 0);                 -- data
register, just a example
signal address_register     : unsigned (7 downto 0);                 --
address register, just a example

begin
-- some signal assignments

-- IO MUX for EPP data

  process(state, data_register, address_register)
  begin
    if state=wait_end_read then
      if data_strobe='1' then
        epp_data <= address_register;                           -- address
read access
      else
        epp_data <= data_register;                              -- data read
access
      end if;
    else
      epp_data
              -- no read access
    end if;
  end process;

-- sample control lines from EPP

  process(clk, reset)
  begin
    if reset='1' then
      data_strobe <= '1';
      adr_strobe <='1';
    elsif clk='0' and clk'event then
      data_strobe <= epp_data_strobe;
      adr_strobe  <= epp_adr_strobe;
    end if;
  end process;

-- the state machine

  process(clk, reset)
  begin
    if reset='1' then
      state      <= idle;
      epp_wait   <='0';
    elsif clk='0' and clk'event then
      case state is
        when idle       => if data_strobe='0' then              -- beginning
of an data access
                             epp_wait <= '1';
                             if epp_write='0' then              -- it is a
write access

                               -- place instructions HERE for data write
access
                               data_register <= epp_data;       -- example

                               state <= wait_end;
                             else                               -- it is a
read access

                               -- place instructions HERE for data read
access

                               state    <=wait_end_read;
                             end if;
                           elsif adr_strobe='0' then            -- adress
access
                             epp_wait <= '1';
                             if epp_write='0' then              -- it is a
write access;

                               -- place instructions HERE for address write
access
                               address_register <= epp_data;       --
example
                               state    <= wait_end;
                             else                               -- its a
read access

                               -- place instructions HERE for address read
access
                               state <= wait_end_read;
                             end if;
                           end if;

        when wait_end   => if data_strobe='1' and adr_strobe='1'
     -- wait for the end of a write access
                             epp_wait <='0';
                             state    <= idle;
                           end if;
        when wait_end_read => if data_strobe='1' and adr_strobe='1'
     -- wait for the end of a read access
                             epp_wait <='0';
                             state    <= idle;
                           end if;
        when others     => null;
      end case;
    end if;
  end process;

end Behavioral;

Article: 49494
Subject: Costing FPGA design projects
From: edaudio2000@yahoo.co.uk (ted)
Date: 13 Nov 2002 05:51:34 -0800
Links: << >> << T >> << A >>

I know this is like asking how long is a piece of string. But being
able to make a rough guess how long a project is going to take is
useful thing to know.

Does anybody have any pointers, rules of thumb or golden rules they
can pass on from their experience?.

For example time taken (say as a proportion) for paper design, coding
and simulation.

How much time should be allocated to simulation? 

Any comments will be useful!

Thanks

theo

Article: 49495
(removed)

Article: 49496
Subject: question about booth multipliers
From: mehmetozcelebi@turk.net (mehmeto)
Date: 13 Nov 2002 07:55:33 -0800
Links: << >> << T >> << A >>

Dear Computer Arithmetic Gurus,
 I am currently working  on the implementation of an unsigned Parallel
Multiplier. After reading some articles I found the modified Booth-2
algorithm suitable.
 It was described in Al_Twaijry's thesis "Area and Performance
optimized CMOS multipliers" page 11 ,1997.
 I wonder if the figure shown in the thesis page 11 is still the state
of the art way to produce partial products? are more advanced
techniques discovered since 1997?

Thanks

Article: 49497
Subject: Re: How to disable IOB register packing?
From: Shareef Jalloq <sjalloq@arm_removeMe_.com>
Date: Wed, 13 Nov 2002 16:08:20 +0000
Links: << >> << T >> << A >>

hamish@cloud.net.au wrote:

> INST "*" IOB = FALSE;
> is useful for test routes of sub-modules.

Hi again guys,

although I used the above syntax in the UCF file, the Xilinx tools still
packed the flops into the IOB!  What can I do aside from adding a dummy
output so that the fanout of the Q pin is higher than 1.  This would at
least guarantee that the flops could not be packed into the IOB.  I'm using
version 4.2.03i of the Xilinx tools on Solaris.

Shareef.

Article: 49498
Subject: Problem with Xilinx Application 134 "Synthesizable High-Performance SDRAM Controllers"
From: jpnicholls@pwav.com (JP Nicholls)
Date: 13 Nov 2002 08:16:34 -0800
Links: << >> << T >> << A >>

I'm trying to run the Xilinx App 134, "Synthesizable High-Performance
SDRAM
Controllers".  I'm using the VHDL version, 

When I run "do run_sim.do" in Modelsim, the macro first compiles and
then runs
the testbench t_sdrm. This runs with the following warnings and errors
(partial file only):

  ** Warning: NUMERIC_STD."=": metavalue detected, returning FALSE
  #    Time: 88200 ps  Iteration: 1  Instance:
/t_sdrm/sdrmc/sdrm_t_int/brst_cntr_inst
  # ** Warning: NUMERIC_STD."=": metavalue detected, returning FALSE
  #    Time: 88200 ps  Iteration: 1  Instance:
/t_sdrm/sdrmc/sdrm_t_int/rcd_cntr_inst
  # ** Warning: NUMERIC_STD."=": metavalue detected, returning FALSE
  #    Time: 88200 ps  Iteration: 1  Instance:
/t_sdrm/sdrmc/sdrm_t_int/ki_cntr_inst
  # ** Warning: NUMERIC_STD."=": metavalue detected, returning FALSE
  #    Time: 88200 ps  Iteration: 4  Instance:
/t_sdrm/sdrmc/sdrm_t_int/ref_cntr_inst
  # ** Error: mt48lc1m16a1.v(781): $hold( posedge Clk:88200 ps,
Addr:88300 ps, 1 ns );
  #    Time: 88300 ps  Iteration: 3  Instance: /t_sdrm/sdram0
  # ** Error: mt48lc1m16a1.v(781): $hold( posedge Clk:88200 ps,
Addr:88300 ps, 1 ns );
  #    Time: 88300 ps  Iteration: 3  Instance: /t_sdrm/sdram1


I suspect the metavalues are intialisation problems because they don't
recur in
the file.  The Errors occur until the end of the file.

What is going wrong?

Also, there are two versions of the source files - one in the
directory
vhdl\func_sim\ and another in vhdl\src.  They are different. The
RUN_SIM.DO
macro uses the files in vhdl\func_sim.

Which is correct? What is the difference?

Many thanks.

-- 
JP Nicholls  /  jpnicholls@pwav.com

Article: 49499
Subject: Re: LU-decomposition
From: Goran Bilski <Goran.Bilski@Xilinx.com>
Date: Wed, 13 Nov 2002 08:37:17 -0800
Links: << >> << T >> << A >>

Hi Jan,

The area was actually with full pipelining.
The problem using it is in the processor part when instructions are sequential
unless you add vector instructions.
You probably can't pipeline everything since most algorithms has some kind of
intermediate calculations and will therefore stall the pipeline.
A much better approach on the barrel shifter is to use smaller shift in each
clock cycles, going to SRL16 is extreme and I not sure it actually going to save
any area.

A remember a good report that a FPU core design house did in the 80s on what is
the average shift amount for floating point.
The report said that a shift upto 8 bits will cover more than 90% of all the
cases and they implemented a FPU with just that.
That core was quite small and efficient. The smaller shift only lower the
average performance by a few percent.
BUT at that time I worked for a company that build computer for space
application and thus needed to calculate everything on maximum latency (not
average). This made the FPU core look bad since it's maximum latency was quite
bad.
The FPU core made it's way into Sun uSPARC processor which was design for
desktop application where average performance is more important.

So with this technique you might get down from 800 to at most 400-500 LUTs and
if we also consider full pipelining all the time.
The new value for floating point would be ((100_000_000/6)*6)/400 = 250000 which
still is 30 times worse than integer operations.
So if the FPGA was full of these operations, you would need a FPGA which is 30
times bigger.

Göran

Jan Gray wrote:

> "Goran Bilski" <Goran.Bilski@Xilinx.com> wrote
> > Quantitative (Number of operations per seconds/ needed area)
> >
> > Floating point : (100_000_000/6)/800 =    20833
> > Integer :             (250_000_000/1)/32 = 7812500
> >
> > Integer operations are roughly 400 times more efficient than floating
> point.
>
> Thanks for the interesting data, Goran.
>
> Can you pipeline the above FP adder to get a factor of ~6 improvement in
> ops/area efficiency?
>
> Also, if you only care about ops/area cost efficiency, and not pure speed,
> you might be able to use bit or nybble serial approaches, use lots of SRL16s
> for delays, and thereby avoid the big expensive barrel shifters in the
> denormalize and renormalize paths.
>
> Jan Gray, Gray Research LLC

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search