Messages from 20625

Article: 20625
Subject: Choosing the correct size FPGA
From: Richard Dempster <red@usgmrl.ksu.edu>
Date: Wed, 16 Feb 2000 10:39:44 -0600
Links: << >> << T >> << A >>

I am very new to programming FPGA's but old (today's standards) in C
programing.  I am trying to find a generic method of defining what size
FPGA I need given a certain algorithm most likely written in C.  Thus, I
have some idea what the compiled size of the C program but can I relate
the compiled size to how many gates will be necessary in a FPGA.

I have four algorithms to code into a FPGA.  Two are data manipulation,
one is pixel compression using a new algorithm, and the last is a
standard Principal Component Analysis algorithm.

I know the software used to program the FPGAs will give an apporximate
gate count but only after development of the VHDL code.

As this will be a proposal, I was wondering if there was a method to
estimate the FPGA size prior to all the encoding.  I know I don't want
to oversize the FPGA area due to unnecessary time delays and power
requirements.

ANY information will be helpful since I haven't found any information
yet and I seemed to have run out of places to search on the WEB.  None
of the tutorials seem to deal with the algoritm to gate mapping.  The
two books I have don't mention this either.

Thanks
Rick Dempster
red@usgmrl.ksu.edu

Article: 20626
Subject: Re: 100% slice utilization in Virtex FPGA
From: Rick Filipkiewicz <rick@algor.co.uk>
Date: Wed, 16 Feb 2000 16:51:42 +0000
Links: << >> << T >> << A >>

gazit@my-deja.com wrote:

> Matt,
> try to use "map -c 1 ..........."  it will improve your device
> utilization.
> My experience shows that the ratio between the real gate count and
> Xilinx numbers is ~1/5 ( 60K ASIC system gates into a "xcv300" device
> sounds reasonable  ratio).
> If your design is going to be changed you should consider immigrating
> to a larger device. I believe you can use the same foot print.
> Good luck.

I think the 1/5 ration is a little pessimitic esp. if there's a fair
amount of memory involved. We recently did an ASIC prototype in a Xilinx
XCV300. The FPGA usage was 89% of LUTs, 55% of CLB FFs, 14 out out 16
Block RAMs [each of these only 30-50% used]. The final ASIC gate count was
about 210K pre scan insertion. With scan & boundary scan it went up to
about 240K. The ASIC partition between logic & memory was about 55/45.

However I would agree with using a bigger FPGA. The XCV300-4 was all we
could get back last April and it was really not big enough to allow fast
P&R and to get decent timing took a lot of work. The best we could get we
could get with a (nearly) pure HDL design & no Floorplanning [F1.5i
floorplanner didn't support Virtex] was 66MHz. The ASIC signed off @
91MHz.

Article: 20627
Subject: Re: 100% slice utilization in Virtex FPGA
From: jeffreyzuelch@my-deja.com
Date: Wed, 16 Feb 2000 17:06:09 GMT
Links: << >> << T >> << A >>

Matt,

I would not worry (at least not now...)

My experiences with Virtex and Virtex-E PAR
produce similar SLICE utilization results.
Consider the following when interpreting the
SLICE utilization number:

1) If the Xilinx mapper determines that the
design requires a portion of the device, it will
"spread" out the FF / LUTs into a greater number
of SLICEs than required (The C-100 MAP option).
This "SLICE-spreading" is performed to simplify
the routing issues.

2) When MAP reports X SLICES utilized, we have no
way of determining if the SLICE used 1 of the 2
FFs, 1 of the 2 LUTs, 1 FF and 0 LUTs, 0 FFs and
1 LUT etc. To see the "real" implementation, you
would have to look at it in FPGA Editor.

3) If you want to see the MINIMAL required
SLICES, run MAP with the -C 1 option.

On the "gate" count issue -

Xilinx "gate" counts are constructed to fit the
device's part number. For example, we are using
the 2000E devices which are billed as 2 Million
gate devices. Xilinx arrived at the 2 Million
gate number by assuming the average design would
use X% of the CLBs as logic, Y% of the CLBs as
distributed memory, and Z% of the Block RAM.
Consider that the 2000E has 160 BRAMs which are
each 4096 bits. At 4 "gates" per bit, the 2000E
has 2.6 Million gates in Block RAM alone - a
number which greatly exceeds the marketed 2
Million gate number for the entire device.

In article
<38A978D6.F2364810@collins.rockwell.com>,
  Matt Gavin <mtgavin@collins.rockwell.com> wrote:
> FPGA gurus,
>
> I am trying to fit a design in a Virtex XCV300
(2.5V part).
> The Xilinx mapper reports 100% slice
utilization, which shocked me.
> However, if you do the math on their flop and
LUT counts,
> (assuming 2 flops and LUTs per slice), the flop
and LUT utilizations
> are 55% and 76%, repectively, which is what I
expected.
> The equivalent gate count is ~60K, which isn't
that high
> (especially since Xilinx claims that 322K
system gates can fit in
> a XCV300.)
>
> Should I be worried that my slice utilization
is 100%?
> Why would the mapper choose to use every single
slice
> if the flop and LUT counts are so low?
>
> The report is given below for reference.
>
> Thanks for any help,
>
>    Matt Gavin
>    mtgavin@collins.rockwell.com
>
> Design Information
> ------------------
> Command Line   : map -p xcv300-5-pq240 -o
map.ncd mimas_fpga.ngd
> mimas_fpga.pcf
> Target Device  : xv300
> Target Package : pq240
> Target Speed   : -5
> Mapper Version : virtex -- C.19
> Mapped Date    : Mon Feb 14 17:08:18 2000
>
> Design Summary
> --------------
>    Number of errors:      0
>    Number of warnings:    4
>    Number of Slices:            3,072 out of
3,072  100%
>       Slice Flip Flops:   3,408
>       4 input LUTs:       4,682 (4 used as a
route-thru)
>    Number of Slices containing
>       unrelated logic:            948 out of
3,072   30%
>    Number of bonded IOBs:         120 out of
166   72%
>    Number of GCLKs:                 4 out
of      4  100%
>    Number of GCLKIOBs:              3 out
of      4   75%
> Total equivalent gate count for design:  60,367
> Additional JTAG gate count for IOBs:  5,904
>
>

Sent via Deja.com http://www.deja.com/
Before you buy.

Article: 20628
Subject: Re: 100% slice utilization in Virtex FPGA
From: jeffreyzuelch@my-deja.com
Date: Wed, 16 Feb 2000 17:06:10 GMT
Links: << >> << T >> << A >>

Matt,

I would not worry (at least not now...)

My experiences with Virtex and Virtex-E PAR
produce similar SLICE utilization results.
Consider the following when interpreting the
SLICE utilization number:

1) If the Xilinx mapper determines that the
design requires a portion of the device, it will
"spread" out the FF / LUTs into a greater number
of SLICEs than required (The C-100 MAP option).
This "SLICE-spreading" is performed to simplify
the routing issues.

2) When MAP reports X SLICES utilized, we have no
way of determining if the SLICE used 1 of the 2
FFs, 1 of the 2 LUTs, 1 FF and 0 LUTs, 0 FFs and
1 LUT etc. To see the "real" implementation, you
would have to look at it in FPGA Editor.

3) If you want to see the MINIMAL required
SLICES, run MAP with the -C 1 option.

On the "gate" count issue -

Xilinx "gate" counts are constructed to fit the
device's part number. For example, we are using
the 2000E devices which are billed as 2 Million
gate devices. Xilinx arrived at the 2 Million
gate number by assuming the average design would
use X% of the CLBs as logic, Y% of the CLBs as
distributed memory, and Z% of the Block RAM.
Consider that the 2000E has 160 BRAMs which are
each 4096 bits. At 4 "gates" per bit, the 2000E
has 2.6 Million gates in Block RAM alone - a
number which greatly exceeds the marketed 2
Million gate number for the entire device.

In article
<38A978D6.F2364810@collins.rockwell.com>,
  Matt Gavin <mtgavin@collins.rockwell.com> wrote:
> FPGA gurus,
>
> I am trying to fit a design in a Virtex XCV300
(2.5V part).
> The Xilinx mapper reports 100% slice
utilization, which shocked me.
> However, if you do the math on their flop and
LUT counts,
> (assuming 2 flops and LUTs per slice), the flop
and LUT utilizations
> are 55% and 76%, repectively, which is what I
expected.
> The equivalent gate count is ~60K, which isn't
that high
> (especially since Xilinx claims that 322K
system gates can fit in
> a XCV300.)
>
> Should I be worried that my slice utilization
is 100%?
> Why would the mapper choose to use every single
slice
> if the flop and LUT counts are so low?
>
> The report is given below for reference.
>
> Thanks for any help,
>
>    Matt Gavin
>    mtgavin@collins.rockwell.com
>
> Design Information
> ------------------
> Command Line   : map -p xcv300-5-pq240 -o
map.ncd mimas_fpga.ngd
> mimas_fpga.pcf
> Target Device  : xv300
> Target Package : pq240
> Target Speed   : -5
> Mapper Version : virtex -- C.19
> Mapped Date    : Mon Feb 14 17:08:18 2000
>
> Design Summary
> --------------
>    Number of errors:      0
>    Number of warnings:    4
>    Number of Slices:            3,072 out of
3,072  100%
>       Slice Flip Flops:   3,408
>       4 input LUTs:       4,682 (4 used as a
route-thru)
>    Number of Slices containing
>       unrelated logic:            948 out of
3,072   30%
>    Number of bonded IOBs:         120 out of
166   72%
>    Number of GCLKs:                 4 out
of      4  100%
>    Number of GCLKIOBs:              3 out
of      4   75%
> Total equivalent gate count for design:  60,367
> Additional JTAG gate count for IOBs:  5,904
>
>

Sent via Deja.com http://www.deja.com/
Before you buy.

Article: 20629
Subject: Re: 100% slice utilization in Virtex FPGA
From: jeffreyzuelch@my-deja.com
Date: Wed, 16 Feb 2000 17:10:24 GMT
Links: << >> << T >> << A >>

Matt,

I would not worry (at least not now...)

My experiences with Virtex and Virtex-E PAR produce similar SLICE
utilization results. Consider the
following when interpreting the SLICE utilization number:

1) If the Xilinx mapper determines that the design requires a portion
of the device, it will "spread" out the FF / LUTs into a greater number
of SLICEs than required (The C-100 MAP option). This "SLICE-spreading"
is performed to simplify the routing issues.

2) When MAP reports X SLICES utilized, we have no way of determining if
the SLICE used 1 of the 2 FFs, 1 of the 2 LUTs, 1 FF and 0 LUTs, 0 FFs
and 1 LUT etc. To see the "real" implementation, you would have to look
at it in FPGA Editor.

3) If you want to see the MINIMAL required SLICES, run MAP with the -C
1 option.

On the "gate" count issue -

Xilinx "gate" counts are constructed to fit the device's part number.
For example, we are using the 2000E devices which are billed as 2
Million gate devices. Xilinx arrived at the 2 Million gate number by
assuming the average design would use X% of the CLBs as logic, Y% of
the CLBs as distributed memory, and Z% of the Block RAM (where X,Y,Z <<
100%). Consider that the 2000E has 160 BRAMs which are each 4096 bits
At 4 "gates" per bit, the 2000E has 2.6 Million gates in Block RAM
alone - a number which greatly exceeds the marketed 2 Million gate
number for the entire device.

Jeff

In article <38A978D6.F2364810@collins.rockwell.com>,
  Matt Gavin <mtgavin@collins.rockwell.com> wrote:
> FPGA gurus,
>
> I am trying to fit a design in a Virtex XCV300 (2.5V part).
> The Xilinx mapper reports 100% slice utilization, which shocked me.
> However, if you do the math on their flop and LUT counts,
> (assuming 2 flops and LUTs per slice), the flop and LUT utilizations
> are 55% and 76%, repectively, which is what I expected.
> The equivalent gate count is ~60K, which isn't that high
> (especially since Xilinx claims that 322K system gates can fit in
> a XCV300.)
>
> Should I be worried that my slice utilization is 100%?
> Why would the mapper choose to use every single slice
> if the flop and LUT counts are so low?
>
> The report is given below for reference.
>
> Thanks for any help,
>
>    Matt Gavin
>    mtgavin@collins.rockwell.com
>
> Design Information
> ------------------
> Command Line   : map -p xcv300-5-pq240 -o map.ncd mimas_fpga.ngd
> mimas_fpga.pcf
> Target Device  : xv300
> Target Package : pq240
> Target Speed   : -5
> Mapper Version : virtex -- C.19
> Mapped Date    : Mon Feb 14 17:08:18 2000
>
> Design Summary
> --------------
>    Number of errors:      0
>    Number of warnings:    4
>    Number of Slices:            3,072 out of  3,072  100%
>       Slice Flip Flops:   3,408
>       4 input LUTs:       4,682 (4 used as a route-thru)
>    Number of Slices containing
>       unrelated logic:            948 out of  3,072   30%
>    Number of bonded IOBs:         120 out of    166   72%
>    Number of GCLKs:                 4 out of      4  100%
>    Number of GCLKIOBs:              3 out of      4   75%
> Total equivalent gate count for design:  60,367
> Additional JTAG gate count for IOBs:  5,904
>
>

Sent via Deja.com http://www.deja.com/
Before you buy.

Article: 20630
Subject: Re: Xilinx Virtex Reset
From: Peter Alfke <peter@xilinx.com>
Date: Wed, 16 Feb 2000 09:19:31 -0800
Links: << >> << T >> << A >>

Rick Filipkiewicz wrote:

> Looking at the Virtex data sheet there's a timing parameter for the GSR->IOB/CLB FF
> outputs given. For a -4 part its 12.5nsec. The question is whether this includes GSR
> routing. If it doesn't its got to be the slowest async reset since LS TTL.

Of course it includes the max routing delay.
But it's a max delay, and some flip-flops are closer to the source and have a much
shorter delay. So this delay ( different from all other delays in the data sheet) has
an enormous spread, you really should assume anywhere between almost zero to the max
value. That's what causes the problems that Ray and I discussed before.

Peter Alfke

Article: 20631
Subject: How to manage projects with Xilinx?
From: jlamorie@engsoc.carleton.ca
Date: Wed, 16 Feb 2000 17:58:52 GMT
Links: << >> << T >> << A >>

Gidday there,

I'm looking for a better way to manage projects than is available from
Xilinx Foundation.

I want to be able to store the appropriate source files for synthesis
and implementation in a CVS system, and use something like 'make' to
handle compilation of subsystems.

I want to be able to create libraries of VHDL, and correctly make use of
all the hierarchical features of this language.  I am very new to this
(just over a year) and come from an ANSI-C+GNU development background.

All this GUI management stuff is horrible, and the management of source
versions is absolutley non-existent.  Or am I missing something.

Any hints to projects, or products that would be able to help me would
be appreciated.

Thanks

Joshua Lamorie
Systems Designer
Xiphos Technologies Inc.


Sent via Deja.com http://www.deja.com/
Before you buy.

Article: 20632
Subject: Re: CIC Question
From: flavioas@my-deja.com
Date: Wed, 16 Feb 2000 18:09:07 GMT
Links: << >> << T >> << A >>

In article <38A80F1C.499D85@ids.net>,
  Ray Andraka <randraka@ids.net> wrote:
> You need to use signed arithmetic.  The input and all subsequent
stages
> should be sign extended to the width of the adders.

    I made a mistake, was using STD_LOGIC instead SIGN, because
my input was in 2´s complement ( direct from A/D converter). Now,
the sign bit generation is correct!

 Also, keep in
mind
> there is a gain through the filter, so you need to either limit the
input
> bits or extend the output width to accommodate the gain.  You might
try not
> truncating first to get it working then prune the adders.  At least
then
> you'll be able to determine if the pruning is causing the problem.

      Now, I´m a little bit confused. I didn´t truncate any
register, except the last one. Should I use more bits for each
intermediate stage, than defined by Hogenauer´s paper?


     TIA,

      Flávio



>
> flavioas@my-deja.com wrote:
>
> >     Hi,
> >
> >     We are attempting to implement a CIC Interpolation Filter,
> > following Hogenauer´s recipe. In a flex10k device.
> >     The parameters are : Bin = 12; Bout = 12; R = 4; M = 1 and N =
4.
> >     So, the minimum register width for each stage is : 13, 14, 15,
15,
> > 15, 16, 16, 18.
> >     We used 2´s complement addition rules, i.e.,all numbers
unsigned,
> > simple binary addition, and carries past the sign are bit ignored.
> >     We did the properly sign extension from one stage to another.
> >     But, the overall frequency response is not the expected, it
didn´t
> > work.
> >     Looking at freq. resp. at each stage, we found the desired shape
> > till the first ( N+1 )Interpolator stage. From this point, as we add
> > more stages ( N+2,..., 2N) the things get worst. The outputs
saturates,
> > we think, and we see nothing useful.
> >     Is this problem familiar to someone? Any hint?
> >
> >     Thanks in Advance,
> >
> >     Flávio
> >
> > Sent via Deja.com http://www.deja.com/
> > Before you buy.
>
> --
> -Ray Andraka, P.E.
> President, the Andraka Consulting Group, Inc.
> 401/884-7930     Fax 401/884-7950
> email randraka@ids.net
> http://users.ids.net/~randraka
>
>


Sent via Deja.com http://www.deja.com/
Before you buy.

Article: 20633
Subject: Spartan-II Pricing - What gives?
From: "Andy Krumel" <andy@krumel.com>
Date: Wed, 16 Feb 2000 10:27:46 -0800
Links: << >> << T >> << A >>

Hi all,

My company is working on a networking product that uses an FPGA for
performing some analysis of Ethernet packets. The algorithms require quick
access to some RAM based tables and dual port on-chip block Ram structures
fit the bill perfectly. The final product is for a price sensitive market so
Xilinx's Spartan-II line looks perfect, but...

I called a distributor to get pricing for 50,000 XC2S100 Spartan-II chips
and received a quote of $58.65 (down from a single chip at $77). Yet
Xilinx's literature claims this chip to cost under $10 in volume.

What constitutes "volume" to get this kind of price?
Is there an FPGA with 30-40K dual port RAM blocks that costs <= $10 in
volumes of 50,000?

Quote from http://www.xilinx.com/products/spartan2/index.htm:

"Say hello to a new level of performance. The Spartan-II family delivers
100,000 system gates for under $10, at speeds of 200 MHz and beyond,
giving you design flexibility that's hard to beat."

Also, I looked and looked and could not find any disclaimers or volume
quotes for these prices. There are plenty of flashing GIFs proclaiming this
price though.

Thanks,
Andy

Article: 20634
Subject: Re: How to manage projects with Xilinx?
From: Dave Vanden Bout <devb@xess.com>
Date: Wed, 16 Feb 2000 13:37:44 -0500
Links: << >> << T >> << A >>

This is a multi-part message in MIME format.
--------------C073BF6C231A9662622D8096
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

You can take a look at http://www.xess.com/fndmake.pdf.  This document shows you how to implement Xilinx projects using a makefile and batch mode processing.  You can store the makefiles and VHDL files in a CVS tree and recall them to regenerate your project bit files.

The makefile described in the document is a bit simple, but you can probably modify it to make it smarter.

> Gidday there,
>
> I'm looking for a better way to manage projects than is available from
> Xilinx Foundation.
>
> I want to be able to store the appropriate source files for synthesis
> and implementation in a CVS system, and use something like 'make' to
> handle compilation of subsystems.
>
> I want to be able to create libraries of VHDL, and correctly make use of
> all the hierarchical features of this language.  I am very new to this
> (just over a year) and come from an ANSI-C+GNU development background.
>
> All this GUI management stuff is horrible, and the management of source
> versions is absolutley non-existent.  Or am I missing something.
>
> Any hints to projects, or products that would be able to help me would
> be appreciated.
>
> Thanks
>
> Joshua Lamorie
> Systems Designer
> Xiphos Technologies Inc.
>
> Sent via Deja.com http://www.deja.com/
> Before you buy.




--------------C073BF6C231A9662622D8096
Content-Type: text/x-vcard; charset=us-ascii;
 name="devb.vcf"
Content-Transfer-Encoding: 7bit
Content-Description: Card for Dave Vanden Bout
Content-Disposition: attachment;
 filename="devb.vcf"

begin:vcard 
n:Vanden Bout;David
tel;fax:(919) 387-1302
tel;work:(919) 387-0076
x-mozilla-html:FALSE
url:http://www.xess.com
org:XESS Corp.
adr:;;2608 Sweetgum Drive;Apex;NC;27502;USA
version:2.1
email;internet:devb@xess.com
title:FPGA Product Manager
x-mozilla-cpt:;28560
fn:Dave Vanden Bout
end:vcard

--------------C073BF6C231A9662622D8096--

Article: 20635
Subject: Re: coregen-bug produces bad blockram > 16 bit
From: fjz001@email.mot.com
Date: Wed, 16 Feb 2000 19:30:10 GMT
Links: << >> << T >> << A >>

Mark,

Why not directly instantiate the RAMB4_S16_S16 in your HDL? In this
case, Coregen just adds a layer of unnecessary complexity.

Jeff

In article <88c4df$4r2@news.Informatik.Uni-Oldenburg.DE>,
  "Mark Hillers" <Mark.Hillers@Informatik.Uni-Oldenburg.DE> wrote:
> Hello,
>
> i think i have found a bug in xilinx-tool coregen 2.1i.
> it happens when creating single-port-blockrams with words larger than
16
> bit.
>
> the resulting ".edn"-file (for synopsys) uses one RAMB4_S16_S16
> component where the
> lower 16 bit of the 24-bit-word are mapped to port A (DOA[15:0]) and
the
> upper 8 bit are mapped to port B (DOA[15:8]).
> But - and here comes the bug - the address of the desired word is
simply
> mapped to both address-ports (ADDRA and ADDRB (8 bit wide)) the
> following way:
>
> ADDRA(4 downto 0) = myaddress(4 downto 0)
> ADDRA(7 downto 5) = "000"
> ADDRB(4 downto 0) = myaddress(4 downto 0)
> ADDRB(7 downto 5) = "000"
>
> The problem is now that always both ports load the same address and
with
> it the same data. The Result is an output which has the form CDABCD
> where A,B,C,D are hex-ciphers.
>
> In application-note XAPP130 (V1.1) is a solution to this problem. The
> mapping of the address-ports should be:
>
> ADDRA(4 downto 0) = myaddress(4 downto 0)
> ADDRA(7 downto 5) = "000"
> ADDRB(4 downto 0) = myaddress(4 downto 0)
> ADDRB(7 downto 5) = "100"
>
> Now I am looking for a simple patch. The simples would be a new
version
> of coregen because i am not good in writing ".edn"-files :-(.
>
> greetings
> mark
>


Sent via Deja.com http://www.deja.com/
Before you buy.

Article: 20636
Subject: Re: Xilinx Virtex Reset
From: mark.luscombe@lineone.net (Mark Luscombe)
Date: Wed, 16 Feb 2000 19:53:29 GMT
Links: << >> << T >> << A >>

Ray,

Yes, if it doesn't meet 74MHz, then the GSR net is useless for us, but
it is worth enquiring.

If a non GSR-net reset is used, then this uses routing (i know Xilinx
say that Virtex has plentiful routing) that could impact timing etc,
and it uses the SR input of the Virtex CLB meaning that RAM LUTs can't
exist with DFFs.

I understand your concern about synchronising the Reset signal with a
CLB DFF (chicken and egg scenario), but there is a technical note by
Peter Alfke ? which details the construction of a synchronising FF
with a CLB's LUTs (cross coupled Nand gates).

Cheers, Mark.


On Mon, 14 Feb 2000 23:53:14 GMT, Ray Andraka <randraka@ids.net>
wrote:

>It's only free if it meets timing.  I think you'll find that you are past it at
>74MHz.  Even in the 4K parts, GSR was only good up to a fraction of the clock
>rate the part can easily achieve with careful design.  Also, the GSR hits every
>single flip-flop in the design, which in some cases can cause you grief
>(especially when you consider the need to resync the reset).
>
>Mark Luscombe wrote:
>
>> Ray,
>>
>> Thanks for your input, but if the GSR net is used, then routing and
>> CLB resources are not used, as it is a "free" function.
>>
>> The Xilinx Rep is coming to see me Monday, so hopefully he'll be able
>> to say whether i can use the GSR net at 74MHz.
>>
>> Cheers, Mark.
>>
>> On Sun, 13 Feb 2000 17:27:01 GMT, Ray Andraka <randraka@ids.net>
>> wrote:
>>
>> >Not every flip flop in an FPGA design needs to be reset;  You only need to
>> >reset select flip-flops to make sure that 'loops' in the logic reach a known
>> >state after some number of clock cycles.  Data path will self clear, so
>> >there's no need to apply explicit resets.  You may also want to reset the
>> >flip-flops closest to the outputs, and hold them reset for a number of
>> >clocks after reset is released.
>> >
>> >I know that this makes the ASIC guys blood curdle, but the fact of the
>> >matter is that it uses up resources in the FPGA and slows down your design.
>> >
>> >Rickman wrote:
>> >
>> >> Mark Luscombe wrote:
>> >> >
>> >> > Hi,
>> >> >
>> >> > I am trying to work out a satifactory method for resetting a
>> >> > synchronously design Virtex running at 74MHz.
>> >> >
>> >> > Now, the reset signal needs to be synchronised with the 74MHz clock
>> >> > and the propagation delay from this to the CLB and IOB DFFs needs to
>> >> > be less than 13ns to ensure that all registers within the device at
>> >> > reset on the sam clock edge.
>> >> >
>> >> > Xilinx seem to have been telling people not to use the GSR net, as it
>> >> > is too slow, but it does seem a pity not to use it, and use extra
>> >> > routing and CLB inputs for a global reset.
>> >> >
>> >> > It seems as though the STARTUP_VIRTEX component can accept a USER_CLK
>> >> > input, i.e. the 74MHz, so is this a good solution ?
>> >> > Also, this component has a GSR input for an external reset signal,
>> >> > does anybody know if this is also synchronised with the USER_CLK input
>> >> > ?
>> >> > The device is configured in 8-bit parallel with CCLK which is related
>> >> > to the 74MHz.
>> >> >
>> >> > What have other designers done in this situation.
>> >> >
>> >> > Cheers, Mark.
>> >>
>> >> This is a subject that is often discussed here. What you describe with
>> >> using a user clock for startup is one way to do it. That should work if
>> >> the GSR net is fast enough to operate within your clock cycle.
>> >>
>> >> Another way to use the GSR which does not depend on sychronized release
>> >> of the GSR is to make sure that all of the inputs to your various FSMs
>> >> or other sychronous logic are in a state that will not cause the
>> >> machines to make a state change. For example if the FSMs reset to an
>> >> IDLE state, then make sure that none of the inputs that let the machine
>> >> leave the IDLE state are asserted. Then even if the GSR is released on
>> >> different clock cycles for different FFs, it will not matter.
>> >>
>> >> Or use a couple of delay FFs to generate (from the GSR) a separate,
>> >> synchronized input to the FSMs which will delay state changes from this
>> >> initial state until it releases. This net will not need to go to all of
>> >> the FFs in your design and can be routed much faster.
>> >>
>> >> Another method which is similar to this last one is to have a separate,
>> >> external reset signal which is controlled by a micro or other logic.
>> >> This will only be released well after the config is complete and is
>> >> synchrnized to the clock. As in the last method, this reset will not
>> >> need to go to every FF in the FPGA and so can be routed more quickly.
>> >>
>> >> The GSR is nice in that it puts every FF into a known state and it is
>> >> asynch so it does it NOW! But releasing it can be a problem. A second,
>> >> more limited reset is a good way to get the FPGA started on the right
>> >> foot.
>> >>
>> >> I can't remember other ways that have been described, but I am sure
>> >> there are some.
>> >>
>> >> --
>> >>
>> >> Rick Collins
>> >>
>> >> rick.collins@XYarius.com
>> >>
>> >> remove the XY to email me.
>> >>
>> >> Arius - A Signal Processing Solutions Company
>> >> Specializing in DSP and FPGA design
>> >>
>> >> Arius
>> >> 4 King Ave
>> >> Frederick, MD 21701-3110
>> >> 301-682-7772 Voice
>> >> 301-682-7666 FAX
>> >>
>> >> Internet URL http://www.arius.com
>> >
>> >--
>> >-Ray Andraka, P.E.
>> >President, the Andraka Consulting Group, Inc.
>> >401/884-7930     Fax 401/884-7950
>> >email randraka@ids.net
>> >http://users.ids.net/~randraka
>> >
>> >
>
>--
>-Ray Andraka, P.E.
>President, the Andraka Consulting Group, Inc.
>401/884-7930     Fax 401/884-7950
>email randraka@ids.net
>http://users.ids.net/~randraka
>
>

Article: 20637
Subject: Simple (?) Question about FPGA Test/Demo Boards....
From: "Xanatos" <deletemeaoe_londonfog@hotmail.com>
Date: Wed, 16 Feb 2000 19:56:18 GMT
Links: << >> << T >> << A >>

Hey all,

As we are approaching the end of a few projects, we have a need to make a
few demo boards for our FPGA's. We want to have two - a Virtex board for one
design, and an Altera APEX board for another. I have never done a demo board
before, and while I am not responsible for it, it would heighten my learning
to find out a little more about them.

My question - What are some of the basic/general things that you include on
the demo board. For instance, we plan on having a PCI interface on the APEX
board....granted, that is specific to what we want, but a general "Well,
when I start a demo board, the first things I make sure I have on the board
are: " is what I am asking.

Thanks - please post or email (remove the spam block to email)

-Xanatos

Article: 20638
Subject: Re: CIC Question
From: Ray Andraka <randraka@ids.net>
Date: Wed, 16 Feb 2000 19:57:51 GMT
Links: << >> << T >> << A >>

I think Hogenauer's paper took into account the growth bits.  Much of his
paper was dedicated to truncating the LSBs and the effects of doing so.

flavioas@my-deja.com wrote:

> In article <38A80F1C.499D85@ids.net>,
>   Ray Andraka <randraka@ids.net> wrote:
> > You need to use signed arithmetic.  The input and all subsequent
> stages
> > should be sign extended to the width of the adders.
>
>     I made a mistake, was using STD_LOGIC instead SIGN, because
> my input was in 2´s complement ( direct from A/D converter). Now,
> the sign bit generation is correct!
>
>  Also, keep in
> mind
> > there is a gain through the filter, so you need to either limit the
> input
> > bits or extend the output width to accommodate the gain.  You might
> try not
> > truncating first to get it working then prune the adders.  At least
> then
> > you'll be able to determine if the pruning is causing the problem.
>
>       Now, I´m a little bit confused. I didn´t truncate any
> register, except the last one. Should I use more bits for each
> intermediate stage, than defined by Hogenauer´s paper?
>
>      TIA,
>
>       Flávio
>
> >
> > flavioas@my-deja.com wrote:
> >
> > >     Hi,
> > >
> > >     We are attempting to implement a CIC Interpolation Filter,
> > > following Hogenauer´s recipe. In a flex10k device.
> > >     The parameters are : Bin = 12; Bout = 12; R = 4; M = 1 and N =
> 4.
> > >     So, the minimum register width for each stage is : 13, 14, 15,
> 15,
> > > 15, 16, 16, 18.
> > >     We used 2´s complement addition rules, i.e.,all numbers
> unsigned,
> > > simple binary addition, and carries past the sign are bit ignored.
> > >     We did the properly sign extension from one stage to another.
> > >     But, the overall frequency response is not the expected, it
> didn´t
> > > work.
> > >     Looking at freq. resp. at each stage, we found the desired shape
> > > till the first ( N+1 )Interpolator stage. From this point, as we add
> > > more stages ( N+2,..., 2N) the things get worst. The outputs
> saturates,
> > > we think, and we see nothing useful.
> > >     Is this problem familiar to someone? Any hint?
> > >
> > >     Thanks in Advance,
> > >
> > >     Flávio
> > >
> > > Sent via Deja.com http://www.deja.com/
> > > Before you buy.
> >
> > --
> > -Ray Andraka, P.E.
> > President, the Andraka Consulting Group, Inc.
> > 401/884-7930     Fax 401/884-7950
> > email randraka@ids.net
> > http://users.ids.net/~randraka
> >
> >
>
> Sent via Deja.com http://www.deja.com/
> Before you buy.

--
-Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email randraka@ids.net
http://users.ids.net/~randraka

Article: 20639
Subject: Re: FPGA Express/XC4KXLA annoyance
From: "Andy Peters" <apeters.Nospam@nospam.noao.edu.nospam>
Date: Wed, 16 Feb 2000 13:25:17 -0700
Links: << >> << T >> << A >>

John Fielden wrote in message <88cs39$dkp$1@schbbs.mot.com>...
>Which OBUFT do you have instantiated.  There are two different kinds,
active
>high and active low enabled.  Maybe you need the opposite one.

Not instantiated; inferred.  The IOB has a mux that selects the proper
polarity, so it shouldn't have mattered.

One of the Xilinx apps guys was kind enough to send me a note about this.
Apparently, it's a known issue that Synopsys dropped the ball on.  It's
something that would only become obvious when you aren't meeting timing and
you start peeking into what's going on.


-- a
-----------------------------------------
Andy Peters
Sr Electrical Engineer
National Optical Astronomy Observatories
950 N Cherry Ave
Tucson, AZ 85719
apeters (at) noao \dot\ edu

"Money is property; it is not speech."
            -- Justice John Paul Stevens

Article: 20640
Subject: Logiblox and virtex
From: "Federico Silla" <fsilla@gap.upv.es>
Date: Wed, 16 Feb 2000 20:35:24 -0000
Links: << >> << T >> << A >>

Hi everybody,

I am trying to get started with the fpga design on a Virtex. Up to now, I
have designed my circuits with the Xilinx 4000 family, using logiblox when
needed. However, when I have moved to Virtex, I have realized that the
software tool Foundation (verion 2.1i) does not allow to design logiblox for
virtex. Does anybody know where have logiblox gone? Does anyone know how to
use some other design tool instead of logiblox?

Thanks in advance

Federico Silla

Article: 20641
Subject: Xilinx hold time problems...
From: jhallen@world.std.com (Joseph H Allen)
Date: Wed, 16 Feb 2000 20:46:34 GMT
Links: << >> << T >> << A >>

The M2.1i software now reports hold times on input pads in the data sheet
timing report file, and, of course, I have some significant (up to 2.5 ns)
hold times relative to the system clock.

This does not happen when using the IOB flip flop, with its delay line.  It
does happen when there is small amount of logic between the input and the
first flip flop (so that the IOB flip flop can not be used), and when both
are placed together in a CLB near the pad.

What is the best (easy+automatic) way to eliminate these hold times?  Has
anyone else noticed this?

-- 
/*  jhallen@world.std.com (192.74.137.5) */               /* Joseph H. Allen */
int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0)
+r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2
]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}

Article: 20642
Subject: Re: Xilinx hold time problems...
From: Magnus Homann <d0asta@mis.dtek.chalmers.se>
Date: 16 Feb 2000 22:04:14 +0100
Links: << >> << T >> << A >>

jhallen@world.std.com (Joseph H Allen) writes:

> The M2.1i software now reports hold times on input pads in the data sheet
> timing report file, and, of course, I have some significant (up to 2.5 ns)
> hold times relative to the system clock.
> 
> This does not happen when using the IOB flip flop, with its delay line.  It
> does happen when there is small amount of logic between the input and the
> first flip flop (so that the IOB flip flop can not be used), and when both
> are placed together in a CLB near the pad.
> 
> What is the best (easy+automatic) way to eliminate these hold times?  Has
> anyone else noticed this?

No, I haven't noticed that they had hold times. I wonder if you can
specify it in the timing constraints. Might be a good idea.

I'm not sure why you want to eliminate hold times on the output. I
thought one normally wanted to eliminate clock to out. They might of
course be related, but not always. Remember, the hold times are
probably minimum, with the (implied) maximum at Tco.

Homann
-- 
Magnus Homann, M.Sc. CS & E
d0asta@dtek.chalmers.se

Article: 20643
Subject: Re: Xilinx hold time problems...
From: jhallen@world.std.com (Joseph H Allen)
Date: Wed, 16 Feb 2000 21:16:27 GMT
Links: << >> << T >> << A >>

In article <ltr9ecamgh.fsf@mis.dtek.chalmers.se>,
Magnus Homann  <d0asta@mis.dtek.chalmers.se> wrote:

>No, I haven't noticed that they had hold times. I wonder if you can
>specify it in the timing constraints. Might be a good idea.

I don't see how.

>I'm not sure why you want to eliminate hold times on the output. I
>thought one normally wanted to eliminate clock to out. They might of
>course be related, but not always. Remember, the hold times are
>probably minimum, with the (implied) maximum at Tco.

I'm working on a project which may actually be subjected to the entire
temperature range, plus the chips driving the pins with the hold times are
both fast, subject to clock-skew and close by.  The problem is simplified
(but of course, not eliminated) if the hold times are zero (the timing
window is the same size, but the windows of all the pins are more likely to
maximally overlap if they have the same hold time spec., which gives a
greater overall window size).
-- 
/*  jhallen@world.std.com (192.74.137.5) */               /* Joseph H. Allen */
int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0)
+r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2
]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}

Article: 20644
Subject: Runtime Conditionals?
From: "Gary Spivey" <spivey@rincon.com>
Date: Wed, 16 Feb 2000 14:25:26 -0700
Links: << >> << T >> << A >>

I just went through the recent discussion in the group about conditional
compilation, and I need to take it to a different level. I want to have a
test bench that does:
procedure ...

if (a) {}
else {}

end proc ...

where a is a value passed in from the command line.
In verilog, I can simply pass in a plusarg and then use the $testplusarg
task. Is there a similar function in VHDL to do runtime decisions based on
command line parameters?

Cheers,
Gary
spivey@ieee.com

Article: 20645
Subject: Re: Xilinx hold time problems...
From: "Xanatos" <deletemeaoe_londonfog@hotmail.com>
Date: Wed, 16 Feb 2000 21:28:30 GMT
Links: << >> << T >> << A >>

"Joseph H Allen" <jhallen@world.std.com> wrote in message
news:Fq1J1q.BDA@world.std.com...
> The M2.1i software now reports hold times on input pads in the data sheet
> timing report file, and, of course, I have some significant (up to 2.5 ns)
> hold times relative to the system clock.
>
> This does not happen when using the IOB flip flop, with its delay line.
It
> does happen when there is small amount of logic between the input and the
> first flip flop (so that the IOB flip flop can not be used), and when both
> are placed together in a CLB near the pad.
>
> What is the best (easy+automatic) way to eliminate these hold times?  Has
> anyone else noticed this?
>

I have noticed this, but I have not really come up with a great solution.
The solutions I used:

1) Put a layer of flops in after the input. You might not be able to
accommodate this change, depending on your design, but the extra layer of
flops significantly lowered the hold times for me.
2) Move the nearest flop as close as possible to the IOB. This is a horrid
solution (using RLOC etc).
3) Check, using EPIC or the FPGA Editor in 2.1i, that the inputs are not
going thru the DELAY element. This is turned ON by default in 2.1i, and was
off in 1.5. Go figure. In the UCF file, put all input pins with the NODELAY
tag if you havn;t done so already. This one seemed to save the most in the
timing department for me.

Hope it helps....and if not, sorry, but I tried.
-Xanatos

Article: 20646
Subject: Writing to STDOUT?
From: "Gary Spivey" <spivey@rincon.com>
Date: Wed, 16 Feb 2000 14:38:20 -0700
Links: << >> << T >> << A >>


Here's another (hopefully) easy one ...

I am trying to write a debug message to stdout. It appears that the only way
to do this is with the assert command (which also gives me a bunch of other
stuff). And if I want to view variables, it gets even more arcane.

So, in Verilog, I would type
$display ("mem[%d] = %d", i, mem[i]);

and in VHDL I get something like
assert (1=0) report "mem[" & integer'image(i) & "] = " &
integer'image(mem[i])) severity note;

Is this the only way to do this? Does textio only work on a file or can it
be used on stdout?

Cheers,
Gary
spivey@ieee.org

Article: 20647
Subject: Re: Choosing the correct size FPGA
From: Peter Alfke <peter@xilinx.com>
Date: Wed, 16 Feb 2000 13:40:04 -0800
Links: << >> << T >> << A >>

This a non-trivial problem.
My first approach would be to estimate the number of flip-flops, and then
slect a part that has twice as many as required. All vendors tell you how
many flip-flops they have per block ( XC4000 has 2, Virtex and Spartan2
have 4 ffs per CLB).
But that can be wrong in either direction. If you have a lot of complex
combinatorial logic, the estimated chip might be too small. If you can
"hide" some of what you think as ffs in the RAMs, BlockRAMs or even the
16-bit shift registers available in each Virtex LUT, then you can be far
more compact.

I think you have to invest a bit of time in studying the architectures.
Hell, there are only two or three contenders. I obiously prefer the "end of
the alphabet"...

Peter Alfke, Xilinx Applications
===============================
Richard Dempster wrote:

> I am very new to programming FPGA's but old (today's standards) in C
> programing.  I am trying to find a generic method of defining what size
> FPGA I need given a certain algorithm most likely written in C.  Thus, I
> have some idea what the compiled size of the C program but can I relate
> the compiled size to how many gates will be necessary in a FPGA.
>
> I have four algorithms to code into a FPGA.  Two are data manipulation,
> one is pixel compression using a new algorithm, and the last is a
> standard Principal Component Analysis algorithm.
>
> I know the software used to program the FPGAs will give an apporximate
> gate count but only after development of the VHDL code.
>
> As this will be a proposal, I was wondering if there was a method to
> estimate the FPGA size prior to all the encoding.  I know I don't want
> to oversize the FPGA area due to unnecessary time delays and power
> requirements.
>
> ANY information will be helpful since I haven't found any information
> yet and I seemed to have run out of places to search on the WEB.  None
> of the tutorials seem to deal with the algoritm to gate mapping.  The
> two books I have don't mention this either.
>
> Thanks
> Rick Dempster
> red@usgmrl.ksu.edu

Article: 20648
Subject: Re: Xilinx hold time problems...
From: Peter Alfke <peter@xilinx.com>
Date: Wed, 16 Feb 2000 13:44:04 -0800
Links: << >> << T >> << A >>

The classical solution to this old problem is to utilize the input flip-flop with
its input delay, but configured as a latch, and hold it permanently transparent.

Peter Alfke
==============================
Joseph H Allen wrote:

> The M2.1i software now reports hold times on input pads in the data sheet
> timing report file, and, of course, I have some significant (up to 2.5 ns)
> hold times relative to the system clock.
>
> This does not happen when using the IOB flip flop, with its delay line.  It
> does happen when there is small amount of logic between the input and the
> first flip flop (so that the IOB flip flop can not be used), and when both
> are placed together in a CLB near the pad.
>
> What is the best (easy+automatic) way to eliminate these hold times?  Has
> anyone else noticed this?
>
> --
> /*  jhallen@world.std.com (192.74.137.5) */               /* Joseph H. Allen */
> int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0)
> +r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2
> ]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}

Article: 20649
Subject: Re: Xilinx hold time problems...
From: Magnus Homann <d0asta@mis.dtek.chalmers.se>
Date: 16 Feb 2000 22:55:55 +0100
Links: << >> << T >> << A >>

jhallen@world.std.com (Joseph H Allen) writes:

> In article <ltr9ecamgh.fsf@mis.dtek.chalmers.se>,
> Magnus Homann  <d0asta@mis.dtek.chalmers.se> wrote:
> 
> >No, I haven't noticed that they had hold times. I wonder if you can
> >specify it in the timing constraints. Might be a good idea.
> 
> I don't see how.

*blush* You clearly said hold times on the INPUT. As you can see
below, I read it as output. That's why I'm confused, and you are right.
 
> >I'm not sure why you want to eliminate hold times on the output. I
> >thought one normally wanted to eliminate clock to out. They might of
> >course be related, but not always. 

Homann
-- 
Magnus Homann, M.Sc. CS & E
d0asta@dtek.chalmers.se

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search