Messages from 38800

Article: 38800
Subject: Re: [Spartan-II] Fastest Possibel Output Enable Time for -5 Devices ??
From: Kevin Brace <ihatespamkevinbraceusenet@ihatespamhotmail.com>
Date: Fri, 25 Jan 2002 12:25:32 -0600
Links: << >> << T >> << A >>

To decrease output delays, use IOB output FF and tri-state control FF if
that is possible, and adjust output drive strength if you are using
LVTTL signaling (Which you are using fortunately.).
To use IOB output FF and tri-state control FF, you must not have
ANYTHING after a tri-state buffer, or the FFs will not be pushed inside
the IOB.
When I say, "ANYTHING," what I mean is, ABSOLUTELY no LUTs after the
tri-state buffer, and certainly no feedback from the either of the FFs
to the LUTs.
To explain further, no LUTs after a tri-state buffer means that
absolutely no multiplexers, no logic gates (AND, OR, XOR, etc.), or not
even an inverter (NOT gate) after the tri-state buffer.
No feedback means that you cannot use the IOB FFs as inputs of LUTs.
In HDL, that means that you won't be allowed to use FFs inside an IOB
for right side of a equation.
        My guess is that very likely your design requires those two FFs
to retain their values from time to time, and you may think if feedback
is not allowed, those FFs pushed into IOB will not be able to retain
their values.
To overcome that problem, several synthesis tools have an option to
generate a duplicate FF to emulate feedback and value retention.
If you happened to use XST (Xilinx's synthesis tool) Ver. E or later
(ISE Foundation or WebPACK 4.1 or later), set Processes for Current
Source -> Synthesize -> (right click) Properties -> Xilinx Specific
Options -> Pack I/O Registers into IOBs to "Yes."
In general, don't use "Auto" for this option because some FF may not get
pushed into the IOB.
If you need to have some output and tri-state control FFs pushed into
IOB, but not for all of the output pins, use a synthesis constraint file
to specify which ones you want pushed in.
To learn more about a synthesis constraint file, download an XST User
Guide from Xilinx.
If you happened not to use XST, then whatever synthesis tool you use
should have a option similar to Pack I/O Registers into IOBs (I believe
Synplify or LeonardoSpectrum supports the feature I am talking about).
Also, in MAP, Pack I/O Registers/Latches into IOBs (Processes for
Current Source -> Map -> (right click) Properties -> Pack I/O
Registers/Latches into IOBs) has to be "For Inputs and Outputs" or "For
Outputs Only."
I recommend using "For Outputs Only" because I have observed that the
use of IOB input FF sometimes worsens setup time (probably due to wire
congestion near the input pin).
        One pitfall I noticed once is that when the global fanout is
limited to a certain value, some IOB FF may not get pushed even though
you followed all the things I wrote so far.
The culprit I discovered by viewing the logic generated through
Floorplanner is that when the global fanout is restricted, the fanout of
an asynchronous reset signal also gets restricted, and when that
happens, an asynchronous preset/reset signal seems to get routed through
CLB cells.
When some LUTs get used as a route-through LUT to limit fanout, that
seems to break a rule (or a condition) that allows FFs to get pushed
into IOBs
To avoid this pitfall, set a very high fanout number for the
asynchronous reset signal through a synthesis constraint file.
        Also, use of IOB output and tri-state control FF tends to make
setup time worse, so you may now have to struggle with decreasing setup
time.
To do so, reducing the levels of logic (levels of LUT) seems to be the
only way to do it, and in a LUT-based FPGAs, reducing the number of
inputs used to determine an output is very important.
        I don't know if you are using IOB FFs from what you wrote (I am
guessing you are not.), but a few months ago I really struggled with
this issue for weeks, and in general, Xilinx support page or their
manuals don't go far enough to explain the rules and the pitfalls I
mentioned here.
I think the stuff I wrote here is far more helpful than whatever I have
seen at their website, but if I have to explain some things in more
detail, I can do that.
        To decrease output delay when using LVTTL signaling, use 24mA
current drive strength option because that is the faster than 12mA you
are using currently.
However, that might make your design more susceptible to ground bounce,
so if using IOB FFs is enough to meet your 6ns timing, it is probably
better to stick with a 12mA or lower current drive strength.
If I remember correctly, the current drive strength option can be
adjusted only if you are using LVTTL signaling, but not when you use PCI
or other signaling standards.




Kevin Brace (Don't respond to me directly, respond within the
newsgroup.)



Markus Meng wrote:
> 
> Hi all,
> 
> actually I need to 'fight' with the UCF File in order to decrease the delay
> until the databus buffers do open from tristate to drive. I have the
> following:
> 
> When dsp_csn AND dsp_rdn are low, then open the buffers of the data bus.
> In the UCF-File I did the following setting, which seems impossible to meet.
> Each time I get 7.5..8.3ns until the buffers are open and driving.
> 
> Can anybody give me an advice how I can improve it, without changing the
> pinning or the layout...
> 
> -- Snip ucf-file
> 
> ############################################################################
> ###
> ## Trying to make to Main DSP output enable faster. 24.01.2002 -mm-
> TIMEGRP DSPRdDataPath = PADS(dsp_data(*));
> ############################################################################
> ###
> 
> TIMEGRP DSPRdCtrlPath = PADS(dsp_csn : dsp_rdn);
> 
> ############################################################################
> ###
> 
> TIMESPEC TSP2P = FROM : DSPRdCtrlPath: TO : DSPRdDataPath : 6ns;
> ############################################################################
> ###
> 
> ############################################################################
> ###
> ## Increase Driver Strength 25.01.2002 -mm-
> NET "dsp_data(*)" IOSTANDARD = LVTTL;
> NET "dsp_data(*)" DRIVE = 12;
> -- Snip ucf-file
> 
> markus
> 
> --
> ********************************************************************
> ** Meng Engineering        Telefon    056 222 44 10               **
> ** Markus Meng             Natel      079 230 93 86               **
> ** Bruggerstr. 21          Telefax    056 222 44 10               **
> ** CH-5400 Baden           Email      meng.engineering@bluewin.ch **
> ********************************************************************
> ** Theory may inform, but Practice convinces. -- George Bain      **

Article: 38801
Subject: Re: Dynamic Reconfiguration of single Xilinx FPGA
From: Philip Freidin <philip@fliptronics.com>
Date: Fri, 25 Jan 2002 18:30:02 GMT
Links: << >> << T >> << A >>

Good news, I know exactly what is going wrong.

During config, normal I/O pins are tri-state. In other email
you said you are not using a special programming pin like
LDC or HDC to connect to your /program pin. Even if you did
use HDC, (which then becomes a normal I/O after config, you
would still have a problem.

When config finishes, the I/O pins go active, and the output
will switch to whatever your circuit in the FPGA has as its
initial value.

You need to do two things:

1) A pull up resistor on the net between the general purpose
output pin and the /program pin. This is the net that you will
be driving low to start reconfiguration.

2) You need to make sure that the output value as the chip
changes from configuration to active is logic HIGH. (I am sure
this is your basic problem). Here's how I have done this:

Use an IOB flipflop to drive you control signal out and place
an INIT=S attribute on it. This will make the initial state
a logic HIGH. You also need to have the logic that feeds this
flipflop also initially supply a logic HIGH to its D input,
as the chip starts up, and continues high till you want to
reconfigure.

You can check that you have this right by putting a high speed
scope on the signal, and trigger the scope (and view it as well)
the DONE pin. If you dont have things right, you will see the
DONE pin go high, then as your I/O goes active, if it switches
from pulled up high, to LOW, then you need to fix your logic.
It should stay high.

This can work, I have done it (about 9 years ago  :-)

Philip.

On Thu, 24 Jan 2002 22:00:02 +0800, "Fong Chii Biao" <ericfcb@tm.net.my> wrote:
>Hi, i really need help bout this, as i worked for this for a week, no result
>turns out.
>first, any one ever try reconfigure Xilinx FPGA using the chip itself? (i'm
>using a single XC4010XL)
>
>the problem is.. when i connect the user I/O to the /program pin, the
>configuration can't even complete at power start up..
>when i disconnect the I/O .. the configuration working pretty well.. whats
>the solution for this?
>
>anyone, anyone at all, who has any idea, ple reply to me, thnaks
>
>chiibiao
>

Philip Freidin
Fliptronics

Article: 38802
Subject: Re: [Spartan-II] Fastest Possibel Output Enable Time for -5 Devices ??
From: Chris <chris@cgschneider.com>
Date: 25 Jan 2002 19:40:30 +0100
Links: << >> << T >> << A >>

Just put the fast attribute at that pin:

ucf-file entry:

NET <your output signal> FAST;

This improves timing by 2.5 ns, according to the datasheet;
using 24 mA current in additionally gives you a total improvement of 3.0 ns.

Fast enough?

No? Then do floorplaning and put the LUT near the pad.

Best Regards,

Chris

Article: 38803
Subject: Re: Xilinx webpack
From: "Falk Brunner" <Falk.Brunner@gmx.de>
Date: Fri, 25 Jan 2002 19:43:39 +0100
Links: << >> << T >> << A >>

"Russell Shaw" <rjshaw@iprimus.com.au> schrieb im Newsbeitrag
news:3C514E29.481AF4D8@iprimus.com.au...

> Does the free xilinx webpack come with a technology viewer?
> (shows a schematic equivalent of your vhdl)

AFAIK, no (I havnt found it)

> What is used for the vhdl compiler? Is it VHDL-93?

XST, which is AFAIK VHDL 93 compliant.

> How optimal is the routing generation, is it recommended
> to 'adjust' the output manually? (for sram based fpgas)

This is not a matter of synthesizer, its the Place & Route. These toola are
identical to the professional Software (Foundation/Alliance)

> Does it come with any timing analysis tools?

Yep.

> Is there anything major omitted compared to a 'real' tool?

Hmm, some people miss the FPGA editor. But you still have floorplanner.

> What chip is smallish and sram based like an acex-1k30?

Spartan-II 30 XC2S30

--
MfG
Falk

Article: 38804
Subject: Re: Intel vs. AMD
From: Rick Filipkiewicz <rick@algor.co.uk>
Date: Fri, 25 Jan 2002 20:19:05 +0000
Links: << >> << T >> << A >>

Mark Kinsley wrote:

> Has anybody done any benchmarking of EDA software under different hardware
> platforms...  I'm a brand-name sucker and tend to buy Intel - but how does
> AMD compare.  And how does a Celeron compare to a similar speed P3.  I'm
> using ModelSim, Leonardo Spectrum & Quartus (all under Windows).
>
>  I've seen loads of benchmarks on the web which talk about games, office
> apps and perhaps DTP - which of these applications is most simliar to EDA
> software ?
>
> Mark

My 1.3GHz Athlon with  512MB of 266MHz DDR memory scorches along.  We paid
about £1500 for it about a year ago and it looks like I could get it for much
less now., or the same price but 1GB of memory.

I'd love to do a comparison with a P4+RAMBUS.

Article: 38805
Subject: Pin assignment on ACEX1K
From: "Jeroen Van den Keybus" <vdkeybus@esat.kuleuven.ac.be>
Date: Fri, 25 Jan 2002 22:24:34 +0100
Links: << >> << T >> << A >>

Hello,

I want to connect an EP1K100 ACEX to a 32-bit host which will access it
asynchronously. So there is a 32 bit data bus and a 16 bit address bus and
some control signals (nWE, nRD, nCS). A colleague of mine will be designing
the PCB for it and he would like to start routing asap even while the FPGA
software is still being written. So he wants to have a complete pin
assignment already. As a matter of fact, the FPGA software will probably
often be rewritten on the same PCB to accomodate different lab setups.

My question: are there any guidelines regarding the pin position or should I
rather have him (my colleague) define the pinout for easiest layout. More
precisely, should we rather put D[0..31] and A[0..15] on column or row
interconnects (the ACEX will be written to and read from). We'd hate to see
the Max+2 fitter fail at the end of the month just because of stupid pin
positions we can't change anymore.

Because of this issue we have already oversized the FPGA, normally max.
60-70% of the LE's will be used. But apart from EMC guidelines stating that
'bunching' of large groups af signals should be avoided, we have found
nothing more on this topic.

Jeroen.

Article: 38806
Subject: Re: [Spartan-II] Fastest Possibel Output Enable Time for -5 Devices ??
From: meng.engineering@bluewin.ch (Markus Meng)
Date: 25 Jan 2002 13:47:01 -0800
Links: << >> << T >> << A >>

meng.engineering@bluewin.ch (Markus Meng) wrote in message news:<aaaee51b.0201250717.3a326ebb@posting.google.com>...
> Hi all,
> 
> actually I need to 'fight' with the UCF File in order to decrease the delay
> until the databus buffers do open from tristate to drive. I have the
> following:
> 
> When dsp_csn AND dsp_rdn are low, then open the buffers of the data bus.
> In the UCF-File I did the following setting, which seems impossible to meet.
> Each time I get 7.5..8.3ns until the buffers are open and driving.
> 
> Can anybody give me an advice how I can improve it, without changing the
> pinning or the layout...
> 
> -- Snip ucf-file
> 
> ############################################################################
> ###
> ## Trying to make to Main DSP output enable faster. 24.01.2002 -mm-
> TIMEGRP DSPRdDataPath = PADS(dsp_data(*));
> ############################################################################
> ###
> 
> TIMEGRP DSPRdCtrlPath = PADS(dsp_csn : dsp_rdn);
> 
> ############################################################################
> ###
> 
> TIMESPEC TSP2P = FROM : DSPRdCtrlPath: TO : DSPRdDataPath : 6ns;
> ############################################################################
> ###
> 
> ############################################################################
> ###
> ## Increase Driver Strength 25.01.2002 -mm-
> NET "dsp_data(*)" IOSTANDARD = LVTTL;
> NET "dsp_data(*)" DRIVE = 12;
> -- Snip ucf-file
> 
> markus


After doing various zenarios together with a design engineer
at memec design service. We got the following. Using SP3 of 4.1
we get 7.5ns at 12mA driver strength using the fast option.

The one CLB containing the logic to open the buffers during DSP
read sequence is 'HandPlaced' through the UCF-File. Low Skew Net
has been used, however this is in fact more contra-productive. Ok
as the same says it has low skew, but more delay ... it would be
somehow nice if the place and route tools from Xilix would allow
several runs of only small part of the logic until those logic
parts do meet the timings ...

What is theoretically the fastest possible output time for this device?
Question goes to Xilinx people overthere ...

markus

Article: 38807
Subject: Re: Intel vs. AMD
From: "Mark Kinsley" <Mark.Kinsley@blueyonder.co.uk>
Date: Fri, 25 Jan 2002 22:04:46 GMT
Links: << >> << T >> << A >>

>
> I benchmarked P3 vs P4 recently ...
> http://groups.google.com/groups?hl=en&selm=uu1xqjbud.fsf%40trw.com
>

Interesting results - will be nice to see an AMD comparrison!  I did some
similar benchmarks running Quartus - comparing a PIII-733MHz to a P4-1500MHz
and saw a relatively small difference... my guess was that a PIII-1000MHz
would possible match or better the performance of my P4-1500MHz (SDR133).

Article: 38808
Subject: Re: Does Xilinx Spartan-II have reserved pin for PCI?
From: Kevin Brace <ihatespamkevinbraceusenet@ihatespamhotmail.com>
Date: Fri, 25 Jan 2002 16:45:56 -0600
Links: << >> << T >> << A >>

What you are talking about is the PCILOGIC or some people call it
"Special IRDY and TRDY pin."
You may want to do a newsgroup search of news:comp.arch.fpga and enter
"special IRDY and TRDY" or "Xilinx IRDY TRDY", and find a past articles
people were discussing about this special feature Xilinx doesn't want
people to know how to use it.
For this "Special IRDY and TRDY pin," you have to use certain pin
location of Spartan-II, or you won't be able to use the feature.
Supposedly, this "Special IRDY and TRDY pin" was a secret for some time,
but recent Spartan-II pinout PDF file mentions which pin has the
feature, so you should not have to worry about it too much.
As Eric Crabill of Xilinx has said, this "Special IRDY and TRDY pin" is
important for 66MHz PCI, and Xilinx further more cheats by adding delay
on the global clock buffer to get more setup time margin, but won't tell
you how to add the delay on the global clock buffer if the user doesn't
pay for 66MHz LogiCORE PCI license.
I know how it can be done (through their bitstream generation program),
but the values that's entered to delay the global clock buffer is a
secret.
I will personally will like to know what the values mean.
        I don't know how you are going to develop your own PCI card, but
if you don't want to risk too much money for prototype board
development, you may want to buy Insight Electronics Spartan-II PCI
Development Kit from Insight Electronics.


http://208.129.228.206/solutions/kits/xilinx/ (Product Introduction
Page)

http://208.129.228.206/solutions/kits/xilinx/spartan-iipci.html
(Insight Electronics Spartan-II PCI Development Kit)



        Unlike some companies (i.e., Altera) who tries to sell you
software or a license for a PCI IP core with a PCI development card,
Insight Electronics lets you buy just only the PCI card only for only
$145, but it is possible that the price may have changed by now.
Of course, the $145 price doesn't include a license for Xilinx LogiCORE
PCI or any other PCI IP core, or device drive development software, so
if you only want the hardware, you will have to do rest of it by
yourself.
You may want to pick up two Spartan-II PCI Development Kit option boards
because it is probably nice to have them, and a parallel port download
cable if you don't own a JTAG or MutiLINX cable.
        Like some else said in this newsgroup in the past, Xilinx's
LogiCORE PCI license for Spartan-II costs only $2,000.
Insight Electronics also seems to resell the same Xilinx LogiCORE 
PCI, but Insight Electronics one seems to also come with the same PCI
Development card I am talking about, so Insight Electronics one seems
like a better deal.
If you work for a company, $2,000 is perhaps about two weeks worth of
salary of a newbie engineer.
From my experience developing my own PCI IP core which I will talk a
little later, developing a PCI IP core can take several months to do
logic design, testbench the design, and get 33MHz PCI timings right (Tsu
< 7ns and Tval < 11ns).
So, if you consider the trouble, $2,000 is not too much money to pay for
a license.
For example, if you will sell a few hundred cards based on Spartan-II
for $300 to $500, the $2,000 license won't be too much per board.
I will have to say that Xilinx didn't pay me to say what I just said,
but if you consider the trouble, it is definitely worth paying $2,000.
        An alternative to Xilinx LogiCORE PCI IP core perhaps will be a
free PCI IP core from opencores.org (http://www.opencores.org), but I am
not sure if it is done yet, and the website isn't clear if their PCI IP
core can meet 33MHz PCI timings (especially Tsu < 7ns).
        If you happened to be really poor like me who cannot afford
anything other than the Insight Electronics Spartan-II PCI Development
Kit and a computer, it is for one of a kind prototype so paying $2,000
for license is too much, for a personal project, or you don't care about
time-to-market, then it might make sense to develop your own PCI IP
core.
If you have no clue what a PCI IP core is like to design, you may want
to download PCI IP core user manuals from companies like Xilinx or
Altera which posts pretty detailed manuals on their websites without
signing an NDA.
Their manuals should let you know what a PCI IP core is like, and
perhaps you may want to develop your own PCI IP core based on their
specification (sort of cloning their PCI IP cores).
Of course, their manuals won't tell you the pitfalls of designing a PCI
state machine, so for that, you will have to come up by yourself (PCI
Specification Appendix B gives you an example state machine, but I found
it easier to come up with my own.).
For tools, you can use Xilinx ISE WebPACK 4.1 for free which comes with
XST synthesis tool and ModelSim XE-Starter.
Technically, ModelSim XE-Starter has a 500-lines of code limit, but it
has a loophole which lets you still simulate a design even if the design
exceeds 500-lines of code although it REALLY slows down after the
500-lines of code (That's why ModelSim XE-Starter is free.).
When I developed my PCI IP core, I took advantage of this loophole, and
was able to do RTL (functional) simulation which exceeded the limit by
4000-lines, and post-route simulation which exceeded the limit by
20,000-lines (Post-route simulation took something like 40 minutes.).
I think XST is an okay synthesis tool for 33MHz PCI design, and the
skill of the designer matters more than the synthesis tool that is used.
        From my experience, the trick to meet 33MHz PCI timings with an
HDL-based PCI IP core will be to design the logic very carefully and
have a good understanding of the target architecture.
Because HDL offers higher level of abstraction than schematics, if the
designer is not being careful in logic design, the designer will likely
write sloppy code with too many "if" statements.
All those "if" statements will eventually come back to haunt the
designer when a synthesis tool creates too many levels of logic for
timing critical non-registered signals like FRAME# and IRDY# in target
mode, and DEVSEL#, TRDY# and STOP# in bus master (initiator) mode
propagating towards AD[31:0] pins.
Any non-registered signals that reach AD[31:0] pins should (or must)
have 3 levels or less LUT if you expect to meet 33MHz PCI Tsu < 7ns
requirement for Spartan-II speed grade -5 (I will say 4 levels or less
for Spartan-II speed grade -6).
In PCI, especially during a no-wait burst cycle, the PCI IP core will
not be able to use a registered version of the inputs, and must use
non-registered (raw) inputs, and still meet 33MHz PCI Tsu < 7ns
requirement.
In addition to careful logic design, manual floorplanning is pretty much
a must because automatic P&R tool just doesn't place the LUTs in the
correct location.
Good understanding of IOB FF rules is also essential especially for 5V
PCI I/O buffers with Spartan-II speed grade -5 (do a Google Groups
search of news:comp.arch.fpga for "[Spartan-II] Fastest Possible Output
Enable Time for -5 Devices ??". My reply to that question should mention
how to get IOB FF part right.).
Assuming you use the Insight Electronics Spartan-II PCI Development Kit 
+ all the option boards + parallel cable + ISE WebPACK, that shouldn't
cost you more than $400.
In addition to the stuff I already mentioned, you may want to buy books
like PCI Local Bus Specification Revision 2.2 from PCISIG
(http://www.pcisig.com) ($40), PCI System Architecture ($40), and PCI
Hardware and Software ($100).
I personally think the PCI Local Bus Specification Revision 2.2 explains
the bus pretty well.
PCI System Architecture is okay, and gives you some additional examples
of possible PCI waveforms to expect that is not emphasized in the
specification.
PCI Hardware and Software is very detailed, but it is very difficult to
understand.
Don't get discouraged even if you don't like PCI Hardware and Software
after reading it because it is too detailed.
All the stuff I mentioned shouldn't cost anything more than $700, so if
you have the time and are willing to design your own PCI IP core, you
can do it very cheap.
        Assuming you are going to eventually develop your own PCB, even
if you decide to write your own PCI IP core, it might still be a good
idea for your PCB to use the exact same pinout as Xilinx LogiCORE PCI's
Spartan-II PQ208 package so that even if you decided to suddenly switch
back to Xilinx LogiCORE PCI, you won't have to change the PCB.
Insight Electronics Spartan-II PCI Development Kit's schematic which you
can download from the above mentioned URL tells you that without getting
a license from Xilinx.
For higher pin count Fine-Pitch BGA package, you may have to license
Xilinx LogiCORE PCI to figure out the pinout (unless someone can
secretly tell you the package pinout).
        I don't know how cost sensitive your design will be, but
Spartan-IIs are really cheap, even if the volume is low.
Last time I checked at Insight Electronics, just buying one Spartan-II
PQ208 speed grade -5 cost only $23, although the minimum order was $50.
Assuming a prototype volume of buying one chip, the price difference
between 150K system gate part and 200K system gate part should only be
about $5, and speed grade difference (-5 and -6) should only be another
$5.
200K system gate part will require a higher density Configuration PROM
(XC18V02) than for 150K system gate part (XC18V01), but that difference
should be about $10.
Fine-Pitch BGA package is much more expensive ($10 to $15 extra) than a
cheaper 208-pin PQFP package, so if 208-pin PQFP package is adequate,
you should stick with that one.
I will say stick with 200K system gate part speed grade -6 part so that
you will have a lot of room, and the chip will be fast.
        The thing Xilinx always lies about is the actual logic density
of the chip because they count Block RAM as gates, so typically the user
has only 1/4 to 1/6 worth of gates (LUT) of what they advertise (150,000
system gate / (4 to 6) = realistic gate count).




Kevin Brace (Don't respond to me directly, respond within the
newsgroup.)





"Deli Geng (David)" wrote:
> 
> Hi, there,
> 
> I'm new to Spartan-II FPGA chip. However, I need to use it in a PCI design.
> So I was wondering if some pins are dedicated for PCI or any pin can be used
> for PCI connection? BTW, can you also provide some advice on how to use
> Spartan FPGA as PCI controller?
> 
> Thanks a lot.
> 
> Regards,
> 
> David

Article: 38809
Subject: Re: [Spartan-II] Fastest Possibel Output Enable Time for -5 Devices ??
From: newman5382@aol.com (newman)
Date: 25 Jan 2002 16:05:46 -0800
Links: << >> << T >> << A >>

Some ideas that you may want to research further are:

I believe there is a default delay on input buffers to guarantee a
zero hold time on inputs.  You may see if you can set it to no delay
for your combinatorial logic.

Check the fanout from your critical path input buffers.  Perhaps you
can instantiate a buffer to your noncritical path to reduce the
fanout.

Determine the intrinsic delay of your path (i.e. with no routing
delays).  If you are already close to it, there maybe little else that
can be done without going to a faster speed grade.  Beware of ground
bounce if you decide to go with fast, high drive strength outputs.

If there is a lot of routing delay, consider using the MPPR tool, and
run it through all the cost tables with the placer level at 5.  Using
the best run, retry that run with a routing effort of 5.

You may want to try to placing the LUT driving the tristate inputs
close to the IOB's in question via the floorplanner or FPGA editor,
and using the MPPR again.

You may want to hand route trying to maximize use of long lines.

Things may go quicker if you can stub out the rest of your design, and
see what the tools can do with just the critical path in question.

Good luck,

Newman

Newman

Article: 38810
Subject: Re: [Spartan-II] Fastest Possibel Output Enable Time for -5 Devices ??
From: Kevin Brace <ihatespamkevinbraceusenet@ihatespamhotmail.com>
Date: Fri, 25 Jan 2002 18:06:28 -0600
Links: << >> << T >> << A >>

I have to say that in my reply to your original posting, I forgot to
mention to enable fast slew rate (someone else mentioned that), but
partially because I don't use fast slew rate (Or, more accurately, I
cannot use fast slew rate option for Spartan-II PCI I/O buffers.).
Fast skew rate will help Tco (Clock-to-Output), but again, you may have
to worry more about ground bounce.
Regardless, you should take advantage of IOB FFs, assuming that you
don't have anything (any LUTs) after a tri-state buffer, and my previous
posting should explain how to do that.
Doing so will definitely make your setup time worse, so you may now have
to "fight" to reduce setup time (Tsu).
In my opinion from my own experience struggling to reduce Tco for a
33MHz PCI IP core, a designer who is having problems with Tco isn't
getting something right.
I think most of the times, people have problems trying to meet certain
clock frequency or setup time, and in the case of 66MHz PCI, the setup
time (Tsu < 3ns) is what makes 66MHz PCI hard to achieve in most FPGAs.
Even my synthesizable PCI IP core can meet Tval (Tco) < 6ns for 66MHz
without sweating because I now know how to constrain output and
tri-state control FFs into IOBs.
I get Tval (Tco) = 4.6ns for Spartan-II speed grade -6, and Tval (Tco) =
5.6ns for Spartan-II speed grade -5 when PCI66_3 I/O buffer is used. 
PCI66_3 is likely as fast as LVTTL 24mA with fast slew rate.
But again, Tsu < 3ns is sooooo hard to meet, and even Xilinx has to use
the fastest Virtex/Spartan-II speed grade -6, undisclosed special IRDY
and TRDY pins, and an undisclosed way to put delay on the global clock
buffer to meet 66MHz PCI timings.
        I don't think I should make up my conclusion based on what you
said so far, but I guess I cannot resist writing more about this issue.
Are you paying this engineer from memec something like US$50 per hour
for "consultation"?
Did the engineer from memec spotted the fact that your UCF file didn't
use 24mA LVTTL current drive strength or fast slew rate option? (I know
I initially missed the fast slew rate option.)
Does the engineer from memec understand IOB FF rules and pitfalls I
mentioned previously?
Perhaps you may want to ask memec for a refund assuming that engineer
didn't spot the slower 12mA current strength option you were using, or
the slow slew rate (default setting) option you were using
unintentionally.
Maybe memec should fire such an employee who cannot spot things that are
fairly basic.
If memec is going to get paid, they should be able to spot those two
things instantly by reading a UCF file.
I am not making fun of you in anyway since I had similar problems trying
to reduce Tco a few months ago, but I think an employee working for a
company that likely charges something for consultation should know
better.




Kevin Brace (Don't respond to me directly, respond within the
newsgroup.)





> After doing various zenarios together with a design engineer
> at memec design service. We got the following. Using SP3 of 4.1
> we get 7.5ns at 12mA driver strength using the fast option.
> 
> The one CLB containing the logic to open the buffers during DSP
> read sequence is 'HandPlaced' through the UCF-File. Low Skew Net
> has been used, however this is in fact more contra-productive. Ok
> as the same says it has low skew, but more delay ... it would be
> somehow nice if the place and route tools from Xilix would allow
> several runs of only small part of the logic until those logic
> parts do meet the timings ...
> 
> What is theoretically the fastest possible output time for this device?
> Question goes to Xilinx people overthere ...
> 
> markus

Article: 38811
Subject: Re: Xilinx webpack
From: Kevin Brace <ihatespamkevinbraceusenet@ihatespamhotmail.com>
Date: Fri, 25 Jan 2002 18:45:59 -0600
Links: << >> << T >> << A >>

Russell Shaw wrote:
> 
> Hi all,
> 
> Does the free xilinx webpack come with a technology viewer?
> (shows a schematic equivalent of your vhdl)
> 

        Doesn't LeonardoSpectrum-Altera have that feature killed?
Who supports such a feature for Altera-based free tools?
It will be nice if I can see the schematic equivalent or the LUT
connection of a design after synthesis in ISE WebPACK, but it doesn't
seem to have such a feature, so I always have to P&R and use
Floorplanner to look at the LUT mapping.



> What is used for the vhdl compiler? Is it VHDL-93?
> 

        I don't use VHDL, so I should not answer this question, but the
synthesis tool will be XST (Xilinx Synthesis Technology)
I find that XST uses far less numbers of LUT than Altera's in-house
synthesis tool for Quartus II 1.1 Web Edition for the same design (a
synthesizable Verilog-based PCI IP core I developed.).
LeonardoSpectrum-Altera seems to generate similar number of LUTs as XST,
but I don't really like it because it is fairly unstable.



> How optimal is the routing generation, is it recommended
> to 'adjust' the output manually? (for sram based fpgas)
> 

        I guess I am much more used to Xilinx tools/devices than Altera
devices, but can you clarify your question?
Regardless, comparing Altera's FLEX10KE to Xilinx Spartan-II, I can say
from my experience that FLEX10KE's output buffer for 5V PCI (actually
3.3V PCI I/O buffer without PCI clamping diode enabled to make the
inputs 5V tolerant.) is a lot faster than Spartan-II's PCI33_5 (5V PCI
I/O buffer), but FLEX10KE's input buffers are much slower than
Spartan-II's input buffers.
In Xilinx devices, use of IOB (IOE in Altera's lingo) output and
tri-state control I think is more important than in Altera devices
because they seem slower (especially PCI33_5).
For LVTTL signaling, current drive and slew rate can be adjusted, and
gives you far more options for adjustment than Altera.

 


> Does it come with any timing analysis tools?
> 

        Yes, it does.
I prefer Xilinx's timing error report than Altera's.



> Is there anything major omitted compared to a 'real' tool?
> 

        Like Falk said, FPGA Editor!
Also, COREGEN is omitted if I am correct.
ModelSim-XE Starter comes with 500 lines code limitation, but the
simulation only gets slower, and doesn't stop even after 500 lines of
code (I have once exceeded the limit by 20,000 lines, but the thing ran
okay. It took a lot of time to do simulation though.).
Because Altera doesn't offer a free version of ModelSim with their
tools, I mainly use Xilinx tools for development, and port it to
Altera's chips later.
I won't mind using Altera's tools more often if they offered a free
version of ModelSim like Xilinx did.



> What chip is smallish and sram based like an acex-1k30?

        Falk said Spartan-II 30K system gate part, but it will be better
to download ACEX1K and Spartan-II datasheet, and compare the number of
4-input LUTs.
In general, Xilinx seems to inflate "system" gate count more than
Altera.
I will guess than Spartan-II 50K system gate part is comparable in terms
of gate density, but the nice thing is that similar density Spartan-IIs
are sold at half or less of the price of ACEX1K.
For example, Spartan-II 150K system gate part 208 pin PQFP package speed
grade -5 is sold at about $23 for just one at various Xilinx
distributors like Insight Electronics or NuHorizons (Minimum order is
$50 though).
Configuration PROM should cost something like $23 (XC18V01) or $35
(XC18V02).
Regarding ACEX1K pricing, I once saw ACEX1K-100K speed grade -1 208 pin
PQFP package for $90 at Arrow.
The comparable density part Spartan-II 200K system gate part speed grade
-6 208 pin PQFP package shouldn't cost more than $35.




Kevin Brace (Don't respond to me directly, respond within the
newsgroup.)

Article: 38812
Subject: Re: Pin assignment on ACEX1K
From: Kevin Brace <ihatespamkevinbraceusenet@ihatespamhotmail.com>
Date: Fri, 25 Jan 2002 18:52:29 -0600
Links: << >> << T >> << A >>

I have to admit my question has nothing to do with ACEX 1K, but why not
use Spartan-II 200K part instead?
LUT density should be similar, but Spartan-IIs seem much cheaper than
ACEX 1K.
You can use ISE WebPACK 4.1 + ModelSim-XE Starter which are both free
for development.



Kevin Brace (Don't respond to me directly, respond within the
newsgroup.)

Article: 38813
Subject: Re: Synthsis Tools for Xilinx
From: Kevin Brace <ihatespamkevinbraceusenet@ihatespamhotmail.com>
Date: Fri, 25 Jan 2002 19:18:59 -0600
Links: << >> << T >> << A >>

I have to admit I have a very limited experience using synthesis tools
for Xilinx devices (I have used XST only that came with ISE WebPACK 3.3
and 4.1), but from my experience using a free version of
LeonardoSpectrum-Altera (I know no one is talking about Altera stuff
here. Ver. 2001-1a-28), LeonardoSpectrum-Altera tends to crash pretty
frequently for no apparent reason.
What I mean is when I am changing synthesis options, lets say when I go
from one synthesis category to another category (clicking another tab),
the tool crashes suddenly.
Crashing I am talking here is a general protection fault.
This thing doesn't always happen, but seems to happen occasionally.
Another crashing behavior I was able to reproduce 100% of the time is
when a "defparam" keyword is declared in Verilog to pass a parameter
down to an instantiated module before the instantiated module is
instantiated in the text ("defparam" comes first in the text before the
instantiated module is instantiated.).
When I do this, LeonardoSpectrum-Altera crashes 100% of the time in the
middle of the synthesis.
To workaround this bug, I had to move the "defparam" to the end of the
text file, and the tool no longer crashes during synthesis.
Altera's in-house synthesis tool for Quartus II (sorry for more Altera
stuff), also doesn't like declaring "defparam" before the related module
gets instantiated, but in Altera's in-house synthesis tool's case, the
synthesis tool just simply gives me an error message, and doesn't crash
miserably like LeonardoSpectrum-Altera.
ModelSim (Also a Mentor Grphics product.) or XST never crashed when I
declared defparam before a module gets instantiated, so I guess things
depend on the synthesis tool.
So, from the general instability I have seen with
LeonardoSpectrum-Altera, I will still expect the paid version that
supports Xilinx devices to be somewhat similar, so if I were in a
position purchasing a synthesis tool, I will never even consider paying
for LeonardoSpectrum.
        That being said, is Synplicity's Synplify that better than XST?
People who use Synplify always seem to mention that it can achieve
higher frequency than other synthesis tools, but how does it fare as far
as setup time of a non-registered signal path going into IOB FFs is
concerned?
Will it generate fewer levels of LUT than what XST can?




Kevin Brace (Don't respond to me directly, respond within the
newsgroup.)




Shawn wrote:
> 
>     I have been using FPGA express with the Xilinx foundation software.  I
> found it to be a good tool.  Unfortunately, Xilinx will no longer be
> supplying FPGA express with its other software. Does anyone know if there
> the Xilinx synthesis tool (XST) is comparable in performance to FPGA
> Express?  Is there an article comparing the performance of the major
> synthesis tools for Xilinx (FPGA Express, Exemplar, Synplicity, XST, etc)?
> The majority of the people I have spoken with seem to prefer Synplicity, but
> the majority of them have very limited or no exposure to the other tools.
> 
> Thank you in advance,
> Shawn Hineline
> 
> --
> *******************************
> Shawn Hineline
> Optimum Engineering, Inc.

Article: 38814
Subject: Re: [Spartan-II] Fastest Possibel Output Enable Time for -5 Devices ??
From: John_H <johnhandwork@mail.com>
Date: Sat, 26 Jan 2002 01:36:11 GMT
Links: << >> << T >> << A >>

If you want to jump through some hoops *and* if you can guarantee that csn disables
before rdn (or vice-versa) I can give you latch tristates that should give you 6.3nS
and better at the 12mA drive in an xc2s100-5.  If you want to increase to 24mA drive
you should be able to get this to 10s of picoseconds over the 6ns target as long as
your data pins are near the control pins.

Interested?
Using Synplify?
Using Verilog?
Comfortable checking the timing as an exception (editing the pcf or running timing
analyzer with path tracing)?

If you want to push the bounds you have to push the code.

- John



John_H wrote:

> You're looking for input-logic-tristate_out times that are extremely low.
> Admirable.  Tough.  Can't get there with just a UCF constraint.
>
> I'm basing the info below on recent Spartan-II design efforts - different
> devices may have different characteristics:
>
> If there's no way your enables can work properly with registered control logic
> (Tristate register in each IOB) then the only way to get your times is to
> "trick" the IOB elements into giving you the routing without the intermediate
> logic.  If you use IOB registers for the data and need the fast clock-to-out
> times you may not have a choice.  If your outputs are combinatorial or if you
> can push the output registers out of the IOB you have a chance.
>
> By using the IOB tristate register as a latch (which means the output must be a
> latch or combinatorial) you can work with the control signals for the data, the
> enable, and/or the reset to "effectively" provide the logic you're looking for,
> replicated in every output as a separate tristate latch.  The result is that the
> signals don't have to get from the pads to the LUTs and back out to the pads but
> can go straight from pads to pads.  This is especially helpful if your control
> and data signals are on the same side of the device.
>
> Any other attempts at improvement that I can come up with involve pin changes.
>
> - John
>
> Markus Meng wrote:
>
> > Hi all,
> >
> > actually I need to 'fight' with the UCF File in order to decrease the delay
> > until the databus buffers do open from tristate to drive. I have the
> > following:
> >
> > When dsp_csn AND dsp_rdn are low, then open the buffers of the data bus.
> > In the UCF-File I did the following setting, which seems impossible to meet.
> > Each time I get 7.5..8.3ns until the buffers are open and driving.
> >
> > Can anybody give me an advice how I can improve it, without changing the
> > pinning or the layout...
> >
> > -- Snip ucf-file
> >
> > ############################################################################
> > ###
> > ## Trying to make to Main DSP output enable faster. 24.01.2002 -mm-
> > TIMEGRP DSPRdDataPath = PADS(dsp_data(*));
> > ############################################################################
> > ###
> >
> > TIMEGRP DSPRdCtrlPath = PADS(dsp_csn : dsp_rdn);
> >
> > ############################################################################
> > ###
> >
> > TIMESPEC TSP2P = FROM : DSPRdCtrlPath: TO : DSPRdDataPath : 6ns;
> > ############################################################################
> > ###
> >
> > ############################################################################
> > ###
> > ## Increase Driver Strength 25.01.2002 -mm-
> > NET "dsp_data(*)" IOSTANDARD = LVTTL;
> > NET "dsp_data(*)" DRIVE = 12;
> > -- Snip ucf-file
> >
> > markus
> >
> > --
> > ********************************************************************
> > ** Meng Engineering        Telefon    056 222 44 10               **
> > ** Markus Meng             Natel      079 230 93 86               **
> > ** Bruggerstr. 21          Telefax    056 222 44 10               **
> > ** CH-5400 Baden           Email      meng.engineering@bluewin.ch **
> > ********************************************************************
> > ** Theory may inform, but Practice convinces. -- George Bain      **

Article: 38815
Subject: Re: Coregen Half-Band FIR filter implemenation does not work
From: "Clark Pope" <cepope@mindspring.com>
Date: Fri, 25 Jan 2002 20:51:46 -0500
Links: << >> << T >> << A >>

I am having the same problem, except I am using a VirtexII part.

If I implement a single rate FIR it simulates and works great. As soon as I
implement a filter with 2 cycles/output it all goes to trash. (Still
simulates fine.)

I have sent my stuff into Xilinx but they claim(so far) that their are no
known defects with coregen halfband filters.

Let me know if you get resolution, I'll do the same.

-Clark

"Andy Peters" <andy@exponentmedia.nospam.com> wrote in message
news:3C50A16E.30008@exponentmedia.nospam.com...
> Yury wrote:
>
> > Implemented 1/2 Band 51-tap FIR using Coregen + Foundation.
> > Pre-synthesis simulation looks excellent, however when the filter is
> > loaded into Spartan-II the output looks like complete random junk. All
>
> > timing is met.
>
>
> You might want to look at the model and verify if it's correct.  Watch
> out for timing information in the model (e.g.,   a <= b after 1 ns; etc)
> -- you may find that your state machines are off by one state, and it's
> the model's fault.
>
> --a
>
>
>

Article: 38816
Subject: Re: [Spartan-II] Fastest Possibel Output Enable Time for -5 Devices ??
From: Ray Andraka <ray@andraka.com>
Date: Sat, 26 Jan 2002 02:44:50 GMT
Links: << >> << T >> << A >>



John_H wrote:

> If you want to jump through some hoops *and* if you can guarantee that csn disables
> before rdn (or vice-versa) I can give you latch tristates that should give you 6.3nS
> and better at the 12mA drive in an xc2s100-5.  If you want to increase to 24mA drive
> you should be able to get this to 10s of picoseconds over the 6ns target as long as
> your data pins are near the control pins.
>
> Interested?
> Using Synplify?
> Using Verilog?
> Comfortable checking the timing as an exception (editing the pcf or running timing
> analyzer with path tracing)?
>
> If you want to push the bounds you have to push the code.

and trust the timing files, and have no clock jitter.  10ps is no margin, and with the
typical jitter on the CLKDLL is probably actually short of the 6ns goal.

>
>
> - John
>
> John_H wrote:
>
> > You're looking for input-logic-tristate_out times that are extremely low.
> > Admirable.  Tough.  Can't get there with just a UCF constraint.
> >
> > I'm basing the info below on recent Spartan-II design efforts - different
> > devices may have different characteristics:
> >
> > If there's no way your enables can work properly with registered control logic
> > (Tristate register in each IOB) then the only way to get your times is to
> > "trick" the IOB elements into giving you the routing without the intermediate
> > logic.  If you use IOB registers for the data and need the fast clock-to-out
> > times you may not have a choice.  If your outputs are combinatorial or if you
> > can push the output registers out of the IOB you have a chance.
> >
> > By using the IOB tristate register as a latch (which means the output must be a
> > latch or combinatorial) you can work with the control signals for the data, the
> > enable, and/or the reset to "effectively" provide the logic you're looking for,
> > replicated in every output as a separate tristate latch.  The result is that the
> > signals don't have to get from the pads to the LUTs and back out to the pads but
> > can go straight from pads to pads.  This is especially helpful if your control
> > and data signals are on the same side of the device.
> >
> > Any other attempts at improvement that I can come up with involve pin changes.
> >
> > - John
> >
> > Markus Meng wrote:
> >
> > > Hi all,
> > >
> > > actually I need to 'fight' with the UCF File in order to decrease the delay
> > > until the databus buffers do open from tristate to drive. I have the
> > > following:
> > >
> > > When dsp_csn AND dsp_rdn are low, then open the buffers of the data bus.
> > > In the UCF-File I did the following setting, which seems impossible to meet.
> > > Each time I get 7.5..8.3ns until the buffers are open and driving.
> > >
> > > Can anybody give me an advice how I can improve it, without changing the
> > > pinning or the layout...
> > >
> > > -- Snip ucf-file
> > >
> > > ############################################################################
> > > ###
> > > ## Trying to make to Main DSP output enable faster. 24.01.2002 -mm-
> > > TIMEGRP DSPRdDataPath = PADS(dsp_data(*));
> > > ############################################################################
> > > ###
> > >
> > > TIMEGRP DSPRdCtrlPath = PADS(dsp_csn : dsp_rdn);
> > >
> > > ############################################################################
> > > ###
> > >
> > > TIMESPEC TSP2P = FROM : DSPRdCtrlPath: TO : DSPRdDataPath : 6ns;
> > > ############################################################################
> > > ###
> > >
> > > ############################################################################
> > > ###
> > > ## Increase Driver Strength 25.01.2002 -mm-
> > > NET "dsp_data(*)" IOSTANDARD = LVTTL;
> > > NET "dsp_data(*)" DRIVE = 12;
> > > -- Snip ucf-file
> > >
> > > markus
> > >
> > > --
> > > ********************************************************************
> > > ** Meng Engineering        Telefon    056 222 44 10               **
> > > ** Markus Meng             Natel      079 230 93 86               **
> > > ** Bruggerstr. 21          Telefax    056 222 44 10               **
> > > ** CH-5400 Baden           Email      meng.engineering@bluewin.ch **
> > > ********************************************************************
> > > ** Theory may inform, but Practice convinces. -- George Bain      **

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 38817
Subject: Xilinx PCI logicore: clarification on nature of COMPLETE
From: David Miller <spam@quartz.net.nz>
Date: Sat, 26 Jan 2002 15:50:56 +1300
Links: << >> << T >> << A >>

Hi,

I wanted to clarify whether my understanding of how the COMPLETE signal 
is interpreted in xilinx's PCI logicore is accurate, and that I won't 
come to grief.

As near as I can tell, when doing master writes in the PCI64 logicore:

Some time after REQUEST64 is asserted, M_DATA and M_SRC_EN go high.  In 
that cycle, ADIO and COMPLETE are sampled.  (Notwithstanding retries 
etc,) data continues to be written into the target until COMPLETE is 
asserted.

In other words, COMPLETE tags the last word the core will write into the 
target.

Is this right?  I've found the PCIcore user guide rather nebulous and 
doesn't define the behaviour of this signal in plain language.  It 
provides a relatively complex example of how to drive COMPLETE, but 
doesn't really explain what it is doing or, more to the point, why each 
bit of the example is necessary.






-- 
David Miller, BCMS (Hons)  | When something disturbs you, it isn't the
Endace Measurement Systems | thing that disturbs you; rather, it is
Mobile: +64-21-704-djm     | your judgement of it, and you have the
Fax:    +64-21-304-djm     | power to change that.  -- Marcus Aurelius

Article: 38818
Subject: Re: Coregen Half-Band FIR filter implemenation does not work
From: yuryws@optonline.com (Yury)
Date: 25 Jan 2002 18:52:47 -0800
Links: << >> << T >> << A >>

Andy Peters <andy@exponentmedia.nospam.com> wrote in message news:<3C50A16E.30008@exponentmedia.nospam.com>...
> Yury wrote:
> 
> > Implemented 1/2 Band 51-tap FIR using Coregen + Foundation.
> > Pre-synthesis simulation looks excellent, however when the filter is
> > loaded into Spartan-II the output looks like complete random junk. All
>  
> > timing is met.
> 
> 
> You might want to look at the model and verify if it's correct.  Watch 
> out for timing information in the model (e.g.,   a <= b after 1 ns; etc) 
> -- you may find that your state machines are off by one state, and it's 
> the model's fault.
> 
> --a

I was able to find the problem, which appears to be related to Xilinx
tools.
Here it is:
1) 3 sets of 14 outputs each when checked with a logic analyzer have
the correct data, but all th bits are inverted.

2) The design was simulated functionally before routing -- all is well
(the outputs are not inverted)

3) The design was simulated functionally after the routing (.ncd -->
.vhd)
-- all is well (the outputs are not inverted)

4) Same as 3) but with timing info (.sdf file) -- all is well (the
outputs are not inverted)

5) Manually verified (by using FPGA Editor) that at least 1 of the
outputs does not get inverted.

The only other step that was performed was the generation of .bit file
from .ncd file. I do not know of a way to go fro .bit file to .ncd
file (or .vhd) to verify the functionality after bit file generation.
I believe that this is where the outputs become inverted. The problem
was fixed by channeling all the desired outputs through the inverted
input to each IOB via FPGA Editor, and then regenerating the new bit
file.

Thanks for all help.

Yury

Article: 38819
Subject: Re: [Spartan-II] Fastest Possibel Output Enable Time for -5 Devices ??
From: John Handwork <johnhandwork@mail.com>
Date: Fri, 25 Jan 2002 19:06:46 -0800
Links: << >> << T >> << A >>

Clock jitter is a serious issue in most systems and should never be
ignored.  It looks like - in this case - the goal is the go from input
pads (csn, rdn) to asynchronous output data.  No DLL involved.  Not
here, at least.  Are you concerned about the DSP clocking?

I have this bad habit of trusting the timing files to a decent degree. 
At least lab-type conditions make the timing numbers more reliable than
full temp/voltage operation.  Ya think?

- John_H



Ray Andraka wrote:
> 
> John_H wrote:
> 
> > If you want to jump through some hoops *and* if you can guarantee that csn disables
> > before rdn (or vice-versa) I can give you latch tristates that should give you 6.3nS
> > and better at the 12mA drive in an xc2s100-5.  If you want to increase to 24mA drive
> > you should be able to get this to 10s of picoseconds over the 6ns target as long as
> > your data pins are near the control pins.
> >
> > Interested?
> > Using Synplify?
> > Using Verilog?
> > Comfortable checking the timing as an exception (editing the pcf or running timing
> > analyzer with path tracing)?
> >
> > If you want to push the bounds you have to push the code.
> 
> and trust the timing files, and have no clock jitter.  10ps is no margin, and with the
> typical jitter on the CLKDLL is probably actually short of the 6ns goal.
> 
> >
> >
> > - John
> >
> > John_H wrote:
> >
> > > You're looking for input-logic-tristate_out times that are extremely low.
> > > Admirable.  Tough.  Can't get there with just a UCF constraint.
> > >
> > > I'm basing the info below on recent Spartan-II design efforts - different
> > > devices may have different characteristics:
> > >
> > > If there's no way your enables can work properly with registered control logic
> > > (Tristate register in each IOB) then the only way to get your times is to
> > > "trick" the IOB elements into giving you the routing without the intermediate
> > > logic.  If you use IOB registers for the data and need the fast clock-to-out
> > > times you may not have a choice.  If your outputs are combinatorial or if you
> > > can push the output registers out of the IOB you have a chance.
> > >
> > > By using the IOB tristate register as a latch (which means the output must be a
> > > latch or combinatorial) you can work with the control signals for the data, the
> > > enable, and/or the reset to "effectively" provide the logic you're looking for,
> > > replicated in every output as a separate tristate latch.  The result is that the
> > > signals don't have to get from the pads to the LUTs and back out to the pads but
> > > can go straight from pads to pads.  This is especially helpful if your control
> > > and data signals are on the same side of the device.
> > >
> > > Any other attempts at improvement that I can come up with involve pin changes.
> > >
> > > - John
> > >
> > > Markus Meng wrote:
> > >
> > > > Hi all,
> > > >
> > > > actually I need to 'fight' with the UCF File in order to decrease the delay
> > > > until the databus buffers do open from tristate to drive. I have the
> > > > following:
> > > >
> > > > When dsp_csn AND dsp_rdn are low, then open the buffers of the data bus.
> > > > In the UCF-File I did the following setting, which seems impossible to meet.
> > > > Each time I get 7.5..8.3ns until the buffers are open and driving.
> > > >
> > > > Can anybody give me an advice how I can improve it, without changing the
> > > > pinning or the layout...
> > > >
> > > > -- Snip ucf-file
> > > >
> > > > ############################################################################
> > > > ###
> > > > ## Trying to make to Main DSP output enable faster. 24.01.2002 -mm-
> > > > TIMEGRP DSPRdDataPath = PADS(dsp_data(*));
> > > > ############################################################################
> > > > ###
> > > >
> > > > TIMEGRP DSPRdCtrlPath = PADS(dsp_csn : dsp_rdn);
> > > >
> > > > ############################################################################
> > > > ###
> > > >
> > > > TIMESPEC TSP2P = FROM : DSPRdCtrlPath: TO : DSPRdDataPath : 6ns;
> > > > ############################################################################
> > > > ###
> > > >
> > > > ############################################################################
> > > > ###
> > > > ## Increase Driver Strength 25.01.2002 -mm-
> > > > NET "dsp_data(*)" IOSTANDARD = LVTTL;
> > > > NET "dsp_data(*)" DRIVE = 12;
> > > > -- Snip ucf-file
> > > >
> > > > markus
> > > >
> > > > --
> > > > ********************************************************************
> > > > ** Meng Engineering        Telefon    056 222 44 10               **
> > > > ** Markus Meng             Natel      079 230 93 86               **
> > > > ** Bruggerstr. 21          Telefax    056 222 44 10               **
> > > > ** CH-5400 Baden           Email      meng.engineering@bluewin.ch **
> > > > ********************************************************************
> > > > ** Theory may inform, but Practice convinces. -- George Bain      **
> 
> --
> --Ray Andraka, P.E.
> President, the Andraka Consulting Group, Inc.
> 401/884-7930     Fax 401/884-7950
> email ray@andraka.com
> http://www.andraka.com
> 
>  "They that give up essential liberty to obtain a little
>   temporary safety deserve neither liberty nor safety."
>                                           -Benjamin Franklin, 1759

Article: 38820
Subject: Re: Xilinx PCI logicore: clarification on nature of COMPLETE
From: Kevin Brace <ihatespamkevinbraceusenet@ihatespamhotmail.com>
Date: Fri, 25 Jan 2002 21:32:25 -0600
Links: << >> << T >> << A >>

First of all, I have to admit that I have never designed anything with
Xilinx LogiCORE PCI.
However, from what I see in an initiator write waveforms in LogiCORE PCI
Design Guide Page 13-8 (Figure 13-3), Page 13-10 (Figure 13-5), and Page
14-19 (Figure 14-6), COMPLETE is asserted on the user side at the same
the user side loads the last data on ADIO.
Okay, those examples were for 32-bit PCI, but from looking at 64-bit PCI
waveforms on Page 15-4 (Figure 15-2), things don't look any different.
Looking at the waveforms, COMPLETE seems to be asserted until M_DATA is
deasserted, like the way you worded.
If the initiator is doing only a single cycle transfer, COMPLETE seems
like it has to be asserted on the next cycle REQUEST is asserted (Figure
12-4, 12-5, 12-6, 12-7, and 12-8), and has to be kept asserted until
M_DATA is deasserted.
So, from what I see, it looks like COMPLETE assertion timing will be
different depending on whether or not the transfer is a single or a
burst.
In practice, initiator transfer can be interrupted by a target
disconnect or a target abort, and when that happens, COMPLETE seems to
get ignored (FRAME# will be deasserted if already asserted because STOP#
was deasserted.).

Kevin Brace (Don't respond to me directly, respond within the
newsgroup.)

David Miller wrote:
> 
> Hi,
> 
> I wanted to clarify whether my understanding of how the COMPLETE signal
> is interpreted in xilinx's PCI logicore is accurate, and that I won't
> come to grief.
> 
> As near as I can tell, when doing master writes in the PCI64 logicore:
> 
> Some time after REQUEST64 is asserted, M_DATA and M_SRC_EN go high.  In
> that cycle, ADIO and COMPLETE are sampled.  (Notwithstanding retries
> etc,) data continues to be written into the target until COMPLETE is
> asserted.
> 
> In other words, COMPLETE tags the last word the core will write into the
> target.
> 
> Is this right?  I've found the PCIcore user guide rather nebulous and
> doesn't define the behaviour of this signal in plain language.  It
> provides a relatively complex example of how to drive COMPLETE, but
> doesn't really explain what it is doing or, more to the point, why each
> bit of the example is necessary.
> 
> --
> David Miller, BCMS (Hons)  | When something disturbs you, it isn't the
> Endace Measurement Systems | thing that disturbs you; rather, it is
> Mobile: +64-21-704-djm     | your judgement of it, and you have the
> Fax:    +64-21-304-djm     | power to change that.  -- Marcus Aurelius

Article: 38821
Subject: Re: Xilinx webpack
From: Russell Shaw <rjshaw@iprimus.com.au>
Date: Sat, 26 Jan 2002 04:51:04 GMT
Links: << >> << T >> << A >>



Kevin Brace wrote:
> 
> Russell Shaw wrote:
> >
> > Hi all,
> >
> > Does the free xilinx webpack come with a technology viewer?
> > (shows a schematic equivalent of your vhdl)
> >
> 
>         Doesn't LeonardoSpectrum-Altera have that feature killed?

Yes. A licence costs quite a bit, and i've got a problem
i can't narrow down or fix over the net. I think its logic
related rather than just timing, because it's running at
low clock rate.

> Who supports such a feature for Altera-based free tools?
> It will be nice if I can see the schematic equivalent or the LUT
> connection of a design after synthesis in ISE WebPACK, but it doesn't
> seem to have such a feature, so I always have to P&R and use
> Floorplanner to look at the LUT mapping.
> 
> > What is used for the vhdl compiler? Is it VHDL-93?
> >
> 
>         I don't use VHDL, so I should not answer this question, but the
> synthesis tool will be XST (Xilinx Synthesis Technology)
> I find that XST uses far less numbers of LUT than Altera's in-house
> synthesis tool for Quartus II 1.1 Web Edition for the same design (a
> synthesizable Verilog-based PCI IP core I developed.).
> LeonardoSpectrum-Altera seems to generate similar number of LUTs as XST,
> but I don't really like it because it is fairly unstable.
> 
> > How optimal is the routing generation, is it recommended
> > to 'adjust' the output manually? (for sram based fpgas)
> >
> 
>         I guess I am much more used to Xilinx tools/devices than Altera
> devices, but can you clarify your question?

The devices i use don't have much in the way of segmented
row and column lines, so it seems that the altera device
can be routed without user intervention. I've read that
xilinx devices have a lot more segments that are hard to
route well in an automated way, but maybe that's only for
larger devices.

> Regardless, comparing Altera's FLEX10KE to Xilinx Spartan-II, I can say
> from my experience that FLEX10KE's output buffer for 5V PCI (actually
> 3.3V PCI I/O buffer without PCI clamping diode enabled to make the
> inputs 5V tolerant.) is a lot faster than Spartan-II's PCI33_5 (5V PCI
> I/O buffer), but FLEX10KE's input buffers are much slower than
> Spartan-II's input buffers.
> In Xilinx devices, use of IOB (IOE in Altera's lingo) output and
> tri-state control I think is more important than in Altera devices
> because they seem slower (especially PCI33_5).
> For LVTTL signaling, current drive and slew rate can be adjusted, and
> gives you far more options for adjustment than Altera.
> 
> 
> 
> > Does it come with any timing analysis tools?
> >
> 
>         Yes, it does.
> I prefer Xilinx's timing error report than Altera's.
> 
> > Is there anything major omitted compared to a 'real' tool?
> >
> 
>         Like Falk said, FPGA Editor!
> Also, COREGEN is omitted if I am correct.
> ModelSim-XE Starter comes with 500 lines code limitation, but the
> simulation only gets slower, and doesn't stop even after 500 lines of
> code (I have once exceeded the limit by 20,000 lines, but the thing ran
> okay. It took a lot of time to do simulation though.).
> Because Altera doesn't offer a free version of ModelSim with their
> tools, I mainly use Xilinx tools for development, and port it to
> Altera's chips later.
> I won't mind using Altera's tools more often if they offered a free
> version of ModelSim like Xilinx did.
> 
> > What chip is smallish and sram based like an acex-1k30?
> 
>         Falk said Spartan-II 30K system gate part, but it will be better
> to download ACEX1K and Spartan-II datasheet, and compare the number of
> 4-input LUTs.
> In general, Xilinx seems to inflate "system" gate count more than
> Altera.
> I will guess than Spartan-II 50K system gate part is comparable in terms
> of gate density, but the nice thing is that similar density Spartan-IIs
> are sold at half or less of the price of ACEX1K.
> For example, Spartan-II 150K system gate part 208 pin PQFP package speed
> grade -5 is sold at about $23 for just one at various Xilinx
> distributors like Insight Electronics or NuHorizons (Minimum order is
> $50 though).
> Configuration PROM should cost something like $23 (XC18V01) or $35
> (XC18V02).
> Regarding ACEX1K pricing, I once saw ACEX1K-100K speed grade -1 208 pin
> PQFP package for $90 at Arrow.
> The comparable density part Spartan-II 200K system gate part speed grade
> -6 208 pin PQFP package shouldn't cost more than $35.
> 
> Kevin Brace (Don't respond to me directly, respond within the
> newsgroup.)

thanks:)

Article: 38822
Subject: Re: Xilinx webpack
From: Russell Shaw <rjshaw@iprimus.com.au>
Date: Sat, 26 Jan 2002 04:52:58 GMT
Links: << >> << T >> << A >>



Falk Brunner wrote:
> 
> "Russell Shaw" <rjshaw@iprimus.com.au> schrieb im Newsbeitrag
> news:3C514E29.481AF4D8@iprimus.com.au...
> 
> > Does the free xilinx webpack come with a technology viewer?
> > (shows a schematic equivalent of your vhdl)
> 
> AFAIK, no (I havnt found it)
> 
> > What is used for the vhdl compiler? Is it VHDL-93?
> 
> XST, which is AFAIK VHDL 93 compliant.
> 
> > How optimal is the routing generation, is it recommended
> > to 'adjust' the output manually? (for sram based fpgas)
> 
> This is not a matter of synthesizer, its the Place & Route. These toola are
> identical to the professional Software (Foundation/Alliance)
> 
> > Does it come with any timing analysis tools?
> 
> Yep.
> 
> > Is there anything major omitted compared to a 'real' tool?
> 
> Hmm, some people miss the FPGA editor. But you still have floorplanner.

What does the fpga editor show, and what do you need it for?

> > What chip is smallish and sram based like an acex-1k30?
> 
> Spartan-II 30 XC2S30
> 
> --
> MfG
> Falk

Article: 38823
Subject: Mapping between Xlinx 4K and Spartan-II
From: "Chatpapon Prasartsee" <g42cpp@ku.ac.th>
Date: Sat, 26 Jan 2002 11:57:42 +0700
Links: << >> << T >> << A >>

I used Xilinx XC4010E in the last project and now I am going to use
Spartan-II FPGA. I don't understand something about the number of CLB and
gates in the chip and how to compare between two chip families.
-----------------------------
XC4010E
Logic cell = 950
Gate range= 7,000 - 20,000
Total CLB = 400
-----------------------------
Spartan-II XC2S30
Logic cell = 972
System Gates = 30,000
Total CLBs = 216
-----------------------------
Obviously, the XC2S30 has more gates, but its CLBs is less than the XC4010E.
My question is:
I think that the CLB numbers are different because of the difference in the
architecture of these two chip families. Am I right?
Which one is bigger? What is the criteria to use when choosing Xilinx chips:
CLBs or number of gates?
Which FPGA in XC4000 family is equivalent to XCS30?

Best Regards,
Chatpapon

Article: 38824
Subject: Re: Synthsis Tools for Xilinx
From: Russell Shaw <rjshaw@iprimus.com.au>
Date: Sat, 26 Jan 2002 04:59:45 GMT
Links: << >> << T >> << A >>



Kevin Brace wrote:
> 
> I have to admit I have a very limited experience using synthesis tools
> for Xilinx devices (I have used XST only that came with ISE WebPACK 3.3
> and 4.1), but from my experience using a free version of
> LeonardoSpectrum-Altera (I know no one is talking about Altera stuff
> here. Ver. 2001-1a-28), LeonardoSpectrum-Altera tends to crash pretty
> frequently for no apparent reason.
> What I mean is when I am changing synthesis options, lets say when I go
> from one synthesis category to another category (clicking another tab),
> the tool crashes suddenly.
> Crashing I am talking here is a general protection fault...

It can be windows related. I've had problems with win95 which
went away with win2k. Are you using the latest maxplus2/leonardo?

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search