Messages from 36250

Article: 36250
Subject: Re: what about FPGA with embedded processor?
From: "Peter Ormsby" <faepete.deletethis@mediaone.net>
Date: Sat, 03 Nov 2001 21:51:43 GMT
Links: << >> << T >> << A >>

Steven,

Good post. Thanks for the Triscend info.  I thought I'd just point out a few
omissions about the XA10 device since I have a bit of knowledge about the
part...

First of all, the device is currently shipping to customers.  Demand is
pretty high right now, so they're a bit hard to come by, but they are
shipping today.

Second, the XA10 is the largest device in the Excalibur ARM family.
Compared to the Triscend devices, the Excalibur XA10 is much more
"logic-heavy", having around 38,000 FFs/LUTs compared to less than 4,000 in
the largest Triscend ARM device (I'm assuming that a CSL is somthing like a
FF/LUT).  There are two other Excalibur ARM devices coming soon, one with
about 16,000 FFs/LUTs (XA4) and another with somewhere around 4,000 (XA1).
These will probably be much closer in price to the Triscend parts than the
XA10 is.

The last thing to point out is that the Excalibur ARM devices are of the ARM
9 family (rather than the ARM 7) and that system frequencies are
significantly higher than the Triscend parts.

-Pete-

Steven K. Knapp <sknapp@triscend.com> wrote in message
news:d2f86928.0111021647.56104ffb@posting.google.com...
> There is an article from EE Times that provides quite a bit of data on
> both parts that you mentioned.
> http://www.eet.com/story/industry/semiconductor_news/OEG20011031S0025
> http://www.eetimes.com/story/OEG20011016S0097
>
> Altera has demonstrated samples of the XA10 at trade shows and there
> are parts listed at distribution, staring at over US$2,000 each
> (single quantity).  Looks like Altera's MIPS project is on the back
> burner, though.  All information on the MIPS-based product was removed
> from their web site.
>
> I haven't seen a physical Virtex 2 Pro device yet but I know of at
> least one company that is an "alpha" site.  There doesn't appear to be
> any datasheet on the Xilinx web site and I don't see any parts listed
> at distribution.
>
> Both companies presented thier solutions at the Microprocessor Forum
> in October.  These are exciting parts, assuming that you can afford
> them.
>
> For comparison, there are other companies are already shipping
> similar, more cost-effective devices in production.  Triscend, for
> example, has been shipping 32-bit ARM-based devices with embedded
> programmable logic for over a year now.  The Triscend A7 family is in
> production, available through distribution, and is supported by
> compilers, synthesizers, simulators, development boards, etc.  The
> largest family member starts at about US$40, single piece.  The A7
> configurable system includes cache, DMAs, UARTs, on-chip SRAM, SDRAM
> controller, and lots of I/O pins.
> http://www.triscend.com/products/indexa7.html
>
> For smaller systems, our 8-bit accelerated 8051-based E5 family has
> parts starting as low as US$9 in single-piece quantities, or below
> US$4 in high volumes.  Like the A7 family, the E5 family is well
> supported by development tools and boards.  The E5 family has been in
> production since 1999.
> http://www.triscend.com/products/IndexE5.html
>
> Both product families are supported by the Triscend FastChip
> development system and the FastChip drag-and-drop IP library, complete
> with standard peripheral functions.
> http://www.triscend.com/products/IndexFCintro.html
> http://www.triscend.com/products/indexfc_ip.html
>
>
> "Jae-cheol Lee" <jchlee@lge.com> wrote in message
news:<dvHD7.1657$cI6.685978@news.bora.net>...
> > There were some news about Virtex II with PowerPC cores.
> > Altera announced Excalibur series with ARM or MIPS cores.
> >
> > What is the schedule of production of the first one?
> >
> > Is there any good story on the use of the second one?
> >
> > Please let me know...

Article: 36251
Subject: Re: 64-bit PCI core for Lattice CPLD?
From: kevinbraceusenet@hotmail.com (Kevin Brace)
Date: 3 Nov 2001 17:27:27 -0800
Links: << >> << T >> << A >>

I have never used Lattice's devices, so I don't know much about them.
However, regarding PCI, I don't know how things will go with a CPLD,
but at least with an FPGA from my experience trying get my own PCI IP
core to meet timings with Xilinx Spartan-II 150K system gate part, it
is very hard for a synthesizable (HDL-based) PCI IP core to meeting
timings of even 33MHz PCI (Tsu = 7ns and Tval (Tco) = 11ns) with only
automatic Place & Route (in other words, without using a floorplanner
tool).
I know that Xilinx and Altera has a PCI IP core that support 66MHz and
64-bit PCI, but it looks like Xilinx has much more devices supported
compared to Altera (Altera seems to have constraint file for only a
few devices, versus Xilinx has constraint file for many more devices
than Altera).
Some people may be concerned that an SRAM-based FPGA may not program
by the time the BIOS starts executing POST (Power On Self Test) code,
but I am told that an FPGA that can handle PCI can program itself
within 100ms, before RST# (PCI's reset signal) gets asserted to reset
all PCI devices.
POST code will get executed after that.
Although I am not sure if your application really requires more than
133MB/s of bandwidth, but if it doesn't (in other words, 32-bit 33MHz
PCI is adequate), you can prototype your design with Insight
Electronics Spartan-II PCI development board which costs only $145 for
the board itself (the same one I use).
Although the Spartan-II part on the Insight Electronics Spartan-II PCI
development board is called 150K system gates, realistically usable
gates will be much lower.
One problem with 64-bit 66MHz PCI (3V) is that very few systems
support it compared to ubiquitous 5V 32-bit 33MHz PCI, but if you
don't care about running your PCI card on regular desktop computers,
that shouldn't be an issue.




Regards,



Kevin Brace (don't respond to me directly, respond within the
newsgroup)




arast@inficom.com (Alex Rast) wrote in message news:<ANEE7.153$FU5.365042@news.uswest.net>...
> Lattice has a core on their site for 32-bit PCI, but I'm wondering if there is 
> available from Lattice or third parties a 64-bit core. It would be ideal if it 
> can run at 66MHz to boot. I'm looking to target an ispLSI8600VE. Such a core 
> would be helpful because it would let me save an additional chip on our 
> circuit board. Right now we're using a different (much smaller) CPLD on our 
> board for other, non-PCI functions. I'm getting ready to design the second 
> revision. I'm leaning towards the Lattice chip because I've been unsatisfied 
> with the s/w tools for the chip I have now, and because I really could use the 
> internal tristate busses on the 8000 series. The 8600VE is way, way overkill 
> in terms of macrocell density as a direct replacement for our current CPLD, 
> but if I could integrate the PCI core onto the chip as well then I think it's 
> justifiable.
> 
> BTW, anybody out there have any experience with Lattice's tools? What are your 
> thoughts on them?
> 
> Alex Rast
> arast@qwest.net
> arast@inficom.com

Article: 36252
Subject: Re: spartan synthesis with synopsis
From: newman5382@aol.com (newman)
Date: 3 Nov 2001 19:05:53 -0800
Links: << >> << T >> << A >>

It looked to me that the Spartan CLB does not include any
dedicated carry logic.  Each LUT has one output, so 
at least two LUT's would be required per bit... one to
generate the cout bit, and one to generate the pc bit.

Have you tried looking at the design after P&R with the
FPGA editor?  This may shed more light on the situation.

Newman

"Tim Boescke" <t.boescke@tu-harburg.de> wrote in message news:<9rvn5u$10hndj$1@ID-107613.news.dfncis.de>...
> I am currently trying to synthesize a loadable
> accumulator with synopsis. The target architecture
> is a spartan. (not 2)
> 
> In my opinion the code below should fit into one 4 LUT
> per bit. (inputs to each 4 LUT: pc, cin, load, inp)
> However, after synthesis the design requires no less
> than 16 4-luts.
> 
> Did I miss something ? Is there any way to infer a
> combined add/load structure ? I already tried
> lots of combinations without success and unfortunately
> it seems that the xilinx libs dont allow direct
> access to the LUTs and the carry logic for spartan..
> (They do for spartan 2)
> 
> ------------------------------------------------------
> 
> architecture synth of counter is
>   signal pc: std_logic_vector(7 downto 0);
> begin
>      process(clk)
>      begin
>         if (res ='1') then
>           pc <= "00000000";
>         elsif rising_edge(clk) then
>           if (load = '1') then
>             pc <= inp;
>           else
>             pc <= pc + inp;
>           end if;
>         end if;
>      end process;
> 
>         outp <= pc;
> end synth;

Article: 36253
Subject: Re: spartan synthesis with synopsis
From: Peter Alfke <palfke@earthlink.net>
Date: Sun, 04 Nov 2001 05:20:07 GMT
Links: << >> << T >> << A >>

Spartan is based on XC4000, and SpartanXL is based on XC4000XL, and all of them have the same carry structure.
Take a look at the XC4000 and XC4000XL documentation, it may be clearer. But it describes the identical
architecture.

Peter Alfke
====================================
newman wrote:

> It looked to me that the Spartan CLB does not include any
> dedicated carry logic.  Each LUT has one output, so
> at least two LUT's would be required per bit... one to
> generate the cout bit, and one to generate the pc bit.
>
> Have you tried looking at the design after P&R with the
> FPGA editor?  This may shed more light on the situation.
>
> Newman
>
> "Tim Boescke" <t.boescke@tu-harburg.de> wrote in message news:<9rvn5u$10hndj$1@ID-107613.news.dfncis.de>...
> > I am currently trying to synthesize a loadable
> > accumulator with synopsis. The target architecture
> > is a spartan. (not 2)
> >
> > In my opinion the code below should fit into one 4 LUT
> > per bit. (inputs to each 4 LUT: pc, cin, load, inp)
> > However, after synthesis the design requires no less
> > than 16 4-luts.
> >
> > Did I miss something ? Is there any way to infer a
> > combined add/load structure ? I already tried
> > lots of combinations without success and unfortunately
> > it seems that the xilinx libs dont allow direct
> > access to the LUTs and the carry logic for spartan..
> > (They do for spartan 2)
> >
> > ------------------------------------------------------
> >
> > architecture synth of counter is
> >   signal pc: std_logic_vector(7 downto 0);
> > begin
> >      process(clk)
> >      begin
> >         if (res ='1') then
> >           pc <= "00000000";
> >         elsif rising_edge(clk) then
> >           if (load = '1') then
> >             pc <= inp;
> >           else
> >             pc <= pc + inp;
> >           end if;
> >         end if;
> >      end process;
> >
> >         outp <= pc;
> > end synth;

Article: 36254
Subject: Re: Registered as well as unregistered outputs?
From: ndeshmukh@yahoo.com (nitin)
Date: 3 Nov 2001 22:22:11 -0800
Links: << >> << T >> << A >>

Hi...

    Well i was very much asking from the veiw point of FLEX10K and
APEX20K architectures. U see in 10K they had one output of the LE
going to the feedback matrix of the LLI and one output to RFTs and
CFTs. So that meant only either registered or unregistered output
going to the LLI feedback matrix and also to the RFT's and CFTs.

    But that changed with APEX20K devices where both outputs could got
to all routing resources. Now is that becuse they wanted both
registered and unregisterd outputs at the same time...?  What was the
need that made them to do this change in the architecture...? I
presumed there might be circuits where one might need that.

    Well some more discussion on this will help a lot.

Nitin.

Ray Andraka <ray@andraka.com> wrote in message news:<3BE298CA.B67829D6@andraka.com>...
> Depends entirely on your design style.  For highest performance, you'll
> generally want to avoid using the unregistered output, but that also
> limits your design options.
> 
> nitin wrote:
> 
> > Hi...
> >
> > Can anyone tell me how frequently and where both registered and
> > unregistered outputs from an LE are required...?
> >
> > Ciao,
> > nitin.
> 
> --
> --Ray Andraka, P.E.
> President, the Andraka Consulting Group, Inc.
> 401/884-7930     Fax 401/884-7950
> email ray@andraka.com
> http://www.andraka.com
> 
>  "They that give up essential liberty to obtain a little
>   temporary safety deserve neither liberty nor safety."
>                                           -Benjamin Franklin, 1759

Article: 36255
Subject: Re: Altera Local Routing
From: ndeshmukh@yahoo.com (nitin)
Date: 3 Nov 2001 22:40:35 -0800
Links: << >> << T >> << A >>

Hi...

    Well actually i was also pondering over the Mercury Architecture
and questions do come to mind. Well i still can't understand why not 5
top LEs drive to the right and 5 bottom LEs to the left or something
like that. And this odd and evn LEs in a LAB driving different
resources carries on even to the RAPID LAB INTERCONNCT, and also the
LEAP LINES...  why...?  does this give some bigger region os a crudely
defined cluster with faster routing within it? If yes then what
motivates that kind of descision and what sort of shape & size does
that crudely defined cluster have...?

    Well i aslo spotted an intresting thing in Mercury architecture...
Uptil now horizontal rsouces used to drive vertical ones ad vertical
ones used to drive horizontal ones. But in Mercury we see Leap lines
driving columns, columns driving columns, and priority comlumns
driving columns as well as priority columns.
     Well first of all the question that comes to mind is the when a
column drives a column is it driving a segment within itself...? I
certainly hope not. Cuse that does not make too much sense to me.  But
then is it drives adjacent columns then which ones? to the right or to
the left or both?

     And what motivates such a decision? Well i hope some one has the
answers...

Ciao,
Nitin.

Ray Andraka <ray@andraka.com> wrote in message news:<3BE30B9B.FD358F22@andraka.com>...
> Right, but the real reason is to provide fast connections for logic using the
> carry/cascade chains.  Those chains run across the LE's in a LAB, which prevents
> the LE's from connecting to another LE in the same LAB.  The inter-lab connects
> give you a way to connect arithmetic logic with a reasonable delay.
> 
> Steve Fair wrote:
> 
> > Digari -
> >
> > It's all about speed . . .
> >
> > The interleaving of the labs gives the router more flexibility.  Each lab
> > has it's own local routes, which is the fastest non-dedicated route (as
> > opposed to carry or cascade chains).  If all the logic between two flops can
> > fit into a lab, you will achieve the best performance possible.  If the
> > logic can't fit into the lab, you go to a megalab route in the apex II
> > architecture, which adds delay AND uses another routing resource.  By
> > interleaving the labs, an LE can be connected to many more LE's for making
> > those fast, local connections.  The area expense isn't that great (a single
> > line and mux to the next lab's local interconnect), so it's a very efficient
> > way to increase routing and performance.  Put another way, a lab goes from
> > having 9 possible local connections to 19 with very little overhead.  With
> > the further interleaving available (remember the left & right drives), you
> > can do some pretty deep equations with very small routing delays.
> >
> > Hope that helps.
> >
> > Steve
> >
> > "digari" <digari@dacafe.com> wrote in message
> > news:e0855517.0111010034.375d9328@posting.google.com...
> > > Why there is interleaved routing from an LE to adjecent LLIs in
> > > Mercury and ApexII architectures.
> > >
> > > "APEX II devices use an interleaved LAB structure, so that each LAB
> > > can
> > > drive two local interconnect areas. Every other LE drives to either
> > > the left
> > > or right local interconnect area, alternating by LE."
> > >
> > > Can anyone shed some light on the alternate routing structure of LEs
> > > within a LAB.
> 
> --
> --Ray Andraka, P.E.
> President, the Andraka Consulting Group, Inc.
> 401/884-7930     Fax 401/884-7950
> email ray@andraka.com
> http://www.andraka.com
> 
>  "They that give up essential liberty to obtain a little
>   temporary safety deserve neither liberty nor safety."
>                                           -Benjamin Franklin, 1759

Article: 36256
Subject: Re: Synplicity, Xilinx, & unwanted BUFGs
From: assaf_sarfati@yahoo.com (Assaf Sarfati)
Date: 3 Nov 2001 23:04:37 -0800
Links: << >> << T >> << A >>

What worked for me:

I explicitly inserted BUFGs on all signals that I wanted a BUFG on, and
then set in the Synplicity constraint file (project.SDC) the attribute
xc_global_buffers to the number of BUFGs in the design. Synplicity
generated warning that some signals appeared to be clocks, but didn't
add any unwanted BUFG. Note that the xc_global_buffers attribute can
be specified only in the SDC file.

As far as I could see, Synplicity simply ignored syn_noclockbuf in the
HDL code, since it know better than me what was and what wasn't a clock...

"Jason T. Wright" <Jason.T.Wright@Boeing.com> wrote in message news:<3BE183E6.204EA0C9@Boeing.com>...
> How does one prohibit Synplify from inferring a global buffer?  I read
> the on-line help, and it mentions using an attribute in the VHDL (or
> verilog) code to FORCE a global buffer, but negating that attribute did
> not stop its insertion.  LeonardoSpectrum and FPGAExpress each has a
> command to stop such an undesired action (or, correspondingly, to force
> such an insertion.)   Left to their own devices, the tools can create
> wonderfully efficient, or wonderfully bloated, results from a user's
> code.  I've seen a little bit of each.

Article: 36257
Subject: Altera download problem
From: "Leon Heller" <leon_heller@hotmail.com>
Date: Sun, 4 Nov 2001 07:22:18 -0000
Links: << >> << T >> << A >>

Why on earth does the Altera web site not allow downloads to be resumed?
Every other web site seems to allow this. I tried emailing their webmaster
about it, but didn't get a reply.

Leon
--
Leon Heller, G1HSM leon_heller@hotmail.con
http://www.geocities.com/leon_heller
Low-cost Altera Flex design kit: http://www.leonheller.com

Article: 36258
Subject: Re: Altera download problem
From: kevinbraceusenet@hotmail.com (Kevin Brace)
Date: 4 Nov 2001 00:48:17 -0800
Links: << >> << T >> << A >>

I am guessing that you are trying to download their free tools.
I tried to do the same thing, but because they put their files on an
FTP server, various download utilities like Download Accelerator
cannot resume the download.
I guess Altera thinks everyone has access to T1 lines.
I gave up and ordered their latest Altera Digital Library, and the
latest version (October 2001) had Quartus II 1.1 Web Edition, but it
looks like they got rid of MAX+PLUS II-BASELINE from the CD-ROM.
I don't like Quartus II 1.1 Web Edition's TalkBack feature (of course,
I disabled it), and for some reason MegaWizard Plug-In Manager doesn't
seem to work properly (Java-based, and somehow can't find the right
file).



Regards,



Kevin Brace (don't respond to me directly, respond within the
newsgroup)




"Leon Heller" <leon_heller@hotmail.com> wrote in message news:<9s1g4d$2lg$1@plutonium.btinternet.com>...
> Why on earth does the Altera web site not allow downloads to be resumed?
> Every other web site seems to allow this. I tried emailing their webmaster
> about it, but didn't get a reply.
> 
> Leon

Article: 36259
Subject: Re: 64-bit PCI core for Lattice CPLD?
From: Philipp Krause <pkk@spth.de>
Date: Sun, 04 Nov 2001 10:38:30 +0100
Links: << >> << T >> << A >>

Alex Rast wrote:

> Lattice has a core on their site for 32-bit PCI, but I'm wondering if there is 
> available from Lattice or third parties a 64-bit core. It would be ideal if it 
> can run at 66MHz to boot.

The PCI core from www.opencores.org should run at 66 Mhz. They've been 
working on a 64-bit version, but I don't know, if it's done yet.

Article: 36260
Subject: Help on which FPGA development boards ?
From: "Bigboss" <bigboss@bigboss.demon.co.uk>
Date: Sun, 4 Nov 2001 11:35:18 -0000
Links: << >> << T >> << A >>

Hi

I am looking for stand along FPGA developement baords

Must Have :

1. ~ 800 K (more or less) gates (prefer Virtex or E, although Alter will do
:))
2. ~ 32MBytes or more, SDRAM (or DDR)  (prefer 64bits DIMM )
3. SSRAM (any type), independent of SDRAM.
4. some Flash
5.  digitized video input, (expension card acceptable)
6. RAMDAC (resonable quality 800x600x24 will do ) for VGA output

Wish List :

1. RS232
2. Ethernet
3. Parallel port,
4. USB
5. digitized Audio in/out.

I have found one which is close but lack SDRAM !!

http://www.xess.com/

XSV-800 boards

Any pointers will be much appreciated.

btw. I am poor student, so can't effort huge sums of money :)

Article: 36261
Subject: A problem configuring APEX device
From: ikauranen@netscape.net (ikauranen)
Date: 4 Nov 2001 09:52:17 -0800
Links: << >> << T >> << A >>

Hello,
We have a problem configuring Altera APEX EP20K100 device on our
prototype video codec card. The first option for our card is when DAC
chip on it supplies two clock signals to the APEX (through i/o pins).
The second option is when DAC chip accepts two clock signals from
APEX. The rest of external APEX-related signals are tri-stated or left
unconnected (mezzanine ADC card). We are using JTAG downloader
(Quartus II Programmer Tool). The second option works well. APEX
begins its operation after successful configuration via JTAG. But the
first option does not work at all. We cannot get APEX configured and
Quartus reports Configuration Failed at 98% progress bar indicator.
In this particular project we can enforce second option (totally
tri-stated environment) at start-up so as to get APEX configured, but
our next large project (now in progress) is build around several
multi-FPGA cards (APEX, mostly), and we cannot obey this limit there.

Thank you.
Igor Kauranen

P.S. I would like to say thank you very much to Peter Ormsby and to
Carl Schlehaus for their help with Nios.

Article: 36262
Subject: Re: spartan synthesis with synopsis
From: Ray Andraka <ray@andraka.com>
Date: Sun, 04 Nov 2001 17:53:03 GMT
Links: << >> << T >> << A >>

Patently untrue.  The Spartan architecture is basically the xilinx 4000E architecture.  It has a dedicated
carry chain _in_front_ of the LUTs.  To instantiate it, you use the CY4 primitive plus one of the CY4 mode
select primitives, which is connected to the CY4 with an 8 bit bus, selecting the right CY4 mode set for the
function you wish to instantiate.  You'll have to refer to the carry section of the libraries guide for the
details.  That said, synthesis should instantiate the carry chain.  IIRC (It has been a little while since I
last used the 4K architecture), synplicity doesn't infer the carry chain for 4K devices when there are less
than about 6-8 bits in the arithmetic/count function.  You may have to massage your description to make it into
something the synthesizer can successfully infer.  Try putting the mux and add outside the process as a
concurrent statement.

newman wrote:

> It looked to me that the Spartan CLB does not include any
> dedicated carry logic.  Each LUT has one output, so
> at least two LUT's would be required per bit... one to
> generate the cout bit, and one to generate the pc bit.
>
> Have you tried looking at the design after P&R with the
> FPGA editor?  This may shed more light on the situation.
>
> Newman
>
> "Tim Boescke" <t.boescke@tu-harburg.de> wrote in message news:<9rvn5u$10hndj$1@ID-107613.news.dfncis.de>...
> > I am currently trying to synthesize a loadable
> > accumulator with synopsis. The target architecture
> > is a spartan. (not 2)
> >
> > In my opinion the code below should fit into one 4 LUT
> > per bit. (inputs to each 4 LUT: pc, cin, load, inp)
> > However, after synthesis the design requires no less
> > than 16 4-luts.
> >
> > Did I miss something ? Is there any way to infer a
> > combined add/load structure ? I already tried
> > lots of combinations without success and unfortunately
> > it seems that the xilinx libs dont allow direct
> > access to the LUTs and the carry logic for spartan..
> > (They do for spartan 2)
> >
> > ------------------------------------------------------
> >
> > architecture synth of counter is
> >   signal pc: std_logic_vector(7 downto 0);
> > begin
> >      process(clk)
> >      begin
> >         if (res ='1') then
> >           pc <= "00000000";
> >         elsif rising_edge(clk) then
> >           if (load = '1') then
> >             pc <= inp;
> >           else
> >             pc <= pc + inp;
> >           end if;
> >         end if;
> >      end process;
> >
> >         outp <= pc;
> > end synth;

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 36263
Subject: JBITS and modular FPGA configuration
From: rickman <spamgoeshere4@yahoo.com>
Date: Sun, 04 Nov 2001 14:08:29 -0500
Links: << >> << T >> << A >>

We produce DSP modules which support small daughtercards for various
types of IO. Each of these daughtercards require a different interface
to the DSP module. The obvious way to handle this is to use FPGAs which
can be downloaded with the design unique to each daughtercard installed.
On our first design we used one FPGA as a main board controller and used
a separate FPGA for each daughtercard interface. This allowed us to load
the FPGAs at boot up time after determining the identity of the
daughtercards installed. 

The problem with this approach is that we are increasing the number of
daughtercards supported from 2 to 4 and we are using a very small form
factor main board (PC/104). So we are very cramped for space for two
more FPGAs. This becomes a bigger problem as we move to newer FPGA
families which do not support the small TQ100 package. Of course there
are the uBGA packages, but they make board routing much harder with so
many in such a small space. 

We can get a significant cost and size savings using a single, larger
FPGA. But this complicates the download process since the daughtercard
interfaces are then all in a single download. If we want to handle all
possible configurations, we will have to provide literally thousands of
possible bitstreams. 

Can we achive the same real time configuration as in the multi-FPGA
approach or at least allow a user a way to build the appropriate
download for the main module FPGA using JBITS? We don't want to require
a user to download FPGA tools and perform a place and route just to
reconfigure their boards. Can we do that with JBITS, use separate
modules for the interfaces and combine them without using the place and
route tools?


-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 36264
Subject: Re: Field Programmable Logic in energy poor environments
From: Kolja Sulimma <kolja@sulimma.de>
Date: Sun, 04 Nov 2001 22:22:05 +0100
Links: << >> << T >> << A >>

Tim wrote:
> 
> "Peter Alfke" <palfke@earthlink.net> wrote
> > Irwin Kennedy wrote:
> > > * Use less "leaky" transistors! (?)
> >
> > Until recently, there was extremely little junction leakage or
> > sub-threshold leakage current. XC3000L could run on 50 microamps.
> > Unfortunately, as we approach 100 nm technology, subthreshold leakage
> > current becomes significant ( for every IC manufacturer. Ask Intel! )
> 
> I should not have to ask, but what is sub-threshold leakage?

A CMOS transistor that is turned off still conducts current. This is
only a pico Ampere or so per transistor but with a 100 million
transistors in a large FPGA this can become a concern.

The leackage current depends heavily on the threshold voltage at which
the transistor is turned off.
The speed of the transistor - the amount of current it conducts if
turned on - also depends on the threshold voltage. This means that there
is a tradeoff between speed and leakage current.

The threshold voltage can be controlled by the voltage on the fourth
"bulk" connection of the transistor.
In non SOI technologies this costs a lot of area or might even be
impossible, but in principle in an FPGA each slice could be configured
for a differen speed/leakage tradeoff.

This way the speed for a given leakage current could be improved by
budgeting more leakage to the critical path. 

Kolja Sulimma

Article: 36265
Subject: Re: JBITS and modular FPGA configuration
From: Neil Franklin <neil@franklin.ch.remove>
Date: 04 Nov 2001 23:19:58 +0100
Links: << >> << T >> << A >>

rickman <spamgoeshere4@yahoo.com> writes:

> On our first design we used one FPGA as a main board controller and used
> a separate FPGA for each daughtercard interface. This allowed us to load
>
> The problem with this approach is that we are increasing the number of
> daughtercards supported from 2 to 4 and we are using a very small form
> factor main board (PC/104). So we are very cramped for space for two
>
> We can get a significant cost and size savings using a single, larger
> FPGA. But this complicates the download process since the daughtercard
>
> Can we achive the same real time configuration as in the multi-FPGA
> approach or at least allow a user a way to build the appropriate
> download for the main module FPGA using JBITS? We don't want to require
> a user to download FPGA tools and perform a place and route just to
> reconfigure their boards. Can we do that with JBITS, use separate
> modules for the interfaces and combine them without using the place and
> route tools?

Should be possible with JBits.

Basically take the large FPGA (JBits demands that it must be a Virtex
(no -E or -EM !) or an Spartal-II of an size also existing in Virtex)
and "cut" the FPGA space into one "master" section and 4 "expansion"
sections. Cutting should be preferably vertical, if run-time reloading
is needed (because Virtex reloads columns), else use any 4 identical
sized sections. Something like this:

           .-----------------------------------.
           |                  .--..--..--..--. |
           |                  |  ||  ||  ||  | |
           |                  |  ||  ||  ||  | |
           |                  |  ||  ||  ||  | |
           |                  |  ||  ||  ||  | |
           |                  |  ||  ||  ||  | |
           |                  |  ||  ||  ||  | |
           |    base          |ex||ex||ex||ex| |
           |                  | 1|| 2|| 3|| 4| |
           |                  |  ||  ||  ||  | |
           |                  |  ||  ||  ||  | |
           |                  |  ||  ||  ||  | |
           |                  |  ||  ||  ||  | |
           |                  |  ||  ||  ||  | |
           |                  `--'`--'`--'`--' |
           `-----------------------------------'

or like this (if not run-time reconfig):

           .-----------------------------------.
           | .-------.               .-------. |
           | |       |               |       | |
           | |  ex1  |               |  ex2  | |
           | |       |               |       | |
           | `-------'               `-------' |
           |                                   |
           |                                   |
           |              base                 |
           |                                   |
           |                                   |
           | .-------.               .-------. |
           | |       |               |       | |
           | |  ex3  |               |  ex4  | |
           | |       |               |       | |
           | `-------'               `-------' |
           `-----------------------------------'

Then generate an "base" .bit bitstream for the master stuff. This will
have to offer 4 "ports" (basically a set of routing lines and signals
on them leading into the master part) for attaching the expansion
parts. The "ports" represent a) connection to "base" stuff and b)
connection to the appropriate slot.

From here on implementation differs on how much you want the user of
the board to do:

1. user has to install JBits on their system

Simply program the expansions as .java files that use an standard
call interface and than have the user edit an small config file that
says which of these files is to be "included" onto which port of the
master. Then let JBits compile.

I suspect you do not want to go this way, as it requires users to
install JBits software, but JBits compiles small program in minutes,
so it may be OK, if you are just worried about config time.

2. User has JBits, but faster way

Use JBits in the base, in a way that an standard "pinout" is made that
interfaces the group of expansion CLBs to "base". "Pins" in this sense
being actual FPGA routing wires (them 24 north/east/south/west wires).
Just for the base->expansion wires you will have to use "raw"
jbits.set() functions on both sides of the "divide".

Then make one .bit with the base and with 4 "holes" (CLBs left
unconfigured) where the expansions would go. Then have JBits load this
pre-compiled .bit and expansions in .java files and have just add the
expansion parts. This still requires the users to have JBits, but will
be below minutes to compile. And no "base" source, if that is relevant
to you.

3. user has no JBits at all on their system

Take the same .bit with holes in it from above. Same "pins" interface.

Then have one precompiled .bitx file each for each expansion modules
(made by an module specific .java), containig just the CLBs for one
subsection (always fitted to, say, the first hole).

Then use the information in XAPP155 to write an own program that
"cutouts" the 4 user-selected .bitx sections and "merges" them into
the base .bit file.

XAPP155 documents the arrangement of the basic 48x18 bits that configure
one CLB.

--
Neil Franklin, neil@franklin.ch.remove http://neil.franklin.ch/
Hacker, Unix Guru, El Eng HTL/BSc, Sysadmin, Archer, Roleplayer
- Intellectual Property is Intellectual Robbery

Article: 36266
Subject: Re: Help with a 1996 XC3064 design!!
From: z80@ds2.com (Peter)
Date: Sun, 04 Nov 2001 22:20:42 +0000
Links: << >> << T >> << A >>


>Hmm, dont know so much about the 3k series in the good old times. But I
>think all these 32 3k parts can be combined into 1 or two uptodate parts
>(Virtex-E, Virtex-II)

I expect so but it would be a huge redesign. Interesting though.

>> The 3k series device is now obsolete. I would also not like to risk
>> rebuilding this board with some newer XC3064s because I know the more
>> recent devices are much faster and some structures don't work anymore,
>
>Nasty asynchronous tricks ?? ;-)

Not so much "nasty" as OK with the older parts - it was OK for example
to use a "long line" for clock distribution. One used the "L"
attribute, together with perhaps "SC=1" on the clock net to force its
allocation to a long line. I was very specifically told by a Xilinx
engineer (c. 1991) that this was OK. And it worked faultlessly. This
was because the worst-case clock net skew was less than the clock-Q
delays in the D-types.

What happened later, even going from slow to faster XC3000 devices,
was that the clock-Q timing improved a lot faster than the local
interconnect timing. This IMHO is to be expected when you speed up
your gates; speeding up the interconnect by the same factor is a lot
harder. So a lot of previously solid clock gating schemes got broken.

Obviously, nowadays, you would design everything to work off the
global clock nets, and use clock-enable to do the clock gating.

Unfortunately, I also used these parts for ASIC prototyping, and when
you are doing a low power design, the most important method by far to
reduce dynamic power is to gate clocks. Using a global clock net and a
clock enable does not reduce dynamic power, well hardly anyway. On
those designs I got my fingers burnt a bit :)



Peter.
--
Return address is invalid to help stop junk mail.
E-mail replies to zX80@digiYserve.com but remove the X and the Y.
Please do NOT copy usenet posts to email - it is NOT necessary.

Article: 36267
Subject: Re: How dense are FPGA/CPLD's
From: Rene Tschaggelar <tschaggelar@dplanet.ch>
Date: Sun, 04 Nov 2001 23:29:35 +0100
Links: << >> << T >> << A >>

They range from a few dozend FF = a few hundred gates to several
millions.

There are EEPROM based chips and SRAM based chips. The boundary is
somwhere
at 128 FF's. The higher density ones use SRAM then.

For small scale development, the chip costs are nil compared to the 
time and perhaps money spent on the software. eg 20$ for 30k gates.

Yes, you can build a cpu or a 3D engine, but these are not the
projects to 
start with.

A thick chip with 10million gates also requires a thick machine with
1G RAM
and software for a few k$.

Rene
-- 
Ing.Buero R.Tschaggelar - http://www.ibrtses.com

bazaillion@yahoo.com wrote:
> 
> Hello,
> 
> I am new to FPGA/PLA/CPLD technology.
> 
> I have a couple questions about them.
> 
> (1) How dense can you get these (How many gates)?
> 
> (2) Hiw much are they typically if you are buy small quantities 1-5
> for prototyping?
> 
> (3)  Are they dense enough to build a CPU or a 3d VGA display chip
> with these? or maybe multiple chips.
> 
> Thanks,
> -M. Bazaillion

Article: 36268
Subject: Re: Leonardo bugs
From: Brian Drummond <brian@shapes.demon.co.uk>
Date: Sun, 04 Nov 2001 23:11:41 +0000
Links: << >> << T >> << A >>

On 31 Oct 2001 09:02:55 -0800, husby_d@yahoo.com (Don Husby) wrote:

>Mike Treseler <mike.treseler@flukenetworks.com> wrote in message news:<3BDF4799.A379AA16@flukenetworks.com>...
>> Russell Shaw wrote:
>> > ...
>> > most of the
>> > device settings you do in the flow tabs don't get
>> > saved with the project. 
>> 
>> Haven't seen anything like that.
>
>Leonardo is full of things like that:  project settings
>flip around like a bad politician.  Constraints evaporate.
>Constraints are added.  Flags appear to be set, but really
>aren't.  File names are changed.  My favorite is when
>each source file is added twice to the compile list.
>The most consistently annoying is when the output file
>name is silently changed - sometimes you don't discover
>it until you've done a complete place and route, and find
>that nothing changed.
>
>At some point, you just have to delete all of the Leonardo
>project files (*.lsp *.scr *.xdb ...) and start over.
>
>It's amazing to me that the Leonardo user interface has
>been so crappy for so many years.  At some point, you
>just have to delete Leonardo and start over.

Current favourite is the "stop" button you press when you notice one of
these absurdities, which puts you on ignore, then finally aborts the
current phase of processing several minutes later, then continues with
all the other phases anyway. Dumb. Just dumb. Or the "exit" button that
politely tells you to press the "stop" button before exiting Leonardo.

Or the way it picks the last file in the input file list as a suitable
name for the design output ( reasonable) but the first in the list for
the constraint file name!

I get the impression they test the synthesis engine, but never bother
about the UI because everybody uses the command line interface. Wonder
if it ever occurred to them, that was because their UI is so bug-ridden.

- Brian

Article: 36269
Subject: Re: Help with a 1996 XC3064 design!!
From: Peter Alfke <palfke@earthlink.net>
Date: Sun, 04 Nov 2001 23:56:33 GMT
Links: << >> << T >> << A >>

As I have posted before, old XC3000 parts are still around, and a switch to
XC3000L might solve the excessive logic speed problem.

Peter Alfke, Xilinx Applications
====================================
Peter wrote:

> >Hmm, dont know so much about the 3k series in the good old times. But I
> >think all these 32 3k parts can be combined into 1 or two uptodate parts
> >(Virtex-E, Virtex-II)
>
> I expect so but it would be a huge redesign. Interesting though.
>
> >> The 3k series device is now obsolete. I would also not like to risk
> >> rebuilding this board with some newer XC3064s because I know the more
> >> recent devices are much faster and some structures don't work anymore,
> >
> >Nasty asynchronous tricks ?? ;-)
>
> Not so much "nasty" as OK with the older parts - it was OK for example
> to use a "long line" for clock distribution. One used the "L"
> attribute, together with perhaps "SC=1" on the clock net to force its
> allocation to a long line. I was very specifically told by a Xilinx
> engineer (c. 1991) that this was OK. And it worked faultlessly. This
> was because the worst-case clock net skew was less than the clock-Q
> delays in the D-types.
>
> What happened later, even going from slow to faster XC3000 devices,
> was that the clock-Q timing improved a lot faster than the local
> interconnect timing. This IMHO is to be expected when you speed up
> your gates; speeding up the interconnect by the same factor is a lot
> harder. So a lot of previously solid clock gating schemes got broken.
>
> Obviously, nowadays, you would design everything to work off the
> global clock nets, and use clock-enable to do the clock gating.
>
> Unfortunately, I also used these parts for ASIC prototyping, and when
> you are doing a low power design, the most important method by far to
> reduce dynamic power is to gate clocks. Using a global clock net and a
> clock enable does not reduce dynamic power, well hardly anyway. On
> those designs I got my fingers burnt a bit :)
>
> Peter.
> --
> Return address is invalid to help stop junk mail.
> E-mail replies to zX80@digiYserve.com but remove the X and the Y.
> Please do NOT copy usenet posts to email - it is NOT necessary.

Article: 36270
Subject: speed of adder in XC1000E-6
From: <khtsoi@pc90026.cse.cuhk.edu.hk>
Date: 5 Nov 2001 04:45:00 GMT
Links: << >> << T >> << A >>

Hi,

I have checked the datasheet of Xilinx that the speed of an adder
in xv1000e-6 (1.8v) is very fast:
16 bit	4.x ns
64 bit	6.x ns
But the timing analysizer report something different:

CLB_R27C37.S1.G3     net (fanout=1)        2.084R  jk<1>
CLB_R27C37.S1.COUT   Topcyg                1.000R  j<0>
                                                   ijk_add/C2/C3/C0
                                                   ijk_add/C2/C3/C2
CLB_R26C37.S1.CIN    net (fanout=1)        0.000R  ijk_add/C2/C3/C2/O
CLB_R26C37.S1.COUT   Tbyp                  0.149R  j<2>
                                                   ijk_add/C2/C4/C2
                                                   ijk_add/C2/C5/C2
CLB_R25C37.S1.CIN    net (fanout=1)        0.000R  ijk_add/C2/C5/C2/O
CLB_R25C37.S1.Y      Tciny                 0.677R  j<4>
                                                   ijk_add/C2/C6/C2
                                                   ijk_add/C2/C7/C1
CLB_R30C32.S0.F4     net (fanout=2)        1.758R  j<5>
CLB_R30C32.S0.X      Tilo                  0.468R  C8/N49
                                                   C741

where Tbyp is the Cin to Cout speed, this is quite fast
but the Topcyg is the time for Ginput to Cout. This should be
the carryout generated in the upper LUT. If this is the delay,
64 bit adder will never be 6ns but 64 ns at least. But why the
report just count one of it but not all in the path?

what I do is:

jk <= j + k;
ijk <= jk + i;

help me pls

---- Brittle

Article: 36271
Subject: Re: How dense are FPGA/CPLD's
From: kevinbraceusenet@hotmail.com (Kevin Brace)
Date: 4 Nov 2001 22:40:24 -0800
Links: << >> << T >> << A >>

bazaillion@yahoo.com wrote in message news:<3be44c04.109339469@news.charter.net>...
> Hello,
> 
> I am new to FPGA/PLA/CPLD technology. 
> 
> I have a couple questions about them.
> 
> 
> (1) How dense can you get these (How many gates)?
> 

        Although, I don't use a CPLD myself, a fairly large CPLD can
fit about 10,000 gates.
Most CPLDs are EEPROM-based, which means that it can be reprogramed a
lot of times, and it will be active immediately upon power on.
Manufacturers of CPLD include Altera (http://www.altera.com), Cypress 
(http://www.cypress.com), Lattice (http://www.latticesemi.com), and
Xilinx (http://www.xilinx.com).
        Most FPGAs are much larger than CPLDs.
However, manufacturers of FPGAs really have a bad habit of inflating
the gate count the chip can realistically fit.
For example, an FPGA I use called Xilinx Spartan-II 150,000 "system
gate" part (XC2S150), although Xilinx (http://www.xilinx.com) claims
that part can fit 150,000 "system gates", the realistical gate count
(not using vendor proprietary features) is
about 30,000 to 35,000 gates.
Yes, Xilinx's "system gate" inflates the realistically achievable gate
count by about 5 times.
Xilinx's rival Altera (http://www.altera.com) also inflates the gate
count, but in my opinion, the gate count inflation seems to be about 2
to 3 times.
Spartan-IIs are low-end FPGAs, but a high-end FPGA like Xilinx
Virtex-II 6M system gate part (XC2V6000) should be able to fit more
than 1M realistical gates, if my assumption is correct.
Xilinx and Altera FPGAs are based on SRAM, so a Configuration PROM
(EEPROM based) has to be attached to program the FPGA when the power
is turned on.
There are other FPGA manufacturers that make FPGAs based on antifuse
like Actel (http://www.actel.com) and Quicklogic
(http://www.quicklogic.com), but antifuse FPGA I think are hard to use
for prototyping because once you program antifuse FPGAs, it cannot be
programmed again.
That's because in antifuse FPGAs, you burn off fuses inside the FPGA
to program the FPGA.
Nowadays, antifuse FPGAs are not popular as SRAM FPGAs, because their
density is relatively small compared to SRAM FPGAs, and hard to use
for prototyping.



> (2) Hiw much are they typically if you are buy small quantities 1-5
> for prototyping?

        Let's say that you want to purchase one Xilinx Spartan-II
150,000 "system gate" part (XC2S150) from a Xilinx distributor.
At Insight Electronics order website
(http://www.insight-electronics.com/order/index.html), you can type in
a part number.
If you type in XC2S150, it gives you a list of various XC2S150 with
different package options and speed grades.
Yes, because Spartan-IIs are geared as a low-cost FPGA, it only costs
between $20 to $40 per chip.
You also have to purchase a Configuration PROM (XC18V01), which costs
about $23.
However, if you try to buy a high-end FPGA like Xilinx Virtex-II 6M
system gate part (type in XC2V6000), only one chip will cost you
between $4,000 to $6,000.
Yes, $4,000 to $6,000 may sound a lot, but for some applications where
the production volume is extremely small (let's say a specialized
communication equipment), buying a couple of Virtex-IIs will likely be
cheaper than fabricating a custom chip (an ASIC).
        Recent FPGAs use packages like PQFP or BGA, which requires a
specialized equipment to solder it onto a PCB (especially BGA
package).
Therefore, I think it will be much more practical to purchase a
prototype board with an FPGA on it.
My recommendation will be that you should pick up one from Insight
Electronics (http://www.insight-electronics.com/solutions/kits/xilinx/index.shtml)
with Spartan-II on it.
One of them is a standalone one with Spartan-II on it, and one with a
Spartan-II on a PCI card (the one I use).
For design tools, you can use Xilinx ISE WebPack which is free,
supports all Spartan-II devices, and comes with a simulator called
ModelSim XE-Starter.
I think you should not initially consider paying for tools because if
you paid $1,000 for tools, and you didn't like designing hardware with
an FPGA afterall, you will still be holding the bag with useless
software.
At least a free tool won't have such a risk.
Other CPLD and FPGA vendors also offer free design tools, but Xilinx
is the only one that offers a free simulator.
It is far easier to find bugs running a simulator than actually firing
up the actual FPGA, especially as the design gets bigger and more
complex.
To design circuits for an FPGA, you should learn languages like
Verilog or VHDL.
I will recommend learning Verilog, but I am sure some people will say
VHDL.
It really doesn't matter which one you learn.
One thing I can say from my experience is that a lot of books about
Verilog and VHDL available out there are not well written.
Examples inside the book are too trivial, too boring, or too difficult
to understand.


> 
> (3)  Are they dense enough to build a CPU or a 3d VGA display chip
> with these? or maybe multiple chips.
> 
> Thanks,
> -M. Bazaillion


        You can obtain some homegrown CPUs from websites like
http://www.fpgacpu.org, http://www.free-ip.com, and
http://www.opencores.org.
Whether or not the CPUs use get from those websites are worthy playing
around is another question.
Those CPUs are pretty small, so they should be able to fit inside most
Spartan-II based prototype boards.
        However, my guess is that you are thinking of whether or not
an x86 processor might fit inside an FPGA.
Probably a 4 to 5 years old x86 processor might fit inside one Xilinx
Virtex-II 6M system gate part, and if one is not enough, you can use
multiple of them.
In fact, because of high cost of fabricating an ASIC, more designers
use multiple of high density FPGAs to debug it initially before
committing to an ASIC.
That way you have less chance to making a mistake when creating an
ASIC.
At 0.18u process, it costs $300,000 of NRE (Non-Recurring Engineering)
charge to fabricate an ASIC for volume production.
At 0.13u process, NRE charge will approach $1 million . . .
Even one small mistake can be fatal, because you are talking of paying
another $300,000 (in 0.18u) to fabricate an ASIC for volume
production.
        For a 3D graphics chip, again a 4 to 5 years old one should
fit inside a Xilinx Virtex-II 6M system gate part, but you will have
to put an external RAMDAC outside of the chip because an FPGA doesn't
contain one (TI and IBM used to make high-end RAMDACs about 4 to 5
years ago).




Regards,



Kevin Brace (don't respond to me directly, respond within the
newsgroup)

Article: 36272
Subject: Synplyfy to Xilinx pipe
From: cappellainfuocata@yahoo.it (Banana)
Date: 4 Nov 2001 23:22:22 -0800
Links: << >> << T >> << A >>

Good Morning,
I use Synplify 6 to synthesize my project, it also perform the mapping
and produce an edif file, I want to send this for Place and Route to
Xilinx 3.3 , how I could perform this ??
There's an opprtunity to use synplify directly from Xilinx but this
seems not work in my computer. Thanks for your help.

Banana

Article: 36273
Subject: Re: Clock attribute problem
From: cappellainfuocata@yahoo.it (Banana)
Date: 4 Nov 2001 23:39:42 -0800
Links: << >> << T >> << A >>

> Unfortunaltely you cannot use both the rising as the falling edge in a single process. What you can do is 1. double the clock (by using a dll)

CAN YOU EXPLAIN ME BETTER HOW I COULD DO THIS ???

THANKS 
BANANA


P.S. :  MY IDEA IS TO COUNT FROM 0 TO 2 AND WITH THE SAME CIRCUIT ALSO
DIVIDE A CLOCK BY THREE. HERE IT IS THE CODE REARRANGED :


library IEEE;
use IEEE.std_logic_1164.all;	  
use IEEE.std_logic_unsigned.all;

entity counter_divider_3 is
	port (
		clk			: in  STD_LOGIC;
		reset		   : in  STD_LOGIC;
		count_3		: out STD_LOGIC_VECTOR (2 downto 0);
		clk_div_3 	: out STD_LOGIC 
		);
end counter_divider_3;


architecture counter_divider_3_arch of counter_divider_3 is
begin
	process (clk, reset)
		variable count_3_internal   : STD_LOGIC_VECTOR (2 downto 0);
		variable clk_div_3_internal : std_logic;
	begin
		if reset = '1' 
		then  
			count_3_internal := "000"; 
		elsif falling_edge(clk)
			then
				if (count_3_internal = "010") 
				then  -- si deve riazzerare il contatore
					count_3_internal     := "000";
					clk_div_3_internal	:= '0' ;
				else  
					count_3_internal := count_3_internal + 1;
				end if;
			elsif count_3_internal = "001"
				then
					clk_div_3_internal := '1' ;
		end if ;
		count_3 	<= count_3_internal   ;
		clk_div_3 	<= clk_div_3_internal ;
	end process;
	
end counter_divider_3_arch;

Article: 36274
Subject: Xilinx Floorplanner Effectiveness
From: kevinbraceusenet@hotmail.com (Kevin Brace)
Date: 4 Nov 2001 23:43:41 -0800
Links: << >> << T >> << A >>

I will like to know effective is the Floorplanner software that comes
with Xilinx ISE series of software.
The design I am working on is a PCI IP core which Tsu (setup time) has
to be less than 7ns, and Tval (clock to output valid) has to be less
than 11ns.
Currently, the worst Tsu I have is 12.974ns, and the worst Tval I have
is 16.594ns.
I synthesized my design with XST Verilog, and I used only automatic
P&R with user constraints (Pad to Setup = 7ns, Clock to Pad = 11ns).
The software I am currently using is ISE WebPack 4.1.
Is it realistic to expect that I will get Tsu and Tval within 7ns and
11ns respectively if I use the Floorplanner?
Will reducing fan-out during synthesis help?
If so, what number (default is 100) is appropriate?
Are there any other helpful synthesis/P&R options that will improve
the timings?
The part I am using is Spartan-II 150K system gate part speed grade -5
which comes with Insight Electronics Spartan-II PCI Development Kit,
so the use of speed grade -6 is not an option.
I already synthesized my design with the speed grade -6, and that
improved the worst timings by 20%, but that still wasn't enough by
about 15%.
I already tried "Pack I/O Registers/Latches into IOBs" in MAP.
If I packed IOBs for input by selecting "For Inputs Only" or "For
Inputs or Outputs", it created a positive hold time, a no-no in PCI
(hold time has to be 0ns in PCI).
Selecting "For Outputs Only" didn't seem to improve Tval that much.
If the Floorplanner is going to help, what kinds of strategies should
I use to hand place the design?



Regards,



Kevin Brace (don't respond to me directly, respond within the
newsgroup)

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search