Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search

Messages from 90675

Article: 90675
Subject: Re: CPLD design software under WINE?
From: troy.scott@latticesemi.com
Date: 18 Oct 2005 13:52:23 -0700
Links: << >> << T >> << A >>

Mika,

It might be worth trying a recent release of ispLEVER software for
Linux. Lattice qualifies it under Red Hat Enterprise v3 so it may run
fine under your Linux distribution. I'd expect the Linux version to run
a bit faster than WINE however for the little CPLDs it may not make a
difference either way.

Troy Scott
Lattice Semiconductor
troy.sc...@latticesemi.com

Article: 90676
Subject: Re: Newbie question: XC3S400 Gate Count
From: "Paul Marciano" <pm940@yahoo.com>
Date: 18 Oct 2005 13:52:51 -0700
Links: << >> << T >> << A >>

Ed McGettigan wrote:
> The FPGA gate counts include not only the logic that can be built in the
> LUTs, but also the registers, the carry logic, the CLB muxes, the BlockRAMs,
> the clock DCMs, the IOB registers, DSP logic (not in the XC3S400).

Ed, just out of curiosity how many transistors, roughly, are there in
an XC3S400?  That is, total in the package - including RAMs,
configuration, routing, redundancy, everyithng?

I'm curious to know how many transistors you pack into a package
compared to, say, a 55 million transistor Pentium4.

I know it doesn't actually mean anything to the user - again... just
curious.

Thanks,
Paul.

Article: 90677
Subject: Re: Carry Chain Design
From: "Gabor" <gabor@alacron.com>
Date: 18 Oct 2005 13:55:36 -0700
Links: << >> << T >> << A >>

Brannon wrote:
> You have a good point. I should have clarified. The subtraction only
> uses two inputs per LUT. The same is the case with what you're
> proposing for the compare. It seems to me you could double the density
> by adding an extra switch bit to the MUXCY because the LUT already
> supports four inputs.

Might be good for compare, but since the LUT still has only one
output you couldn't build an adder/subtracter, for which the carry
chain was optimized.

Article: 90678
Subject: Re: Carry Chain Design
From: "John_H" <johnhandwork@mail.com>
Date: Tue, 18 Oct 2005 21:07:17 GMT
Links: << >> << T >> << A >>

So to clarify:
You want 2 bits of "a" and 2 bits of "b" to drive the 16-entry LUT for the 
carry chain select AND you want an 8-entry LUT for the Direct Input of the 
carry chain mux?

"Brannon" <brannonking@yahoo.com> wrote in message 
news:1129659685.077987.171750@g14g2000cwa.googlegroups.com...
> You have a good point. I should have clarified. The subtraction only
> uses two inputs per LUT. The same is the case with what you're
> proposing for the compare. It seems to me you could double the density
> by adding an extra switch bit to the MUXCY because the LUT already
> supports four inputs.
>

Article: 90679
Subject: Re: Newbie question: XC3S400 Gate Count
From: jerryzy@gmail.com
Date: 18 Oct 2005 14:53:17 -0700
Links: << >> << T >> << A >>

Thanks a lot Ed!

Does that mean if I want to roughly estimate the ASIC gates in a
general design, I can just multiple the LUT number by 10-15?

Article: 90680
Subject: Re: Newbie question: XC3S400 Gate Count
From: Ed McGettigan <ed.mcgettigan@xilinx.com>
Date: Tue, 18 Oct 2005 14:53:37 -0700
Links: << >> << T >> << A >>

Paul Marciano wrote:
> Ed McGettigan wrote:
> 
>>The FPGA gate counts include not only the logic that can be built in the
>>LUTs, but also the registers, the carry logic, the CLB muxes, the BlockRAMs,
>>the clock DCMs, the IOB registers, DSP logic (not in the XC3S400).
> 
> 
> Ed, just out of curiosity how many transistors, roughly, are there in
> an XC3S400?  That is, total in the package - including RAMs,
> configuration, routing, redundancy, everyithng?
> 
> I'm curious to know how many transistors you pack into a package
> compared to, say, a 55 million transistor Pentium4.
> 
> I know it doesn't actually mean anything to the user - again... just
> curious.
> 

We counted transistors for a while internally as we thought that it
was an interesting statistic.  But, it's actually a very hard problem to
handle as our devices are almost 100% full custom and sometimes the legs
of a transistor may not be clearly defined or split between different
submodules. The auto reporting functions from the CAD tools were also
spitting out numbers that were way too high.  We stopped doing this in
detail for Virtex-4 as there was no real benefit in knowing the exact
number except for bragging rights.

We had a press release for the Virtex-II Pro 2VP100 part that stated
430 Million transistors back in 2003
http://www.xilinx.com/prs_rls/xil_corp/03133taiwan.htm

I think that we were estimating about 1 Billion transistors for the
Virtex-4 LX200 parts that we are shipping now. I'm not as familiar with
the Spartan-III line, but I think that you are looking at about 30-35 Million
for the XC3S400 which has a configuration size of 1.7 Mbit.

Ed

Article: 90681
Subject: Re: Newbie question: XC3S400 Gate Count
From: Austin Lesea <austin@xilinx.com>
Date: Tue, 18 Oct 2005 15:12:11 -0700
Links: << >> << T >> << A >>

Check out:

http://www.fpgajournal.com/articles_2005/20051018_hardi.htm

And the other articles this week all on using FPGAs to prototype ASICs.

Austin

jerryzy@gmail.com wrote:

> Thanks a lot Ed!
> 
> Does that mean if I want to roughly estimate the ASIC gates in a
> general design, I can just multiple the LUT number by 10-15?
>

Article: 90682
Subject: Webpack install yields "299" error
From: Eboy <Eboy@netscape.ca>
Date: Tue, 18 Oct 2005 16:03:20 -0700
Links: << >> << T >> << A >>

Neophyte looking for help installing webpack 7.1. Being a student has denied me access to webcase. Any insight would be mucho appreciated.

Article: 90683
Subject: Re: clock timing
From: =?ISO-8859-15?Q?Benjamin_Menk=FCc?= <benjamin@menkuec.de>
Date: Wed, 19 Oct 2005 01:06:38 +0200
Links: << >> << T >> << A >>

Hi Symon,

fig. 11 in XAPP622 is about direct LVDS output. I am doing the parallel 
to lvds conversion with an external IC.

My VHDL Latch looks like this:
process (pixel_clk , screen_reset)
begin
	if screen_reset = '1' then
		lvds1 <= "0000000";
		lvds2 <= "0000000";
		lvds3 <= "0000000";
		lvds4 <= "0000000";
	elsif rising_edge(pixel_clk)then
		lvds1 <= lvds1pre;
		lvds2 <= lvds2pre;
		lvds3 <= lvds3pre;
		lvds4 <= lvds4pre;
	end if;
end process;

pixel_clk is my master clock. The lvds signals contain pixel color + 
control signals for the lcd. Somehow the results are worse now than before.
I am running 66 MHz pixel_clk at the moment. Is it neccessary to take 
special care of the pixel_clk output? My pixel_clk output is on a glck pin.

I generate my pixel_clk like this:

dcm2_1 : dcm2 port map (
	CLKIN_IN => clk_ibufg,
	RST_IN => RESET,
	CLKFX_OUT => clk_180m,
	CLKDV_OUT => pixel_clk,
	LOCKED_OUT => lvds_locked,
	CLK0_OUT => open);

where clk_ibugf is a 100MHz input clock that gets divided by 1.5. I have 
  connected the output "pixel_clk" of my entity directly to the dcm.

What is wrong or what should I improve?

regards,
Benjamin

Article: 90684
Subject: Re: clock timing
From: =?ISO-8859-15?Q?Benjamin_Menk=FCc?= <benjamin@menkuec.de>
Date: Wed, 19 Oct 2005 01:31:07 +0200
Links: << >> << T >> << A >>

Hi,

I have looked in the fpga editor. My VHDL code really uses the FFs 
inside the IOBs.

However I saw, that my clock output goes directly to the pad inside the 
IOB, furthermore it doesn't look like a good routing inside the FPGA. 
However the lvds_clk Signal in that reaches the IOB where the clk pin is 
  located is named "lvds_clk_OBUF" so I guess the compiler put an OBUF 
on it :) I remember that there was an OBUFG for clocks... Is it better 
to use this instead? If yes, how do I implement that in VHDL?

Antti said "use DCM to adjust phy clock phase". How do I do that in my case?

I have a mistake in my last post: My output definition in the entity of 
the design is lvds_clk.

Somewhere in the code I do just lvds_clk <= pixel_clk.

Thank You :)

regards,
Benjamin

Article: 90685
Subject: Re: Newbie question: XC3S400 Gate Count
From: Ed McGettigan <ed.mcgettigan@xilinx.com>
Date: Tue, 18 Oct 2005 16:44:44 -0700
Links: << >> << T >> << A >>

jerryzy@gmail.com wrote:
> Thanks a lot Ed!
> 
> Does that mean if I want to roughly estimate the ASIC gates in a
> general design, I can just multiple the LUT number by 10-15?
> 

It really depends on what your logic is actually doing in the LUT.
If it's an XOR it would accurate, if it's an AND it wouldn't.  We
have an old XAPP on this subject that you can look through for more
info: http://direct.xilinx.com/bvdocs/appnotes/xapp059.pdf

In my original analysis of your HDL code I didn't include any area
optimization for sharing logic functions which is likely happening.

The right way to do this is to resynthesize your design to a target
ASIC library and then to use the reported gate counts.

Ed

Article: 90686
Subject: re:How to Reduce Interconnects (VDD and VSS)
From: fahadislam2002@hotmail-dot-com.no-spam.invalid (fahadislam2002)
Date: Tue, 18 Oct 2005 19:16:20 -0500
Links: << >> << T >> << A >>

Hi...again :) 
       First thanks for such a nice response...

As everyone suggested that donot try to use 1 or 2 layer but instead
go for four layer ...
    but the problem is ... in my country (Pakistan) 4-layer is not
available... and as RAM is just a part of my final year project
(Designing of Gaming Console using FPGA) so obviously i donot have
much time to get 4-layer PCB from Abroad.
       MY
NEED
      As I need only 1MB to 4MB RAM with a better speed (as also want
to use it as Video RAM  :? ) 
      So I also tried to get chips.But here only 32KB is available
with a worst speed of 85 ns. 
      So for chips I contacted to Micron,s office in China (as not
present in my country).But they responded that they send a shipment
of atleast of $500...and I need only one or two SDRAM Chips of 16MB. 
:D 
   [b:d11e0ac89c][color=darkred:d11e0ac89c]Please
Suggest[/color:d11e0ac89c][/b:d11e0ac89c]
     
   [color=darkred:d11e0ac89c]Query_1:-[/color:d11e0ac89c]
                   Where to Get a Cheap SDRAM chip (Better if micron
as i have done work with it) and also Video RAM  .And more
importantly that i need just one or two :( 
    [color=darkred:d11e0ac89c]Query_2:-[/color:d11e0ac89c]
                    Where to get cheap 4-layer PCB or some alternative
of it.
    [color=darkred:d11e0ac89c]Query_3:-[/color:d11e0ac89c]
                   Major area where i need more RAM (and also its
speed) is for Video RAM ,,,,,,,,,,,,suggest how can reduce this
need,,,,,,,,,,,,, :) 
,,,,,,,,,,as if reduce it and get manage it on very less.....then can
try to use series of chips......... of RAM .

  Waiting for Response :x

Article: 90687
Subject: WANTED: Contract Verilog Designer
From: Greg Neff <>
Date: Tue, 18 Oct 2005 20:28:59 -0400
Links: << >> << T >> << A >>

We have an immediate requirement for a Verilog designer for a
short-term (approx. 3 month) period.  This job involves completing two
projects already underway.  

Both projects are embedded communications systems.  In both cases the
targets are Xilinx Spartan 3 FPGAs.  In one case the design includes
the Xilinx Microblaze soft processor.  The tool chain is Synplify,
Xilinx EDK and Xilinx ISE.  

The location is Scarborough (north-east Toronto in Ontario, Canada).
For more information contact Greg Neff at (416) 293-8263.


================================

Greg Neff
VP Engineering
*Microsym* Computers Inc.
greg@guesswhichwordgoeshere.com

Article: 90688
Subject: Re: How to Reduce Interconnects (VDD and VSS)
From: Bevan Weiss <kaizen__@NOSPAMhotmail.com>
Date: Wed, 19 Oct 2005 14:30:06 +1300
Links: << >> << T >> << A >>

fahadislam2002 wrote:
> Hi...again :) 
>        First thanks for such a nice response...
> 
> As everyone suggested that donot try to use 1 or 2 layer but instead
> go for four layer ...
>     but the problem is ... in my country (Pakistan) 4-layer is not
> available... and as RAM is just a part of my final year project
> (Designing of Gaming Console using FPGA) so obviously i donot have
> much time to get 4-layer PCB from Abroad.
>        MY
> NEED
>       As I need only 1MB to 4MB RAM with a better speed (as also want
> to use it as Video RAM  :? ) 
>       So I also tried to get chips.But here only 32KB is available
> with a worst speed of 85 ns. 
>       So for chips I contacted to Micron,s office in China (as not
> present in my country).But they responded that they send a shipment
> of atleast of $500...and I need only one or two SDRAM Chips of 16MB. 
> :D 
>    [b:d11e0ac89c][color=darkred:d11e0ac89c]Please
> Suggest[/color:d11e0ac89c][/b:d11e0ac89c]
>      
>    [color=darkred:d11e0ac89c]Query_1:-[/color:d11e0ac89c]
>                    Where to Get a Cheap SDRAM chip (Better if micron
> as i have done work with it) and also Video RAM  .And more
> importantly that i need just one or two :( 
>     [color=darkred:d11e0ac89c]Query_2:-[/color:d11e0ac89c]
>                     Where to get cheap 4-layer PCB or some alternative
> of it.
>     [color=darkred:d11e0ac89c]Query_3:-[/color:d11e0ac89c]
>                    Major area where i need more RAM (and also its
> speed) is for Video RAM ,,,,,,,,,,,,suggest how can reduce this
> need,,,,,,,,,,,,, :) 
> ,,,,,,,,,,as if reduce it and get manage it on very less.....then can
> try to use series of chips......... of RAM .
> 
>   Waiting for Response :x
> 

I haven't been following this thread, however it looks like you started 
out by designing for RAM modules (ie PCBs containing several DRAM 
chips).  It then looks like you decided routing this on 1/2 layer PCB 
would be too complex so you'd like to go to a 16MB DRAM chip.

I'd recommend that in all cases you will really need a 4 layer board. 
This is especially true if you've only done a few board designs, and 
certainly if you've never successfully completed one involving high 
speed interconnects.

There are many good fast prototype board houses in the US, many will 
produce 4 layer boards in under a week.  So including shipping you could 
have one in less than 2 weeks.  The cost is also quite reasonable, maybe 
a couple of hundred dollars for a one off.

You could then choose whether you'd prefer to route for a single chip 
(or maybe a couple) or whether you'd prefer to include a RAM module slot 
(which is what I'd do based on what the design seems to be for).

If you request some academic samples you may be able to obtain a couple. 
  I've had good experience with Micron in this regards before.  Try 
mentioning what you're after the samples for when you talk to your local 
distributor.  If they're not helpful, try a distributor slightly further 
afield.

I'm surprised that you only mention issues with routing for your memory 
modules.  Have you had no other routing problems, such as the FPGA -> 
host etc?  2 layers is really not that much when you've got to include 
power and ground on them...
How are you getting your bitstream to the FPGA?

Article: 90689
Subject: Re: LSI RapidChip
From: "Jerry" <nospam@nowhere.com>
Date: Tue, 18 Oct 2005 22:38:25 -0400
Links: << >> << T >> << A >>

John, this is  exactly the information I'm looking for. A user with hands on
experience.
 I'm tending to agree with you that Arrow may earn their money since this is
a new tool
flow, new devices and new bugs. While I have done several ASICs in the
past, the flow was toss the netlist over the fence after a few checks, run
the netlist with extracted
delays from place and route and maybe fix a few setup/hold problems. They
always worked. With this submicron
geometries with all of the second order effects the process is much more
involved.
Yes the NRE is high but for a full custom its even higher. I'm preaching to
the choir on that.

Any idea what the hourly rate of Synopsis primetime would be? Also any idea
how long it would take
to do a timing analysis on 2 million gates, single clock, totally
synchronous design. NO GATED CLOCKS!

>From experience I'd rather deal with Arrow any day. LSI ARE sticklers
> and can be a real pain to work with.
LSI has a methodology that works.
I hated procmon. My LSI chips always worked.

> - Detailed place and route time - very dependent on the size /
> complexity of the chip. But very fast compared to an ASIC 6-8 weeks is
> a good rule of thumb
I'm somewhat surprised it takes that long for a structured ASIC. I need to
understand better why it takes so long.

Another question: What is your typical gate utilization?

Again thanks for your time John.

Regards
Jerry




"John B" <ctlmdjb@gmail.com> wrote in message
news:1129648958.692373.274090@o13g2000cwo.googlegroups.com...
> Jerry - we've designed (are designing)  several LSI RapidChips.
> Here's some comments on your posting:
>
> - Costs are pretty out there. You need to get to a Rapid Ready Netlist
> using the LSI tool flow (LSI RapidWorx). Depends where you are coming
> from on how hard that is. If it's an FPGA design, expect to re-map
> the memories, re-do the I/Os, maybe change some IP, re-verify. If
> it's a clean design it's a case of designing for the technology.
> - You'll pay; LSI NRE ($50-200K depending on chip chosen) for Netlist
> to chip. Synplicity tool costs ($20K or so) and that should be it.
> RapidWorx is free. You'll need Synopsis primetime for STA - if you
> haven't got it that can be pricey, or you can pay someone to do it
> for you on an hourly rate
> - Layout is easy as long as you follow the rules - so, one iteration.
> The rules cover things like max speed (typically 250MHz) and getting
> the RTL through the design rule checker (Tera Systems, Teraform tool).
> The rules can be broken, but that's when physical becomes trickier
> and you can expect to pay some additional NRE
> - Device delivery schedule.....won't lie to you here, LSI is still
> putting chips through a new tool flow and we've run into several
> gotcha's that delayed things. By now hopefully things are cleaned up
> (they've done about 50 chips). But don't believe their 'as easy
> as an FPGA' hype - it isn't. Learning the tools takes
> sometime.....the tools still have some bugs. Working with someone
> who's done it before is a huge benefit.
> - Synplicity does a 'placed netlist' synthesis with pretty accurate
> timing.....except, one design we did had a massive fan out on the AMBA
> bus. We warned the end customer but he went ahead anyway. The final
> layout tool threw in a whole bunch of buffers and that screwed things
> up compared to Synplicity results. Otherwise the match has been very
> accurate.
> - Detailed place and route time - very dependent on the size /
> complexity of the chip. But very fast compared to an ASIC 6-8 weeks is
> a good rule of thumb
> - Test scan is built into the chip by LSI so you should have no worries
> there. You hand off RTL, they add clocks, scan chains etc. No need to
> worry about test coverage, signal integrity (as long as you follow the
> rules)
> - Paying  distributor - LSI uses Arrow (in North America) to deal
> with most customers who aren't Cisco etc. Arrow has a design center
> and will do most (not all) of the post Netlist engineering. So yes, you
> pay them. But they are highly professional and experienced. Unless
> you're a high volume big guy in which case you might deal direct.
> >From experience I'd rather deal with Arrow any day. LSI ARE sticklers
> and can be a real pain to work with.
>
> Feel free to Email me with any questions or you can talk directly to
> one of our engineers (within reason, we have to make a living to!)
>

Article: 90690
Subject: Re: LSI RAPIDCHIP
From: "Jerry" <nospam@nowhere.com>
Date: Tue, 18 Oct 2005 22:40:16 -0400
Links: << >> << T >> << A >>

I think you did a fine job.

"John B" <ctlmdjb@gmail.com> wrote in message
news:1129649524.280207.212790@g49g2000cwa.googlegroups.com...
> Sorry just read through that and realised I didn't do a very good job
> on the flow:
> 1) Enter the design in LSI RapidWorx
> 2) Check the design in TeraSystems Teraform
> 3) Synthesize the design in Synplicity Amplify
> 4) Check STA in Synopsis primetime
> 5) Hand off netlist to LSI or 3rd party.
> John (john.burton@octera.com)
>

Article: 90691
Subject: Re: using i2c core
From: "CMOS" <manusha@millenniumit.com>
Date: 18 Oct 2005 20:18:33 -0700
Links: << >> << T >> << A >>

i've done that already. im using IOBUF at the port. but at the
traslation stage it gives an error "ERROR:924" which says the port is
connectod to an non-buffered input.
in detail this is what i have.
i got six port signals comming out from the i2c interface to generate
that open drain "scl" and "sda" outputs. namely,
sda_pad_o - this is internaly connected to ground as you said. (port
type -out)
scl_pad_o - this too is internaly connected to ground as you said.(port
type -out)
sda_pad_oen - this is the buffer enable/disable output for sda.(port
type -out)
scl_pad_oen - this is the buffer enable/disable output for scl.(port
type -out)
sda_pad_i - this is the buffer enable/disable output for sda.(port
type - in)
scl_pad_i - this is the buffer enable/disable output for scl.(port
type - in)

what im doing in the top level design is to connect those signals to
two IOBUF's, where bidirectional pin of those two IOBUF's are used as
the final sda and scl. im sure the connections are correct. when i do
so, in the traslation phase it gives the error "ERROR:924".

Please help me on this.

p.s:
Im new to vhdl and FPGA and i got lot of questions, some of which are
very silly. But symon knew VHDL since his birth and im proud of him.
let me know a better forum if im too in-experienced here.

http://groups.google.com/group/comp.arch.fpga/browse_thread/thread/7bf06cf65c8e6a76/d33a8d735c04af66?q=clk+clock+coffee&rnum=1&hl=en#d33a8d735c04af66
or serch for "clk clock coffee cocoa" in this forum.

Thank you.
CMOS

Article: 90692
Subject: Re: using i2c core
From: John_H <johnhandwork@mail.com>
Date: Wed, 19 Oct 2005 04:10:36 GMT
Links: << >> << T >> << A >>

"im sure the connections are correct"

Is the IOBUF's input hooked up to the output from the core (which you 
found is always ground) and is the I/O buf's output hooked up the the 
input to your core?  I've mentally stumbled on the .O feeding my inputs 
and not being what I drive and vice-versa on the other side.

The .I is the input *to* the IOBUF and the .O is the output *from* the 
IOBUF.

It's a shot.

- John_H


CMOS wrote:
> i've done that already. im using IOBUF at the port. but at the
> traslation stage it gives an error "ERROR:924" which says the port is
> connectod to an non-buffered input.
> in detail this is what i have.
> i got six port signals comming out from the i2c interface to generate
> that open drain "scl" and "sda" outputs. namely,
> sda_pad_o - this is internaly connected to ground as you said. (port
> type -out)
> scl_pad_o - this too is internaly connected to ground as you said.(port
> type -out)
> sda_pad_oen - this is the buffer enable/disable output  for sda.(port
> type -out)
> scl_pad_oen - this is the buffer enable/disable output  for scl.(port
> type -out)
> sda_pad_i - this is the buffer enable/disable output  for sda.(port
> type - in)
> scl_pad_i - this is the buffer enable/disable output  for scl.(port
> type - in)
> 
> what im doing in the top level design is to connect those signals to
> two IOBUF's, where bidirectional pin of those two IOBUF's are used as
> the final sda and scl. im sure the connections are correct. when i do
> so, in the traslation phase it gives the error "ERROR:924".
> 
> Please help me on this.
> 
> p.s:
> Im new to vhdl and FPGA and i got lot of questions, some of which are
> very silly. But symon knew VHDL since his birth and im proud of him.
> let me know a better forum if im too in-experienced here.
> 
> http://groups.google.com/group/comp.arch.fpga/browse_thread/thread/7bf06cf65c8e6a76/d33a8d735c04af66?q=clk+clock+coffee&rnum=1&hl=en#d33a8d735c04af66
> or serch for "clk clock coffee cocoa" in this forum.
> 
> Thank you.
> CMOS

Article: 90693
Subject: Re: using i2c core
From: "CMOS" <manusha@millenniumit.com>
Date: 18 Oct 2005 22:49:16 -0700
Links: << >> << T >> << A >>

hi,
the IOBUF im  using has 4 pins.
 it is made of one TRISTATE output buffer and input buffer. the
inputbuffer's input and tri_state output buffers output is connected
together and function as the IO port for the entity. the input of the
tristate output buffer is connected to the output from the core, which
is always grounded. its enable/disable pin is also controlled by the
core. the output of the input buffer is connected to an input of the
core. 

CMOS

Article: 90694
Subject: Re: Data2Mem usage - help required
From: backhus <nix@nirgends.xyz>
Date: Wed, 19 Oct 2005 07:51:10 +0200
Links: << >> << T >> << A >>

Hi Robert,
sorry, I forgot to mention that.
I fell into that very same trap when I did my first steps with data2mem.
  :-)

Have a nice synthesis
   Eilert


Robert schrieb:
> Just to add:
> 
> I thought the syntax for data2mem in the documentation was misleading.
> Or may be its me. There it says:
> 
> data2mem -bm|bd infile.[bmm|elf|mem] [options]
> 
> This lead me to think that I should give EITHER a bmm or a mem file as
> input, as opposed to both which are required.
>

Article: 90695
Subject: How to speed up the critical path (Xilinx)
From: "starbugs" <starbugs@gmx.net>
Date: 19 Oct 2005 01:39:32 -0700
Links: << >> << T >> << A >>

Hi there,

I would be happy about some suggestions on how I could start to make my
design faster. My design is a processor (18 bit datapath)and the
critical path looks like this:

1. Instruction register (containing number of register)
2. Register file (distributed RAM)
3. Mux (2-way, selects either register or RAM)
4. Mult18x18 within the ALU
5. Mux (ALU output selector)
6. Register file (distributed RAM)

Target is a Spartan3 speed grade 4. I ran PAR at highest effort.
Timing-driven mapping makes no difference. The report (after PAR) says:

    Delay type         Delay(ns)  Logical Resource(s)
    ----------------------------  -------------------
    Tiockiq               0.259   EX_Instr_adr1_1
    net (fanout=18)       2.114   EX_Instr_adr1<1>
    Tilo                  0.608   regs_a10_Mram_RAM_inst_ramx_0.F
    net (fanout=2)        0.693   EX_Regs1do<10>
    Tilo                  0.608   data1mux_Mmux_q_Result<10>1
    net (fanout=5)        2.617   EX_Data1<10>
    Tmult                 3.493   alu_Mmult_prod_inst_mult_0
    net (fanout=1)        2.378   alu_prod<14>
    Tilo                  0.550   alu_result<14>16
    net (fanout=3)        1.061   EX_Data3<14>
    Tds                   0.519   regs_a14_Mram_RAM_inst_ramx_0.F
    ----------------------------  ---------------------------
    Total                14.900ns (6.037ns logic, 8.863ns route)
                                  (40.5% logic, 59.5% route)

Now how could I start improving the design? I don't want to split this
up into two cycles (because instruction level parallelism is low and I
need one result to compute the next).

I notice that the net delay of the instruction register is quite high.
Does this have to do with the fanout? Fanout is 18 (because the value
is used as an address to 18 parallel distributed RAM LUTs). I've heard
of duplicated registers. Would that help? And then, how would I achieve
it? Automatically through a setting? Manually? Is there an elegant way
to do it?

Another thing I've heard about is RLOC constraints. I never dared try
them so far. Do you think I could improve the design, and by how much?

Of course, I highly appreciate any (other?) suggestions on how to speed
up my design. I might also consider changing the architecture, if it
doesn't mean I have to change the whole concept of my processor.

Also, I am looking for good literature on FPGA implementation.

Thanks in advance!
K.B.

Article: 90696
Subject: Re: How to speed up the critical path (Xilinx)
From: "starbugs" <starbugs@gmx.net>
Date: 19 Oct 2005 01:49:18 -0700
Links: << >> << T >> << A >>

What I forgot to say:
1. In case you wonder what the multiplexer (3rd thing in the path) is
for: operands can come either from registers or from a RAM (pipelined
with a register before the mux), that's why I need the mux here.

2. The scarce resource in my design is block RAM, so it doesn't matter
if it uses more area as long as that makes it faster. For some reason,
if I set the mapper's optimization goal to "speed" instead of "area",
the design gets even _slower_.

Article: 90697
Subject: Re: How to speed up the critical path (Xilinx)
From: Zara <nospam.yozara@terra.es>
Date: Wed, 19 Oct 2005 09:40:06 GMT
Links: << >> << T >> << A >>

On 19 Oct 2005 01:39:32 -0700, "starbugs" <starbugs@gmx.net> wrote:

>Hi there,
>
>I would be happy about some suggestions on how I could start to make my
>design faster. My design is a processor (18 bit datapath)and the
>critical path looks like this:
>
(...)
>I notice that the net delay of the instruction register is quite high.
>Does this have to do with the fanout? Fanout is 18 (because the value
>is used as an address to 18 parallel distributed RAM LUTs). I've heard
>of duplicated registers. Would that help? And then, how would I achieve
>it? Automatically through a setting? Manually? Is there an elegant way
>to do it?
>

Yes, your problem is fanout. Duplicating registers could be the
solution, but it has mainly two drawbacks:
- It usuallly uses more FPGA registers
- It may only move the fanout problem to the routing *before* the
register. You could kill this problem creating a new one.

Try experimenting with the synthesis/implementation options. For
instance, you could try to set maximum fanout to 10. In general, don´t
try to topimise by hand, let xilinx tool try to do it.

Set optimization goal to speed, use timing constraints... try
everything if possible.

BTW, I would register the multiplier outpu, call it
"productResultRegister" and use this resgiter as another one from the
register file. This way, instruction set might grow a little but
timing might improve. (Test before doing too radical changes!)

Article: 90698
Subject: Implementation of 1024 point FFT in Actel FPGA
From: "cisivakumar" <cisivakumar@gmail.com>
Date: 19 Oct 2005 02:55:50 -0700
Links: << >> << T >> << A >>

Hai,

   I want to do the main project as Implementation of 1024 point FFT in
Actel FPGA.I have to find a new frequency identification algorithm
other than Fast Fourier Transform.Please give valuable notes,codes and
suggestions for successully completing this project.

Thanking you.
I.Sivakumar

Article: 90699
Subject: which is Low power FPGA?
From: "himassk" <himassk@gmail.com>
Date: 19 Oct 2005 02:58:35 -0700
Links: << >> << T >> << A >>

Hi,

Please advice me on low power FPGA available in the market.
which is the low power consumption FPGA among Xilinx - Spartan3L,
Altera - CycloneII, Actel - proASIC3 and Lattice. Is there any review
available on this?

According to me proASIC3 is with low power, but its a flash based FPGA.
Is there any disadvantages with Flash based FPGAs?

Thanks in advance.

Best regards,
HimaSSK.

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search