Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Mika, It might be worth trying a recent release of ispLEVER software for Linux. Lattice qualifies it under Red Hat Enterprise v3 so it may run fine under your Linux distribution. I'd expect the Linux version to run a bit faster than WINE however for the little CPLDs it may not make a difference either way. Troy Scott Lattice Semiconductor troy.sc...@latticesemi.comArticle: 90676
Ed McGettigan wrote: > The FPGA gate counts include not only the logic that can be built in the > LUTs, but also the registers, the carry logic, the CLB muxes, the BlockRAMs, > the clock DCMs, the IOB registers, DSP logic (not in the XC3S400). Ed, just out of curiosity how many transistors, roughly, are there in an XC3S400? That is, total in the package - including RAMs, configuration, routing, redundancy, everyithng? I'm curious to know how many transistors you pack into a package compared to, say, a 55 million transistor Pentium4. I know it doesn't actually mean anything to the user - again... just curious. Thanks, Paul.Article: 90677
Brannon wrote: > You have a good point. I should have clarified. The subtraction only > uses two inputs per LUT. The same is the case with what you're > proposing for the compare. It seems to me you could double the density > by adding an extra switch bit to the MUXCY because the LUT already > supports four inputs. Might be good for compare, but since the LUT still has only one output you couldn't build an adder/subtracter, for which the carry chain was optimized.Article: 90678
So to clarify: You want 2 bits of "a" and 2 bits of "b" to drive the 16-entry LUT for the carry chain select AND you want an 8-entry LUT for the Direct Input of the carry chain mux? "Brannon" <brannonking@yahoo.com> wrote in message news:1129659685.077987.171750@g14g2000cwa.googlegroups.com... > You have a good point. I should have clarified. The subtraction only > uses two inputs per LUT. The same is the case with what you're > proposing for the compare. It seems to me you could double the density > by adding an extra switch bit to the MUXCY because the LUT already > supports four inputs. >Article: 90679
Thanks a lot Ed! Does that mean if I want to roughly estimate the ASIC gates in a general design, I can just multiple the LUT number by 10-15?Article: 90680
Paul Marciano wrote: > Ed McGettigan wrote: > >>The FPGA gate counts include not only the logic that can be built in the >>LUTs, but also the registers, the carry logic, the CLB muxes, the BlockRAMs, >>the clock DCMs, the IOB registers, DSP logic (not in the XC3S400). > > > Ed, just out of curiosity how many transistors, roughly, are there in > an XC3S400? That is, total in the package - including RAMs, > configuration, routing, redundancy, everyithng? > > I'm curious to know how many transistors you pack into a package > compared to, say, a 55 million transistor Pentium4. > > I know it doesn't actually mean anything to the user - again... just > curious. > We counted transistors for a while internally as we thought that it was an interesting statistic. But, it's actually a very hard problem to handle as our devices are almost 100% full custom and sometimes the legs of a transistor may not be clearly defined or split between different submodules. The auto reporting functions from the CAD tools were also spitting out numbers that were way too high. We stopped doing this in detail for Virtex-4 as there was no real benefit in knowing the exact number except for bragging rights. We had a press release for the Virtex-II Pro 2VP100 part that stated 430 Million transistors back in 2003 http://www.xilinx.com/prs_rls/xil_corp/03133taiwan.htm I think that we were estimating about 1 Billion transistors for the Virtex-4 LX200 parts that we are shipping now. I'm not as familiar with the Spartan-III line, but I think that you are looking at about 30-35 Million for the XC3S400 which has a configuration size of 1.7 Mbit. EdArticle: 90681
Check out: http://www.fpgajournal.com/articles_2005/20051018_hardi.htm And the other articles this week all on using FPGAs to prototype ASICs. Austin jerryzy@gmail.com wrote: > Thanks a lot Ed! > > Does that mean if I want to roughly estimate the ASIC gates in a > general design, I can just multiple the LUT number by 10-15? >Article: 90682
Neophyte looking for help installing webpack 7.1. Being a student has denied me access to webcase. Any insight would be mucho appreciated.Article: 90683
Hi Symon, fig. 11 in XAPP622 is about direct LVDS output. I am doing the parallel to lvds conversion with an external IC. My VHDL Latch looks like this: process (pixel_clk , screen_reset) begin if screen_reset = '1' then lvds1 <= "0000000"; lvds2 <= "0000000"; lvds3 <= "0000000"; lvds4 <= "0000000"; elsif rising_edge(pixel_clk)then lvds1 <= lvds1pre; lvds2 <= lvds2pre; lvds3 <= lvds3pre; lvds4 <= lvds4pre; end if; end process; pixel_clk is my master clock. The lvds signals contain pixel color + control signals for the lcd. Somehow the results are worse now than before. I am running 66 MHz pixel_clk at the moment. Is it neccessary to take special care of the pixel_clk output? My pixel_clk output is on a glck pin. I generate my pixel_clk like this: dcm2_1 : dcm2 port map ( CLKIN_IN => clk_ibufg, RST_IN => RESET, CLKFX_OUT => clk_180m, CLKDV_OUT => pixel_clk, LOCKED_OUT => lvds_locked, CLK0_OUT => open); where clk_ibugf is a 100MHz input clock that gets divided by 1.5. I have connected the output "pixel_clk" of my entity directly to the dcm. What is wrong or what should I improve? regards, BenjaminArticle: 90684
Hi, I have looked in the fpga editor. My VHDL code really uses the FFs inside the IOBs. However I saw, that my clock output goes directly to the pad inside the IOB, furthermore it doesn't look like a good routing inside the FPGA. However the lvds_clk Signal in that reaches the IOB where the clk pin is located is named "lvds_clk_OBUF" so I guess the compiler put an OBUF on it :) I remember that there was an OBUFG for clocks... Is it better to use this instead? If yes, how do I implement that in VHDL? Antti said "use DCM to adjust phy clock phase". How do I do that in my case? I have a mistake in my last post: My output definition in the entity of the design is lvds_clk. Somewhere in the code I do just lvds_clk <= pixel_clk. Thank You :) regards, BenjaminArticle: 90685
jerryzy@gmail.com wrote: > Thanks a lot Ed! > > Does that mean if I want to roughly estimate the ASIC gates in a > general design, I can just multiple the LUT number by 10-15? > It really depends on what your logic is actually doing in the LUT. If it's an XOR it would accurate, if it's an AND it wouldn't. We have an old XAPP on this subject that you can look through for more info: http://direct.xilinx.com/bvdocs/appnotes/xapp059.pdf In my original analysis of your HDL code I didn't include any area optimization for sharing logic functions which is likely happening. The right way to do this is to resynthesize your design to a target ASIC library and then to use the reported gate counts. EdArticle: 90686
Hi...again :) First thanks for such a nice response... As everyone suggested that donot try to use 1 or 2 layer but instead go for four layer ... but the problem is ... in my country (Pakistan) 4-layer is not available... and as RAM is just a part of my final year project (Designing of Gaming Console using FPGA) so obviously i donot have much time to get 4-layer PCB from Abroad. MY NEED As I need only 1MB to 4MB RAM with a better speed (as also want to use it as Video RAM :? ) So I also tried to get chips.But here only 32KB is available with a worst speed of 85 ns. So for chips I contacted to Micron,s office in China (as not present in my country).But they responded that they send a shipment of atleast of $500...and I need only one or two SDRAM Chips of 16MB. :D [b:d11e0ac89c][color=darkred:d11e0ac89c]Please Suggest[/color:d11e0ac89c][/b:d11e0ac89c] [color=darkred:d11e0ac89c]Query_1:-[/color:d11e0ac89c] Where to Get a Cheap SDRAM chip (Better if micron as i have done work with it) and also Video RAM .And more importantly that i need just one or two :( [color=darkred:d11e0ac89c]Query_2:-[/color:d11e0ac89c] Where to get cheap 4-layer PCB or some alternative of it. [color=darkred:d11e0ac89c]Query_3:-[/color:d11e0ac89c] Major area where i need more RAM (and also its speed) is for Video RAM ,,,,,,,,,,,,suggest how can reduce this need,,,,,,,,,,,,, :) ,,,,,,,,,,as if reduce it and get manage it on very less.....then can try to use series of chips......... of RAM . Waiting for Response :xArticle: 90687
We have an immediate requirement for a Verilog designer for a short-term (approx. 3 month) period. This job involves completing two projects already underway. Both projects are embedded communications systems. In both cases the targets are Xilinx Spartan 3 FPGAs. In one case the design includes the Xilinx Microblaze soft processor. The tool chain is Synplify, Xilinx EDK and Xilinx ISE. The location is Scarborough (north-east Toronto in Ontario, Canada). For more information contact Greg Neff at (416) 293-8263. ================================ Greg Neff VP Engineering *Microsym* Computers Inc. greg@guesswhichwordgoeshere.comArticle: 90688
fahadislam2002 wrote: > Hi...again :) > First thanks for such a nice response... > > As everyone suggested that donot try to use 1 or 2 layer but instead > go for four layer ... > but the problem is ... in my country (Pakistan) 4-layer is not > available... and as RAM is just a part of my final year project > (Designing of Gaming Console using FPGA) so obviously i donot have > much time to get 4-layer PCB from Abroad. > MY > NEED > As I need only 1MB to 4MB RAM with a better speed (as also want > to use it as Video RAM :? ) > So I also tried to get chips.But here only 32KB is available > with a worst speed of 85 ns. > So for chips I contacted to Micron,s office in China (as not > present in my country).But they responded that they send a shipment > of atleast of $500...and I need only one or two SDRAM Chips of 16MB. > :D > [b:d11e0ac89c][color=darkred:d11e0ac89c]Please > Suggest[/color:d11e0ac89c][/b:d11e0ac89c] > > [color=darkred:d11e0ac89c]Query_1:-[/color:d11e0ac89c] > Where to Get a Cheap SDRAM chip (Better if micron > as i have done work with it) and also Video RAM .And more > importantly that i need just one or two :( > [color=darkred:d11e0ac89c]Query_2:-[/color:d11e0ac89c] > Where to get cheap 4-layer PCB or some alternative > of it. > [color=darkred:d11e0ac89c]Query_3:-[/color:d11e0ac89c] > Major area where i need more RAM (and also its > speed) is for Video RAM ,,,,,,,,,,,,suggest how can reduce this > need,,,,,,,,,,,,, :) > ,,,,,,,,,,as if reduce it and get manage it on very less.....then can > try to use series of chips......... of RAM . > > Waiting for Response :x > I haven't been following this thread, however it looks like you started out by designing for RAM modules (ie PCBs containing several DRAM chips). It then looks like you decided routing this on 1/2 layer PCB would be too complex so you'd like to go to a 16MB DRAM chip. I'd recommend that in all cases you will really need a 4 layer board. This is especially true if you've only done a few board designs, and certainly if you've never successfully completed one involving high speed interconnects. There are many good fast prototype board houses in the US, many will produce 4 layer boards in under a week. So including shipping you could have one in less than 2 weeks. The cost is also quite reasonable, maybe a couple of hundred dollars for a one off. You could then choose whether you'd prefer to route for a single chip (or maybe a couple) or whether you'd prefer to include a RAM module slot (which is what I'd do based on what the design seems to be for). If you request some academic samples you may be able to obtain a couple. I've had good experience with Micron in this regards before. Try mentioning what you're after the samples for when you talk to your local distributor. If they're not helpful, try a distributor slightly further afield. I'm surprised that you only mention issues with routing for your memory modules. Have you had no other routing problems, such as the FPGA -> host etc? 2 layers is really not that much when you've got to include power and ground on them... How are you getting your bitstream to the FPGA?Article: 90689
John, this is exactly the information I'm looking for. A user with hands on experience. I'm tending to agree with you that Arrow may earn their money since this is a new tool flow, new devices and new bugs. While I have done several ASICs in the past, the flow was toss the netlist over the fence after a few checks, run the netlist with extracted delays from place and route and maybe fix a few setup/hold problems. They always worked. With this submicron geometries with all of the second order effects the process is much more involved. Yes the NRE is high but for a full custom its even higher. I'm preaching to the choir on that. Any idea what the hourly rate of Synopsis primetime would be? Also any idea how long it would take to do a timing analysis on 2 million gates, single clock, totally synchronous design. NO GATED CLOCKS! >From experience I'd rather deal with Arrow any day. LSI ARE sticklers > and can be a real pain to work with. LSI has a methodology that works. I hated procmon. My LSI chips always worked. > - Detailed place and route time - very dependent on the size / > complexity of the chip. But very fast compared to an ASIC 6-8 weeks is > a good rule of thumb I'm somewhat surprised it takes that long for a structured ASIC. I need to understand better why it takes so long. Another question: What is your typical gate utilization? Again thanks for your time John. Regards Jerry "John B" <ctlmdjb@gmail.com> wrote in message news:1129648958.692373.274090@o13g2000cwo.googlegroups.com... > Jerry - we've designed (are designing) several LSI RapidChips. > Here's some comments on your posting: > > - Costs are pretty out there. You need to get to a Rapid Ready Netlist > using the LSI tool flow (LSI RapidWorx). Depends where you are coming > from on how hard that is. If it's an FPGA design, expect to re-map > the memories, re-do the I/Os, maybe change some IP, re-verify. If > it's a clean design it's a case of designing for the technology. > - You'll pay; LSI NRE ($50-200K depending on chip chosen) for Netlist > to chip. Synplicity tool costs ($20K or so) and that should be it. > RapidWorx is free. You'll need Synopsis primetime for STA - if you > haven't got it that can be pricey, or you can pay someone to do it > for you on an hourly rate > - Layout is easy as long as you follow the rules - so, one iteration. > The rules cover things like max speed (typically 250MHz) and getting > the RTL through the design rule checker (Tera Systems, Teraform tool). > The rules can be broken, but that's when physical becomes trickier > and you can expect to pay some additional NRE > - Device delivery schedule.....won't lie to you here, LSI is still > putting chips through a new tool flow and we've run into several > gotcha's that delayed things. By now hopefully things are cleaned up > (they've done about 50 chips). But don't believe their 'as easy > as an FPGA' hype - it isn't. Learning the tools takes > sometime.....the tools still have some bugs. Working with someone > who's done it before is a huge benefit. > - Synplicity does a 'placed netlist' synthesis with pretty accurate > timing.....except, one design we did had a massive fan out on the AMBA > bus. We warned the end customer but he went ahead anyway. The final > layout tool threw in a whole bunch of buffers and that screwed things > up compared to Synplicity results. Otherwise the match has been very > accurate. > - Detailed place and route time - very dependent on the size / > complexity of the chip. But very fast compared to an ASIC 6-8 weeks is > a good rule of thumb > - Test scan is built into the chip by LSI so you should have no worries > there. You hand off RTL, they add clocks, scan chains etc. No need to > worry about test coverage, signal integrity (as long as you follow the > rules) > - Paying distributor - LSI uses Arrow (in North America) to deal > with most customers who aren't Cisco etc. Arrow has a design center > and will do most (not all) of the post Netlist engineering. So yes, you > pay them. But they are highly professional and experienced. Unless > you're a high volume big guy in which case you might deal direct. > >From experience I'd rather deal with Arrow any day. LSI ARE sticklers > and can be a real pain to work with. > > Feel free to Email me with any questions or you can talk directly to > one of our engineers (within reason, we have to make a living to!) >Article: 90690
I think you did a fine job. "John B" <ctlmdjb@gmail.com> wrote in message news:1129649524.280207.212790@g49g2000cwa.googlegroups.com... > Sorry just read through that and realised I didn't do a very good job > on the flow: > 1) Enter the design in LSI RapidWorx > 2) Check the design in TeraSystems Teraform > 3) Synthesize the design in Synplicity Amplify > 4) Check STA in Synopsis primetime > 5) Hand off netlist to LSI or 3rd party. > John (john.burton@octera.com) >Article: 90691
i've done that already. im using IOBUF at the port. but at the traslation stage it gives an error "ERROR:924" which says the port is connectod to an non-buffered input. in detail this is what i have. i got six port signals comming out from the i2c interface to generate that open drain "scl" and "sda" outputs. namely, sda_pad_o - this is internaly connected to ground as you said. (port type -out) scl_pad_o - this too is internaly connected to ground as you said.(port type -out) sda_pad_oen - this is the buffer enable/disable output for sda.(port type -out) scl_pad_oen - this is the buffer enable/disable output for scl.(port type -out) sda_pad_i - this is the buffer enable/disable output for sda.(port type - in) scl_pad_i - this is the buffer enable/disable output for scl.(port type - in) what im doing in the top level design is to connect those signals to two IOBUF's, where bidirectional pin of those two IOBUF's are used as the final sda and scl. im sure the connections are correct. when i do so, in the traslation phase it gives the error "ERROR:924". Please help me on this. p.s: Im new to vhdl and FPGA and i got lot of questions, some of which are very silly. But symon knew VHDL since his birth and im proud of him. let me know a better forum if im too in-experienced here. http://groups.google.com/group/comp.arch.fpga/browse_thread/thread/7bf06cf65c8e6a76/d33a8d735c04af66?q=clk+clock+coffee&rnum=1&hl=en#d33a8d735c04af66 or serch for "clk clock coffee cocoa" in this forum. Thank you. CMOSArticle: 90692
"im sure the connections are correct" Is the IOBUF's input hooked up to the output from the core (which you found is always ground) and is the I/O buf's output hooked up the the input to your core? I've mentally stumbled on the .O feeding my inputs and not being what I drive and vice-versa on the other side. The .I is the input *to* the IOBUF and the .O is the output *from* the IOBUF. It's a shot. - John_H CMOS wrote: > i've done that already. im using IOBUF at the port. but at the > traslation stage it gives an error "ERROR:924" which says the port is > connectod to an non-buffered input. > in detail this is what i have. > i got six port signals comming out from the i2c interface to generate > that open drain "scl" and "sda" outputs. namely, > sda_pad_o - this is internaly connected to ground as you said. (port > type -out) > scl_pad_o - this too is internaly connected to ground as you said.(port > type -out) > sda_pad_oen - this is the buffer enable/disable output for sda.(port > type -out) > scl_pad_oen - this is the buffer enable/disable output for scl.(port > type -out) > sda_pad_i - this is the buffer enable/disable output for sda.(port > type - in) > scl_pad_i - this is the buffer enable/disable output for scl.(port > type - in) > > what im doing in the top level design is to connect those signals to > two IOBUF's, where bidirectional pin of those two IOBUF's are used as > the final sda and scl. im sure the connections are correct. when i do > so, in the traslation phase it gives the error "ERROR:924". > > Please help me on this. > > p.s: > Im new to vhdl and FPGA and i got lot of questions, some of which are > very silly. But symon knew VHDL since his birth and im proud of him. > let me know a better forum if im too in-experienced here. > > http://groups.google.com/group/comp.arch.fpga/browse_thread/thread/7bf06cf65c8e6a76/d33a8d735c04af66?q=clk+clock+coffee&rnum=1&hl=en#d33a8d735c04af66 > or serch for "clk clock coffee cocoa" in this forum. > > Thank you. > CMOSArticle: 90693
hi, the IOBUF im using has 4 pins. it is made of one TRISTATE output buffer and input buffer. the inputbuffer's input and tri_state output buffers output is connected together and function as the IO port for the entity. the input of the tristate output buffer is connected to the output from the core, which is always grounded. its enable/disable pin is also controlled by the core. the output of the input buffer is connected to an input of the core. CMOSArticle: 90694
Hi Robert, sorry, I forgot to mention that. I fell into that very same trap when I did my first steps with data2mem. :-) Have a nice synthesis Eilert Robert schrieb: > Just to add: > > I thought the syntax for data2mem in the documentation was misleading. > Or may be its me. There it says: > > data2mem -bm|bd infile.[bmm|elf|mem] [options] > > This lead me to think that I should give EITHER a bmm or a mem file as > input, as opposed to both which are required. >Article: 90695
Hi there, I would be happy about some suggestions on how I could start to make my design faster. My design is a processor (18 bit datapath)and the critical path looks like this: 1. Instruction register (containing number of register) 2. Register file (distributed RAM) 3. Mux (2-way, selects either register or RAM) 4. Mult18x18 within the ALU 5. Mux (ALU output selector) 6. Register file (distributed RAM) Target is a Spartan3 speed grade 4. I ran PAR at highest effort. Timing-driven mapping makes no difference. The report (after PAR) says: Delay type Delay(ns) Logical Resource(s) ---------------------------- ------------------- Tiockiq 0.259 EX_Instr_adr1_1 net (fanout=18) 2.114 EX_Instr_adr1<1> Tilo 0.608 regs_a10_Mram_RAM_inst_ramx_0.F net (fanout=2) 0.693 EX_Regs1do<10> Tilo 0.608 data1mux_Mmux_q_Result<10>1 net (fanout=5) 2.617 EX_Data1<10> Tmult 3.493 alu_Mmult_prod_inst_mult_0 net (fanout=1) 2.378 alu_prod<14> Tilo 0.550 alu_result<14>16 net (fanout=3) 1.061 EX_Data3<14> Tds 0.519 regs_a14_Mram_RAM_inst_ramx_0.F ---------------------------- --------------------------- Total 14.900ns (6.037ns logic, 8.863ns route) (40.5% logic, 59.5% route) Now how could I start improving the design? I don't want to split this up into two cycles (because instruction level parallelism is low and I need one result to compute the next). I notice that the net delay of the instruction register is quite high. Does this have to do with the fanout? Fanout is 18 (because the value is used as an address to 18 parallel distributed RAM LUTs). I've heard of duplicated registers. Would that help? And then, how would I achieve it? Automatically through a setting? Manually? Is there an elegant way to do it? Another thing I've heard about is RLOC constraints. I never dared try them so far. Do you think I could improve the design, and by how much? Of course, I highly appreciate any (other?) suggestions on how to speed up my design. I might also consider changing the architecture, if it doesn't mean I have to change the whole concept of my processor. Also, I am looking for good literature on FPGA implementation. Thanks in advance! K.B.Article: 90696
What I forgot to say: 1. In case you wonder what the multiplexer (3rd thing in the path) is for: operands can come either from registers or from a RAM (pipelined with a register before the mux), that's why I need the mux here. 2. The scarce resource in my design is block RAM, so it doesn't matter if it uses more area as long as that makes it faster. For some reason, if I set the mapper's optimization goal to "speed" instead of "area", the design gets even _slower_.Article: 90697
On 19 Oct 2005 01:39:32 -0700, "starbugs" <starbugs@gmx.net> wrote: >Hi there, > >I would be happy about some suggestions on how I could start to make my >design faster. My design is a processor (18 bit datapath)and the >critical path looks like this: > (...) >I notice that the net delay of the instruction register is quite high. >Does this have to do with the fanout? Fanout is 18 (because the value >is used as an address to 18 parallel distributed RAM LUTs). I've heard >of duplicated registers. Would that help? And then, how would I achieve >it? Automatically through a setting? Manually? Is there an elegant way >to do it? > Yes, your problem is fanout. Duplicating registers could be the solution, but it has mainly two drawbacks: - It usuallly uses more FPGA registers - It may only move the fanout problem to the routing *before* the register. You could kill this problem creating a new one. Try experimenting with the synthesis/implementation options. For instance, you could try to set maximum fanout to 10. In general, donīt try to topimise by hand, let xilinx tool try to do it. Set optimization goal to speed, use timing constraints... try everything if possible. BTW, I would register the multiplier outpu, call it "productResultRegister" and use this resgiter as another one from the register file. This way, instruction set might grow a little but timing might improve. (Test before doing too radical changes!)Article: 90698
Hai, I want to do the main project as Implementation of 1024 point FFT in Actel FPGA.I have to find a new frequency identification algorithm other than Fast Fourier Transform.Please give valuable notes,codes and suggestions for successully completing this project. Thanking you. I.SivakumarArticle: 90699
Hi, Please advice me on low power FPGA available in the market. which is the low power consumption FPGA among Xilinx - Spartan3L, Altera - CycloneII, Actel - proASIC3 and Lattice. Is there any review available on this? According to me proASIC3 is with low power, but its a flash based FPGA. Is there any disadvantages with Flash based FPGAs? Thanks in advance. Best regards, HimaSSK.
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z