Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
> Actually, I work in a IC design company. My boss want to develop a low-end > DSP chip. However, we are less experience in this. > We think that one of the important building block is 16X16 small size, > single cycle multiplier. > I write simple verilog and synthesis by Xilinx Web pack tools. It seems that > work. > Assuming it is work, I want to open some output files to see what "circuit" > is synthesised, because I will design a DSP chip. But i do not know which > output files mention the netlist of the "systhesised design" in gate level. It has been a while since I used Xilinx webpack. To support gate-level simulations, webpack can write the synthesized netlist to a standard Verilog or VHDL file. (To run a Verilog simulation, all you need are the included Xilinx primitives library and your gate-level netlist.) > I guess that the verilog code will be synthesised by synthesis according to > synthesis tool's library. Am I correct? Can i force the synthesis tool to > synthesis the verilog code without using library? (I means the design is > systhesis in gate level ...AND OR XOR.....) Webpack will only target Xilinx's FPGA parts, which means it'll always target some kind of Xilinx primitives library. (That's mostly the LUT4 cell primitive.) Someone correct me if I'm wrong If your ASIC-vendor truly offers a 'FPGA->ASIC conversion flow', they surely will accept the Xilinx netlist 'as is.' The ASIC-vendor will worry about the logical remapping between FPGA-library and ASIC-library. > Then, can i see the netlist in gate level such that I can study the > design synthesised by the synsthesis tool? Why do you care what it looks like? You can export the netlist to a structural Verilog (ASCII text) file. This is usually done if you want to run the netlist through Verilog simulations. If you want a 'graphical view' breaking down the netlist into intelligible AND, OR, NOT functions, then I don't know. Someone else needs to answer this question. > You say that: > >To make sure the synthesized design was synthesized correctly, > >do a gate-level simulation of the synthesized design. > >You should be able to run the same testbench code you used for an RTL > simulation. > > I am not really understand because I am a beginner of IC design. > what is the meaning of gate-level simulation? by what kind of tools? > Modelsim? Xilinx? or other? Forgive me if I'm treating you like a novice. Let's start from the beginning. "RTL-simulation" - your source RTL code (that's the Verilog code you used to synthesize your FPGA-DSP circuit) is instantiated in a top-level 'testbench file.' Then you have some waveform 'stimulus', i.e. you set some inputs on your DSP-model, advance the 'clock' waveform, then look at the DSP-model's outputs! A gate-level simulation works the same way. The difference is the device under test -- instead of the RTL-source code, here you instantiate the synthesized-netlist in your testbench. Once again, you drive the netlist's inputs, advance the clock, then check the outputs. If the netlist is functionally identical to the RTL-source, then the outputs should agree 100%. If they don't agree, you get to some "fun" debugging! The Verilog-simulator isn't part of webpack. You have to buy that separately. (I think Modeltech is popular for this kind of thing.)Article: 45451
I'm not quite sure if that works. I think if you are writing to port A, the output on port A isn't the data that was in that address, but rather the data that you are writing into the A input. I think it gets routed directly through. Check the timing diagrams in the datasheet. I think the blockRAM on Virtex2 allows you to circumvent this mode, but not V-E/Spartan-IIe. "John_H" <johnhandwork@mail.com> wrote in message news:3D3DFC80.43C0EF93@mail.com... > I like the BlockRAM idea. But - to get 32 bit mode - use the same read and > write address (cycle through n addresses) and use port A as the upper 16 bits > and port B as the lower 16 bits. The output data is the registered version of > the memory at that address so there aren't any timing problems. The memory can > look like a 32 wide by 128 deep in this mode. Only 12 memories needed which > would fit in a Spartan-IIE 150! Lots of extra logic to develop a "pong" game > as well. > > I tried compiling an array of 384x64 regs in Synplify and I quit after 8 > minutes of compiling. Synplify usually goes much faster! The SRL16s with > arrays of instatiations would probably work much faster in Synplify but I don't > know if Webpack supports the "arrays of instances" which - I believe - was > Verilog 1995, just not supported by many. > > The BlockRAM approach would work out so very nice. > > > Kevin Neilson wrote: > > > John, > > You definitely don't want to use flops for this. (I don't think there are > > even enough; you need 24500 flops and that part only has about 6000 I > > think.) You could use the SRLs, which are 16-bit shift registers. You > > would need about 1530 of these, and the part has about 6000 of these. > > Better yet would be to use the dual-port blockRAMs. You could use 24 of > > these in a 16wide x 256deep configuration. Then you just set the write and > > read addresses 16 apart to get a 16-deep pipe. If your clock is slow > > enough, you split up the RAM and use only six RAMs. This part has about 32 > > blockRAMs I think. > > -Kevin > > > > "John Hovell" <jhovell@yahoo.com> wrote in message > > news:9402973.0207231559.2030b94a@posting.google.com... > > > Hello all -- > > > > > > I am trying to implement a delay pipe that is 384 bits wide and 64 > > > bits long in Verilog. > > > > > > I was trying to build one out of fairly simple D-flops, but my design > > > has been "synthesizing" in Xilinx Web Pack for nearly 2 hours now, so > > > I think I mus have done something wrong. > > > > > > Is there an efficient or correct way to implement such a pipe on a > > > 300K gate Spartan IIe? I think the size should be OK since a Spartan > > > IIe 300K gate could theoretically make a 98kbit distributed memory... > > > but maybe I am missing something here. > > > > > > TiA for any help, pointers, etc. > > > > > > Cheers, > > > John >Article: 45452
You're trying to @(posedge clk) increment the counter and provide a comparison value on... the new value? The old value? In the telecom stuff I worked with, there were typically frame counters to track the bytes and provide gates for various operations. If you only need one gate signal, things are too simple. If you need a separate gate for each of 20 bit positions, it's a little tougher but your speeds should be extreme with a little care. If you're doing an equality compare for each gate, there are two ways to do it, with a tree or a carry chain. I'll be playing with my first Virtex-II in a week or two but I've heard the carry chains aren't as effective as they were in the Virtex-E parts but they should still provide excellent results. A 14 bit *constant* equality compare in a tree would require 3.5 LUTs for the first level of comparison and another LUT to assemble all those together. Since there are 4 slices (8 LUTs) in one Virtex-II CLB, this should scream! If it's a variable equality compare, the 7 LUTs feeding 2 LUTs feeding 1 final LUT isn't as clean but you should still get great speed. One of the key factors is that the *registered* count value needs to be compared to a constant or a *registered* comparison value. The carry chain is probably better for a 14 bit equality compare since the 7 LUTs can cascade into one carry chain. If you want to do a 98 bit equality compare, you could assemble the 7 bit carry chains into a series of (horizontal) cascade ORs (if that's what they're called - I won't look it up now). The point is, things should scream in either format compared to the speeds you're getting. Check out your logic and routing delays to see how your timing goes from source register to destination. Ask yourself if some of the stages can be pipelined. One of the beautiful things about counters is that they increment predictably! (Unless they decrement) You could assemble a huge comparison tree and register each level to attain outrageous pipelined speeds. Look at your requirements and figure out what you can back into a previous pipeline stage. Very good things should come together with nice design work. An example of a counter with a single compare output (apologies if you're VHDL): always @(posedge clk) if( count == max_count ) count <= 0 + ena; else count <= count + ena; assign out_gate = (count == max_count); The structure above isn't very efficient because a wide compare is needed in the logic while it isn't needed in the design. The logic may not synthesize into a simple counter, either, requiring two stages of logic for the counter to add to the compare. You could use a registered compare of out_gate <= (count == max_count - 1) & ena; which (in the always block) has the gate go active when you want it. But you could do better by resetting your counter with a different value: always @(posedge clock) if( out_gate ) {out_gate,count} <= {1'b0,-max_count} + ena; else {out_gate,count} <= count + ena; Note that the gate is now synchronous and there is NO compare required. (Apologies that things look a little strange... the constant "max_count" should be dimensioned the same as the "count" vector so the out_gate initializes properly false) The structure can be made "synthesis friendly" to use one level of synthesized logic (if it doesn't already) by using an equation that's more friendly to the Xilinx carry chain configuration: always @(posedge clock) {out_gate,count} <= (out_gate ? {1'b0,-max_count} : count) + ena; The conditional operator works in place of the if/else construct and "fits" in the carry structure. Many things to do. Happy coding! - John_H Sniper Daryl wrote: > > Here, > > I am Daryl and I have to trouble you. :-) > > When I design a chip used for optical network, a lot of effort must > be made to increase the clock speed and reduce the chip resource cost. > In a timing interface module, there is a counter with 14-bit width to > provide timing to the outgoing frame. So, a comparator used to compare > the counter word with a series of registers set by the controller. > I've notice that the slices cost increases seriously and the maxinum > clock speed decreases a lot, when the counter and the comparator get > wider. > > Troubled with it, I firstly tried a wider counter(14-bit) and a > narrower comparator(4-bit) and got 20MHz upgrade of speed and more > than 20 slices saving. Then, a 4-bit counter and 14-bit comparator > with a result of 10MHz upgrade and about 10 slices saving. So, I think > the critical factor is the wide comparator. This is proved by studying > the report and schematics from the synthesis tools(FCII3.6.1 and > Synplify Pro with Amplify). > > To improved the performance, I've tried to use CoreGen tool to > generate a core of comparator. But,after implement, the result is no > better than from myselft code. > > The synthesis tool I used is FCII 3.6.1, the device is > VirtextII1000, implement by ISE4.2SP3. Here is the result of my trials > : > > 14-bit counter, 14-bit comparator and other logic : 63 > slices used(36 FFs and 105 LUTs); 95MHz > > 4-bit counter, 14-bit comparator and other logic : 50 > slices used(26 FFs and 85 LUTs); 115MHz > > 14-bit counter, 4-bit comparator and other logic : 41 > slices used(26 FFs and 62 LUTs); 127MHz > > Would you give me some advice about it from your experience? Or > some resource to study? > > > > Thanks in advance for you time! > > DarylArticle: 45453
Kevin -- Thanks very much for the suggestion! SRL's sound like a great bet. As many others have very helpfully pointed out, for a simple delay pipe Block RAM's are an even better choice, but I am already using all 16 block RAM's for another part of my design :-(... and my delay pipe isn't quite a simple delay pipe -- specifically I need to read a few values in the middle. I'm hunting around right now for the instatiation syntax for an SRL's on the net... Is there a primative that I can call so one of these is inferred? It seems the PRNG (Xilinx app 211) just uses some fancy compiler ifdef statements to get the right piece of hardware inferred. I'm sure I can find this info on the 'net so I don't want to bother anyone with simple questions... however if someone feels compelled to clue me in, I certainly won't mind ;-). The only reason SRL's might not work is that I need to read *some* values in the delay pipe (i.e. the first column and a sort of diagonal row through the first half of it: Total bits: 384*2 = 768). Hopefully I can read values that are in the shift registers.... Thanks everyone for your help! Cheers, John "Kevin Neilson" <kevin-neilson@removethistextattbi.com> wrote in message news:sPm%8.142580$uw.86229@rwcrnsc51.ops.asp.att.net... > John, > You definitely don't want to use flops for this. (I don't think there are > even enough; you need 24500 flops and that part only has about 6000 I > think.) You could use the SRLs, which are 16-bit shift registers. You > would need about 1530 of these, and the part has about 6000 of these. > Better yet would be to use the dual-port blockRAMs. You could use 24 of > these in a 16wide x 256deep configuration. Then you just set the write and > read addresses 16 apart to get a 16-deep pipe. If your clock is slow > enough, you split up the RAM and use only six RAMs. This part has about 32 > blockRAMs I think. > -Kevin > > "John Hovell" <jhovell@yahoo.com> wrote in message > news:9402973.0207231559.2030b94a@posting.google.com... > > Hello all -- > > > > I am trying to implement a delay pipe that is 384 bits wide and 64 > > bits long in Verilog. > > > > I was trying to build one out of fairly simple D-flops, but my design > > has been "synthesizing" in Xilinx Web Pack for nearly 2 hours now, so > > I think I mus have done something wrong. > > > > Is there an efficient or correct way to implement such a pipe on a > > 300K gate Spartan IIe? I think the size should be OK since a Spartan > > IIe 300K gate could theoretically make a 98kbit distributed memory... > > but maybe I am missing something here. > > > > TiA for any help, pointers, etc. > > > > Cheers, > > John > >Article: 45454
I don't sure to understand well your problem, but to divide a clock source you can try to use the LUT ram. You have to set the LUT as shift ram and use it as the follow: U_SRL16_1 : SRL16 -- synopsys translate_off generic map (INIT => X"0001") -- synopsys translate_on port map ( D => srl_out, CLK => CLK25M, A0 => one, -- 16 division A1 => one, A2 => one, A3 => one, Q => srl_out); To increase the clock division you can use more LUT in sequence. This use only one slice. Hope to be usefull Regards Giuseppe "Sniper Daryl" <e-engineer@eastday.com> ha scritto nel messaggio news:289dc5a9.0207232014.5ca2f487@posting.google.com... > In a timing interface module, there is a counter with 14-bit width to > provide timing to the outgoing frame. So, a comparator used to compare > the counter word with a series of registers set by the controller. > I've notice that the slices cost increases seriously and the maxinum > clock speed decreases a lot, when the counter and the comparator get > wider. > <cut>Article: 45455
Dear B S, Thank you for your help. You answer is very details. It make me learn a lot about ASIC design. In your post, you say: > Webpack will only target Xilinx's FPGA parts, which means it'll always > target some kind of Xilinx primitives library. (That's mostly the LUT4 > cell primitive.) Someone correct me if I'm wrong If your ASIC-vendor > truly offers a 'FPGA->ASIC conversion flow', they surely will accept the > Xilinx netlist 'as is.' The ASIC-vendor will worry about the logical > remapping between FPGA-library and ASIC-library. If the netlish including LUT4, then, how to change this into "circuit" when I implement this in ASIC? Is it done by ASIC-vender? You says that ASIC-vender will worry remapping? So, what is the normally development flow for ASIC starting at Verilog? As I am a beginning of IC design, I am not sure about this. Thanks again ^_^ Reala "B__ S_______" <B___S_____@nonoonnoo.com> wrote in message news:3D3E3756.20AA602B@nonoonnoo.com... > > Actually, I work in a IC design company. My boss want to develop a low-end > > DSP chip. However, we are less experience in this. > > We think that one of the important building block is 16X16 small size, > > single cycle multiplier. > > I write simple verilog and synthesis by Xilinx Web pack tools. It seems that > > work. > > Assuming it is work, I want to open some output files to see what "circuit" > > is synthesised, because I will design a DSP chip. But i do not know which > > output files mention the netlist of the "systhesised design" in gate level. > > It has been a while since I used Xilinx webpack. To support gate-level > simulations, webpack can write the synthesized netlist to a standard > Verilog > or VHDL file. (To run a Verilog simulation, all you need are the > included Xilinx primitives library and your gate-level netlist.) > > > I guess that the verilog code will be synthesised by synthesis according to > > synthesis tool's library. Am I correct? Can i force the synthesis tool to > > synthesis the verilog code without using library? (I means the design is > > systhesis in gate level ...AND OR XOR.....) > > Webpack will only target Xilinx's FPGA parts, which means it'll always > target some kind of Xilinx primitives library. (That's mostly the LUT4 > cell primitive.) Someone correct me if I'm wrong If your ASIC-vendor > truly offers a 'FPGA->ASIC conversion flow', they surely will accept the > Xilinx netlist 'as is.' The ASIC-vendor will worry about the logical > remapping between FPGA-library and ASIC-library. > > > Then, can i see the netlist in gate level such that I can study the > > design synthesised by the synsthesis tool? > > Why do you care what it looks like? You can export the netlist to > a structural Verilog (ASCII text) file. This is usually done if you > want to run the netlist through Verilog simulations. If you want a > 'graphical view' breaking down the netlist into intelligible AND, OR, > NOT > functions, then I don't know. Someone else needs to answer this > question. > > > You say that: > > >To make sure the synthesized design was synthesized correctly, > > >do a gate-level simulation of the synthesized design. > > >You should be able to run the same testbench code you used for an RTL > > simulation. > > > > I am not really understand because I am a beginner of IC design. > > what is the meaning of gate-level simulation? by what kind of tools? > > Modelsim? Xilinx? or other? > > Forgive me if I'm treating you like a novice. Let's start from the > beginning. "RTL-simulation" - your source RTL code (that's the Verilog > code you used to synthesize your FPGA-DSP circuit) is instantiated in > a top-level 'testbench file.' Then you have some waveform 'stimulus', > i.e. you set some inputs on your DSP-model, advance the 'clock' > waveform, then look at the DSP-model's outputs! > > A gate-level simulation works the same way. The difference is the > device under test -- instead of the RTL-source code, here you > instantiate > the synthesized-netlist in your testbench. Once again, you drive the > netlist's inputs, advance the clock, then check the outputs. If > the netlist is functionally identical to the RTL-source, then the > outputs > should agree 100%. If they don't agree, you get to some "fun" > debugging! > > The Verilog-simulator isn't part of webpack. You have to buy that > separately. (I think Modeltech is popular for this kind of thing.)Article: 45456
I'm looking at the possibility of doing a new design with a programmable (or "configurable") System-on-chip (SoC) device, which is basically a CPU, memory, FPGA and some peripheral devices on a single chip. (If this is the wrong newsgroup for this question, I apologize and ask for direction.) So far, I've located the Cypress PSoC, Atmel FPSLIC and Triscend E5 - all with 8-bit MCU's on board. Can anyone who has used any of these chips comment on their performance, reliability, ease of use, etc.? How about the development systems? From what I've seen so far, I like the Atmel parts but I'm leaning toward the Cypress because of lower development system cost. Thanks, MikeArticle: 45457
This is a reply to Michael Cozza posting. Try STRATIX series of programmable SOCs from Altera. Regards. JaideepArticle: 45458
Try http://www.opencores.org/projects/pci/ . You'll found here an open source IP core PCI. I succed to compile it for a Spartan2 2S200 and a Spartan 2 2S150. Regards, BROTO Laurent "Jeff Reeve" <teamreeve@attbi.com> a écrit dans le message news: _jq%8.642847$cQ3.104066@sccrnsc01... > I'm looking for a synthesizeable 32-bit 33MHz PCI Target only design to be > placed into a FPGA or large CPLD. Minimal implementation is fine. Does > anybody know if such a thing is available in VHDL or Verilog and is open > sourced? I seem to recall Xilinx publishing a target only design quite some > time ago but I can no longer find it on their web site. > > Any help is much apprecieated! > Jeff > >Article: 45459
This is a multi-part message in MIME format. --------------51E2AFBB507CD441856E0905 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hi Ray, I have just tested the RLOC_ORIGIN attribute in 4.2.03i and it works for me. I was using XST and Synplify. I agree I just put the attribute onto FFs but it worked. I know that the RLOC_RANGE attribute only gets picked up via the ucf file but the RLOC_ORIGIN should work. Regards, Stephan Ray Andraka wrote: > Fellow experts: > > I'm hoping someone has fallen on this one and found a > workable solution. > So far the Xilinx hotline has been unusually unresponsive in > handling this > case (they've had it since 7/12 and haven't even > acknowledged whether > or not they see the problem with the test case I sent). > Seems the hotline > is not as responsive and helpful as it once was, which is a > shame. > Anyway, this is the case: > > RLOC_ORIGIN being ignored by placer. The macro shows up in > the > correct position in the floorplanner but with the G and Y > elements > missing (that is a separate issue which is still an open > case). The RLOC > origin is being ignored by the place and route, so the macro > is not landing > in the specified position. It is critical I be able to > specify the RLOC > origins in this design (2V6000, 200 MHz, high utilization. > Is this a known > problem (I don't see it in the answers data base). I tried > adjusting the > RLOC origin by single slice steps as suggested in answer > record 12192 > to no avail. > > The RPMs are not created under FPGA editor, rather they are > RLOC'd in the > VHDL source. That gets me relative locations for each BEL in > the design > which in turn give me a macro that I should be able to place > or let the > tools place. I am trying to put an RLOC_ORIGIN on it by > adding an > RLOC_ORIGIN attribute to the UCF file. The syntax is > correct, and indeed > the part of the RPM that is not decimated by the > floorplanner shows up in > the floorplanner editable window in the correct position > (there is a > previously reported bug in the floorplanner that prevents > the RPM from > showing correctly unless you first go through auto PAR, then > constrain from > placement, then unbind and bind the RPM). However, when the > design is run > through the PAR, the macro is placed at some location other > than that > indicated by the RLOC_ORIGIN. > > Thanks in advance for any info you may have seen on this > problem. > > -- > --Ray Andraka, P.E. > President, the Andraka Consulting Group, Inc. > 401/884-7930 Fax 401/884-7950 > email ray@andraka.com > http://www.andraka.com > > "They that give up essential liberty to obtain a little > temporary safety deserve neither liberty nor safety." > -Benjamin > Franklin, 1759Article: 45460
Michael Cozza <Michael.Cozza@verizon.net> wrote: : I'm looking at the possibility of doing a new design with a programmable : (or "configurable") System-on-chip (SoC) device, which is basically a CPU, : memory, FPGA and some peripheral devices on a single chip. : (If this is the wrong newsgroup for this question, I apologize and ask for :direction.) : So far, I've located the Cypress PSoC, Atmel FPSLIC and Triscend E5 - all : with 8-bit MCU's on board. : Can anyone who has used any of these chips comment on their performance, : reliability, ease of use, etc.? How about the development systems? : From what I've seen so far, I like the Atmel parts but I'm leaning toward : the Cypress because of lower development system cost. There are several synthezisable CPU core available from the different manufacturers. If you can use them depends on your processing power needs. Bye -- Uwe Bonnes bon@elektron.ikp.physik.tu-darmstadt.de Institut fuer Kernphysik Schlossgartenstrasse 9 64289 Darmstadt --------- Tel. 06151 162516 -------- Fax. 06151 164321 ----------Article: 45461
"Børge Strand" <borge.strand.remove.if.not.spamming@sintef.no> wrote: > errors when parsing the .ucf-file. What is the recommended way to maintain > constraints? Use your favorite text editor and edit the .ucf file directly. (i.e. UltraEdit/Win32 is a quite fine thing) WD --Article: 45462
Michael Cozza wrote: > > I'm looking at the possibility of doing a new design with a programmable (or "configurable") System-on-chip (SoC) device, which is basically a CPU, memory, FPGA and some peripheral devices on a single chip. > > (If this is the wrong newsgroup for this question, I apologize and ask for direction.) > > So far, I've located the Cypress PSoC, Atmel FPSLIC and Triscend E5 - all with 8-bit MCU's on board. > > Can anyone who has used any of these chips comment on their performance, reliability, ease of use, etc.? How about the development systems? > > From what I've seen so far, I like the Atmel parts but I'm leaning toward the Cypress because of lower development system cost. > You should give more info on what you want the SoC device to do, both core and peripherals. Cypress has the lowest 'configurable' quotient, but it does have Analog capability that the others lack, and is more a Flash-uC with Quasi-Smart peripherals. If that level of peripheral is enough for your task, this will be the cheapest solution. Remember this family has only ONE variant per package, so don't over-run the code budget. The E5 and FPSlic are FPGA with RAM based uC alongside. ( no analog ) Both need a Boot loader, so are actually two chip solutions [So2C] - as such, you should include a Std FLASH uC + CPLD on the PSoC map. This two chip pathway offers multi sourcing, and much more chance to better fit the uC and Logic to the task at hand, and the consequence of missing the code budget are not so drop-dead. - jgArticle: 45463
You can gate the clock using the enable pin on the Virtex-E global clock buffer. You would need to create a hard macro in Fpga Editor containing a GCLKBUF with the I, O and CE i/o's as external pins. I suggest driving the CE pin from a FF clocked off the falling edge of the ungated clock. Beware : there seems to be a bug with the GCLKBUF primitive due to swapped configuration bits; the workaround is to set the CEMUX option to '1' not CE. I have tried this. It works. -- Edward Moore Jason Crawford <jace@cisco.com> wrote in message news:<3D3BAC69.A03017A@cisco.com>... > Hi, > > Apart from using clock-enables, does anyone know of any > way to use clock-gating in Virtex-E parts? > > We have a design that is partially written for an ASIC > target and expects to see a gated clock. Rather than have > to get the designers to pour throught the code and add > clock enables to all flip flops (I can hear teeth gnashing > already) I am hoping against hope that someone has an > alternate answer to this rather difficult problem. > > yours in hope, > Jason.Article: 45464
The key to consulting is to offer more value than the cost of using your services. Generally, that means developing an expertise that is difficult to or too expensive to cultivate in-house. A good consultant can offer a breadth of experience that just isn't available anywhere else. B__ S_______ wrote: > >My impression working in Southern California (so I have a small view of > the EE world!) is US companies prefer to keep project development > in-house. If an outside vendor already has a finished, working > IP-block, > the company will consider buying it. In essence trading money for time > (the idea being they can 'buy' the IP block and just drop it in their > current project.) > > Otherwise, the company will have to invest money AND time(since the > contracter's development-cycle would be non 0-day.) In that case, > for a little additional expenditure, the company can keep the > development expertise in-house, so why pay someone else to learn > your project? > > I know there are some success stories when it comes to design > services companies. But given the size of the fabless semiconductor > industry, wouldn't you expect there to be *more* design services > consulting? -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 45465
I tried the RLOC origin on a smaller macro in the design, and it works fine in that case. The failing macro has tiles which come from a different edif netlist. It is a fairly large macro as well (24x80 slices). The size and/or the fact that its netlist comes from nested edif files may be contributing factors. The odd thing is that the RLOC_ORIGIN is recognized in the editable view of the floorplanner (which pulls it's info from the ucf and ngd files), but is being ignored in place and route. Stephan Neuhold wrote: > Hi Ray, > > I have just tested the RLOC_ORIGIN attribute in 4.2.03i and it works for > me. I was using XST and Synplify. I agree I just put the attribute onto > FFs but it worked. I know that the RLOC_RANGE attribute only gets picked > up via the ucf file but the RLOC_ORIGIN should work. > > Regards, > Stephan > > Ray Andraka wrote: > > > Fellow experts: > > > > I'm hoping someone has fallen on this one and found a > > workable solution. > > So far the Xilinx hotline has been unusually unresponsive in > > handling this > > case (they've had it since 7/12 and haven't even > > acknowledged whether > > or not they see the problem with the test case I sent). > > Seems the hotline > > is not as responsive and helpful as it once was, which is a > > shame. > > Anyway, this is the case: > > > > RLOC_ORIGIN being ignored by placer. The macro shows up in > > the > > correct position in the floorplanner but with the G and Y > > elements > > missing (that is a separate issue which is still an open > > case). The RLOC > > origin is being ignored by the place and route, so the macro > > is not landing > > in the specified position. It is critical I be able to > > specify the RLOC > > origins in this design (2V6000, 200 MHz, high utilization. > > Is this a known > > problem (I don't see it in the answers data base). I tried > > adjusting the > > RLOC origin by single slice steps as suggested in answer > > record 12192 > > to no avail. > > > > The RPMs are not created under FPGA editor, rather they are > > RLOC'd in the > > VHDL source. That gets me relative locations for each BEL in > > the design > > which in turn give me a macro that I should be able to place > > or let the > > tools place. I am trying to put an RLOC_ORIGIN on it by > > adding an > > RLOC_ORIGIN attribute to the UCF file. The syntax is > > correct, and indeed > > the part of the RPM that is not decimated by the > > floorplanner shows up in > > the floorplanner editable window in the correct position > > (there is a > > previously reported bug in the floorplanner that prevents > > the RPM from > > showing correctly unless you first go through auto PAR, then > > constrain from > > placement, then unbind and bind the RPM). However, when the > > design is run > > through the PAR, the macro is placed at some location other > > than that > > indicated by the RLOC_ORIGIN. > > > > Thanks in advance for any info you may have seen on this > > problem. > > -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 45466
I found it. It's under Implement Design->Properties, then on the Translate Properties Tab, it's called Macro Search Path. Hmm, I wish the Project Navigator documentation correlated better with the command line interface documents. Oh well. Dave "D Brown" <dbrown123@shaw.ca> wrote in message news:ahhr1c$dfn$1@pallas.novatel.ca... > Is there any way to specify the -sd (search directory) option for NGDBuild > from the Project Navigator for Xilinx 4.2i software? Or am I limited to only > being able to do this at the command line? > Thanks, > Dave > > >Article: 45467
Thanks, Kevin, I stand corrected. Identical read and write addresses pass the input data to the output according to the libraries guide, http://toolbox.xilinx.com/docsan/xilinx4/data/docs/lib/dsgnelpr30.html Since the Spartan-IIE 300 only has 16 block rams, http://www.xilinx.com/partinfo/ds077_1.pdf the way to fully implement the pipeline would be to use bunches and bunches of SRL16 elements. (Half could be SRLs and half be memory but this gets weird) I'd recommend putting together a small submodule that's 4 cascaded SRL elements for the 64 bit delay (or make it less to account for input/output registers, for instance) and instantiate that module 384 times. The arrays of instances are wonderful to do this; if you don't have that capability, another module instantiating 16 of those modules could, itself, be instantiated 24 times for a total of 384 delay elements. - John_H Kevin Neilson wrote: > I'm not quite sure if that works. I think if you are writing to port A, the > output on port A isn't the data that was in that address, but rather the > data that you are writing into the A input. I think it gets routed directly > through. Check the timing diagrams in the datasheet. I think the blockRAM > on Virtex2 allows you to circumvent this mode, but not V-E/Spartan-IIe.Article: 45468
Go to http://toolbox.xilinx.com/docsan/xilinx4/manuals.htm and look at the "Libraries Guide" which has all the primitives listed. You can use an SRL16 with or without enable. Of you need to tap off a fixed diagonal, the task is pretty straightrorward but the coding (3 levels of module hierarchy) isn't as clean. Do you know about parameterized modules? You can have a chain of 4 SRL16s to get your 64 bit delay. To "tap" an element in the middle at a fixed address, you can either daisy-chain two shorter SRL16s together (you can program them for delays of 1 to 16, inclusive) or you can tap off the feed between the 16 long delays and feed an SRL16 in parallel with the fixed chain. You can hard code the address or select which of those 16 taps you want dynamically if necessary (but the bits that make the selection could have a HUGE fanout! 384 bits?!). John Hovell wrote: <excerpt> > I'm hunting around right now for the instatiation syntax for an SRL's on the > net... Is there a primative that I can call so one of these is inferred? It > seems the PRNG (Xilinx app 211) just uses some fancy compiler ifdef > statements to get the right piece of hardware inferred. I'm sure I can find > this info on the 'net so I don't want to bother anyone with simple > questions... however if someone feels compelled to clue me in, I certainly > won't mind ;-). > > The only reason SRL's might not work is that I need to read *some* values in > the delay pipe (i.e. the first column and a sort of diagonal row through the > first half of it: Total bits: 384*2 = 768). Hopefully I can read values > that are in the shift registers....Article: 45469
Hi, I'm looking to prototype my design on an evaluation board. But I have some questions. Hardware requirements : One Apex20KE 1500E or an equivalent Xilinx device 1. What evaluation or prototyping boards would the experienced recommend ? 2. I was looking at Altera's DSP development board. How would you write the output from the FPGA into a file on a PC, for this or any other board ? 3. Is it possible to give inputs real time from a PC to an FPGA on a dev board ? I wouldn't think so, but let me know if the technology allows. Thanks, Prashant PS : Prototype board companies feel free to contact me at the email address : pjain@tensorcomm.comArticle: 45470
What exactly does the Wind River Diab XE compile down to? Does it target the Virtex II-Pro core PowerPC, or does it actually manipulate the FPGA fabric? Any help understanding this will be most appreciated. Thanks!Article: 45471
Erik wrote: > > Hi Kevin, > > > I look at first in the Table, from the Chip-description-pdf, > "Table 2: Performance for Common Circuit Functions" . > In this Table you can found some differences betwen the FPGA-Families. > > I hope this tables are korrect. > Where did you download such a PDF file? > > Why not? I will check the real Voltage-levels on the Signallines with an > oscilloscope and if i see the highest level is 3.3V i can use the Virtex-E. > > Bye > Erik It's up to you if you want to risk a Virtex-E. I will rather live within the specification than risking a PCI card with Virtex-E (Worth at least several hundred dollars.). Kevin Brace (In general, don't respond to me directly, and respond within the newsgroup.)Article: 45472
My 2 cents worth on this thread: First, some tools such as synplify will infer the SRL16's, and will even put a register at the output, which improves the clock to Q considerably. What it does not do well is adding a register between each SRL16 in the chain. Personally, I prefer to instantiate them using the SRL16 primitive in the unisim library. That way I can put the registers between where they belong, and If I want I can add RLOCs as well as non-zero initial values. Set the timingcheckson generic to false (it defaults to true) to avoid problems in functional simulation, and put the generics inside a syn_translate pragma to avoid possible problems with inference as a black box. Only the output of a register or the flip-flop following it (if you put them in there, which I advise) are visible. If you need to get somewhere in between, then you'll need to adjust your delays to get to the tap you desire. VirtexII has a nice feature that adds an always available output out of the last tap useful for cascade chains. If you dynamically control the shift length, you'll probably want to split up and duplicate the address drivers. If you were not using the block RAMs for something else, you could use them in 16 bit wide mode as a delay queue by using one port for read and one for write. The read address and write address have to be offset for it to work correctly. Depending on your data rate frequency, you may also be able to run the BRAM on a 2x or even 4x clock in order to get 2 accesses per clock in order to double the available width. At 1x, and with a 64 deep queue, you can only use 1/4 of the memory per block RAM, so a 4x memory clock would be ideal provided it does not exceed the capabilities of the BRAM. John_H wrote: > Go to http://toolbox.xilinx.com/docsan/xilinx4/manuals.htm and look at the > "Libraries Guide" which has all the primitives listed. You can use an SRL16 > with or without enable. > > Of you need to tap off a fixed diagonal, the task is pretty straightrorward but > the coding (3 levels of module hierarchy) isn't as clean. Do you know about > parameterized modules? > > You can have a chain of 4 SRL16s to get your 64 bit delay. To "tap" an element > in the middle at a fixed address, you can either daisy-chain two shorter SRL16s > together (you can program them for delays of 1 to 16, inclusive) or you can tap > off the feed between the 16 long delays and feed an SRL16 in parallel with the > fixed chain. You can hard code the address or select which of those 16 taps you > want dynamically if necessary (but the bits that make the selection could have a > HUGE fanout! 384 bits?!). > > John Hovell wrote: > <excerpt> > > > I'm hunting around right now for the instatiation syntax for an SRL's on the > > net... Is there a primative that I can call so one of these is inferred? It > > seems the PRNG (Xilinx app 211) just uses some fancy compiler ifdef > > statements to get the right piece of hardware inferred. I'm sure I can find > > this info on the 'net so I don't want to bother anyone with simple > > questions... however if someone feels compelled to clue me in, I certainly > > won't mind ;-). > > > > The only reason SRL's might not work is that I need to read *some* values in > > the delay pipe (i.e. the first column and a sort of diagonal row through the > > first half of it: Total bits: 384*2 = 768). Hopefully I can read values > > that are in the shift registers.... -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 45473
John, I don't know about the other synthesizers, but if you are using Synplify you can write a single 'for' loop that will infer all the SRLs. You could still use the BRAMs if you can run them at twice the data rate and write half the bus on one cycle and the other half on the next. -Kevin "John_H" <johnhandwork@mail.com> wrote in message news:3D3ED543.95AE563B@mail.com... > Thanks, Kevin, I stand corrected. Identical read and write addresses pass the > input data to the output according to the libraries guide, > > http://toolbox.xilinx.com/docsan/xilinx4/data/docs/lib/dsgnelpr30.html > > Since the Spartan-IIE 300 only has 16 block rams, > > http://www.xilinx.com/partinfo/ds077_1.pdf > > the way to fully implement the pipeline would be to use bunches and bunches of > SRL16 elements. (Half could be SRLs and half be memory but this gets weird) > > I'd recommend putting together a small submodule that's 4 cascaded SRL elements > for the 64 bit delay (or make it less to account for input/output registers, for > instance) and instantiate that module 384 times. The arrays of instances are > wonderful to do this; if you don't have that capability, another module > instantiating 16 of those modules could, itself, be instantiated 24 times for a > total of 384 delay elements. > > - John_H > > > Kevin Neilson wrote: > > > I'm not quite sure if that works. I think if you are writing to port A, the > > output on port A isn't the data that was in that address, but rather the > > data that you are writing into the A input. I think it gets routed directly > > through. Check the timing diagrams in the datasheet. I think the blockRAM > > on Virtex2 allows you to circumvent this mode, but not V-E/Spartan-IIe. >Article: 45474
Jeff Reeve wrote: > > I'm looking for a synthesizeable 32-bit 33MHz PCI Target only design to be > placed into a FPGA or large CPLD. Minimal implementation is fine. Does > anybody know if such a thing is available in VHDL or Verilog and is open > sourced? I seem to recall Xilinx publishing a target only design quite some > time ago but I can no longer find it on their web site. > > Any help is much apprecieated! > Jeff This is what you are probably talking about. ftp://ftp.xilinx.com/pub/applications/pci/ ftp://ftp.xilinx.com/pub/applications/pci/00_index.htm For some reason, a Verilog version of the reference design is missing, but if you want it I can E-mail it to you (Some kind, long time Xilinx user sent it to me.). I also believe Lattice Semiconductor and Quicklogic also have their own PCI reference design (I know the Lattice one is written in Verilog, but not sure about the Quicklogic one.). However, here is a caveat of using reference designs offered by device manufacturers. Even if the design is written in a device independent form (Uses generic Verilog or VHDL statements, and no vendor specific primitives.), when using reference designs offered by device manufacturers, you are often legally required to use the reference designs on their devices. Opencores.org also has a free PCI IP core, but it is a lot more complex (Supports initiator and target transfers.) than any of the above mentioned reference designs, so I feel like you will likely have a hard time modifying it to suit your own needs. When modifying a PCI interface, PCI specification Appendix B's state machine examples and the following article may be helpful. http://www.eedesign.com/editorial/1995/fpgafeature9502.html Kevin Brace (In general, don't respond to me directly, and respond within the newsgroup.)
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z