Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Small CPLDs have some advantages over FPGAs: more predictable performance, faster pin-to-pin delays, non-volatility, and most importantly, conceptually easier-to-grasp design methods, and simpler software. As CPLDs get bigger, these advantages become less important, and the disadvantages: high static power consumption, severely limited number of flip-flops, become more annoying. So SRAM-based FPGAs become a more attractive alternative beyond a certain size. FPGA pin-to-pin delays are now the same as CPLD's, and there are far more flip-flops, while the static power is almost zero. (Nobody can avoid the dynamic power). And volatility has become a non-issue in almost all cases. CoolRunner is the only CPLD that avoids the static power, that's why it can implement more macrocells in a meaningful way. CPLDs and FPGAs really serve differnt applications, with little overlap. Peter Alfke, Xilinx Applications Tim Tuan wrote: > Hi, why doesn't Xilinx make their 9500 CPLDs any larger. What's the > constraining factor? > > Thanks, > -T > mailto:timtuan@yahoo.comArticle: 20726
You need an 1. adjustable oscillator at 64 MHz, 2. divide-by 64 counter, and 3. phase comparator at 1 MHz. 2 and 3 are very easy to implement. #1 is the problem. You could build a ring-oscillator out of a chain of reasonably fast delay elements. Lets assume each element has a delay of 32 ps ( makes the math easier). You would cascade 250 to achieve the half-period delay of 8 ns at 64 MHz. The delay drifts with temperature and supply voltage, and you must adjust it from the phase comparator. But you can only adjust the half-period with a granularity of 32 ns, i.e. you will have an uncontrollable sporadic frequency error of up to plus/minus 120 kHz. Can you tolerate that amount of uncertainty and jitter? Next question is, where do you find, and how do you multiplex these fine-grained delay elements ? Ray suggested the best programmable element there is, the carry chain. But that is still too coarse for you (I think). Xilinx uses dedicated circuitry in the digital DLL in all Virtex devices. And we think that is still too coarse for a PLL, where frequency errors are cumulative and must be fixed all the time. (In a DLL the error is not cumulative, and a 35 ps error is usually acceptable. ) That's why people use an analog PLL, in spite of all the awful headaches it creates. Peter Alfke Nestor wrote: > Hi everyone. > > I am interested in building a clock synthesizer using an FPGA. My aim is to > generate a higher frequency clock from a lower frequency reference using an > FPGA. For instance, a 64MHz could be generated from a 1MHz reference. In > traditional analog phase-locked loops (PLL) this is possible. My intent is > to use a digital PLL (DPLL) or an analog-digital hybrid version of the DPLL > (everything digital except the VCO) to synthesize my higher frequency from > the lower reference. > > From what I have read, a DPLL approaches its analog equivalent if the loop > is oversampled. Does this mean that, in order to generate my 64MHz from the > 1MHz, I would need to use a sampling frequency higher than 64MHz? > > If this is true, then the analog PLL would be the better choice to > synthesize the 64MHz frequency since no frequency higher than 1MHz would be > required. > > Thanks in advance for any suggestions or other comments. > > Nestor > nestor@stansync.com > nestor@ece.concordia.caArticle: 20727
Hello, Does any one know of any way to implement an 18 bit wide FIFO in Virtex without utilizing 100% of two Block RAMs? Regards, K. IraniArticle: 20728
If you are synthesizing, you should be using RTL level code rather than a behavioral description. If the behvioral description is even synthesizable, it is likely not enough information for the synthesizer to guess at an implementation. Synplicity will infer RAMs and ROMs from an RTL integer array. ritchie99_uk@my-deja.com wrote: > HI ALL, > > what's the performance of the actual synthesisers for a behavioural > vhdl input > i am looking to target the VIRTEX-E with a behavioural vhdl input, and > here i don't know which synthesiser(s) is (are) good especially for > inferring Block Rams and distributed RAMs > > i read that "exemplar" infers automatically the BRAM , what about FPGA > express that come with F2.1i, is it good ??? > same question for synplify and symplicity ( i hope that it's the right > spelling ...) > > thanks in anticipation > > --ritchie > > Sent via Deja.com http://www.deja.com/ > Before you buy. -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 20729
How deep? You can use the CLB Ram for relatively small FIFOs, same way as with 4K designs. Keyvan Irani wrote: > Hello, > > Does any one know of any way to implement an 18 bit wide FIFO in > Virtex without utilizing 100% of two Block RAMs? > > Regards, > K. Irani -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 20730
I still have the above tools installed, but I haven't done any FPGA design for about 2 years, and before that it was mostly ASIC prototyping with large 3k devices. The supported devices are 2k, 3k and 4k and I know the 2k are obsolete, and the 3k are still available but pricing isn't too good. The 4k *appear* to still be a reasonably good choice for a long-life project, because it seems to me that the newer Xilinx families get "churned" a lot more quickly in today's super fast moving marketplace. I have some production-volume (hundreds) FPGA projects coming up and would be doing only small designs, say 5k gates or less. Am I right about the 4k range being a good choice for that? Or should I get rid of all this old (although just about totally bug-free and solid) stuff and get one of the starter kits which Xilinx offer? Are they any good? I know XACT6 is very hard to beat. Any advice appreciated. Peter. -- Return address is invalid to help stop junk mail. E-mail replies to zX80@digiYserve.com but remove the X and the Y. Please do NOT copy usenet posts to email - it is NOT necessary.Article: 20731
You won't be able to target much anything other than 4000E and 4000EX series parts with that. The new tools do run through place and route faster, and do a little better with automatic placement than Xact6, and the whole thing will run under NT. In the process though, you gain lots of bugs and less functionality in the floorplanner. Peter wrote: > I still have the above tools installed, but I haven't done any FPGA > design for about 2 years, and before that it was mostly ASIC > prototyping with large 3k devices. > > The supported devices are 2k, 3k and 4k and I know the 2k are > obsolete, and the 3k are still available but pricing isn't too good. > > The 4k *appear* to still be a reasonably good choice for a long-life > project, because it seems to me that the newer Xilinx families get > "churned" a lot more quickly in today's super fast moving marketplace. > > I have some production-volume (hundreds) FPGA projects coming up and > would be doing only small designs, say 5k gates or less. Am I right > about the 4k range being a good choice for that? Or should I get rid > of all this old (although just about totally bug-free and solid) stuff > and get one of the starter kits which Xilinx offer? Are they any good? > I know XACT6 is very hard to beat. > > Any advice appreciated. > > Peter. > -- > Return address is invalid to help stop junk mail. > E-mail replies to zX80@digiYserve.com but remove the X and the Y. > Please do NOT copy usenet posts to email - it is NOT necessary. -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 20732
Is not RTL level description a behavioural one?? I am a little bit confused by the jargon in use... In my understanding: - Structural VHDL is any VHDL design that is based on composing sub-designs (use Hierarchy). - Behavioural VHDL is any VHDL that is target independent be it DATA FLOW (boolean equations) or RTL (register transfer with IF, WHILE, CASE etc. statements). So a VHDL design can be structural and behavioural at the same time if the sub-designs are coded behaviourally. There is another type of VHDL: the one that takes advantage of the target HW structure (infer special blocks like BlockRAMs in Virtex for example). For me, this is a subset of VHDL, specific to a particular HW platform (not standard) . Any comments????Article: 20733
Hi everyone! Having the need to reprogram an old Lattice Ispl1016, and without the download cable at hand, I hope someone of you will help me to get around the problem. My question is: a) Can I program a single device from the PC- printer port without using any kind of buffering? b) If so, Which pin of the LPT port is connected to what programming line? Thanks in advance AageArticle: 20734
Tuesday, June 6, 2000, 7-9pm Los Angeles Convention Center *** Submission Deadline: March 10, 2000 -What is the Ph.D. Forum at DAC? The Ph. D. Forum at the Design Automation Conference is an annual event for Ph.D. students to present their work in front of a poster and have interactive discussions with attendants. The forum is hosted by SIGDA and is OPEN to all members of the DA community. It will take place during the SIGDA member meeting immediately after and adjacent to the DAC Cocktail party (Tuesday, 7-9pm), in Room 502A at the Los Angeles Convention Center. The motivation for this forum started from a 1996 NSF workshop entitled "Future Research Directions in CAD for Electronic Systems: Putting the D back in CAD." Since its debut at DAC 1998, we have had many outstanding Ph.D. candidates. They had some very positive comments about the forum! Please participate and make this a continuing success. -Goals The goals of the Ph. D. forum are * for graduate students to get feedback on their thesis work from other researchers. * for the industry (CAD, system companies) to preview academic work-in-progress to provide a structured way for increasing interaction between academia and industry -Eligibility Eligible students are those who expect to complete their thesis within 1-2 years, and those who have completed their theses in the 1999-2000 academic year. Pre-completion students must have a university-approved thesis proposal or at least one published conference paper. -What to Submit * A one-page abstract of the thesis in PDF, not including figures or references, and not to exceed 750 words. * A university-approved thesis proposal, or a published paper this is required of ALL students. * Names of five reviewers whom the student would like to review the abstract The submission will be reviewed to ensure that the abstract is supported via the accompanying paper/proposal. -Travel Grants (Notification April 30, 2000) Some funding will be made available to students to present their work. The criteria will be based on the quality of the presentation and the potential benefit for both the students and Forum attendees. -Forum Presentation (Tuesday, June 6, 2000, 7:00 - 9:00 p.m.) Students will present their work in a poster session, hosted by the SIGDA during their member meeting. http://www.eng.uci.edu/~daforum/ -- Pai H. Chou, Assistant Professor of ECE email chou@ece.uci.edu Henry Samueli School of Engineering, UC Irvine phone (949) 824-3229 444F Engineering Tower, Irvine, CA 92697-2625, USA fax (949) 824-3203 -- -- Pai H. Chou, Assistant Professor of ECE email chou@ece.uci.edu Henry Samueli School of Engineering, UC Irvine phone (949) 824-3229 444F Engineering Tower, Irvine, CA 92697-2625, USA fax (949) 824-3203Article: 20735
For all of you who have been asking, and those who wanted to know but were afraid to ask, I have finally gotten a page explaining distributed arithmetic up on my website. And for those who don't have a clue what I'm talking about, distributed arithmetic is a hardware technique that lets us hide lots of multipliers in an FPGA. Take a look and let me know what y'all think. -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 20736
My understanding of the terminology is behavioral just emulates the function. while RTL also carries some information about the structure (namely the locations of the registers and the logic between them). For example, one could do a behavioral model of a pipelined multiplier by using the * operator plus a series of clock delays to match the pipeline length. An RTL description would describe, at least in high level terms, the logic between each of the pipeline stages. "J.R." wrote: > Is not RTL level description a behavioural one?? I am a little bit confused > by the jargon in use... > In my understanding: > - Structural VHDL is any VHDL design that is based on composing sub-designs > (use Hierarchy). > - Behavioural VHDL is any VHDL that is target independent be it DATA FLOW > (boolean equations) > or RTL (register transfer with IF, WHILE, CASE etc. statements). > > So a VHDL design can be structural and behavioural at the same time if the > sub-designs are coded behaviourally. > > There is another type of VHDL: the one that takes advantage of the target HW > structure (infer special blocks like BlockRAMs in Virtex for example). For > me, this is a subset of VHDL, specific to a particular HW platform (not > standard) . > > Any comments???? -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 20737
> From what I have read, a DPLL approaches its analog equivalent if the loop > is oversampled. Does this mean that, in order to generate my 64MHz from the > 1MHz, I would need to use a sampling frequency higher than 64MHz? I think the answer to that one is "yes". Doing everything with digital logic works great if you want to make a slower clock - that is you get to divide by a big number. The divide ratio determines the jitter on individual clock edges. If you are dividing by 100, you can get the clock edge within 1% of a cycle of where you want it. You can make the long term frequency as accurate as you want by dividing by N on some cycles and N+1 on others. (Think of it as dividing by N plus a fraction.) I don't know how to multiply up with digital logic. -- These are my opinions, not necessarily my employers.Article: 20738
For a DPLL, you have a master clock that is several times higher than the clock you wish to synthesize. How high is determined by the amount of jitter you can allow in the generated clock. Typically you want that to be at least 16x. You can multipy a clock digitally with a delay lock loop, but you'll need access to some fairly small incremental delays and equal routing delays to make it work. Not an easy task in an FPGA. If I were going to do it, I'd probably look at using the carry chain for the delay line because it gives you the finest incremental delays available to the user. Hal Murray wrote: > > From what I have read, a DPLL approaches its analog equivalent if the loop > > is oversampled. Does this mean that, in order to generate my 64MHz from the > > 1MHz, I would need to use a sampling frequency higher than 64MHz? > > I think the answer to that one is "yes". > > Doing everything with digital logic works great if you want > to make a slower clock - that is you get to divide by a big number. > The divide ratio determines the jitter on individual clock edges. > If you are dividing by 100, you can get the clock edge within 1% of > a cycle of where you want it. > > You can make the long term frequency as accurate as you want by > dividing by N on some cycles and N+1 on others. (Think of it as > dividing by N plus a fraction.) > > I don't know how to multiply up with digital logic. > > -- > These are my opinions, not necessarily my employers. -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 20739
On Fri, 18 Feb 2000 03:55:49 GMT, Ray Andraka <randraka@ids.net> wrote: >Was the movement limited to within a CLB? If not check the placement >report to see what and why it happened. > To which placement report are you referring? I've looked in the .par file, and other than a number of statements of the form "Resolved that CLB <DRAM_ADR_8> must be placed at site CLB_R20C11," I don't see anything that would help me figure out why things are moving. And placed logic is moving between CLBs, not just within CLBs. Bob Perlman >Bob Perlman wrote: > >> Hi - >> >> I don't know how many of you use the Xilinx M2.1 floorplanner. If you >> do, I have a question for you. >> >> Yesterday I used the floorplanner to place portions of a >> schematic-based XCS30XL design, and managed to go from a design that >> failed route after 1-1/2 hours (didn't complete route and didn't meet >> timing on the routed nets) to a design that routed and met all timing >> constraints in 40 minutes. So, I'm happy with the results, but was >> puzzled by the fact that the Xilinx tools moved some of the cells that >> I'd placed. Any RPMs that I placed stayed put, but cells that I'd >> moved individually into the placement window were sometimes in new >> places after routing. You could see that the place and route tools >> had kept the cells more or less where I'd placed them, but moved some >> cells around. >> >> Is this expected behavior when using the floorplanner? If so, what's >> to keep I/O pin assignments from moving? >> >> Thanks, >> Bob Perlman >> >> >> ----------------------------------------------------- >> Bob Perlman >> Cambrian Design Works >> Digital Design, Signal Integrity >> http://www.best.com/~bobperl/cdw.htm >> Send e-mail replies to best<dot>com, username bobperl >> ----------------------------------------------------- ----------------------------------------------------- Bob Perlman Cambrian Design Works Digital Design, Signal Integrity http://www.best.com/~bobperl/cdw.htm Send e-mail replies to best<dot>com, username bobperl -----------------------------------------------------Article: 20740
It can be done with careful use of opposite clock edges if the 40M and 80M clocks are phase locked but with some amount of unknown skew as long as the skew doesn't push you too close to the opposite edge. I know it can be done in a 4025E device if you are careful with placement (the register to register timing is tight, so the registers in opposite clock domains have to be placed in horizontally adjacent CLBs using the direct connects - that is the fastest register to register connect in a 4K device). Spend a lot of time on those clock domain crossings to make sure the skew doesn't kill ya. Rickman wrote: > Andy Peters wrote: > > > > Tom Burgess wrote in message <38A35E12.DE1CC4F8@hia.nrc.ca>... > > >The newer parts are amazing all right. Even a slow XLA should give > > >12.5 - (1.5 Tcko + 1.5? route + 3.0 Tgls + 0.7 Tecck) = 5.8 ns margin. > > >If we were talking about ye olde 4000 series of 5+ years ago, then "worry" > > >might have been the right word. > > > > Well, the part I'm using is a Spartan XL-4. I've decided to not worry about > > it! > > Maybe I am missing something. If you are generating a slower clock (40 > MHz) from a faster clock using a divide by 2 FF, then you will have skew > between your clock domains. Signals moving from the 40 MHz domain to the > 80 MHz domain will have a reduced setup time (by the amount of the > skew). This will not be easy to deal with since you are starting with > only 12.5 nS. > > But signals going from 80 MHz domain to the 40 MHz domain will have a > setup time based on the skew time, not the clock cycle time. The 40 MHz > clock edge is delayed from the 80 MHz clock edge. If you can't guaranty > that the minimum delay time for the signal is greater than the skew > time, which you can't, then you must use the skew time as your clock > cycle time for this signal! > > Am I missing something in the design? > > -- > > Rick Collins > > rick.collins@XYarius.com > > remove the XY to email me. > > Arius - A Signal Processing Solutions Company > Specializing in DSP and FPGA design > > Arius > 4 King Ave > Frederick, MD 21701-3110 > 301-682-7772 Voice > 301-682-7666 FAX > > Internet URL http://www.arius.com -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 20741
Mathew Wojko wrote: > In article <38ACBDDF.CA310F0A@ids.net> you wrote: > : Mathew Wojko wrote: > > : > Ray Andraka (randraka@ids.net) wrote: > : > : Wallace trees are not generally the fastest multipliers in FPGAs. See the > : > > : > If you pipeline them they generally are. > : > > > : No, they are not. A wallace tree produces a sum vector and a carry vector. > : Those have to be added together to obtain the full sum. > > : However, that final adder determines the maximum clock rate of the > multiplier. > > Precisely. The Wallace tree is a carry-save architecture. When pipelined, > carry values only ever propagate one bit-position within each stage of > processing (no carry propagation latencies are experienced). Thus fast > clocking rates for this 'tree-part' of the multiplier can be acheived. > > However, when combining the carry and sum vectors, you do not want to > compromise the performance obtained thus-far from the 'tree-part' of > the multiplier. A simple ripple adder implemented using fast carry logic > will not yeild the same performance as acheived by the wallace tree. > Thus overall performance will be affected. > > : Now fade to the FPGA. The fast carry chain logic in modern FPGAs is a highly > : optimized dedicated path that is about an order of magnitude faster than logic > : implemented in the LUT logic and connected via the general routing resources. > : That fact makes it extremely difficult to improve upon the performance of the > : carry chain ripple carry adder. > > This is the point that I dont necessarily agree on. I agree that you > cannot improve on the performance of a ripple carry adder. Using the > fast-carry logic provides unparallel results for their implemenation. > However, their exist other addition techniques that will provide better > pipeline performance when implemented on an FPGA. The trick is not > to ripple or propagate the carry great lengths between successive > pipeline stages. That carry has to be tightly pipelined. In the extreme case, you can pipeline the carry out of each bit, but you'll need to add skew and deskew registers to the design to compensate for the pipeline latency. If you do this, you will find that you can get a shorter clock period with a pipelined array multiply than you can with a wallace tree because the array multipliers routing is all to nearest neighbors. The limiting factor in that case is the routing time required to get the multiplicands distributed across the array. Note the array can be either a row ripple or column ripple array and you still get the same tight routing. The pipeline latency is 2n for the array multiplier instead of n+logn for the wallace tree. The thing that slow down the wallace tree when compared to this extreme case is the length of the routes - a wallace tree has a quite complicated routing pattern compared to the very simple routing of an array multiplier. Note that a wallace tree is to a column ripple array as a row ripple tree is to a row ripple array. Both the former are tree implementations of the latter to reduce the tranport delay. The resources for the wallace tree with a pipelined ripple carry output adder are the same as for an array multiplier. In either case, the improvement over a partial products tree is not as great as you might expect because the array multiplier/wallace tree multiplier is more than twice the area of the partial products multiplier (area translates to longer routes for inputs). The inputs also have to fan-out to more loads, which requires additional buffering to keep the speed up. I found some years ago (and I recall being somewhat surprised by the result at first) that at least in Xilinx 4K, you get a shorter clock period and lower latency out of two partial products type multipliers than out of a single pipelined array multiplier, and the area, without even considering the skew/deskew registers needed for the array, is about the same for both. Throw in the skew and deskew registers needed to pipeline the carries, and you go way over on the area. I expect Virtex to come out even more favorable for the partial products multiplier because it's carry structure lets you do a 2xN partial product in one layer of logic. > > > : This non-homogenous mix of logic means that the > : cheap ripple carry adder is about as fast as you're gonna get in the FPGA (short > : of pipelining the carry) for word widths up to around 24-32 bits. > > Exactly. If you pipeline the carry then you can acheive a matching > performance result to that of the wallace tree. Remember that the > Wallace tree pipelines the carry result at every stage of processing. > Thats why its called a carry-save technique. Why you would want to > use a carry ripple adder after expending the extra logic to implement > a Wallace tree to reduce partial products is beyond me. > > : The result is > : a wallace tree buys you nothing in terms of area, and in fact is twice as big as > : a a row-ripple tree because the ripple carry adders use one LUT per bit (the > : carry is in dedicated logic in xilinx or splits the lut in altera) where the full > : adders in the wallace tree need two luts per bit (one for sum, one for carry). > > I agree that the wallace requires more area than the row-ripple tree. As > you have pointed out, thats true because you do not pipeline the carry > values in a row-ripple tree (what I call vector based computation), > whereas in the wallace tree you do. As such, the wallace tree *does* give > you added performance for area. The clocking speed is substantially faster > since carry values only propagate one bit position between pipeline stages > rather than up to 2n bits as in the row-ripple technique. > > : The larger area costs clock cycle time since the routing in FPGAs has substantial > : delay. Now pipelining will get back the performance (requires a register > : immediately in front of the final adder for best clock speed), but the fact of > : the matter is you are still limited by the speed of that final adder. > > But thats my point. Why include a carry ripple adder at the final > stage? This is the obvious performance limiting factor. By using carry > lookahead techniques you can obtain better performance results than > the carry ripple adder. Regardless of the carry ripple adder implemented > by the fast-carry logic. > > So a > : wallace tree gets you at best, the same performance as a row-ripple tree with > : double the area (more if you use partial product techniques at the front layer). > : This is why a wallace tree multiplier is not appropriate for an FPGA. > > Sorry, but I disagree. A wallace tree multiplier is appropriate for > an FPGA *if* you use the appropriate adder to combine the sum and carry > results. The BCLA adder is a perfect addition technique to combine with the > wallace tree. Using this, (implemented correctly) the pipeline latency > at every stage of processing will only be from one 4-input LUT output to a > register. Thus this technique matches well to both ALTERA and Xilinx FPGA > architectures. See my comments above. I still stand by my assertion that a wallace tree multiplier is rarely appropriate for an FPGA with a fast carry chain. The more than doubled area does not give doubled performance because of routing delays inside the array (which for a wallace tree are not necessarily to nearest neighbors) and more importantly in the distribution of the multiplicands to the array. > > > : That said, the column route delay penalty in Altera 10K devices does make a > : wallace tree a little more attractive for pipelined trees that cannot fit in one > : row. The reason for that is the clock period is limited by the delay from the > : output register on one level of the tree through the carry chain to the msb > : output register of the next level. If the levels cross a row boundary, there is > : a significant delay hit which will reduce the clock frequency unless additional > : registers are added ahead of and in the same row of the carry chain. If the tree > : extends across several rows, several layers of pipeline registers are needed if > : the tree is all ripple carry adds. A wallace tree can reduce the hit, but again > : at the expense of a considerable amount of area...and that is only true for trees > : that extend across more than two rows. You get the same clock cycle performance > : in less area by simply adding the extra pipeline registers instead of doing a > : wallace tree, but at the expense of a little clock latency. Note that this is a > : special case. The other special case occurs in FPGAs without carry chains, where > : in order to get an advantage by using a wallace tree, your final adder should use > : a fast carry scheme. > > : > However, it depends on how you define speed. If you are referring to the > : > clocking rate, then a fully pipelined Wallace tree multiplier will provide > : > the best results - over vector and array based techniques. However, > : > Wallace trees require a large amount of device resource to do so (CLB count). A wallace tree is a vector approach. It is the tree implementation of a column ripple array. A column ripple array uses carry save adders just like the wallace tree except they are connected to the next row instead of in a tree. > > : > > : > If you are interested in pipelined structures and associated clocking > : > rates, be prepared to experience an area/time tradeoff for multiplication > : > implementations. Thats is, the faster you wish to clock the implementation, > : > the more area you will have to use. > : > > : > If you are interested in the functional density of the implementation, > : > I'd say that vector based approaches (which add partial products in parallel > : > - using fast carry logic) provide best utilisation results. > > : For FPGAs with fast carry chains, these partial product techniques also provide > : the fastest multipliers short of pipelining the carries. -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 20742
If the placement found conflicts it would be in the PAR report. SOunds to me like your floorplan file for the rev you are working on didn't get updated with your floorplan changes. The only times I've seen stuff move from a CLB I floorplanned are 1) RLOCs/LOCs in the source or constraints override the floorplan or 2) I specified the wrong floorplan file or 3) I didn't set the floorplan file. In 2.1 if you specify a floorplan file outside of the rev, it copies that into that rev when you specify the floorplan. If you then go and modify the original floorplan file, it doesn't get copied into the rev again unless you go back and respecify the floorplan file. Its a little backasswards I think, but I've gotten used to it. Hope that's what's wrong...sure wouldn't want that as a bug waiting to bite. Bob Perlman wrote: > On Fri, 18 Feb 2000 03:55:49 GMT, Ray Andraka <randraka@ids.net> > wrote: > > >Was the movement limited to within a CLB? If not check the placement > >report to see what and why it happened. > > > > To which placement report are you referring? I've looked in the .par > file, and other than a number of statements of the form "Resolved that > CLB <DRAM_ADR_8> must be placed at site CLB_R20C11," I don't see > anything that would help me figure out why things are moving. And > placed logic is moving between CLBs, not just within CLBs. > > Bob Perlman > > >Bob Perlman wrote: > > > >> Hi - > >> > >> I don't know how many of you use the Xilinx M2.1 floorplanner. If you > >> do, I have a question for you. > >> > >> Yesterday I used the floorplanner to place portions of a > >> schematic-based XCS30XL design, and managed to go from a design that > >> failed route after 1-1/2 hours (didn't complete route and didn't meet > >> timing on the routed nets) to a design that routed and met all timing > >> constraints in 40 minutes. So, I'm happy with the results, but was > >> puzzled by the fact that the Xilinx tools moved some of the cells that > >> I'd placed. Any RPMs that I placed stayed put, but cells that I'd > >> moved individually into the placement window were sometimes in new > >> places after routing. You could see that the place and route tools > >> had kept the cells more or less where I'd placed them, but moved some > >> cells around. > >> > >> Is this expected behavior when using the floorplanner? If so, what's > >> to keep I/O pin assignments from moving? > >> > >> Thanks, > >> Bob Perlman > >> > >> > >> ----------------------------------------------------- > >> Bob Perlman > >> Cambrian Design Works > >> Digital Design, Signal Integrity > >> http://www.best.com/~bobperl/cdw.htm > >> Send e-mail replies to best<dot>com, username bobperl > >> ----------------------------------------------------- > > ----------------------------------------------------- > Bob Perlman > Cambrian Design Works > Digital Design, Signal Integrity > http://www.best.com/~bobperl/cdw.htm > Send e-mail replies to best<dot>com, username bobperl > ----------------------------------------------------- -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 20743
I suppose two bits are parity. You could FIFO only 16 bits and recreate parity at the output ( takes one half Virtex CLB). Otherwise, 18 is just an awkward number... Peter Alfke Ray Andraka wrote: > How deep? You can use the CLB Ram for relatively small FIFOs, same way > as with 4K designs. > > Keyvan Irani wrote: > > > Hello, > > > > Does any one know of any way to implement an 18 bit wide FIFO in > > Virtex without utilizing 100% of two Block RAMs? > > > > Regards, > > K. Irani > > -- > -Ray Andraka, P.E. > President, the Andraka Consulting Group, Inc. > 401/884-7930 Fax 401/884-7950 > email randraka@ids.net > http://users.ids.net/~randrakaArticle: 20744
>You won't be able to target much anything other than 4000E and 4000EX >series parts with that. The new tools do run through place and route >faster, and do a little better with automatic placement than Xact6, and >the whole thing will run under NT. In the process though, you gain lots >of bugs and less functionality in the floorplanner. Looks like I should go for the new tools. I paid some $15k for the stuff I have, and recently someone posted a crack for both the dongles, so I was quite happy to have the investment protected. :) How do the bugs get found? Do you mean that the design does not work and there is no indication of why, or does post-route DRC find them? Are the 4000E and EX parts obsolete or just too expensive for what they do? I did like one particular feature in the old tools which was great for multiple clock signals: if you placed an L and SC=1 (skew of 1ns max) attributes onto a clock line, it would use a long line to run that clock. APR worked just fine with this. However PPR broke this, and one was back to doing it "properly" using the 1 or 2 global clock nets. This really hammers dynamic power consumption, increasing it often several times. Peter. -- Return address is invalid to help stop junk mail. E-mail replies to zX80@digiYserve.com but remove the X and the Y. Please do NOT copy usenet posts to email - it is NOT necessary.Article: 20745
Thanks Peter, Ray and Hal for your input. Since creating a completelly digital DPLL in an FPGA looks to be quite difficult, what about creating a hybrid PLL where only the voltage-controlled oscillator would be external (analog) and the rest (phase detector, loop filter and divide-by-N) would be designed in the FPGA? In most previous cases I have seen, only the phase detector designed digitally and the rest was still analog. I wouldn't mind building a hybrid PLL solution as long as I could be guaranteed that clock multiplication would be possible. Otherwise, I would have to resort to a completely analog design... Thanks in advance. Nestor nestor@stansync.com nestor@ece.concordia.caArticle: 20746
Peter wrote: > >You won't be able to target much anything other than 4000E and 4000EX > >series parts with that. The new tools do run through place and route > >faster, and do a little better with automatic placement than Xact6, and > >the whole thing will run under NT. In the process though, you gain lots > >of bugs and less functionality in the floorplanner. > > Looks like I should go for the new tools. I paid some $15k for the > stuff I have, and recently someone posted a crack for both the > dongles, so I was quite happy to have the investment protected. :) > > How do the bugs get found? Do you mean that the design does not work > and there is no indication of why, or does post-route DRC find them? > Most of the bugs I've seen so far are in the mapper and floorplanner. When they occur, mapper exits with errors. The 'pushbutton' flow seems to be pretty bug free. Bugs are things like not allowing a legal combination in a CLB, floorplanner not dealing with RLOC correctly and the like. > > Are the 4000E and EX parts obsolete or just too expensive for what > they do? Spartan parts have the same functionality, are faster and are cheaper. The E and EX are not obsolete...yet. They are the oldest families still sold though. > > > I did like one particular feature in the old tools which was great for > multiple clock signals: if you placed an L and SC=1 (skew of 1ns max) > attributes onto a clock line, it would use a long line to run that > clock. APR worked just fine with this. However PPR broke this, and one > was back to doing it "properly" using the 1 or 2 global clock nets. > This really hammers dynamic power consumption, increasing it often > several times. This is the type of thing I meant when referring to less functionality. > > > Peter. > -- > Return address is invalid to help stop junk mail. > E-mail replies to zX80@digiYserve.com but remove the X and the Y. > Please do NOT copy usenet posts to email - it is NOT necessary. -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 20747
You can use a run-of the mill analog PLL chip such as the widely available 74FCT88915 for the PLL and use the FPGA or a CPLD to do the reference and feedback divides. I like the 88915 because it is available from several vendors (mot, idt,cypress), has a simple external loop filter, and has 2x, X/2 and multiple 1x outputs that are skew controlled. By itself, it is a PLL low skew clock driver, but with the addition of external dividers it is a quite capable clock synthesizer. Another PLL I've used is a National CGS410 pixel clock generator. It has the dividers inside and also has a simple loop filter. nestor@ece.concordia.ca wrote: > Thanks Peter, Ray and Hal for your input. > > Since creating a completelly digital DPLL in an FPGA looks to be > quite difficult, what about creating a hybrid PLL where only the > voltage-controlled oscillator would be external (analog) and the rest > (phase detector, loop filter and divide-by-N) would be designed in the > FPGA? > In most previous cases I have seen, only the phase detector designed > digitally and the rest was still analog. I wouldn't mind building a > hybrid PLL solution as long as I could be guaranteed that clock > multiplication would be possible. Otherwise, I would have to resort > to a completely analog design... > > Thanks in advance. > > Nestor > nestor@stansync.com > nestor@ece.concordia.ca -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 20748
Hi, Does anybody have a HDL source of 32 or 16-bit divider ? Smaller but parameterized dividers are also welcome. regards, Dave. Sent via Deja.com http://www.deja.com/ Before you buy.Article: 20749
Hi, Does anybody have a HDL source of 32 or 16-bit divider ? Smaller but parameterized dividers are also welcome. regards, Dave. Sent via Deja.com http://www.deja.com/ Before you buy.
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z