Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
hi, I am trying to implement a 16 bit RISc processor in XC4005xl FPGA.i have written a structural code for the RISC processor. but when i try to synthesize, it gives me an error where a window pops up saying "synopsys internal error".this error occurs when the software tries to "map combinational logic".i am not sure if it is an error in the software or if i have done some mistake.please reply asap. santoshArticle: 35276
I'm with Noddy on this one. The Block Rams are, in most cases anyway, remotely located with respect to the logic you want to use them with. In order to get any respectable performance out of them, the signals to and from them have to be pipelined too. Additionally, the Block RAMs are pretty much the slowest thing in the FPGA...at least I find that in 90% of the designs I do that use Block RAM, the block RAM is by far the limiting factor on clock rate. If you count up the CLBs needed for pipelining the signals to/from the block RAM, then many times using just CLBs doesn't look so expensive, especially if you start having to clock the BRAM at 1/2 the clock and accessing 2 words at a time. Sounds like what he has is a limited set multiplier (see my web page) with the encoded phase angle as one input, and the input signal as the other. In practice this isn't much different than the partial products multiplier except it uses less luts and less adders than a full parallel multiplier. BRAMs are not the way to do that in most cases. Peter Alfke wrote: > There is one think I forgot to tell you: > The CLB-LUT is a combinatorial device: You give it a new address input, and you > get the new output "immediately", i.e. within a nanosecond or two. > The BlockRAM ( or BlockROM) is a clocked device, even in its read operation. You > get the new output after applying the next clock edge (You decide on rising or > falling). This sequential behavior has its advantages and disadvantages, but in > any case, you must be aware of it. > Sorry for not telling you before. > "Better placement" is puzzling. If you have an unused BlockRAM, it takes no > extra space at all, while your CLB implementation takes a fair number of CLBs > and routing resources. > > Peter Alfke, Xilinx Applications. > ======================== > Noddy wrote: > > > Thanks for the suggestion... did it, but it appears I get better placer > > scores using the distrbted arithmetic instead. Also, (I'm using the LUT's in > > a digital mixer), I get timing problems using the Block RAM, but not using > > the DA. > > > > adrian > > > > > I would suggest putting this big LUT in one of the available BlockRAMs, > > each of > > > which is 4096 bits, organized any way you want ( and dual-ported, although > > that > > > feature may not be any advantage in your case). > > > Saves area and routing, and will most likely be faster. > > > > > > Remember, you can obviously use any Virtex/SpartanII 4096 bit dual-ported > > RAM or > > > ROM=LUT also as two totally independent single-port 2048 bit RAMs or ROMs. > > > > > > Peter Alfke, Xilinx Applications -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 35277
In article <3BB349B2.712D1072@gmx.de>, Falk Brunner <Falk.Brunner@gmx.de> writes >yaohan schrieb: >> >> However, when I simulate using MAX+PLUS II, the results show that Y is >> update to A input values after short delay. As if the sensitivity list does >> not have any effect .. ( Or may be the compiler has included all the signal >> into sensivity list ...).. > >yes. From my point of view, the sensitivity list is just some kind of >dinosaur from the good, old days when VHDL was defined and the guys >wantet to make the language designed according to theoretical standards. >Nowaday (and even in the good old days) a compiler should have no >problem to find the input signals of a process (clocks, data) by itself. > Sorry Falk, I can't agree with you about that. There are plenty of situations in test benches where I really NEED to control what's going on by means of a sensitivity list. There is no "a priori" way of inferring, from the process code, what should be in the sensitivity list. Of course, if you are restricted to the standard synthesisable subset, then it IS quite easy to infer the sensitivity list from the procedural code; many synthesis tools do exactly that, and some (to their great shame IMHO) don't even report it as a warning if their inferred list differs from your explicit one. But it seems to me to be very important that we keep as much control as possible in our simulations, and that synthesis should agree with simulation behaviour in a "correct by construction" way. To achieve that requires that synthesis and simulation interpret sensitivity lists in the same way. Just my opinion, of course. -- Jonathan Bromley DOULOS Ltd. Church Hatch, 22 Market Place, Ringwood, Hampshire BH24 1AW, United Kingdom Tel: +44 1425 471223 Email: jonathan.bromley@doulos.com Fax: +44 1425 471573 Web: http://www.doulos.com ********************************** ** Developing design know-how ** ********************************** This e-mail and any attachments are confidential and Doulos Ltd. reserves all rights of privilege in respect thereof. It is intended for the use of the addressee only. If you are not the intended recipient please delete it from your system, any use, disclosure, or copying of this document is unauthorised. The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.Article: 35278
> 1/2 the clock and accessing 2 words at a time. Sounds like what he has is a limited > set multiplier (see my web page) with the encoded phase angle as one input, and the > input signal as the other. In practice this isn't much different than the partial > products multiplier except it uses less luts and less adders than a full parallel > multiplier. BRAMs are not the way to do that in most cases. Thanks for the advice. This is pretty much what I am doing, with I/Q signals going in different LUT's, followed by bus multiplexers which are driven by the phasor angle input. ( got the 90 degree rotation working fine, but can you give any advice as to the optimal configuration for a 45 degree rotation. Having one large LUT seems very inefficient - I tried splitting into two LUTs, followed by two adders to sort out the real and imaginary terms. Thanks AdrianArticle: 35279
Which synthesis tool? Synplify has no problem duplicating signals like you need, and the P+R tools do not seem to merge them back... --a Jens-Christian Lache wrote: > > John_H wrote: > > > Please let us know > > 1) synthesis tool, > > 2) coding language > > > > Jens-Christian Lache wrote: > > > > > Hi! > > > To reduce the fanout of a tristate signal leading to 64 iobs I > > > tried dublicate this signal. How do I tell > > > the synthesis tool now not to remove my dublicated logic? > > > ( I tried to use a BUFG as well, but that didn't work at all) > > > thanks for your help, > > > -jc- > > Hi! > 1) foundation 3.1i > 2) vhdl > > In the "Libraries Guide 3.3.06i -- Online" under Design Elements, > BUF is a comment about this problem: > > .. the buffer is preserved by attaching an X (explicit) attribute to > both the input and output nets of the BUF." > > I tried it, but it didn't work. > > This is the code: > http://d6.design.chalmers.se/jctmp/jctmp/specache.vhd > > I would like to tell the synthesis tool a max fanout for the > > readBuffer0Pipeline1: FDCE FF and to have > it generate several nets with smaller fan out > automatically. > > Thanks a lot!!!!!!!!!!!!! > > -jc-Article: 35280
Has anyone had success using Xilinx's System Generator for FPGA DSP design? Is the tool relatively stable and mature? Thanks.Article: 35281
"Mike R." <mrandelzhofer@uumail.de> skrev i meddelandet news:3bb334e7$0$192$4d4ebb8e@read.news.de.uu.net... > Hi Matthias, > > The only big disadvantage of SRAM fpgas is: > there is no single chip solution.You always need a configuration memory on > startup. > The AT94Sxx Secure FPSLIC contains the FPGA and Configurator in the same package. Should be OK for anyone. -- Best regards, ulf at atmel dot com The contents of this message is intended to be my private opinion and may or may not be shared by my employer Atmel SwedenArticle: 35282
It depends on how many angles you wish to have in your rotation table, and on the sample rate desired. If it is only a few angles, then the partial product multiplies work fine, especially if augmented with two's complementers so that the mixer works in one quadrant only and the other quadrants are obtained by selected negation. Rather than using LUTs, however, I often use a CORDIC rotator as described in my article in the Xilinx XCELL journal last december (there is a link to the article on my website under the publications page). This is especially true when you are dealing with complex signals, as the CORDIC often winds up being less logic, and is capable of arbitrarily fine phase resolution, and done right has less error than a multiplier solution. Noddy wrote: > > 1/2 the clock and accessing 2 words at a time. Sounds like what he has is > a limited > > set multiplier (see my web page) with the encoded phase angle as one > input, and the > > input signal as the other. In practice this isn't much different than the > partial > > products multiplier except it uses less luts and less adders than a full > parallel > > multiplier. BRAMs are not the way to do that in most cases. > > Thanks for the advice. This is pretty much what I am doing, with I/Q signals > going in different LUT's, followed by bus multiplexers which are driven by > the phasor angle input. ( got the 90 degree rotation working fine, but can > you give any advice as to the optimal configuration for a 45 degree > rotation. Having one large LUT seems very inefficient - I tried splitting > into two LUTs, followed by two adders to sort out the real and imaginary > terms. > > Thanks > Adrian -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 35283
Russell Shaw wrote: > Hi all, > > How's the cypress warp 6.1 ($99) compare to other cheap/free > VHDL editors/synthesizers? > > I'm interested in the delta39k devices which have internal ram. > > A particularly attractive feature of the delta39k parts (which > seem to be in a similar area to the altera acex 1k parts), is > that they have internal flash and internal (as well as external) > boot option. The acex 1k parts don't have flash, so you always > need an external configuration device, which takes up space, > inventory, and is more exposed to copying... > > http://www.cypress.com/pld/warp.html?homeadwarp > > -- > ___ ___ > / /\ / /\ > / /__\ / /\/\ > /__/ / Russell Shaw, B.Eng, M.Eng(Research) /__/\/\/ > \ \ / Victoria, Australia, Down-Under \ \/\/ > \__\/ \__\/ I don't have experience using the actual Cypress PARTS but I've been using Warp for about 4 years. I find it to give better error messages than most VHDL compilers, but some things that I synthesize fairly easily using Synplify I don't synthesize quite so easily in Cypress. One frustrating error message you will get however is the mysterious "Can't handle xxxx in final equations". I usually find that it is some way I've coded something and Cypress doesn't like the way I did it. In general, however, I am pleased with my investment and look forward to actually using the PARTS in a design. What is holding me back is the expensive cable for programming the devices...Article: 35284
Jonathan Bromley schrieb: > > Sorry Falk, I can't agree with you about that. There are plenty > of situations in test benches where I really NEED to control what's > going on by means of a sensitivity list. There is no "a priori" way Hmm. Can you explain this more detail or give a simple example? I dont get it at all. I dont have much expirience with VHDL testbenches (we only use ...... ;-) > of inferring, from the process code, what should be in the sensitivity > list. > Of course, if you are restricted to the standard synthesisable I do ;-) > subset, then it IS quite easy to infer the sensitivity list from the > procedural code; many synthesis tools do exactly that, and some > (to their great shame IMHO) don't even report it as a warning if > their inferred list differs from your explicit one. But it seems > to me to be very important that we keep as much control as possible > in our simulations, and that synthesis should agree with simulation > behaviour in a "correct by construction" way. To achieve that > requires that synthesis and simulation interpret sensitivity lists > in the same way. Yes. 100% ACK. -- MFG FalkArticle: 35285
"Mike R." wrote: > Hi Matthias, > > The only big disadvantage of SRAM fpgas is: > there is no single chip solution.You always need a configuration memory on > startup. > > The ideal configuration memory is available from XILINX: > The XC18V00 series of jtag in system programmable fpga proms - see > http://www.xilinx.com/partinfo/ds026.pdf > They work well, but are difficult to get and expensive. > > There is often a byte or word wide flash memory in the system which could be > used for the fpga configuration. > You can built your own cpld prom loader with a jtag interface, but you must > use a separate jtag chain for the cpld and the prom loader. > The jtag interface is well documented and simple, but the spi interface > needs less logic to implement. > > The spi interface is a simple 4 wire interface for data transfer from a > master to a slave device: Mike, I've often thought of doing this and, as you say, the h/w is relatively straightforward. What's always put me off is figuring out how to do an NT driver to run the parallel port in raw mode. Do you have any advice where I could at least find the low level IO stubs. With that I might grit my teeth and figure out how to use MS VC/C++.Article: 35286
The tool is stable, and works well as long as your design can be done entirely using the xilinx coregen components. If you want to do your own DSP element, say a CORDIC rotator for example, there is no easy way to get it into the sytem generator tool. If it is not in there, it becomes more or less useless as a system simulation tool, so in that case it is not much different than a block editor similar to renoir or Aldec. You'll also find that doing control logic, state machines and other "random" stuff is exceedingly difficult. That said, the tool is a big step in the right direction, it just needs some more time to grow up to be truly useful for many designs. Tony Kirke wrote: > Has anyone had success using Xilinx's System Generator for FPGA DSP design? > Is the tool relatively stable and mature? > Thanks. -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 35287
I have a basic question. Where can I get a .pin, .pad ro .xnf file to create the part? I am using OrCAD 9.2 and I do have the create part option. I will gladly share the symbol if I can get the necessary file.Article: 35288
Rick Filipkiewicz <rick@algor.co.uk> wrote: >> The spi interface is a simple 4 wire interface for data transfer from a >> master to a slave device: > >Mike, > >I've often thought of doing this and, as you say, the h/w is relatively >straightforward. What's always put me off is figuring out how to do an NT driver >to run the parallel port in raw mode. Do you have any advice where I could at >least find the low level IO stubs. With that I might grit my teeth and figure >out how to use MS VC/C++. > I have an NT kernel driver which I have developed for this purpose. I can license it to interested people. I also have a user level program to drive it. It would cost less than a week of your time. Muzaffer DSP implementations using FPGA http://www.dspia.comArticle: 35289
rafael plonka wrote: > [...] > I'm using the Quartus II 1.1 for designing with a APEX 20KE. The VHDL > code (i.e. State-Machines) I generated with Mentor Graphics HDL Designer > Series Pro. Now I tried to simulate my target device within the Quartus > Software. The result was kind of frustrating: Signals I asserted in the > same state of a state machine clocked with 133MHz had an output skew of > more than 3ns (and these signals should have a setup and hold > relationship to the corresponding clock signal of 1 ns as they are the > command bits of a DDR SDRAM). After looking at the (in my opinion quite > poor) online documentation, I know that there are different ways of > "grouping" signals or logic together to reduce such skew or to > accelerate the whole design, but so far I tried and had no real success. > Does anyone know how to correct these problems? Also I detected strange > flickerings at some signals with a frequency of more than 1.1GHz and a > duty cycle of 25%. Where do these come from? I use signal frequencies of > 33MHz and multiples of 2x, 4x and 8x (only on internal signals), but I > don't think they have a relation to these flickering. Or can I ignore > that problem? But it seems to me as if the simulation is quite exactly > and the problems would occur in the chip also. Hello Rafael, i am working on a EP20K300E design myself right now, although "only" 100 MHz system clock. I also found some problematic timing behaviour, making all setup times hard to meet. I tried to switch to the new copper parts. These simulation results where even worse. When I phoned Altera application, they told me that the simulation timing parameters were not "finalized" yet and are rather pessimistic. When you look at the data sheets (at least when I started the design a couple months ago) most (all?) timing parameters still showed an tbd (better tbm). By the way, I am also using Quartus II, 1.1. I find a 100+ MHz design which does some "real" work is quite hard to implement. At cycles of 10ns or less all the delays are quite nasty. Anyway, try to group logic which belongs together. Use timing constraints where neccesary. Best Regards + good luck, Henning TrispelArticle: 35290
You have two options: 1. If the asic clock rate is much higher than the sample rate, just use a few wide enough multiply-accumulate in a time-sharing architechture. I.e. if you ran the ASIC at 4 times the sample rate you could use the same multiplier for 4 coefficients. 2. The other option is to use adders and shifters instead of a real multiplier. Find a set of coefficients that approximates your desired set and minimze the number of adders. CSD representation (cannonic signed digit) is a good representation for coefficients in this case (something like x000111100000 can be handled using just two adders). Of course, if the filter is symmetric you can use that to your advantage. If it is an interpolation or decimation filter - that will help too. -- Nimrod. kkdeep@mailcity.com (kuldeep) wrote in message news:<a0f016a9.0109252112.767f04b1@posting.google.com>... > thanx robert but unfortunately i cannot change the coefficients to > suit this method :-( so i have to look for some other method. any more > pointers will be appriciated. > --kuldeep > > > "RM" <yeren@gmx.de> wrote in message news:<9oq2pu$i9q$1@f40-3.zfn.uni-bremen.de>... > > the following reference (you can get more by google'ing with FIR BIT SERIAL > > ftp://ftp.ittc.ukans.edu/pub/projects/DSP/FPGA/Bit_Serial.pdf > > is intended for FPGA use, but it _might_ be suitable for asic, too, because > > it > > is on designs that only include bit shift operations and additions instead > > of > > multiplications. Bad side effect: you will have to calculate a new filter > > that > > meets the restrictions of the method.. > > robert > > > > "kuldeep" <kkdeep@mailcity.com> schrieb im Newsbeitrag > > news:a0f016a9.0109250458.7bb874f1@posting.google.com... > > > i have to implement a 64 tap FIR filter with fixed coefficients in > > > hardware. I have found some architecture suitable for fpga (using LUTs > > > of fpga) which don't use multipliers. can somebody point me to > > > architecture suitable for ASIC?. Since the coefficients are fixed, i > > > want to optimize or avoid the multipliers.The input sample rate is 16 > > > MHz with 12 bits (in 2s complement). > > > thanx > > > kuldeepArticle: 35291
Every VHDL simulator I've used respects the sensitivity list for a process; if a simulator didn't do this, it wouldn't be VHDL compliant. And it's a subtle simulation bug to figure out when you've inadvertanly left something important off the sensitivity list, and I've dealt with this a couple of times... In a sense, this is exactly what you want in the "usual" model for an edge-triggered DFF: process(clk,rst) begin if rst = '1' then q <= '0'; elsif clk'event and clk = '1' then q <= d; endif end process; In this case, d is not part of the sensitivy list on purpose -- when it changes it won't change the register value. This speeds up simulation somewhat. Bill Falk Brunner wrote: > > yaohan schrieb: > > > > However, when I simulate using MAX+PLUS II, the results show that Y is > > update to A input values after short delay. As if the sensitivity list does > > not have any effect .. ( Or may be the compiler has included all the signal > > into sensivity list ...).. > > yes. From my point of view, the sensitivity list is just some kind of > dinosaur from the good, old days when VHDL was defined and the guys > wantet to make the language designed according to theoretical standards. > Nowaday (and even in the good old days) a compiler should have no > problem to find the input signals of a process (clocks, data) by itself. > > -- > MFG > Falk -- Bill McDermith Staff Engineer Mentor Graphics Corp (719) 265-9827 4920 Sapphire Drive Fax: 719-593-1653 Colorado Springs, CO 80918 e-mail: bill_mcdermith@mentor.comArticle: 35292
Hi, I'd developing a design for a 2V6000 using Xilinx 4.1i and Synplicity. I have much experience in s/w at all levels, as well as digital and analog h/w, and even fpga design years ago using schematics. What is the fastest way to become a Verilog expert these days? Classic books like the Thomas/Moorby "The Verilog Hardware Description Language" seem dated, and Smith's "HDL Chip Design" is cluttered (IMHO) with VHDL, which we can't use here. Anyway, advice appreciated! Thanks -MikeArticle: 35293
Hi Rick, see: Programming Tools for Port I/O and Interrupts in http://www.lvr.com/parport.htm or download port95nt.exe from www.xess.com Mike "Rick Filipkiewicz" <rick@algor.co.uk> schrieb im Newsbeitrag news:3BB38638.700F93CB@algor.co.uk... > > > "Mike R." wrote: > > > Hi Matthias, > > > > The only big disadvantage of SRAM fpgas is: > > there is no single chip solution.You always need a configuration memory on > > startup. > > > > The ideal configuration memory is available from XILINX: > > The XC18V00 series of jtag in system programmable fpga proms - see > > http://www.xilinx.com/partinfo/ds026.pdf > > They work well, but are difficult to get and expensive. > > > > There is often a byte or word wide flash memory in the system which could be > > used for the fpga configuration. > > You can built your own cpld prom loader with a jtag interface, but you must > > use a separate jtag chain for the cpld and the prom loader. > > The jtag interface is well documented and simple, but the spi interface > > needs less logic to implement. > > > > The spi interface is a simple 4 wire interface for data transfer from a > > master to a slave device: > > Mike, > > I've often thought of doing this and, as you say, the h/w is relatively > straightforward. What's always put me off is figuring out how to do an NT driver > to run the parallel port in raw mode. Do you have any advice where I could at > least find the low level IO stubs. With that I might grit my teeth and figure > out how to use MS VC/C++. > >Article: 35294
Thank you guys..... "yaohan" <engp1590@nus.edu.sg> wrote in message news:191C91BDFE8ED411B84400805FBE794C1884252F@pfs21.ex.nus.edu.sg... > hi, > I have a problem which I feel difficult to explain. > > library ieee; > use ieee.std_logic_1164.all; > > entity test is > port( > A: in integer range 0 to 127; > Y: out integer range 0 to 127 > ); > end test; > > architecture test_bev of test is > signal B: integer range 0 to 127; > begin > process (A) --process A > begin > B <= A; > Y <= B; > end if; > end process; > end test_bev; > > As for my understanding, the process A will only execute when change on > signal A. So Let say, I only change the value A for once, this value should > be assigned to B; Mean while Y is assigned with OLD B value. As long as no > change on signal A, this process will be suspended, i.e Y == value of OLD B. > > However, when I simulate using MAX+PLUS II, the results show that Y is > update to A input values after short delay. As if the sensitivity list does > not have any effect .. ( Or may be the compiler has included all the signal > into sensivity list ...).. >Article: 35295
I wrote about Spartan II-E: > Is it just a cheaper Virtex-E Austin Lesea <austin.lesea@xilinx.com> writes: > Less expensive, or cost reduced, or low cost specific application > targeted, or ASIC/ASSP replacement device I believe are the politically > correct descriptors. Sorry, I should have been more precise and said "less expensive". I was only using "cheap" in the sense of inexpensive, without any intended negative connotation.Article: 35296
Jon Parker <jparker@dnaent.com> writes about Cypress Warp: > In general, however, I am pleased with my investment and look forward to > actually using the PARTS in a design. What is holding me back is the > expensive cable for programming the devices... You can build the cable yourself. It is nearly trivial, and the schematics are readily available. I'm pretty sure they are in one of the apnotes on the Cypress web site.Article: 35297
Use it. M wrote: > Hi, I'd developing a design for a 2V6000 using Xilinx 4.1i and Synplicity. > > I have much experience in s/w at all levels, as well as digital and analog h/w, > and even fpga design years ago using schematics. > > What is the fastest way to become a Verilog expert these days? Classic books > like the Thomas/Moorby "The Verilog Hardware Description Language" seem dated, > and Smith's "HDL Chip Design" is cluttered (IMHO) with VHDL, which we can't use > here. > > Anyway, advice appreciated! > > Thanks -Mike -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 35298
If it were for an FPGA, I'd immediately say use distributed arithmetic. For an ASIC, the answer is less certain, especially if you don't have many clocks per sample to work with. Distributed arithmetic can be paralleled up to handle more than one bit at a time, and in the extreme it is all bits in parallel. It maps nicely to FPGAs because it efficiently uses the 4 input look-up tables the FPGA is constructed from. In an ASIC, these LUTs would have to be implemented as small ROM (for fixed coefficients), or RAM if the coefficients had to be programmable. In the case a a full parallel distributed arithmetic, the savings over a more conventioal set of multiply accumulators is greatly diminished, and if you include the gate complexity of the small ROMs may actually exceed that of a more conventional multiplier-accumulator approach. Now to answer your doubts: 1) This referred to folding the filter if the coefficients are symmetric (a common occurence for FIR filters). It uses less hardware to wrap the tapped delay back on itself, adding the samples that get multiplied by the same (symmetric) coefficients before the multiply. That reduces the number of multiplies that need to be done by a factor of 2. 2) Assuming your filter is symmetric by your earlier questions, then you have 33 unique coefficients. If your LUTs are 4 LUTs, then you have 8 with 4 taps, and 1 with 1 tap for a bit serial implementation. For digit serial (more than 1 bit per clock) implementation, you have 33 taps per bit. There is nothing stopping you from combining taps for different bits, provided you scale the addends for the partial sums with the appropriate weighting for the bit, as all the partial products are eventually summed anyway. So for a 2 bit digit serial, you have 2 bits times 33 unique coefficients. Using 4 LUTs, you need a total of 66/4 or 18 LUTs. The last one has zeros associated with the unused inputs. In the case you mention below, where there is only 1 tap attached to the LUT, the LUT can be replaced with just an and gate, which can generally be absorbed into the adder tree. kuldeep wrote: > Hi jacky , > Thanx for reply. This seems to be good architecture as i can > tradeoff throughput with hardware . Fully serial approach will not > work for me as my input data is coming at 16Mhz, 12-bit wide. That > means i need clock of 192MHz (16x12) which i can't afford .correct me > ..so i will go for some mix of serial -parllel approach. > i have two doubts: > quoting a line from ur reply : > 1."you better add coefficient before feeding the partial products > table" > Here do u mean adding inputs (for which coeffcient happen to be same) > before feeding the partial product table? Plese elaborate further how > can i take advantage of symmetrical coeffcients. > 2. i have odd number (65) coeffcients. Each LUT take 4 coeff. so where > will the last coeff go?? should i use 1 LUT for this single coeff. > > thanx and regds > Kuldeep -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 35299
Are you sure you have all the signals you need connected. At first blush, it sounds like something you need is unconnected and causing much of the logic to be optimized out by the mapper. You could try disabling the trim unconnected logic to see if it works then, it might at least help you pinpoint why it is getting stripped out. timnicolson wrote: > Help > > Can anyone experienced with fpga design suggest what I maybe doing wrong. > > I have quite a large design for an 8-bit space vector motor controller, for > implementation on a SpartanII device. > > I have design the system using Xilinx's foundation software. > > I have used a mixture of coregen modules and VHDL macros stuck together in > the schematic editor. > > The design simulates as intended. > > Problems arise when implementing, I get some warnings because some of my > coregen signals do not connect to anything (do I need them too?), and I get > some other warnings saying 100% back annotation not possible for some of the > blocks. It finishes and says its implemented OK. > > Now I go to verify the design. It doesn't work as intended, I've searched > around, and found the problem, a CoreGen "pipelined divider" module is > producing rubbish, not remotely what it should do.... but only when I verify > the whole design! if I enter verification and then select "verify single > component" from the file menu, and select the next hierarchy down, which > contains the divider, the module verifies correctly. > > This problem is really beyond me at the moment, can someone give me any > pointers to what the problem maybe. > Thanks for your time/effort > > Tim Nicolson > University of Sheffield > UK -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z