Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
rickman wrote: <snip> > So, are coarse grained architectures the way of FPGA... opps FPxA > devices in the near future? Will the lowly LUT and FF be pushed into > the dark corners of the die in coming years? I think it is not a > matter of if, just a matter of when and I think the when is soon! The main drive seems to be MHz, as hard IP is always faster than Soft Logic. Pretty much all the FPGA's now have 'DSP blocks' and those blocks get ever-more complex. Some have GHz links hardwired. The latest Altera device uses more of this 'Hard IP', but the soft-logic speeds have not increased much. There will always be LUT/FF areas, as that handles the State engines etc, but perhaps the next iteration, will be wide-path BUS routing. -jgArticle: 132501
Hi Jared, synthesis tools like ISE XST try to optimize the design as best as possible. If you leave the output of a flipflop open, it's of no use for the design anymore and the Flipflop is deleted. Then the gates to the former flipflops input have open outputs and are deleted as well. That goes on and on until the Input from your switch is reached. This behavior may cause no warnings but infos so read them as well. Anyway, if you have an idea about what elements your design should have and the synthesis result is suspiciously small you should check the reports for deleted flipflops and combinatorical circuits. Similar thing happens when inputs have a fixed connection to high or low. The synthesis tool calculates the optimised logic and the result may be fixed outputs, so the design will be reduced to some fixed connections to high and low at the output. Sometimes whole designs vanish this way. :-) Have a nice synthesis Eilert jared.pierce@gmail.com schrieb: > Thanks for the help guys. I made several changes and the problem > finally went away. It works! I still don't know why it wasn't > working so I plan to back up what I have and insert one test problem > at a time to see what happens(of the problems I fixed). I still don't > get why the RTL schematic would bother to include a DFF where the Q > wasn't connected to anything. It would have been just as good not to > include it. Any hints on that one?Article: 132502
"rickman" <gnuarm@gmail.com> wrote in message news:5d1e4ddd-969f-47a4-9959-b017f9fd8c0f@m3g2000hsc.googlegroups.com... > On May 28, 12:10 pm, KJ <kkjenni...@sbcglobal.net> wrote: >> On May 28, 11:24 am, Brian Philofsky >> >> <brian.philofsky@no_xilinx_span.com> wrote: >> > KJ wrote: >> > > On May 28, 6:50 am, "MikeWhy" <boat042-nos...@yahoo.com> wrote: >> >> > This very well could be a timing issue but another possible cause of >> > this could be the writing of simulatable code but not synthesizable >> > code. Looking at the code, I see the signals a and b in the >> > sensitivity >> > list of the process which could cause the else statement to get >> > evaluated asynchronously in simulation however for synthesis, the >> > sensitivity list is likely ignored (generally with a warning) and thus >> > processed differently. >> >> Having 'a' and 'b' in the sensitivity list is not the problem. The >> structure of the code is >> process(...) >> begin >> if (rst = '1') then >> ...assignements here >> elsif (clk'event and clk = '1') then >> ...assignements here >> end if; >> end process; >> >> There are no assignments to anything outside of the if statement. >> That if statement is of the form for flops with async presets/resets. >> Having 'extra' signals in the sensitivity list will not result in any >> difference between simulation and synthesis. If it does, then contact >> the synthesis tool provider and submit a bug report. > > Not only is the a and b not a problem, the a in the sensitivity list > is *required* for proper simulation unless the simulator has the > smarts to fix problems with the sensitivity list. XST complained that they were missing from the sensitivity list and warned that it was proceeding as though they were there. So I added them. > Notice that there are assignments in the if (rst ='1') clause that > have a on the right hand side. Since this is not in the clocked > portion of the process, this is a concurrent assignment and the > process must run when either rst or a change. Aha. Thanks. That was lucid enough to understand. >> > >> Also, is there a way to tell XST to not treat reset as a clock? I >> > >> haven't >> > >> fully read up on configuration, having spent way too much time on >> > >> this >> > >> little time waster. >> >> > I imagine you are referring to XST using a global buffer for the reset >> > signal. In general this should not cause any issues and many times can >> > be the right thing to do but if you want to go to prevent that >> > behavior, >> > tell XST you want an IBUF on the reset signal by adding the following >> > attribute: I missed the import of this the first time through. I'll make note of it for the future. It added the clock buffer even when I used a GPIO line, not just the sys_rst_pin which has special attributes in the config file. > If the reset signal is being run on a global clock line, it is because > your code does not allow the GSR to be used (a global net dedicated to > the reset function). You are resetting to a signal value instead of a > fixed value, so the set/reset signals have to be brought out to the > routing matrix to accommodate that. It is actually preferred to use > the clock nets for such a reset since your reset likely has a high > fanout and it will not be able to meet a fast timing spec any other > way. This also saves a lot of routing resources if you have spare > clock lines. > > >> > >> Last, .... is this really worth pursuing? I've been programming for >> > >> 25 >> > >> years, and know that the greatest leassons come after the greatest >> > >> pain. But >> > >> there's also good pain, and just senseless injury. Is this not a >> > >> suitable >> > >> first Zen parable to contemplate? I'm goaded forward by the belief >> > >> that >> > >> there's a good lesson on synchronous systems lurking as the >> > >> punchline. >> >> > > The punchline might be static timing analysis. Signals don't just >> > > 'happen' when you want them to, you need to guarantee by design that >> > > they arrive at the proper time relative to the clock. > > Can you say what your clock rate is? For the most part, speeds of > below 25 MHz are pretty easy to meet. Speeds of 100 MHz are a lot > tougher. Speeds in the middle depend on the logic and the density. > As you get above about 70% or 80% full, it gets harder to meet faster > timing. 125 MHz. One of the timing reports said the circuit should be able to get 300+ MHz. Alas. It wasn't circuit I intended. I understand race conditions as they apply to multithreaded software. I imagine the root problem I have isn't too very different here. Slowing down one side to avoid contention isn't the same as preventing it. Reading the simulation traces was very educational. Debounce is a ff. Upstream logic gates the clock enable to cause it to flop state. I can see how the brief strobe can be missed if the actual gate timing differs from the sim by just a tiny amount. About all I can say at this point is I learned 10 times more this way than if it had all just worked when I typed it in. > The one problem that most newbies have, especially if they come from a > software orientation, is thinking of HDL as software. HDL stands for > Hardware Description Language and that is how it works. It describes > hardware. If you try to write it like software, you most likely won't > care for the hardware that results, if it produces hardware at all. > Every construct I write I picture in terms of the hardware it will > generate as well as the behavior it has. I took time away from the keyboard tonight to re-read code from Pong Chu's newbie VHDL book, to find where I had gone astray. There's light at the end of this tunnel. I absolutely have it backwards and inside out on how to protect the synchronous states. It is, as you say, a software habit that doesn't apply.Article: 132503
There was a similar discussion about device support on this list, for edk 9.2: http://groups.google.de/group/comp.arch.fpga/browse_frm/thread/d91828e0a0528747/fdb2b5b939566021?hl=de&lnk=st&q=edk+9.2+virtex+2+group%3Acomp.arch.fpga#fdb2b5b939566021 I remember that VirtexII devices should be supported properly again in version 10.1 which is not the case, obviously. If you want to hack your EDK, change the *.mpd file: add VirtexII to the supported families and see what happens then. (And tell us) -Markus rmeiche schrieb: > Hi, > > I'm trying to build a system for a XC2V6000 FPGA. The problem I have > is that I have to implement a PLB. But the PLB shipped with the EDK > 10.1 is the PLB_v46 which doesn't support the virtex II, only V2Pro > and V4. > > At the datasheets on the xilinx website I found that the PLB_v34 > should support the V2 (see this link: http://www.xilinx.com/products/ipcenter/plb_v34.htm > ). > > But there exist two versions of the datasheet. One on the xilinx > website and one at the pcore directory. I got the pcore from the EDK > 8.2 which includes the PLB_v34 in version 1.01a. The datasheet says > that this version only supports V2Pro and V4. > > I just took that core and copied it to the pcore directory of my EDK > 10.1 project and added it to the system. I ignored the warnings and > started the build process but this was aborted with the reason that > the Virtex2 isn't supported. > > > > Does that mean that the datasheet on the xilinx website is wrong? Did > I something wrong? > > > > Has anyone tried to implement a PLB on a Virtex II system? > > > > Thanks.Article: 132504
"rickman" <gnuarm@gmail.com> wrote in message news:1e1cedff-2d7c-4d5a-b477-ee48b163eaba@p25g2000hsf.googlegroups.com... > Now that I have looked at it, this seems so frigging simple that I am > going to try to write it on the fly! :) I've said that more than once, on just this alone. I didn't give much thought to actually taking best advantage of the encoder resolution. The encoder is a hand knob with detents. Reading just one transition per detent serves this exercise better, to demonstrate proper debouncing. However, I can see room for a third try, but likely not for a few weeks or months. At which point, it should be entirely trivial. ;) > > right_way : process (clk, rst) begin > if (rst = '1') then > old_a <= '0'; ... Thanks to all who wrote with their thoughts.Article: 132505
I have just included my article in Spain. Thanks for your reply. PD: I work at University of Extremadura in Spain.Article: 132506
I have a 10MHz clock but needed a 20MHz clock speed. I used two asynchronous clear flip flops with a series of buffers to add delay to the signal. Is this a bad practice? Will it fail with time or temperature? It works fine on a PCB, but I am concerned! It does exactly what I want, increment the counter on both rising and falling edges. http://www.stockly.com/images4/080529-Clock_Doubler.jpg Above is a link to a picture the Xilinx schematic. Thanks GrantArticle: 132507
Hi everyone,i am trying to display a video on a rgb leds.The leds would be connected to the fpga what i was concerned about is that which video format can be displayed easily on the leds..Waiting for your replies.. Regards Ankit AnandArticle: 132508
If this is NOT recommended, then would 2 two bit counters (one with an inverted clock) and a 4 bit adder be the best solution? I'd like to keep my clock at 10MHz.Article: 132509
"Grant Stockly" <grant@stockly.com> wrote in message news:f7276c42-5e33-4b6e-96e1-d16afeae08cb@p25g2000pri.googlegroups.com... > If this is NOT recommended, then would 2 two bit counters (one with an > inverted clock) and a 4 bit adder be the best solution? That's almost certainly not the 'best' solution. A better solution is to use a DCM to double the frequency. At 10MHz input frequency, you'll need to use its CLKFX output. > I'd like to keep my clock at 10MHz. No you wouldn't. You'd like to keep your logic clock _enabled_ at 10MHz, but clocked by your newly DCMed 20MHz clock. HTH., Syms. p.s. Designing with schematics? How quaint! ;-)Article: 132510
Hai, 1.Can i set the clock frequency of the FIR filter at any frequency i want... but pretty much higher than the sample rate? for example :Fc=3.5khz Fs=8khz can i clock as any value >8khz say 1Mhz(considering the max clock for target device) 2.Whether direct form non-symmetric filter structure can support symmetric coefficients??whether the response computed in non-symmetric structure is same as symmetric filter structure?? I knew resource utlization wise symmetric need more adder at the cost of multpliers.. 3.In addition to impulse test(basic test to check FIR filter operation before implementing to FPGA),step test,sine wave test.What are other test that has to be compulsorily performed in time domain to check the proper working filter operation before giving any arbitary input to the filter?? regards, fazArticle: 132511
On Wed, 28 May 2008 11:24:03 -0700 (PDT), "vijayant.rutgers@gmail.com" <vijayant.rutgers@gmail.com> wrote: >I have a design on FPGA that is ready. However, we need to have some >mapping from fpga design to asic. I know that this will not be >accurate. But accuracy is not our concern right now. We just need >upper bound. Also, we are also looking for some IP Core for ASIC so >that we can rough estimate. > >Regards, >Vijayant > One approach is to run it through the Xilinx tools and review the map report (.mrp file). If you take this approach, I suggest eliminating memory blocks (PPC if used) and DSP/multiplier blocks and re-running, to understand how much of the gate count comes from these blocks. - BrianArticle: 132512
On Wed, 28 May 2008 19:45:53 -0400, krw <krw@att.bizzzzzzzzzz> wrote: >In article <OqSdnacY_bKfqqHVnZ2dnUVZ_tvinZ2d@lmi.net>, >rgaddi@technologyhighland.com says... >> krw wrote: >> >> Any reason to not just infer the comparator? VHDL generics make this >> sort of thing a breeze. > >Two things... First, I want to walk before running. I also need to >manually instantiate BRAMs and it would be nice to ditch the GUI >altogether. It would make managing mu libraries through various >core releases much simpler. > >Perhaps it's no longer true, but I found that the LogicCore devices >were better optimized than the ones that were inferred from HDL. IMO it is good practice to infer first, while bearing this statement in mind. Then you have a portable design which may be largely good enough; you only need to pay attention to instantiation where the inferred design fails size or timing (which does still happen sometimes) Sounds as if the comparators are good enough now. - BrianArticle: 132513
On Wed, 28 May 2008 04:44:40 -0700 (PDT), "fatfpga@googlemail.com" <fatfpga@googlemail.com> wrote: >hi, > >does anyone know how to solve this error when selecting 'generate >simulation hdl files' in xps (xilinx edk 9.1): >Running Data2Mem with the following command: data2mem -bm system_sim.bmm -bd /pl/hardware/user-platforms/MySystemV5/fs-boot/executable.elf tag microblaze_0 -u -o u tmpucf.ucf >ERROR:MDT - Ucf2Vhdl Conversion Generated Errors. What does this command do on its own? (from a shell) Find out why that isn't working, fix it and try again. - BrianArticle: 132514
On Thu, 29 May 2008 01:28:49 -0700 (PDT), Ankit <ankitanand1986@gmail.com> wrote: >Hi everyone,i am trying to display a video on a rgb leds.The leds >would be connected to the fpga what i was concerned about is that >which video format can be displayed easily on the leds..Waiting for >your replies.. Unless you have a very large budget for LEDs, you might want to use Baird Televisor format. HTH, - BrianArticle: 132515
On 29 Mai, 07:59, Jim Granville <no.s...@designtools.maps.co.nz> wrote: > rickman wrote: > > <snip> > > > So, are coarse grained architectures the way of FPGA... opps FPxA > > devices in the near future? Will the lowly LUT and FF be pushed into > > the dark corners of the die in coming years? I think it is not a > > matter of if, just a matter of when and I think the when is soon! > > The main drive seems to be MHz, as hard IP is always faster than > Soft Logic. Special purpose hardware is allways faster than general purpose hardware, except in the general case ;-) Coarses granularity makes the implementation of what you are building a lot more efficient, but at the same time it is less likely to match the desires of the designer. Take the DSP block as an example (lets forget the multiplier for now, as this uses an additional advantage: The existence of very clever hardware structures for multipliers) The muxes and adders use a lot less configuration bits and low level muxes as always 18 to 48 elements are configured to implement the same functions, and the data lines always run in parallel, can't be permuted as they could be in the FPGA fabric. This is a huge gain for a 48+18 bit accumulator. But, if you need 49 bits you lose a factor of two immediately. (The fabric implementation grows only by a 2%, the DSP48 implementation by 100%) Andr=E9 DeHon analyzed this in in a chapter of his PhD thesis many years ago: http://www.seas.upenn.edu/~andre/abstracts/dehon_phd.html There are graphs showing the efficiency as a function of application word length and hardware granularity. It should be noted that in FPGAs both delay and area are dominated by the routing ressources. Therefore mainly the granularity of the routing should be optimized. No design has millions of gates of random logic. Large designs are dominated by arithmetic function blocks. Therefore it is likely that an FPGA with a granularity of 2 for example will have a much better efficiency than current FPGAs. For random control logic half of the LUTs would remain unused, but for datapathes the utilization would approach 100% and the device coud save as much as 75% of the switches and configuration bits. This is old knowledge for FPGA architecure folks, but there are two strong arguments against it: 1.) It is hard to quantify routing utilization, but the competitors marketing will immediately target the lower LUT utilization as a disadvantage. (But hey, if a LUT costs 75% less, who cares if I can only use 80% of the LUTS? Especially if the clock frequency is better?) 2) Granularity 1 FPGAs make use of the huge knowledge about ASIC EDA algorithms. For higher granularities you need to redevelopemost of the software toolflow from scratch. There is a small FPGA vendor that has high speed global routing with 10bit granularity. Maybe this is a start. The area savings are marginal, as most of the switches are in local routing, but the speed improvement for long connections is significant. Kolja SulimmaArticle: 132516
On Thu, 29 May 2008 04:25:50 -0700, fazulu deen wrote: Let me try to respond with some questions of my own. >1.Can i set the clock frequency of the FIR filter at any frequency i >want... but pretty much higher than the sample rate? > for example :Fc=3.5khz > Fs=8khz > can i clock as any value >8khz say 1Mhz(considering the max >clock for target device) Are you familiar with the concept of "clock enable"? >2.Whether direct form non-symmetric filter structure can support >symmetric coefficients?? I assume you understand the phrases you have used in the question. Can you please explain what you find hard or non-obvious about your question? >whether the response computed in non-symmetric >structure is same as symmetric filter structure?? Have you considered the effects of numeric overflow in your filter? > I knew resource utlization wise symmetric need more adder at the >cost of multpliers.. Are you sure it needs more adders than an equivalent canonical implementation? What leads you to believe that? >3.In addition to impulse test(basic test to check FIR filter >operation before implementing to FPGA),step test,sine wave test.What >are other test that has to be compulsorily performed in time domain >to check the proper working >filter operation before giving any arbitary input to the filter?? Do you understand the concept of linearity? Can you think of anything that might make your filter non-linear? (Clue: my previous question about overflow). Do you trust your adder, multiplier and register building blocks? ~~~~~~~~~~~~~~~ If you are simply reproducing homework problems, please do us the courtesy of trying to solve them yourself before asking for help. If these are real problems of understanding, then please give us a clue about what you already know and what you find difficult. -- Jonathan Bromley, Consultant DOULOS - Developing Design Know-how VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK jonathan.bromley@MYCOMPANY.com http://www.MYCOMPANY.com The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.Article: 132517
hai, Are you familiar with the concept of "clock enable"? yes i do..I asked during implementation wat is max clock frequency can i set for the example pointed out by me... Can you please explain what you find hard or non-obvious about your question? I mean structural difference and coefficients support between symmetry and non-symmetry Have you considered the effects of numeric overflow in your filter? For response checking (during testing)i will consider that... Are you sure it needs more adders than an equivalent canonical implementation? What leads you to believe that? Have you ever seen the symmetric and non symmetric structure structure before once u see u will also believe in it.. Do you understand the concept of linearity? Can you think of anything that might make your filter non-linear? (Clue: my previous question about overflow). Do you trust your adder, multiplier and register building blocks? yes i do..critical path might make it non-linear.... if these are real problems of understanding, then please give us a clue about what you already know and what you find difficult. my problems are mentioned as questions and few comments about the currents progress to get the answers from the group.. regards, faz On May 29, 5:10=A0pm, Jonathan Bromley <jonathan.brom...@MYCOMPANY.com> wrote: > On Thu, 29 May 2008 04:25:50 -0700, fazulu deen wrote: > > Let me try to respond with some questions of my own. > > >1.Can i set =A0the clock frequency of the FIR filter at any frequency i > >want... but pretty much higher than the sample rate? > > =A0for example :Fc=3D3.5khz > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Fs=3D8khz > > =A0 =A0 =A0 =A0 can i clock as any value >8khz say 1Mhz(considering the = max > >clock for target device) > > Are you familiar with the concept of "clock enable"? > > >2.Whether direct form non-symmetric filter structure can support > >symmetric coefficients?? > > I assume you understand the phrases you have used in the > question. =A0Can you please explain what you find > hard or non-obvious about your question? > > >whether the response computed in non-symmetric > >structure is same as =A0symmetric filter structure?? > > Have you considered the effects of numeric overflow > in your filter? > > > =A0I knew resource utlization wise symmetric need more adder at the > >cost of multpliers.. > > Are you sure it needs more adders than an equivalent canonical > implementation? =A0What leads you to believe that? > > >3.In addition to impulse test(basic test to check FIR filter > >operation before implementing to FPGA),step test,sine wave test.What > >are other test that has to be =A0compulsorily performed in time domain > >to check the proper working > >filter operation before giving any arbitary input to the filter?? > > Do you understand the concept of linearity? =A0Can you think of > anything that might make your filter non-linear? =A0(Clue: > my previous question about overflow). =A0Do you trust your > adder, multiplier and register building blocks? > > ~~~~~~~~~~~~~~~ > > If you are simply reproducing homework problems, please > do us the courtesy of trying to solve them yourself before > asking for help. =A0If these are real problems of understanding, > then please give us a clue about what you already know and > what you find difficult. > -- > Jonathan Bromley, Consultant > > DOULOS - Developing Design Know-how > VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services > > Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK > jonathan.brom...@MYCOMPANY.comhttp://www.MYCOMPANY.com > > The contents of this message may contain personal views which > are not the views of Doulos Ltd., unless specifically stated.Article: 132518
Kolja Sulimma wrote: > On 29 Mai, 07:59, Jim Granville <no.s...@designtools.maps.co.nz> > wrote: >> rickman wrote: >> >> <snip> >> >>> So, are coarse grained architectures the way of FPGA... opps FPxA >>> devices in the near future? Will the lowly LUT and FF be pushed into >>> the dark corners of the die in coming years? I think it is not a >>> matter of if, just a matter of when and I think the when is soon! >> The main drive seems to be MHz, as hard IP is always faster than >> Soft Logic. > > Special purpose hardware is allways faster than general purpose > hardware, except in the general case ;-) > > Coarses granularity makes the implementation of what you are building > a lot more efficient, but at the same time it is less likely to match > the desires of the designer. > > Take the DSP block as an example (lets forget the multiplier for now, > as this uses an additional advantage: The existence of very clever > hardware structures for multipliers) > The muxes and adders use a lot less configuration bits and low level > muxes as always 18 to 48 elements are configured to implement the same > functions, and the data lines always run in parallel, can't be > permuted as they could be in the FPGA fabric. This is a huge gain for > a 48+18 bit accumulator. > But, if you need 49 bits you lose a factor of two immediately. (The > fabric implementation grows only by a 2%, the DSP48 implementation by > 100%) > > André DeHon analyzed this in in a chapter of his PhD thesis many years > ago: > http://www.seas.upenn.edu/~andre/abstracts/dehon_phd.html > There are graphs showing the efficiency as a function of application > word length and hardware granularity. > > It should be noted that in FPGAs both delay and area are dominated by > the routing ressources. Therefore mainly the granularity of the > routing should be optimized. > > No design has millions of gates of random logic. Large designs are > dominated by arithmetic function blocks. Therefore it is likely that > an FPGA with a granularity of 2 for example will have a much better > efficiency than current FPGAs. For random control logic half of the > LUTs would remain unused, but for datapathes the utilization would > approach 100% and the device coud save as much as 75% of the switches > and configuration bits. > > This is old knowledge for FPGA architecure folks, but there are two > strong arguments against it: > > 1.) > It is hard to quantify routing utilization, but the competitors > marketing will immediately target the lower LUT utilization as a > disadvantage. (But hey, if a LUT costs 75% less, who cares if I can > only use 80% of the LUTS? Especially if the clock frequency is > better?) > > 2) > Granularity 1 FPGAs make use of the huge knowledge about ASIC EDA > algorithms. For higher granularities you need to redevelopemost of the > software toolflow from scratch. > > There is a small FPGA vendor that has high speed global routing with > 10bit granularity. Maybe this is a start. The area savings are > marginal, as most of the switches are in local routing, but the speed > improvement for long connections is significant. > > Kolja Sulimma > Forgive my possible ignorance here (my fairly limited fpga experience is only with smaller Cyclones and PLDs, not big devices), but isn't "granularity 2" pretty much what the Stratix II, III (and IV, when it's available) have in their "adaptive logic module" ? And as far as I can see from the following recent white paper, this is exactly what Altera are saying - using the ALM they get much more into a Stratix with roughly the same number of logic elements / slices / LUTs / flip-flops than into a Virtex. Obviously all such marketing information must be taken with a large handful of salt. <http://www.altera.com/products/devices/stratix-fpgas/stratix-iii/overview/architecture/performance/st3-opencores.html>Article: 132519
On Thu, 29 May 2008 06:04:36 PDT, fazulu deen wrote: >Are you familiar with the concept of "clock enable"? >yes i do..I asked during implementation wat is max clock frequency can >i set for the example pointed out by me... FIR filters can (in almost all applications) be pipelined as deeply as necessary, so the upper limit on clock frequency is much the same as you would get for any other logic in the same technology. All that's needed is to enable all the FIR's registers for one clock cycle on each sample (i.e. at the appropriate sample rate). Not hard. I asked... >>Are you sure it [symmetric FIR structure] >>needs more adders than an equivalent canonical >>implementation? What leads you to believe that? > >Have you ever seen the symmetric and non symmetric structure structure >before once u see u will also believe in it.. I know how to build an N-tap non-symmetric FIR using N multipliers and N-1 adders. And I know how to build a symmetric or antisymmetric N-tap FIR using ceil(N/2) multipliers and N-1 adders (some are subtractors, if it's antisymmetric). So I don't see why you think you need extra adders in the symmetric/antisymmetric case. Do you know some additional tricks that I don't? >Do you understand the concept of linearity? Can you think of >anything that might make your filter non-linear? (Clue: >my previous question about overflow). Do you trust your adder, >multiplier and register building blocks? >yes i do..critical path might make it non-linear.... My point is this: if it's truly linear, and it gives the correct impulse response, then it's correct; no further testing is needed. However, nonlinearity could easily be introduced by... - buggy multiplier or adder blocks - arithmetic overflow - improperly connected input bits that were not exercised by the impulse test >if these are real problems of understanding, then please give us a >clue about what you already know and >what you find difficult. >my problems are mentioned as questions and few comments about the >currents progress to get the answers from the group.. I'm none the wiser. -- Jonathan Bromley, Consultant DOULOS - Developing Design Know-how VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK jonathan.bromley@MYCOMPANY.com http://www.MYCOMPANY.com The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.Article: 132520
On 29 Mai, 15:55, David Brown <da...@westcontrol.removethisbit.com> wrote: > Kolja Sulimma wrote: > > On 29 Mai, 07:59, Jim Granville <no.s...@designtools.maps.co.nz> > > wrote: > >> rickman wrote: > > >> <snip> > > >>> So, are coarse grained architectures the way of FPGA... opps FPxA > >>> devices in the near future? Will the lowly LUT and FF be pushed into > >>> the dark corners of the die in coming years? I think it is not a > >>> matter of if, just a matter of when and I think the when is soon! > >> The main drive seems to be MHz, as hard IP is always faster than > >> Soft Logic. > > > Special purpose hardware is allways faster than general purpose > > hardware, except in the general case ;-) > > > Coarses granularity makes the implementation of what you are building > > a lot more efficient, but at the same time it is less likely to match > > the desires of the designer. > > > Take the DSP block as an example (lets forget the multiplier for now, > > as this uses an additional advantage: The existence of very clever > > hardware structures for multipliers) > > The muxes and adders use a lot less configuration bits and low level > > muxes as always 18 to 48 elements are configured to implement the same > > functions, and the data lines always run in parallel, can't be > > permuted as they could be in the FPGA fabric. This is a huge gain for > > a 48+18 bit accumulator. > > But, if you need 49 bits you lose a factor of two immediately. (The > > fabric implementation grows only by a 2%, the DSP48 implementation by > > 100%) > > > Andr=E9 DeHon analyzed this in in a chapter of his PhD thesis many years= > > ago: > >http://www.seas.upenn.edu/~andre/abstracts/dehon_phd.html > > There are graphs showing the efficiency as a function of application > > word length and hardware granularity. > > > It should be noted that in FPGAs both delay and area are dominated by > > the routing ressources. Therefore mainly the granularity of the > > routing should be optimized. > > > No design has millions of gates of random logic. Large designs are > > dominated by arithmetic function blocks. Therefore it is likely that > > an FPGA with a granularity of 2 for example will have a much better > > efficiency than current FPGAs. For random control logic half of the > > LUTs would remain unused, but for datapathes the utilization would > > approach 100% and the device coud save as much as 75% of the switches > > and configuration bits. > > > This is old knowledge for FPGA architecure folks, but there are two > > strong arguments against it: > > > 1.) > > It is hard to quantify routing utilization, but the competitors > > marketing will immediately target the lower LUT utilization as a > > disadvantage. (But hey, if a LUT costs 75% less, who cares if I can > > only use 80% of the LUTS? Especially if the clock frequency is > > better?) > > > 2) > > Granularity 1 FPGAs make use of the huge knowledge about ASIC EDA > > algorithms. For higher granularities you need to redevelopemost of the > > software toolflow from scratch. > > > There is a small FPGA vendor that has high speed global routing with > > 10bit granularity. Maybe this is a start. The area savings are > > marginal, as most of the switches are in local routing, but the speed > > improvement for long connections is significant. > > > Kolja Sulimma > > Forgive my possible ignorance here (my fairly limited fpga experience is > only with smaller Cyclones and PLDs, not big devices), but isn't > "granularity 2" pretty much what the Stratix II, III (and IV, when it's > available) have in their "adaptive logic module" ? And as far as I can > see from the following recent white paper, this is exactly what Altera > are saying - using the ALM they get much more into a Stratix with > roughly the same number of logic elements / slices / LUTs / flip-flops > than into a Virtex. Obviously all such marketing information must be > taken with a large handful of salt. No. What I was saying is, that with granularity two, you get slightly less logic into the same number of LUTs, at greatly reduced costs. Alteras (probably correct) claim is, that because they are more flexible how the inputs to a LUT pair can be routed you can better utilize the LUTs. This added flexibility probably increases the area cost for the input routing significantly. Granularity 2 would mean that a pair of elements (most importantly routing switches) share a configuration. Each output of a LUT can only reach half of the inputs that it could reach in a granularity 1 FPGA (or must take a detour). Useful logic per LUT would go down (Because soem LUTs can't be used), useful logic per chip area would go up (Because each LUT with its associated routing ressources would get less expensive). Altera is doing the opposite: Paying extra area for added flexibility. It is achieving two goals by this: a) It sounds better for marketing, because chip area is kept secret anyway and sales prices are interpreted creatively. LUT count and utilization OTOH are easily measured. b) The device is easier to use because you can accurately estimate whether your design will fit into the device. This is valueable and might be worth the price. I do not know much about Altera, but I know that starting with Virtex-4 Xilinx decided to spent a lot extra area for routing to make the delays more predictable. This helps the XST software people and the users. But another design would have a better cost/performace ratio. Kolja SulimmaArticle: 132521
On May 29, 1:23=A0am, Grant Stockly <gr...@stockly.com> wrote: > I have a 10MHz clock but needed a 20MHz clock speed. =A0I used two > asynchronous clear flip flops with a series of buffers to add delay to > the signal. > > Is this a bad practice? =A0Will it fail with time or temperature? =A0It > works fine on a PCB, but I am concerned! =A0It does exactly what I want, > increment the counter on both rising and falling edges. > > http://www.stockly.com/images4/080529-Clock_Doubler.jpg > > Above is a link to a picture the Xilinx schematic. > > Thanks > > Grant Grant, years ago I published a reliable clock doubler circuit, part of the "six easy pieces" that seem to be lost. In words: Run your 10 MHz clock through a 2-input XOR. Generate a toggling flip-flop by feeding Q back through an inverting LUT to the D input. Route the signal driving D also to the second XOR input. Use the XOR output to clock the flip-flop, and also use it as your 20 MHz clock. Disadvantage: If your 10 MHz doesn't have 50/50 duty cycle, your 20 MHz will have frequency modulation. And the High (or Low depending on XOR or XNOR) time of your 20 MHz clock will be short but you can lengthen it by adding delay to the Q- to-D path. Anyhow, it's self-adaptive to the device speed. Use this trick only when no PLL or DLL is available. Peter AlfkeArticle: 132522
On May 29, 8:51=A0am, Kolja Sulimma <ksuli...@googlemail.com> wrote: > On 29 Mai, 15:55, David Brown <da...@westcontrol.removethisbit.com> > wrote: > > > > > Kolja Sulimma wrote: > > > On 29 Mai, 07:59, Jim Granville <no.s...@designtools.maps.co.nz> > > > wrote: > > >> rickman wrote: > > > >> <snip> > > > >>> So, are coarse grained architectures the way of FPGA... opps FPxA > > >>> devices in the near future? =A0Will the lowly LUT and FF be pushed i= nto > > >>> the dark corners of the die in coming years? =A0I think it is not a > > >>> matter of if, just a matter of when and I think the when is soon! > > >> The main drive seems to be MHz, as hard IP is always faster than > > >> Soft Logic. > > > > Special purpose hardware is allways faster than general purpose > > > hardware, except in the general case ;-) > > > > Coarses granularity makes the implementation of what you are building > > > a lot more efficient, but at the same time it is less likely to match > > > the desires of the designer. > > > > Take the DSP block as an example (lets forget the multiplier for now, > > > as this uses an additional advantage: The existence of very clever > > > hardware structures for multipliers) > > > The muxes and adders use a lot less configuration bits and low level > > > muxes as always 18 to 48 elements are configured to implement the same= > > > functions, and the data lines always run in parallel, can't be > > > permuted as they could be in the FPGA fabric. This is a huge gain for > > > a 48+18 bit accumulator. > > > But, if you need 49 bits you lose a factor of two immediately. (The > > > fabric implementation grows only by a 2%, the DSP48 implementation by > > > 100%) > > > > Andr=E9 DeHon analyzed this in in a chapter of his PhD thesis many yea= rs > > > ago: > > >http://www.seas.upenn.edu/~andre/abstracts/dehon_phd.html > > > There are graphs showing the efficiency as a function of application > > > word length and hardware granularity. > > > > It should be noted that in FPGAs both delay and area are dominated by > > > the routing ressources. Therefore mainly the granularity of the > > > routing should be optimized. > > > > No design has millions of gates of random logic. Large designs are > > > dominated by arithmetic function blocks. Therefore it is likely that > > > an FPGA with a granularity of 2 for example will have a much better > > > efficiency than current FPGAs. For random control logic half of the > > > LUTs would remain unused, but for datapathes the utilization would > > > approach 100% and the device coud save as much as 75% of the switches > > > and configuration bits. > > > > This is old knowledge for FPGA architecure folks, but there are two > > > strong arguments against it: > > > > 1.) > > > It is hard to quantify routing utilization, but the competitors > > > marketing will immediately target the lower LUT utilization as a > > > disadvantage. (But hey, if a LUT costs 75% less, who cares if I can > > > only use 80% of the LUTS? Especially if the clock frequency is > > > better?) > > > > 2) > > > Granularity 1 FPGAs make use of the huge knowledge about ASIC EDA > > > algorithms. For higher granularities you need to redevelopemost of the= > > > software toolflow from scratch. > > > > There is a small FPGA vendor that has high speed global routing with > > > 10bit granularity. Maybe this is a start. The area savings are > > > marginal, as most of the switches are in local routing, but the speed > > > improvement for long connections is significant. > > > > Kolja Sulimma > > > Forgive my possible ignorance here (my fairly limited fpga experience is= > > only with smaller Cyclones and PLDs, not big devices), but isn't > > "granularity 2" pretty much what the Stratix II, III (and IV, when it's > > available) have in their "adaptive logic module" ? =A0And as far as I ca= n > > see from the following recent white paper, this is exactly what Altera > > are saying - using the ALM they get much more into a Stratix with > > roughly the same number of logic elements / slices / LUTs / flip-flops > > than into a Virtex. =A0Obviously all such marketing information must be > > taken with a large handful of salt. > > No. What I was saying is, that with granularity two, you get slightly > less logic into the same number of LUTs, at greatly reduced costs. > > Alteras (probably correct) claim is, that because they are more > flexible how the inputs to a LUT pair can be routed you can better > utilize the LUTs. This added flexibility probably increases the area > cost for the input routing significantly. > Granularity 2 would mean that a pair of elements (most importantly > routing switches) share a configuration. Each output of a LUT can only > reach half of the inputs that it could reach in a granularity 1 FPGA > (or must take a detour). Useful logic per LUT would go down (Because > soem LUTs can't be used), useful logic per chip area would go up > (Because each LUT with its associated routing ressources would get > less expensive). > > Altera is doing the opposite: Paying extra area for added flexibility. > It is achieving two goals by this: > a) It sounds better for marketing, because chip area is kept secret > anyway and sales prices are interpreted creatively. LUT count and > utilization OTOH are easily measured. > b) The device is easier to use because you can accurately estimate > whether your design will fit into the device. This is valueable and > might be worth the price. > I do not know much about Altera, but I know that starting with > Virtex-4 Xilinx decided to spent a lot extra area for routing to make > the delays more predictable. This helps the XST software people and > the users. But another design would have a better cost/performace > ratio. > > Kolja Sulimma Let me add my 2 cents worth here, as a personal opinion (not official Xilinx position): In the distant past, each process generation gave us smaller and thus cheaper die, and higher speed, while leakage current was a non-issue. =46rom now on, the next process generation will still give us smaller size, and eventually lower cost, but hardly any raw speed improvement. And leakage current is the big concern... Speed improvement will predominantly come from architectural (granularity) changes. That's why Virtex-5 quadrupled the logic size of the LUTs (from 16 bits to 64 bits) to pack logic more tightly, and to reduce routing. That's also why we added many hard-coded functions, multipliers, ALUs, FIFOs, SerDes in each I/O, PCIexpress, Ethernet, and multi-gigabit transceivers in all Virtex-5 LXT/SXT/FXT devices. In the 'FXT subfamily we also include one or two hard-coded PPC microprocessors with attached crosspoint and DMA. So we are increasing efficiency and speed and reducing power not only in the general-purpose fabric, but more importantly through larger hard-coded blocks. But we always make sure that our FPGAs remain general-purpose devices. The art of engineering is forever a compromise between conflicting demands... Peter AlfkeArticle: 132523
On May 28, 4:07 am, Bryan <bryan.fletc...@avnet.com> wrote: > On May 27, 12:42 am, "MikeWhy" <boat042-nos...@yahoo.com> wrote: > > > > > "bish" <bishes...@gmail.com> wrote in message > > >news:5df586c0-0126-48cc-9ff1-ee382f505221@u6g2000prc.googlegroups.com... > > > > We have just bought a new Spartan 3a 1800a dsp board of Xilinx. We > > > needed i/o pins to control motors and use various sensors and camera. > > > The board contains EXP expansion slots, ( somewhere I found it is > > > called QTE connector?). > > > > We are confused as how to easily connect our sensors like optical > > > encoder, camera and output for our motor drivers using these EXP > > > slots? > > > Somewhere we found that we need to use QSE connector but we are not > > > clear about it. We need a low cost solution !!!!! Can we find the > > > connectors to match with EXP slot at one end and have simple wires at > > > the other end?? > > > The S3ADSP starter kit board comes with Samtec QTE connectors. The > > corresponding connectors are series their QSE. The user guide documents the > > specific type. Samtec was nice enough to send a couple of the required > > connectors as samples. These are SMT components; you'll have to build a > > board to bring out the required signals. They also sell QSE terminated > > cables, although I doubt these are cheap. They are shielded differential > > pairs. You might look at daughterboards sold for the S3ADSP3400 kits. I > > don't have experience with the 3400 board. Its accessories might or might > > not fit the QTE connectors on the 1800 board. > > The Avnet EXP Proto module brings the I/Os out to headers.www.em.avnet.com/exp-prototype thanx for the link of EXP Proto module. But it's bit EXPENSIVE for us, having just bought the board. We don't need complete protoboard. It'd be ok if we can just get the I/O of the FPGA from the EXP either simply in wires or in headers. AnyArticle: 132524
Peter Alfke wrote: > Grant, years ago I published a reliable clock doubler circuit, part of > the "six easy pieces" that seem to be lost. I repeat my request that the Xilinx marketing and/or web people put all the old stuff that they unceremoniously removed back into an archive section of the web or FTP site. The "six easy pieces" article is exactly the sort of thing that I was worried would be lost. :-( Just because application notes and white papers are old does NOT mean that they aren't of any use to Xilinx customers. Eric
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z