Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Hal Murray wrote: > > >I proposed some time ago that injecting a high frequency signal into the FF > >feedback path would limit metastabilty maximum duration. With a virtual aperture of 0.07fs, that may have to be an especially high frequency :) Illuminating the die with light, could test your theory. Another test, would be to create a placement matrix of 3x3 FF's, and take the output only from the very centre one - the others are there to disturb(vibrate) the lead inductances. If the virtual aperture does not change, with/without the extra 8 FF's then the theory does not look good. > >You drop a pin onto a table now and then it will land perfectly balanced on > >its point, how long it stays upright depends on how well it was balanced. > > >Vibrate the table and it will stay upright for less time. > > I don't think so. > > This is another example of a "fix". They don't work. The only question > is can you see why they don't work? > > In this case, the vibrations will kick some pins that have started to > fall back to the ballancing point. If you take this physical analogy, then there will be a vibration that will speed collapse of the most perfectly balanced pin. Taking an almost ideal balance, the movement from balance is VERY slow initially, and a skew movement will accelerate that portion of the curve. If it DOES move to balance the pin, the it also moves to unbalance in the next half period. At some time in the period, the pin fall becomes regenerative, and accelerates away from any vibration effect. Key question : Does a metastable FF follow this model ? - jgArticle: 59726
Hi, In the Virtex2pro user guide, http://direct.xilinx.com/bvdocs/userguides/ug012.pdf the section on "bufg exclusivity" (p69) says "Each DCM has a restriction on the number of BUFGs it can drive on its (top or bottom) edge. Pairs of buffers with shared dedicated routing resources exist such that only one buffer from each dedicated pair can be driven by a single DCM. The exclusive pairs for each edge are: 1:5, 2:6, 3:7, and 4:8." BUFGs are actually numbered 0 to 7, so I assume the above should have read "0:4, 1:5, 2:6 and 3:7." Can someone from Xilinx please confirm that this is the case? Thanks, Allan.Article: 59727
Hi, I have Implemented PCI Core in xilinx FPGA. The clock input to the core is coming from the the mother board. And Should i include the DCM for the clock input before giving it to the internal design. Won't it affect the specification. Becasue DCM may introduce some shift in the Clock. And PCI is not source synchrnous. It is operating under common clock. Correct me if i am wrong. Thanks and Regards, Muthu SArticle: 59728
Ray thanks for the answer! Searching this forum I also found your past answers to this question. > You are missing some key pieces of information: the available clock frequency > (this is different than the sample rate), and the target FPGA family. The device is Virtex2 Pro. Its system clock is 108MHz or 125 MHz not decided yet. And the filter coefficients are constant (not reconfigurable). I have some additional question about FIR filter and its decimation capability. In my DDC after a CIC filter I would like to place this FIR filter, the question is: Is better to use FIR with decimation capability, or to use higher decimation rate in CIC filter and use the FIR only as low pass filter? CIC decimation is 32 FIR decimation is 4 The second way will allow me to do a time multiplex of this FIR because I will have instead of 1/32 Fs the 1/128 Fs and also the number of taps will be reduced, is this correct. Also this Fs is low enough to use MAC FIR. Best regards, SasaArticle: 59729
On Tue, 26 Aug 2003 07:13:55 -0000, hmurray@suespammers.org (Hal Murray) wrote: >>As long as the voting circuit output is consistant LOW or HIGH, it >>doesn't really matter, at least in the context given. If the voting >>circuit output is metastable, then thats problematic. > >Everything I know about metastability says that looking for things >like voting circuits is barking up the wrong tree. If anybody has >a good example of some "fix", "kludge" or "hack" that actually works, >please tell me/us and explain why. > >We all agree that waiting longer helps. Right? (Helps lots!) > >Consider a voting circuit. It adds a few gates between the first >bank of 3 FFs and the next FF that captures the output of the voting >logic. Compare that with the clean/simple case of FF-FF with minimal >routing delays. The delay through the gates is working against an >exponential in increased settling time. > >Even if the normal no-logic path used a LUT so the voting logic is free, >there is still the extra delay of routing from nearby FFs over to the >LUT. (The other two FFs won't be quite so close.) How about sampling the data with multiple clocks which are slightly apart and detecting a transition ? If you have enough clocks and the rise/fall time of the input is limited you can safely decide whether a transition has happened and if any flop has the potential of going MS and not even use the output of that flop. Muzaffer Kal http://www.dspia.com ASIC/FPGA design/verification consulting specializing in DSP algorithm implementationsArticle: 59730
In common with some other software I use, WebPack ISE is screwed up by Norton Anti-virus software; for instance, User Constraints Chip Viewer hangs. I'm still using 4.2, but 5.2 might have the same problem. I found this rather puzzling, until I realised what was wrong. Leon -- Leon Heller, G1HSM leon_heller@hotmail.com http://www.geocities.com/leon_hellerArticle: 59732
Hi all, Just wanted to delve into the groups accumulated knowledge. I've got an app where I must find the index of the maximum of 40 or so (unsorted, of course) 20-bit numbers. --the specifics aren't that important, just helps me think. It's pretty time critical, with throughput more important than latency, unless latency becomes absurd, and not area critical. The "brute force" solution of cycling through the 40 numbers and doing a compare-store on each is pretty slow for FPGA implementation, and cascading comparators is pretty ugly. I did some googling and literature searching and came up with plenty of CS-style results, but very few hardware results. This was one... http://www.fpgacpu.org/usenet/max.html which was seemed the FPGA equivalent of a very clever IEE paper entitled "Efficient Parallel Pipelineable VLSI architechture for finding the maximum binary number" by F. Daneshgaran and K. Yao. which solved the problem using tristates. The bitwise parallel-max approach is pretty attactive and reduces the clock cycles to the bitwidth, e.g. 20 (though I guess you could look at m-bits at a time, but it becomes more complex) but I think it will get somewhat hung up on the "check for no candidates" logic. I was wondering if there were any other "clever" approaches around to this sort of thing from an FPGA synthesis viewpoint? I'll be the first to admit to it being a while since I've looked at an algorithms book, so feel free to club me over the head with any idiocies. --Josh Model Asst. Technical Staff MIT Lincoln LaboratoryArticle: 59733
Ray Andraka wrote: > > The trick is to make your interface look like the intended device, which can be > tricky with an FPGA. For writes, you can use write strobe as a clock signal > into the FPGA, and use that then to clock the address/data coming in from the > CPU. I use an additional toggle flip-flop to indicate when data has written by > changing state. Sync the toggle signal to your local clock, then use a state > machine to produce a 1 clock pulse in response to the toggle changing state, and > that 1 clock pulse then enables registers that copy the input registers to > locally clocked registers. Works fine like that if the input rate is lower than > the local clock by a factor of 3 or more. If less, then you need to be a bit > more clever about catching multiple writes then transferring more than one word > on one local clock. This is all true. But my point is that even once you do all this, you have not reduced the metastability problem to near zero as in the case with the async clocks. Since I am using the *same* clock as the MCU, I have a relatively fixed timing relationship between my signals and my clock. The standard metastability model assumes two independant clocks with a random distribution of signficant edge timing. So you can analyze the likelyhood of the two edges being close enough to cause a failure and allow enough delay in the settling circuit to reduce the failure rate to acceptable levels (since we all know you can't *eliminate* failures). But with a fixed timing relationship, if the timing is just right (or wrong) the sync circuit does not reduce the failure rate enough. In fact you can not analyze the problem without a characterization of the random elements in the timing. I have allowed a lot more delay than needed and will be counting on the FFs being able to resolve the metastable state even in the worse case conditions. I don't see where I can do much else. -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 59734
Jon Elson wrote: > > Well, just because the manufacturer isn't putting the info in the data > sheets, and don't > test for it, I suspect that you could get a rough description that could > be quite helpful. > Knowing, for instance, that the strobes will always follow a CPU clock > by at least > 1 ns, and never change than 5 nS after the clock, would make designing a > synchronous > memory/peripheral controller running from the same clock a lot simpler. Except that you don't know that this won't change if they tweek the chip design. It could very easily be 1 to 5 ns delay now and then change to 5 to 10 ns when they switch to a new process for cost savings. The 5 to 10 ns delay will put it right at the falling edge where I would likely be clocking it in with the old numbers. > If you can just get the roughest of descriptions of the clock vs. > strobes circuitry, you > can improve the system performance greatly, because you don't have to > have ranked > FFs on every strobe. > > Now, on the high speed stuff, with clock multipliers, etc. it can get > quite complex, and > a CPU swap to the next higher speed can throw everything off due to a > change in clock > multiplier. But, I gathered from the initial post that this wasn't a > clock multiplied CPU. Actually, yes the MCU runs at 8x or 4x the input clock. But the output is at the MCU rate and I will be clocking the FPGA with the 8x MCU output clock. -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 59735
"Josh Model" <model@ll.mit.edu> wrote in message news:Mg13b.61$Y5.21@llslave.llan.ll.mit.edu... > I must find the index of the maximum of 40 or so (unsorted, of course) > 20-bit numbers. --the specifics aren't that important, just helps me think. > It's pretty time critical, with throughput more important than latency, > unless latency becomes absurd, and not area critical. Do you get all 40 numbers on every clock? Or do you get them one at a time? Or are they all sitting in a memory somewhere? If the numbers arrive one by one, then keeping track of the largest (and its index) is pretty simple. If you need the max of the 40 numbers that you saw most recently, it's a bit harder because you must keep the numbers in sorted order, but you also need to know about their arrival order so that you can throw away the correct value when it reaches the "stale" end of the list. I've done similar things (on a smaller scale) for median filters, and it gets kinda messy. > The "brute force" solution of cycling through the 40 numbers and doing a > compare-store on each is pretty slow for FPGA implementation, and cascading > comparators is pretty ugly. I don't think it's that ugly, really. A tree of 40 comparator/mux modules will do it, in a fairly systematic way. (It would be horrific to code in Verilog, but easy in VHDL.) > I did some googling and literature searching and came up with plenty of > CS-style results, but very few hardware results. Somewhere in the archives here we have a heapsorter design which does the right kind of thing... I'll seek it out, if you are interested. -- Jonathan Bromley, Consultant DOULOS - Developing Design Know-how VHDL * Verilog * SystemC * Perl * Tcl/Tk * Verification * Project Services Doulos Ltd. Church Hatch, 22 Market Place, Ringwood, Hampshire, BH24 1AW, UK Tel: +44 (0)1425 471223 mail: jonathan.bromley@doulos.com Fax: +44 (0)1425 471573 Web: http://www.doulos.com The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.Article: 59736
Hello, I have a large number division to perform and need to keep 2 to 3 digits of accuracy beyond the decimal point. The result will always be some fraction number between 1.000 and 192.000. For example a typical set of division will be: 40189861/1947555 = 20.6360595721302 (need 20.636) 19399957/1948400 = 71.5458617327038 (need 71.546) 238466182/1947387 = 122.454438691436 (need 122.454) 337495643/1946746 = 173.363984310228 (need 173.364) I have found in Simulink that this requires 31 bit for the numerator and 24 bits for the denominator. I also found that to keep the answer accurate to about 3 places requires a 46 bit number consisting of 8 bits for the integer portion and the remaining 38 bits to keep the 3 decimal place fractional portion. I tried using the pipelined divider core from Xilinx Core Generator but that only gives me an 8 bit integer with at most 32 bit fractional remainder. That does not cut it. What can I do? Any help is greatly appreciated. Salman "Even a smile is charity."Article: 59737
Isn't 3 decimal places about 10 bits? 1/1024? If you want your result to have n bits of fractional portion, just multiply the numerator by 2^n and your integer result will be shifted by n bits giving you the fixed number of "fraction bits" that you want. Salman Sheikh wrote: >Hello, > >I have a large number division to perform and need to keep 2 to 3 digits of >accuracy beyond the decimal point. The result will always be some fraction >number between 1.000 and 192.000. > >For example a typical set of division will be: > >40189861/1947555 = 20.6360595721302 (need 20.636) > >19399957/1948400 = 71.5458617327038 (need 71.546) > >238466182/1947387 = 122.454438691436 (need 122.454) > >337495643/1946746 = 173.363984310228 (need 173.364) > > > >I have found in Simulink that this requires 31 bit for the numerator and 24 >bits for the denominator. I also found that to keep the answer accurate to >about 3 places requires a 46 bit number consisting of 8 bits for the >integer portion and the remaining 38 bits to keep the 3 decimal place >fractional portion. I tried using the pipelined divider core from Xilinx >Core Generator but that only gives me an 8 bit integer with at most 32 bit >fractional remainder. That does not cut it. What can I do? Any help is >greatly appreciated. > > >Salman > >"Even a smile is charity." > > > >Article: 59738
Responses *'d > > I must find the index of the maximum of 40 or so (unsorted, of course) > > 20-bit numbers. --the specifics aren't that important, just helps me > think. > > It's pretty time critical, with throughput more important than latency, > > unless latency becomes absurd, and not area critical. > > Do you get all 40 numbers on every clock? Or do you get them one at > a time? Or are they all sitting in a memory somewhere? > > If the numbers arrive one by one, then keeping track of the largest > (and its index) is pretty simple. *Right. we get all 40 numbers (outputs of Multiply accumulates) once every N clock cycles, with N probably being determined by the update rate at which we can find the maximum. In the ideal situation, N = 1, and we update the maximum index every clock cycle. > If you need the max of the 40 numbers that you saw most recently, > it's a bit harder because you must keep the numbers in sorted order, > but you also need to know about their arrival order so that you can > throw away the correct value when it reaches the "stale" end of the > list. I've done similar things (on a smaller scale) for median > filters, and it gets kinda messy. > > > The "brute force" solution of cycling through the 40 numbers and doing a > > compare-store on each is pretty slow for FPGA implementation, and > cascading > > comparators is pretty ugly. > > I don't think it's that ugly, really. A tree of 40 comparator/mux modules > will do it, in a fairly systematic way. (It would be horrific to code in > Verilog, but easy in VHDL.) * I think you're right, might not be so bad. If pipelined it would reduce latency to Ceil(log2(40)) = 6, which is pretty good, and keep throughput at maximum. The way I see it, each module would be made up (using 20-bit numbers & 6-bit addresses) a 2-input, 20-bit comparator which serves as the select for this module, a 20-bit 2:1 mux to pass the greater value, and a 6-bit 2:1 mux to pass the corect index. Cascade these guys in levels and you'd need 20 + 10 + 5 + 2 + 2 + 1 modules. Still seems sort of inelegant to me, but it gets the job done, which is the point. > > I did some googling and literature searching and came up with plenty of > > CS-style results, but very few hardware results. > > Somewhere in the archives here we have a heapsorter design which does > the right kind of thing... I'll seek it out, if you are interested. * I'd be interested if it's easy for you to find -- for my app, I think most of the overhead of building the heap would be wasted, since I only want the max. *Thanks, *--JoshArticle: 59739
I am using a Spartan II with Foundation 3.1. Sometimes, after I make a design change, implementation does not rebuild the affected module. This results in a bit file that is identical to the previous version. It's like Xilinx needs a "make -clean." Does anyone have an explanation or fix for this problem? ThanksArticle: 59740
Jonathan Bromley wrote: >"Josh Model" <model@ll.mit.edu> wrote in message >news:Mg13b.61$Y5.21@llslave.llan.ll.mit.edu... > > >>The "brute force" solution of cycling through the 40 numbers and doing a >>compare-store on each is pretty slow for FPGA implementation, and cascading >> >> >>comparators is pretty ugly. >> >> > >I don't think it's that ugly, really. A tree of 40 comparator/mux modules >will do it, in a fairly systematic way. (It would be horrific to code in >Verilog, but easy in VHDL.) > > Aw, heck... it's just a bunch of for loops. 40 values, 20 compares, 20 results. 20 values, 10 compares, 10 results. 10 values, 5 compares, 5 results. 5 values, 2 compares, 3 results. 3 values, 1 compare, 2 results. 2 values, 1 compare, final result. 41 20 bit registers in the pipeline and a result in about 5 clocks. one stage would look like: for( i=0; i<10; i=i+1) begin result_stage3[i] <= result_stage2[2*i] > result_stage2[2*i+1] ? result_stage2[2*i] : result_stage2[2*i+1]; index_stage3[i] <= result_stage2[2*i] > result_stage2[2*i+1] ? index_stage2[2*i] : index_stage2[2*i+1]; endArticle: 59741
On Tue, 26 Aug 2003 02:07:57 -0700, Jean Nicolle wrote: > my manager said it couldn't be done. So just to prove him wrong :-) > http://www.fpga4fun.com/PWM_DAC.html > > Well, pretty simple stuff anyway. > Have fun. > Jean Actually I dont think that you have a PWM generator, but a phase accumulator (The output will _toggle_ at a rate determined by your clock and the input data) Using a magnitude comparator for PWM generation has the nice option of bit reversing a set of the reference count MSBs to select the number of transitions per cycle. Low number of transitions per cycle (no bits reversed) is best for highest accuracy - less dependence on output switching time assymetries. Highest number of transitions (fully bit reversed reference count) gives you a very easy to filter signal... Peter WallaceArticle: 59743
On Wed, 27 Aug 2003 08:12:56 -0700, Peter Wallace wrote: > On Tue, 26 Aug 2003 02:07:57 -0700, Jean Nicolle wrote: > >> my manager said it couldn't be done. So just to prove him wrong :-) >> http://www.fpga4fun.com/PWM_DAC.html >> >> Well, pretty simple stuff anyway. >> Have fun. >> Jean > > Actually I dont think that you have a PWM generator, but a phase > accumulator (The output will _toggle_ at a rate determined by your clock > and the input data) > Oops it will work, I see it now! I miseed that you were just adding the lower 8 bits to the result Peter WallaceArticle: 59744
Well, the "pin is falling over" in 2 ns anyhow, so how fast can you "shake the table"? We are not worried about microseconds of metastability anymore... Peter Alfke nospam wrote: > > Peter Alfke <peter@xilinx.com> wrote: > > >I am embarrassed. I fell into the "fix metastability trap". > >To quote a German proverb: > > "Alter schuetzt vor Torheit nicht" (Age is no protection against stupidity). > > > >So, before anybody starts experimenting: > >Three or more flip-flops with majority voting are no cure for > >metastability, they just protract the agony. And so do all sorts of > >other schemes. All of them! > > I proposed some time ago that injecting a high frequency signal into the FF > feedback path would limit metastabilty maximum duration. > > You drop a pin onto a table now and then it will land perfectly balanced on > its point, how long it stays upright depends on how well it was balanced. > > Vibrate the table and it will stay upright for less time.Article: 59745
Unfortunately, there is a difference. The BUFGMUX has a set-up time requirement for S, so it is not really asynchronous. Both types of circuit have a common flaw: They do not work properly with a clock of zero frequency... For that you have to use the Enable option. This is all being fixed in the next generation. Peter Alfke ================ Jay wrote: > > Hi Peter, > > I've used your Asynchronous clock switching circuits(Six Easy Pieces) in my > design. It works really good. > > What I'm wondering is whether there's the same circuits in the BUFGMUX. Any > difference?Article: 59746
Rick: as I mentioned, the metastability-catching is so small, that you may not hit it consistently even if you try ( 0.07 femtoseconds width for a 1.5 ns metastable delay. If you can accomodate 3 ns of delay, the probability gets so small, you can forget about it. Every extra half nanosecond increases the MTBF a million times. If you are still not convinced, you could develop a training ( adaptive ) search for a safe clock phase. Virtex-II DCMs do this exceptionally well with their fine phase control, but there are also other, less elegant methods. Metastably yours Peter AlfkeArticle: 59747
Hi. I wonder if the SpartanIIE in QP208 package can talk to 3.3v and 1.8v logic at the same time? The QP208 option requires me to connect all VCCO pins to a single voltage. MarcArticle: 59748
Hello! I am a newbie trying to perform a simulation with ModelSim... I did a (very small) project with the Xilinx downloadable webpack - and as i wanted to simulate it with modelsim, modelsim closed after the following line: #vsim -lib work -t 1ps -L xilinxcorelib testbench the transcript file shows the following: # Reading D:/Programme/Modeltech_5.5d/win32/../tcl/vsim/pref.tcl my modelsim version is 5.5d - but i also tried 5.5e. Didn't work! They always crash after the same line. I already tried to reinstall everything - no change. I don't know what to do anymore...please help me!!!! Thanx, SimoneArticle: 59749
Hi, Is it possible to convert jedec to logical equations? I've got a jed file for a xilinx cpld(XC9536xl) and I'm trying to recover a job done a long time ago. Or, is it possible to discover the pinout based on the jed file? This would be quite useful too. Thanks in advance, yusuke PS: Sorry for my poor English skills.
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z