Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
This bloke seems to know what he's doing. http://www.andraka.com/cores.htm Some stuff here too http://www.opencores.org/browse.cgi/by_category Also, Google gave me over 44khits for FPGA FFT "bart" <larsonbr@gmail.com> wrote in message news:1115143492.028339.55630@f14g2000cwb.googlegroups.com... > I have been tasked with trying to implement a FFT algorithm in a > FPGA/DSP architecture. The algorithm would be a N point FFT with 1000 > frequency bins. Each frequency bin would require a multiply, by the > constant e^jx, and then accumulate every 1 microsecond. This turns out > to be 1000 multiply accumulates happening in parallel every 1 > microsecond. Does anyone have experience doing something similar in an > FPGA/DSP and can they point me in the right direction as far as > choosing a FPGA/DSP development board? Any help would be appreciated. >Article: 83601
Hi Jim, JD_Design wrote: > Austin, > > One more thing; I noticed that the same is true for the DCM (no > variance over temperature; I did not have the data for the DCM > previously so I had not entered it). > > Since it is about 38mW (3 for VCCINT, 5 for VCCAUX dynamic and 30 for > VCCAUX standby) it could be a factor at 85 degrees C if it does indeed > vary over temperature. Following on from Austin's i/p, I'd just like to add that our V4 concentration (in terms of what the Web Power Tool models) has been on where we have seen the greatest variation, i.e., VCCINT quiescent. Brendan > > > Thanks for any help! > > JDArticle: 83602
Austin, I guess I would expect VCCAUX power to scale at least somewhat with VCCINT power over temp since at room temp they only differ by a factor of two for current draw (since they have the same power at room temperature but different voltage). Understood about the leakage current on the PPC; of course, if we didn't want to use it we could have looked at the LX :) How about the DCM (I added that question in a later post)? It also doesn't vary over temperature in the calculator. Thanks for the info, JDArticle: 83603
Peter Alfke wrote: > Here are some lines I have used in presentations: > > The maturing FPGA market > Dominated by two players, Xilinx and Altera > With 52% and 34% market share = 86% combined > Remaining players scramble for niches > All non-dedicated players have given up: > Intel, T.I., Motorola, NSC, AMD, Cypress, Philips, ST... > Late-comers have been absorbed or failed: > Dynachip, PlusLogic, Triscend, Siliconspice (absorbed) > Chameleon, Quicksilver, Morphics, Adaptive Silicon ... > > It's not just because of patents > The big guys lack the focus, the small ones lack the resources. A good 'presentation party line', but there is still room for innovation, and this work looks interesting : http://www.elixent.com/products/array.htm As processes shrink, the resource split that made sense 10+ years ago, can be sub-optimal today - but there are SW and tools inertia that wants to keep the 'same old' LUT structures... A cell between the newer DSP blocks, and the older LUT, could make sense = 4 Bit ALU used by elixent ?. 4 bit ALUs do go a long way back - IIRC the AMD 2901 ? -jgArticle: 83604
Preben, it is best to look at the I/O from a hardware point of vies( and not talk about "writing a process" etc. You point out correctly that the min clock delay can cost you performance. Here is something I wrote a month ago, and that is also covered in XAPP702. It allows you to compensate for many variable delays. Capturing the Input Data Valid Window. Let's assume a continuously running clock and a 16-wide data input bus. Let's assume the clock is source-synchronous, i.e. its rising transition is aligned with the data transitions, and all these transitions have little skew. (Preben, in your case you know thatthe data is valid 3 ns before to 2 ns after the clock edge. So we might call it "aligned with the clock" If you do not want to make that assumption,you are in for a more complex training data pattern, but it could be done, if absolutely necessary, not needed in your case). The user faces the problem of aligning the clock with respect to the data in such a way that set-up- and hold-time specs are obeyed and (hopefully) data is captured close to the center of the data valid window. Given the fairly wide spread between worst-case set-up- and hold-time as specified by the IC manufacturer, a carefully worst-cased design will achieve only modest performance, since the designer is forced to accomodate the specified extreme set-up and hold time values of the input capture flip-flops. Typical values are positive 300 ps set-up time, negative 100 ps hold time, which implies a 200 ps window. The actual capture window is only a small fraction of a picosecond, but, depending on temperature, supply voltage or device processing, it might be positioned anywhere inside the specified wide window. Here is a self-calibrating design approach that achieves much better performance by largely eliminating the uncertainty of the flip-flop characteristics. This approach assumes reasonable tracking of the input flip-flops driven by the data and clock inputs, and assumes programmable delay elements at each input buffer. The incoming clock is buffered and used to clock all data input flip-flops. The incoming clock is also used as if it were data, run through its own delay element X, then driving the D input of a clocked flip-flop. Its output is then used to control a state machine that manipulates X to find the two edges of the valid window, where the flip-flop output changes. Note that changing X has no impact on the bus data capture operation, it only affects the control flip-flop. Once both edges are found, the state machine calculates the center value, and applies this in common to all data input delays. This auto-calibration circuit can run continuously (or non-continuously), since it does not interfere with normal operation. It means that the user can completely ignore the flip-flop set-up and hold time specifications, the spread between set-up and hold-times, and their possible variation with temperature and Vcc. This circuit does not compensate for skew between data lines, or any skew between data and clock, and it assumes good tracking between all input flip-flops, and relies on a reasonably fine granularity in the delay adjustments. Fundamentally, this auto-calibration reduces the data capture uncertainty from a first-order problem, to a second order issue, thus permitting substantially higher data rates and/or higher reliability of operation. Virtex-4 programmable input delays have 75 picosecond granularity. A low-skew data bus can thus be captured at bus data rates in excess of 1Gbps, even when the data valid window is smaller than 200 ps. Peter Alfke, 3-31-05Article: 83605
Comments below, Austin > > I guess I would expect VCCAUX power to scale at least somewhat with > VCCINT power over temp since at room temp they only differ by a factor > of two for current draw (since they have the same power at room > temperature but different voltage). Vccaux uses all thick oxide transistors. Leakage varies imperceptibly with temperature compared to all the analog stuff which has a fixed bias always flowing. > > Understood about the leakage current on the PPC; of course, if we > didn't want to use it we could have looked at the LX :) > > How about the DCM (I added that question in a later post)? It also > doesn't vary over temperature in the calculator. It varies A LOT with frequency. It runs (partially) from a regulator supplied by Vccaux, at a lower voltage than the core, so leakage is again something of little consequence. The control logic runs from Vccint, but is so small in the overall scheme of things on Vccint, that again, it is lost in the Vccint leakage with temperature.Article: 83606
gja, I replied back to you personally. Xilinx stands behin the app note. I am aware of, and have stated that, the quick switch folks do not support our application .... we do. Fine with me, they told me that if anyone calls them about it, they refer the call to us (and take your order). Austin gja wrote:Article: 83607
Paul has a point when he says that changing Vccint should cause a change in Iccint. This is in our plans for the Web Power Tool (WPT). When we have sufficient silicon-based data we will add the appropriate modelling to the WPT. Changing Vccint (in V4) does also affect the sub-threshold leakage current - though our measurements and analysis todate suggest it is not the simple square relationship that Paul suggests. Again - we are continuing with our analysis - which will be reflected in time in the WPT if (and only if) the effect is significant. (It's of no benefit to the WPT user for us to model current variations due to certain small effects if the normal variation from part to part will swamp such tiny improvements.) Brendan "Paul Leventis (at home)" wrote: > Hi Brendan, > > > In terms of "PVT" - Process, Voltage & Temperature : In the WPT currently > > you can, for V4 FX devices, vary Vccint and the ambient temperature. > > When I enter 1.25V vs. 1.20V in WPT 4.1, I'm given 687 mW vs. 660 mW of > VccInt static power for a LX80, in addition to the 234 mW of VccAux power. > > Similarly, for some random amount of logic utlization, I get 2727 mW vs. > 2618 mW of dynamic power. > > It seems to me that all the tool is doing is increasing the V in P = VI. > However, increasing V should (a) increase dynamic current draw roughly > linearly and (b) increase sub-threshold leakage by the square of 1.25/1.2. > Neither of these effects appear to be modeled. > > Regards, > > Paul Leventis > Altera Corp.Article: 83608
Walter, Based on your description, it sounds like bitreservoir is propogation 'U' via one of it's outputs that is fed to synchronizer. You should start there, making sure all the signals in bitresrvoir have reset states and are driven. Also, note that if you had contention, the signals would be 'X', and never 'U'. I wasn't sure you understood that from your description. JohnArticle: 83609
Antony <ascgroup_nospam@tiscalinet.it> wrote in message news:<_YTbe.13230$ms1.5857@tornado.fastwebnet.it>... > Duane Clark ha scritto: > > > By the way, I should mention one more subtle gotcha. The addresses to > > the DIMM need to be reversed, because this determines the DDR/DIMM > > commands. > > Modified the core, but it didn't work... Unfortunately I discovered that > I hadn't the Service Pack installed, so I had to modify manually the > cores (two of them were older than the ones you used for the diff > files...) and had to do some fine tuning... Tomorrow I'll ask to install > the SP 2 on the lab's machine and check with it installed what I can do. Hi Duane, how are you? Ok, I had the SP2, patched the files and tied the external ports to 0 touse only the PLB connection. I even imported the core in EDK, but it seems not to work correctly. I'm wondering if it depends on how I clocked the DDR and the system, with the TWO classical DCM tied to 100 MHz both for the bus and for the DDR DCM... How did you connected the DCMs? Thank you very much! Bye!!!Article: 83610
BlankHi, I seem to have a problem talking to a MAC chip that is connected as a memory mapped device on the EBI bus of an EPXA1-672 chip (EBI2 for CS, no split reads and no prefetch). EBI1 is connected to a flash chip. I am using the GNU toolset to develop code and no OS (as yet). Apparently, if I read a single register (in my code), a series of 16 read accesses are made by the chip and cached. Subsequent reads do not access the MAC, rather return values from the cache - I do not see any CS transitions at the chip pins. The write operations function perfectly well if I do not perform a read - one CS for every write request. Once a read is performed, the writes also cease to be "executed" and change the register value in the cache only. Has anyone seen a similar problem? The Altera folks have not responded to my trouble tickets - their support is not what it used to be. Thanks, bta3Article: 83611
Hi all: I am trying to add customized IP to microblaze using FSL channel. The customized IP includes block ram generated by coregen tool and wrapper vhdl file to use this block-ram. Under pcores directory, the files are arranged as follows: pcores/dual_ram_v 1_00_a/ data/dual_ram_v2_1_0.bdd dual_ram_v2_1_0.pao dual_ram_v2_1_0.mpd pcores/dual_ram_v 1_00_a/ hdl/vhdl/input_ram_wrapper.vhd (wrapper vhd file) pcores/dual_ram_v 1_00_a/ netlist/input_ram.edn (file created by coregen). BDD file contains the following: FILES input_ram.edn. When I tried to generate bitstream, I receive the following message: ERROR: NgdBuild:604 - logical block 'dual_ram_0/dual_ram_0/inst_input_ram/ram_name' with type 'input_ram' could not be solved. A pin name misspelling can cause this, a missing edif or ngc file, or the misspelling of a type name. Symbol 'input_ram' is not supported in target 'virtex2p'. How can I overcome the above error? I think that I made mistake in *.bbd file and netlist directory. If I added wrong file in the netlist directory, please let me know the correct file that I needed to add there and let me know how I can create those relevant file. Please help me to overcome this problem. ThanksArticle: 83612
Have you set the STYLE in the mpd file to "MIX" so EDK knows to use the bbd and associated netlists? http://www.xilinx.com/ise/embedded/psf_rm.pdf Paul Mayil wrote: > > Hi all: > > I am trying to add customized IP to microblaze using FSL channel. The > customized IP includes block ram generated by coregen tool and wrapper > vhdl file to use this block-ram. Under pcores directory, the files are > arranged as follows: > > pcores/dual_ram_v 1_00_a/ > data/dual_ram_v2_1_0.bdd > dual_ram_v2_1_0.pao > dual_ram_v2_1_0.mpd > pcores/dual_ram_v 1_00_a/ > hdl/vhdl/input_ram_wrapper.vhd (wrapper vhd file) > pcores/dual_ram_v 1_00_a/ > netlist/input_ram.edn (file created by coregen). > > BDD file contains the following: > > FILES > input_ram.edn. > > When I tried to generate bitstream, I receive the following message: > > ERROR: NgdBuild:604 - logical block > 'dual_ram_0/dual_ram_0/inst_input_ram/ram_name' with type 'input_ram' > could not be solved. A pin name misspelling can cause this, a missing > edif or ngc file, or the misspelling of a type name. Symbol 'input_ram' > is not supported in target 'virtex2p'. > > How can I overcome the above error? I think that I made mistake in > *.bbd file and netlist directory. If I added wrong file in the netlist > directory, please let me know the correct file that I needed to add > there and let me know how I can create those relevant file. Please help > me to overcome this problem. > > ThanksArticle: 83613
John, Thanks for your comment. I had a previous design for which I followed the xapp623. It did work and it was for a Virtex part not V2P. Your recommendations are in sync with xapp 623 and ise 7.1 (xpower), not ml320 or the virtex2P users guide. And also contradicts Symons point of view. My main worry is why Xilinx is not following their own guidelines on their own designs. Or, I wish they make a revision to their documents when there is a change. May be it does n't matter either way! I do not have access to expensive board simulation tools or time to do it. Usually, I like following guidelines to be safe. By the way your board looks great. But, I prefer the 0402 cap for now where I have more placement flexibility. And I could place 0402 caps(atleast 2 per bank) right under the BGA, just to be safe. Thomas "John Adair" <loseitintheblackhole@blackholesextreme.co.uk> wrote in message news:d565q4$pu8$1@newsg2.svr.pol.co.uk... > Generally if you can follow the pyramid of values and numbers it is great. > You will also find recommendations power plane structure and via > structures if you dig. If you have not seen it yet have a look at Xilinx > application note XAPP623 on power distribution as a starting point. > > I would agree that reaching the numbers required is very difficult.You can > get to the stage where the vias for the power capacitors effectively > blocks the fanout of signals from the FPGA very significantly. You can > help a bit by using capacitor arrays but even then board area of the > packages and the vias is significant. We use a 0508 package containing 4 > capacitors on our Broaddown2 and MINI-CAN products to help in this > respect. You might be able to see them if you look at our website pictures > of these products and how the capacitor layout forces the routing on the > top surface tracking. > > I would recommend at least roughly following the pyramid of values with > smaller values closest to power pins and larger values further away. We > did this on the products mentioned above with groupings at corners and > centre row positions of the FPGA. If your product is already double sided > then you can fit capacitors underneath the FPGA to ease the routing > blockage but check that your board assembler is happy to do this. Some > don't like parts straight underneath BGA or will claim a cost/yield impact > on your assembly. > > John Adair > Enterpoint Ltd. - Home Of Low Cost FPGA Development Board MINI-CAN. > http://www.enterpoint.co.uk > > "Thomas" <res0rsef@verizon.net> wrote in message > news:TRTce.1736$db7.1382@trnddc01... >>I am little confused with the V2P decoupling guidelines. I am using a >>V2p7-ff896 part. It has 32 VCCINT pins and the the XST 7.1 suggest using a >>power rail scheme: >> .001uf - 34% >> .01uf - 31% >> .04uf - 18% >> .47uf - 9% >> 4.7uf - 3 % >> 470uf - 3% >> >> The user guide and the Xilinx ml320 reference design use a .1uf for each >> pin and a bulk cap near the regulator. Similar with VCCAUX and VCCO. >> Which is the best way to follow? >> >> Thanks for any help. >> > >Article: 83614
Hi marc, yes i asked somewhat same beforehand also. But just i got the doubt again. Sorry i cud not understand putting FF's in IOB's to reduce delay.. can u explain further....... thanx in advance.......Article: 83615
Hi walter... i think john is right... If u do not initialize ur signals properly you do get 'U' in simulation. This is a problem that all of us face. make sure u initialize all the signals to some known value. Then in the testbench start with a reset. Do something like this in ur module... signal a,b : std_logic; signal c : std_logic_vector(7 downto 0); process(clk,reset) if reset ='1' then a <= '0'; b <= '0'; c <= "00000000"; elsif clk'event and clk='1' then a <= -- ur assignments --; b <= -- " --; c <= -- " --; end if; end process; So all ur signals will be initialized and ur problem will be solved. puneetArticle: 83616
Thomas, You bring up a valid point: why do we (Xilinx) sometimes not even follow our own best practices? Well, for one, the design may be for a limited application where we do not feel the need to be as conservative as we would normally recomend. Secondly, we may have designed that pcb using advanced SI tools which enabled us to minimize the power distribution system to just meet the requirements of that application. And, sometimes we goof, and we do not do as good a job as we should have. After all, our pcb designers are human (big surprise?), and they can make a mistake if the design, requirements, and application are not clearly stated (sound familiar?). Maybe the review was skipped due to schedule issues .... However, when we have a less than ideal pcb SI layout, it isn't long before reality sets in, and we have to go and deal with it. If you have any issues with any of our boards, please email me directly. The new sparse chevron packages in V4 with their demonstrated 8X advantage over other 90nm FPGA solutions has made some of these issues a whole lot easier to deal with! AustinArticle: 83617
CODE_IS_BAD wrote: > Hi marc, > yes i asked somewhat same beforehand also. But just i got the > doubt again. > > Sorry i cud not understand putting FF's in IOB's to reduce delay.. > can u explain further....... thanx in advance....... I really meant to say "pushing FF's into the IOB", not putting *more* FF's there (although that might work as well, at the expense of an additional pipeline delay). Here is a simplistic example of what we're talking about: FF -> routing -> I/O pin The delay figure in your report includes not just the I/O pin delay, but the routing delay all the way back to the FF that launches that signal. This routing delay can be significant (as your report indicates), and it can be variable, changing considerably from run to run of the tools. Now move the FF to the I/O pin: routing -> FF -> I/O pin Here, the routing delay is contrained by your clock period constraint. That means the only delay you have to worry about is the I/O pin delay, which is relatively small, and relatively fixed. Peter just got finished trying to explain the input delay and FF's in another thread... perhaps reading that would help you understand why you want your do the same for your output: http://groups.google.com/groups?q=inputFF MarcArticle: 83618
I just want to ask how will you enter your 1000 frequancy pins, how many bits are you representing you frequancy points. I mean if you have 8 bits per point then you need 8000 pins which I think is to much for any FPGA avaliable. I think you mean 1000 analog inputs which also requars some form of ADC , which also leads to the same problem. May be you will enter them sequentaly which will take time to enter them to the FPGA. Best regardsArticle: 83619
> If latency is not an issue, I would register the control signal at the > IOB to make sure you don't have setup timing issues. Then add another > delay stage (register) for the DATA[7..0] bus so you can use the > control signal a cycle later. > > If you need to reduce latency into your FIFO, you need to create a > timing spec in the ucf file for the control signal like: > OFFSET = IN 7.2 ns BEFORE "RXCLK"; > to make sure your state machine does not exceed the input setup > time available. > > If you haven't assigned pins yet, I would suggest grouping the control > pin near the data pins so your state machine can be placed easily > near the control input. How can I make this timing constraint "OFFSET X BEFORE Clk" in QuartusII ? Thank you for your help. Rgds AndreArticle: 83620
Its right By the Simaulation must your entry signals have I defined state! And for real use it can but musn't be.Article: 83621
Have you the OPB Intc as a core or do you have it withhin the IPIF as Devise Interrupt Controller?Article: 83622
Hallo Joseph When you have only ModelSimXE (Xilinx Edition) there are no simple ways to simulate bus signals. You must create you testbech and manualy put the sim plb signals to stimulate your core. Or you simulate the IPIC Interface because it is quite simpler to simulate. When you have ModelSim PE or SE you can build a simple environment whis simulate the Bus signals (BFM Simulation)Article: 83623
I posted following message, but nobody respond(I don't know the reason, maybe it is too naive). I I post it here again, wish someone could help me. I am using VirtexE to communication with an ADI's chip. The interface include, write, read, Data, and Address. I wish FPGA communication with the chip on FPGA main clock, which is up to 65MHz. I used a synchronized signal gated with the Clock to generate the write, read signal. Data and Address signal are synchronized. The problem is: 1) write/read signal often generate one more period than what I needed. although I could overcome it by adjust control signal's edge sensitivity, but it maybe reappear when I resynthesize the design. The reason is time delay of 2 inputs(one is clock, one is control signal) of the LUT4 vary greatly. Can I limit delay difference of the 2 inputs to an acceptable level? if it is, How could I do? 2) the timing of address, data and write/read is inconsistent with the timing required by ADI. I could delay some signal by add buffers or invertors. But I am afraid if it is work well if I add this modular to the top design. Is there any better way? if I generate the w/r signal synchronized with the clock, the problem may do not exist. But I should drive the clock twice high, I don't know if VirtexE can work well on 125MHz. Thank you for your advice.Article: 83624
You have that area of the memory mapped cached ... you are seeing a cache line burst for the first access and nothing after that because it is manipulating the cache memory ... turn off the cache for this memory region. Mike "bta3" <bta3@iname.com> wrote in message news:4JTde.11307$o32.1391@fe09.lga... > BlankHi, > I seem to have a problem talking to a MAC chip that is connected as a > memory mapped device on the EBI bus of an EPXA1-672 chip (EBI2 for CS, no > split reads and no prefetch). EBI1 is connected to a flash chip. I am > using > the GNU toolset to develop code and no OS (as yet). > > Apparently, if I read a single register (in my code), a series of 16 read > accesses are made by the chip and cached. Subsequent reads do not access > the > MAC, rather return values from the cache - I do not see any CS transitions > at the chip pins. The write operations function perfectly well if I do not > perform a read - one CS for every write request. Once a read is performed, > the writes also cease to be "executed" and change the register value in > the > cache only. > > Has anyone seen a similar problem? The Altera folks have not responded to > my trouble tickets - their support is not what it used to be. > > Thanks, bta3 > >
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z