Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
I am currently task of recommending the largest, fastest and most memory FPGA that's readily available the first half of this year for a FPGA Array Card. The choices have been narrowed down to two families Altera's APEX-II (EP2A70) and XILINX Virtex-II (XC2V6000). Which can operate at the highest speed? SteveArticle: 38326
Hi, I am having problems with trying to constrain the inputs going into a multi-level parity generator in XST Verilog. Here I am trying to generate a parity of 36 inputs for my PCI IP core, and, of course, Xilinx and Altera FPGAs are 4-input LUT-based, so the input signals go through multiple levels of LUTs to calculate the parity. In the first level, the parity generator uses 9 LUTs to calculate parity. In the second level, 2 LUTs take in 8 of the 9 outputs of the first level LUTs, and the remaining one output from the first level LUT will be used at the third level. At the final third level, 2 inputs from the second level LUTs and one input from the first level LUTs will be used to calculate the final parity calculation result. Here are the partial Verilog codes for the top module where I instantiate the parity generator, and the parity generator. ___________________________ Top Module _____________________________ Parity_Generator Parity_Generator_Generator_Instance( .clk(clk), .Parity_Input({c_be_n[3:0], ad_Port[31:0]}), .XORed_Result(Parity_Generated) ); ____________________________________________________________________ __________________________ Parity Generator ________________________ module Parity_Generator( clk, Parity_Input, XORed_Result ); input clk; input[35:0] Parity_Input; output XORed_Result; reg XORed_Result; wire[8:0] First_Intermediate_Parity; wire[2:0] Second_Intermediate_Parity; wire Final_Parity; // First level assign First_Intermediate_Parity[8] = Parity_Input[35] ^ Parity_Input[34] ^ Parity_Input[33] ^ Parity_Input[32]; assign First_Intermediate_Parity[7] = Parity_Input[31] ^ Parity_Input[30] ^ Parity_Input[29] ^ Parity_Input[28]; assign First_Intermediate_Parity[6] = Parity_Input[27] ^ Parity_Input[26] ^ Parity_Input[25] ^ Parity_Input[24]; assign First_Intermediate_Parity[5] = Parity_Input[23] ^ Parity_Input[22] ^ Parity_Input[21] ^ Parity_Input[20]; assign First_Intermediate_Parity[4] = Parity_Input[19] ^ Parity_Input[18] ^ Parity_Input[17] ^ Parity_Input[16]; assign First_Intermediate_Parity[3] = Parity_Input[15] ^ Parity_Input[14] ^ Parity_Input[13] ^ Parity_Input[12]; assign First_Intermediate_Parity[2] = Parity_Input[11] ^ Parity_Input[10] ^ Parity_Input[ 9] ^ Parity_Input[ 8]; assign First_Intermediate_Parity[1] = Parity_Input[ 7] ^ Parity_Input[ 6] ^ Parity_Input[ 5] ^ Parity_Input[ 4]; assign First_Intermediate_Parity[0] = Parity_Input[ 3] ^ Parity_Input[ 2] ^ Parity_Input[ 1] ^ Parity_Input[ 0]; // Second level assign Second_Intermediate_Parity[2] = First_Intermediate_Parity[8]; assign Second_Intermediate_Parity[1] = First_Intermediate_Parity[7] ^ First_Intermediate_Parity[6] ^ First_Intermediate_Parity[5] ^ First_Intermediate_Parity[4]; assign Second_Intermediate_Parity[0] = First_Intermediate_Parity[3] ^ First_Intermediate_Parity[2] ^ First_Intermediate_Parity[1] ^ First_Intermediate_Parity[0]; // Final level assign Final_Parity = Second_Intermediate_Parity[2] ^ Second_Intermediate_Parity[1] ^ Second_Intermediate_Parity[0]; always @ (posedge clk) begin XORed_Result <= Final_Parity; end endmodule ____________________________________________________________________ From what I see, the c_be_n[3:0] should go through, assign First_Intermediate_Parity[8] = Parity_Input[35] ^ Parity_Input[34] ^ Parity_Input[33] ^ Parity_Input[32]; But the problem I have here is that when I synthesize the code, XST Verilog (ISE WebPACK's synthesis tool) or Xilinx MAP somehow automatically chooses which inputs goes into which LUTs, and I have a problem with that. I want "c_be_n[3:0]" which is an unregistered bus signal of PCI bus to go through as fewer LUTs as possible to reduce setup time requirements. For "ad_Port[31:0]," that signal comes from inside of the chip (from DFFs), so I don't have to worry too much about how many levels of LUTs it passes through. I tried disabling (unchecking) XST's option called XOR collapsing, but it didn't seem to make any difference. I recently upgraded to the latest ISE WebPACK 4.1WP2.0 from 4.1WP0.0, but that didn't seem to make any difference, either. For MAP, setting Map to Inputs option to 4 or 5 didn't seem to make difference. I first noticed this problem when I synthesized my PCI IP core trying to meet 66MHz PCI timings (Tsu < 3ns, Tval(Tco) < 6ns) just for curiosity. In 33MHz PCI, this whole issue of which signals go through how many LUTs for calculating parity was not a big issue because Tsu only has to be < 7ns. I found someone else discussing a better way of calculating 36-bit parity than the method shown above for Virtex architecture devices, so I modified my code to take advantage of that idea. Here are the new partial Verilog codes for the top module where I instantiate the parity generator, and the parity generator. ___________________________ Top Module _____________________________ Parity_Generator Parity_Generator_Instance( .clk(clk), .Fast_Path_Parity_Input(cben[3:0]), .Parity_Input_1(ad_Port[3:0]), .Parity_Input_2(ad_Port[7:4]), .Parity_Input_3(ad_Port[11:8]), .Parity_Input_4(ad_Port[15:12]), .Parity_Input_5(ad_Port[19:16]), .Parity_Input_6(ad_Port[23:20]), .Parity_Input_7(ad_Port[27:24]), .Parity_Input_8(ad_Port[31:28]), .XORed_Result(Parity_Generated) ); ____________________________________________________________________ __________________________ Parity Generator ________________________ module Parity_Generator( clk, Fast_Path_Parity_Input, Parity_Input_1, Parity_Input_2, Parity_Input_3, Parity_Input_4, Parity_Input_5, Parity_Input_6, Parity_Input_7, Parity_Input_8, XORed_Result ); input clk; input[3:0] Fast_Path_Parity_Input; input[3:0] Parity_Input_1; input[3:0] Parity_Input_2; input[3:0] Parity_Input_3; input[3:0] Parity_Input_4; input[3:0] Parity_Input_5; input[3:0] Parity_Input_6; input[3:0] Parity_Input_7; input[3:0] Parity_Input_8; output XORed_Result; reg XORed_Result; wire[7:0] First_Intermediate_Parity; wire[1:0] Second_Intermediate_Parity; wire Third_Intermediate_Parity; wire Final_Parity; // First level assign First_Intermediate_Parity[7] = Parity_Input_1[3] ^ Parity_Input_1[2] ^ Parity_Input_1[1] ^ Parity_Input_1[0]; assign First_Intermediate_Parity[6] = Parity_Input_2[3] ^ Parity_Input_2[2] ^ Parity_Input_2[1] ^ Parity_Input_2[0]; assign First_Intermediate_Parity[5] = Parity_Input_3[3] ^ Parity_Input_3[2] ^ Parity_Input_3[1] ^ Parity_Input_3[0]; assign First_Intermediate_Parity[4] = Parity_Input_4[3] ^ Parity_Input_4[2] ^ Parity_Input_4[1] ^ Parity_Input_4[0]; assign First_Intermediate_Parity[3] = Parity_Input_5[3] ^ Parity_Input_5[2] ^ Parity_Input_5[1] ^ Parity_Input_5[0]; assign First_Intermediate_Parity[2] = Parity_Input_6[3] ^ Parity_Input_6[2] ^ Parity_Input_6[1] ^ Parity_Input_6[0]; assign First_Intermediate_Parity[1] = Parity_Input_7[3] ^ Parity_Input_7[2] ^ Parity_Input_7[1] ^ Parity_Input_7[0]; assign First_Intermediate_Parity[0] = Parity_Input_8[3] ^ Parity_Input_8[2] ^ Parity_Input_8[1] ^ Parity_Input_8[0]; // Second level assign Second_Intermediate_Parity[1] = First_Intermediate_Parity[7] ^ First_Intermediate_Parity[6] ^ First_Intermediate_Parity[5] ^ First_Intermediate_Parity[4]; assign Second_Intermediate_Parity[0] = First_Intermediate_Parity[3] ^ First_Intermediate_Parity[2] ^ First_Intermediate_Parity[1] ^ First_Intermediate_Parity[0]; // Third level assign Third_Intermediate_Parity = Second_Intermediate_Parity[1] ^ Second_Intermediate_Parity[0]; // Final level assign Final_Parity = Fast_Path_Parity_Input[3] ^ Fast_Path_Parity_Input[2] ^ Fast_Path_Parity_Input[1] ^ Fast_Path_Parity_Input[0] ^ Third_Intermediate_Parity; always @ (posedge clk) begin XORed_Result <= Final_Parity; end endmodule ____________________________________________________________________ In the above shown code, "ad_Port[31:0]" has to go through 4 levels of LUTs, but like the previous version, that signal comes from inside of the chip (from DFFs), so I don't have to worry too much about how many levels of LUTs it passes through. The nice part of this method is that "c_be_n[3:0]" only has to go through 1 level of LUT. Yes, a 5-input LUT's gate delay is larger than a 4-input LUT's gate delay, but the 5-input LUT's gate delay is far better than two 4-input LUTs connected in series with the routing delay between two 4-input LUTs. In theory XST should use Virtex architecture's 5-input LUT, but when I synthesize this code with the same XOR Collapse option disabled, XST still seems to collapse the XOR structure of the HDL code, and synthesizes with 3 levels of LUTs using only 4-input LUTs. How can I work around this problem to instruct XST not to collapse the XOR gate structure? The "XOR Collapse" option I unchecked was through Project Navigator -> Processes for Current Source -> Synthesize -> Properties -> HDL Options. I am absolutely willing to use XST synthesis constraint file which I already use to constraint fanouts of individual inputs signals, if that is possible, but I don't want to insert vendor specific synthesis directives into my code because my PCI IP code will have to be synthesizable with other synthesis tools. I am using ISE WebPACK 4.1WP2.0 to develop my PCI IP core, and the device I am currently targeting is Xilinx Spartan-II XC2S150-5CPQ208 or 6CPQ208. Thanks, Kevin Brace (Don't respond to me directly, respond within the newsgroup.)Article: 38327
rickman wrote: > > > After rethinking this I realized that there is no need remove the macro using > > JBits. Simply delete the macro in FPGA Editor after place and route. > > Bret, are you saying that a macro like this can be treated as a single > object and removed with a single command in the FPGA Editor? Yes, set the list window to "all macros", pick the macro from the list and delete. > > > > Regarding automatic macro creation, since any .ncd file can be converted to an > > .nmc, it would be possible to use JBits to create an .ncd and then convert the > > .ncd to an .nmc using FPGA Editor (File-->Save as macro). Since no external pins > > would be needed for an interface, the only remaining operation is to set a > > reference component (Select component, Edit-->Set Macro Reference Comp). > > What if external pins are needed? In my application, I will be defining > five blocks of logic all connected to a common bus and to external pins. > One block would have a lot of IO. The remaining four blocks are all > equivalent with only about 30 IOs. These four blocks are each loaded > with a design for 1 of N possible interfaces that match the HW that is > plugged into the board. I was using the term "external pins" in the context of macro creation. These are physical component pins (usually slice pins) that are defined as ports in/out of the macro and given names matching the ports in the corresponding symbol in the logical design. What I meant was that no external pins are needed in the dummy macro since it does not need to interface to real design. I wasn't making any statement regarding I/O components which can indeed be part of the macro. Bret > > > -- > > Rick "rickman" Collins > > rick.collins@XYarius.com > Ignore the reply address. To email me use the above address with the XY > removed. > > Arius - A Signal Processing Solutions Company > Specializing in DSP and FPGA design URL http://www.arius.com > 4 King Ave 301-682-7772 Voice > Frederick, MD 21701-3110 301-682-7666 FAXArticle: 38328
Your shifter is apparently not being constructed as a merged tree. If it were a merged tree, a 32 bit shift would be 5 layers, each layer being a 32 bit 2:1 mux. The first layer shifts by either 0 or 16 bits, the next by 0 or 8 bits, the next by 0 or 4 and so on. With this construction you get a composite shift of 0 to 31 bits. Each layer consists of 32 2:1 muxes, each of which occupies an LE. 5 Layers is 160 LEs. I suspect the shifter you are dealing with is instead a set of 32 32:1 muxes. Each of those is a tree containing the logic equivalent of 31 2:1 muxes, or about 6 times the logic. I further suspect that the logic has been reduced using the cascade chains to make 4:1 muxes from pairs of LE's, resulting in approximately 3 times the LE's of the merged tree. Routing usually is not part of the macro, however if it is a placed macro, then the routing can be more or less forced onto certain paths. You have to know what the connection matrix is, however if you are going to congest the routing. In that respect, the xilinx devices are easier to tweak for performance because you have access to the routing matrix when doing the floorplanning, and routes are generally more local so there is less chance of having other stuff upset the routing in your floorplanned logic. ssy wrote: > Hi Ray > > Thanks for your help first > > but I have some different idea from you > > everytime before I compile in quartus, I assign the shifter to a > custom region that hold three Megalab, it actually took almost every > le in that 3 megalab(about 450 le), this is my first question, you say > it ocuppy only 160 le, how to achive this? BTW, my shifter is 32 bit > rotate left shifter, > > and every compilation get different route result, so I think the place > and route information is not contain in the lib from altera, the P&R > of the shifter is perform on the fly with the other logic of the > design, is that right? > > hope for further help from you > > Ray Andraka <ray@andraka.com> wrote in message news:<3C3E8F95.C4CD51B2@andraka.com>... > > A 32 bit barrel shift with 32 bits in and out should occupy 160 LEs. > > Since the proper construction does not use the cascade/carry chains, it > > can be laid out with 2 bits per LAB, so that it takes up 16 LABs. > > Depending on how they are laid out, There may not be enough row routes > > to squeeze it all into a single megalab. Unfortunately, Altera does not > > provide information on the row route connections available at each LAB > > (it is a sparse connection matrix), so doing hand placement can actually > > hurt performance and density by forcing wires to go through an > > intermediate lab to make connections. That said, the routing time is > > fairly uniform at each level of hierarchy in Altera, so you may find > > that you get little additional performance trying to do the placement > > yourself. > > > > ssy wrote: > > > > > Hi everyone > > > > > > I am looking for a fast 32 bit barrel shifter for APEX20K400E, I use > > > the LPM from Altera, but after P&R, I found it ocuppy three MegaLAB, > > > and many wire run between them. > > > > > > so I think if somebody have hand place the shifter? > > > > -- > > --Ray Andraka, P.E. > > President, the Andraka Consulting Group, Inc. > > 401/884-7930 Fax 401/884-7950 > > email ray@andraka.com > > http://www.andraka.com > > > > "They that give up essential liberty to obtain a little > > temporary safety deserve neither liberty nor safety." > > -Benjamin Franklin, 1759 -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 38329
Thanks so much, you are a great help. "Ray Andraka" <ray@andraka.com> wrote in message news:3C3DED27.69F8054C@andraka.com... > If your bus mux is connected directly from the BRAMs and you only have latency > of 1, then that is a problem. If you look at the Tcko of the BRAMs, you'll find > that it is pretty long compared to other delays in the chip. Add to that the > fact that routing around the BRAMs is usually pretty congested (especially if > you are using the BRAMs as 32 bits wide), plus your routing through several > layers of logic before hitting a flip-flop. You need to put an additional > pipeline register between the BRAM and your bus mux. You will also have to > floorplan those flip-flops to be immediately adjacent to the BRAM. With 32 bits > wide, You are going to have to avoid using the BRAMs on the edges of the chips, > as it gets too congested there to put the registers close enough to the BRAMs. > Even then, you may find that you need to use narrower data paths to the BRAMs to > get enough fast connections from the BRAM. If you are using the BRAM ENA or WE, > be aware that those inputs have a long set up time too, so the driving flip-flop > has to be right there. Those inputs are actually even more critical than the > data read path. In high performance stuff, I try to keep those inputs wired to > constants and control the read/write address counters instead. > > "H.L" wrote: > > > "Ray Andraka" <ray@andraka.com> wrote in message > > news:3C3DB8F1.48F10CD1@andraka.com... > > > 155 MHz is not hard to achieve in a VirtexE (any speed grade), but you do > > have > > > to be careful about how the design is implemented, particularly making > > sure that > > > you don't have lots of levels of logic. > > > > In the timing errors the max levels of logic is 6, is this good? > > > > You will likely need to do some > > > floorplanning to get the speed, especially when reading from the BRAMs. > > If you > > > are accessing the BRAMs at 155 MHz, you will need registers immediately > > adjacent > > > to the BRAM with no LUTs between the BRAM and the registers, and you will > > have > > > to floorplan those to place them there. Depending on how wide the BRAMs > > are, > > > you may not be able to read them at 155MHz in a -6 or if 16 bits wide, > > even a -7 > > > part. One solution may be to run the BRAM at half the clock rate and > > read/write > > > two locations per clock by using a set of staging registers. > > > > I use 3 BRAMS (128x32) at 155MHz , I have 3 modules that access them so I > > use BUS MUXs for the memory arbitration (the BUS MUXs is LUT based and > > registered with latency 1), all 3 modules use a fsm to read and write to > > the BRAMs. Do you think that the logic is total wrong? In the functional > > simulation all seem well (if that counts :)) ) > > In the timing errors I get that the total delay is mostly owing to route > > (70% route - 30% logic), do you think that with floorplanning I will be able > > to decrease the delays? > > > > Thank you very much for the help , I am new into these :) > > > > > > > > The first step, of course, is to look at the timing report to see where > > your > > > design is not meeting the timing. Once you do that, you'll know where you > > need > > > to focus your attention. > > > > > > > > > > > > "H.L" wrote: > > > > > > > Hello all, > > > > > > > > I have to program a Virtex-E FPGA at 155MHz. For this purpose I use 8 > > vhdl > > > > entities,a MUX BUS and a Block RAM from the CORE GENERATOR (I use XILINX > > ISE > > > > 4.1 with SP2). I use for synthesis FPGA EXPRESS 3.6.1 , so I create a > > fpga > > > > express project where I add the vhdl sources and the 2 edn files (the > > one > > > > for the mux bus and the one for the block ram), is this the correct > > > > procedure? I manage to export the netlist for my design but in the PAR > > > > process I get too many timing errors!!! > > > > > > > > Thanks a lot > > > > > > -- > > > --Ray Andraka, P.E. > > > President, the Andraka Consulting Group, Inc. > > > 401/884-7930 Fax 401/884-7950 > > > email ray@andraka.com > > > http://www.andraka.com > > > > > > "They that give up essential liberty to obtain a little > > > temporary safety deserve neither liberty nor safety." > > > -Benjamin Franklin, 1759 > > > > > > > > -- > --Ray Andraka, P.E. > President, the Andraka Consulting Group, Inc. > 401/884-7930 Fax 401/884-7950 > email ray@andraka.com > http://www.andraka.com > > "They that give up essential liberty to obtain a little > temporary safety deserve neither liberty nor safety." > -Benjamin Franklin, 1759 > >Article: 38330
Rick, I have responded before on the power on current issue, but for the group, I will respond here again. For existing designs in Virtex, Virtex E, Spartan II, and Spartan IIE, consult the datasheets, and the app notes: http://www.support.xilinx.com/xapp/xapp450.pdf http://www.support.xilinx.com/xapp/xapp451.pdf Which demonstrates ways to start up with as little as 80 mA by adding three components that cost pennies at -40 C with the industrial grade parts. In Virtex II, there is no startup current at all on any of the three supplies. The startup current equals the operational current, and there is no extra current to be supplied. If all you can supply is the operational currents, it powers on cleanly and configures. 4KXLA, and Spartan XL also have no startup current. Austin rickman wrote: > I am finalizing my FPGA selection for a line of DSP boards that we will > be making for a number of years. I have always been more familiar with > Xilinx but had a chance to work with the Altera 10K parts this past > year. They seem ok, but the nearly identical ACEX 1K family is better in > most regards. But the gate size is limited if we are looking at having > future growth and I am not finding as good a price as with the Xilinx > SpartanII parts. The only vendors I can find are Arrow and Newark, and > Newark does not show much on their web site. Anyone know how to get good > price numbers on the Altera parts without having a handfull of specific > parts? If I call the vendors, they always want me to give the a few part > numbers and I am window shopping and need pricing on all the parts so I > can make my choices. > > The other problem I have with the Altera FPGAs is the lack of LUT RAM. > There are only a small number of RAM (EAB/ESB) in these parts and I need > a lot more blocks of it than are available. They don't have to be big, > the 16 words available in a bank of Xilinx LUTs is perfect. For example, > I will need 64 blocks of RAM if four modules of the 8 channel ADC/DAC > are on board. This is not hard using the Xilinx LUTs. Anyone know of a > way to do something similar in an Altera FPGA? > > BTW, just to mention why I like the Altera parts... THEY DON'T HAVE A > STARTUP CURRENT SURGE!!! Was I at all unclear about that? :) > > -- > > Rick "rickman" Collins > > rick.collins@XYarius.com > Ignore the reply address. To email me use the above address with the XY > removed. > > Arius - A Signal Processing Solutions Company > Specializing in DSP and FPGA design URL http://www.arius.com > 4 King Ave 301-682-7772 Voice > Frederick, MD 21701-3110 301-682-7666 FAXArticle: 38331
Rick, I think you are intending to use these in a signal processing (read arithmetic and filtering) application. If that is the case, be very careful. The SpartanII/Virtex offer more advantages for DSP than just having distributed RAM. If you look at the Altera carry chain, it breaks the 4LUTs into a pair of 3LUTs, one for the carry and one for the 'sum'. One input to those 3LUTs is your carry from the adjacent bit. That means, at best, you get a two input arithmetic function in each level of logic. Things like adder/subtractors wind up using two or more times the LUTs as the equivalent function in Xilinx. Also, the carry chain runs through the LAB, so your data flow has to run from LAB to LAB. In the 10K and I believe (correct me if I am wrong) in the Acex familes, there are no direct connections between LABs, so the data path has to go onto the row routing. There are 6 row routes for every 8 LE's, so in a heavily arithmetic design you run out of row routes when the row gets 3/4 full. It is actually worse than that: since the row routing connections are a sparse matrix, you need to route through an intermediate LAB if there is not a direct connection between your source and destination. As the row fills up, the number of no connects goes up sharply, accelerating the saturation. In a heavily arithmetic design, you hit a pretty hard limit at about 50% device utilization because of this. The 20K family greatly improves the situation by the addition of direct connects between adjacent LABs. Also, don't discount the utility of the SRL16's in Xilinx. Not only do those make very compact delay queues, which are extensively used in filters, but they also give you a way to reload LUT contents without having to reconfigure the device. This is valuable for DA filters, since the coefficients are stored as partial sums in LUTs. In altera, a reprogrammable filter is much harder to build because the LUTs don't help you there. The SRL16's are also great for doing small reorder queues. Reordering comes up frequently in signal processing in such operations as FFTs, channel multiplexing etc. Before you abandon the Xilinx offerings, I would look long and hard at what you are giving up. The cost may not be worth the small gains you get. rickman wrote: > I am finalizing my FPGA selection for a line of DSP boards that we will > be making for a number of years. I have always been more familiar with > Xilinx but had a chance to work with the Altera 10K parts this past > year. They seem ok, but the nearly identical ACEX 1K family is better in > most regards. But the gate size is limited if we are looking at having > future growth and I am not finding as good a price as with the Xilinx > SpartanII parts. The only vendors I can find are Arrow and Newark, and > Newark does not show much on their web site. Anyone know how to get good > price numbers on the Altera parts without having a handfull of specific > parts? If I call the vendors, they always want me to give the a few part > numbers and I am window shopping and need pricing on all the parts so I > can make my choices. > > The other problem I have with the Altera FPGAs is the lack of LUT RAM. > There are only a small number of RAM (EAB/ESB) in these parts and I need > a lot more blocks of it than are available. They don't have to be big, > the 16 words available in a bank of Xilinx LUTs is perfect. For example, > I will need 64 blocks of RAM if four modules of the 8 channel ADC/DAC > are on board. This is not hard using the Xilinx LUTs. Anyone know of a > way to do something similar in an Altera FPGA? > > BTW, just to mention why I like the Altera parts... THEY DON'T HAVE A > STARTUP CURRENT SURGE!!! Was I at all unclear about that? :) > > -- > > Rick "rickman" Collins > > rick.collins@XYarius.com > Ignore the reply address. To email me use the above address with the XY > removed. > > Arius - A Signal Processing Solutions Company > Specializing in DSP and FPGA design URL http://www.arius.com > 4 King Ave 301-682-7772 Voice > Frederick, MD 21701-3110 301-682-7666 FAX -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 38332
Depends very heavily on what you are putting in there. Steve Holroyd wrote: > I am currently task of recommending the largest, fastest and most > memory FPGA that's readily available the first half of this year for a > FPGA Array Card. > > The choices have been narrowed down to two families Altera's APEX-II > (EP2A70) and XILINX Virtex-II (XC2V6000). > > Which can operate at the highest speed? > > Steve -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 38333
Because of a separate project I am doing, I had a Linux box set up to emulate windows using wine. The interesting thing is that we set up two identical PCs. Both had same motherboard with dual P3 1Ghz processors and 1Gbyte of ram. Both have 7200rpm IBM hard drives. We took the same edif and ucf files and routed them on both. The design was in a XC2V3000 part with utilization around 70%. The Linux box showed almost a 2 to 1 speedup in PAR. Also the Linux box shows huge speedups using fpga_editor. The windows box(running windows 2000pro) takes over 10 minutes just to open the routed ncd. The Linux box takes just a little over 3 minutes. If you add a net with autoroute on, windows box is several minutes, Linux box is less than 10 seconds. Just wanted to see if anyone else has experimented with this. I am now doing all of my routes and fpga_editor work on the linux box and saving a lot of time. BTW, the time for PAR on this particular design was over 2 hours on the windows box and just over an hour on the linux box. BryanArticle: 38334
BTW, nothing was running under windows, not even a screen saver. Task Manager showed that no other native apps were using any processor cycles. Bryan "Bryan" <bryan@srccomp.com> wrote in message news:3c3f4b2e$0$26669$724ebb72@reader2.ash.ops.us.uu.net... > Because of a separate project I am doing, I had a Linux box set up to > emulate windows using wine. The interesting thing is that we set up two > identical PCs. Both had same motherboard with dual P3 1Ghz processors and > 1Gbyte of ram. Both have 7200rpm IBM hard drives. We took the same edif > and ucf files and routed them on both. The design was in a XC2V3000 part > with utilization around 70%. The Linux box showed almost a 2 to 1 speedup > in PAR. Also the Linux box shows huge speedups using fpga_editor. The > windows box(running windows 2000pro) takes over 10 minutes just to open the > routed ncd. The Linux box takes just a little over 3 minutes. If you add a > net with autoroute on, windows box is several minutes, Linux box is less > than 10 seconds. Just wanted to see if anyone else has experimented with > this. I am now doing all of my routes and fpga_editor work on the linux box > and saving a lot of time. BTW, the time for PAR on this particular design > was over 2 hours on the windows box and just over an hour on the linux box. > > Bryan > > > >Article: 38335
You are kind of stuck if you are unwilling to use any vendor specific coding. You could use keep buffers in your synthesis to force the synthesizer to assign logic to specific LUTs. The mapper should pretty much leave those alone as long as the synthesizer outputs LUT primitives. Unfortunately, the keep buffers have different syntaxes between vendors. An option, if this part is not changed from instance to instance, is to compile the parity component separately to an edif using vendor specific code, then distribute as and instantiate it as a black box in the design. The PAR tools will look for the edif to merge the black box component with the rest of the design. You can include the code when compiling it for simulation so that the simulation is correct and remove it from the compile script so that it gets black boxed. Kevin Brace wrote: > Hi, I am having problems with trying to constrain the inputs > going into a multi-level parity generator in XST Verilog. > Here I am trying to generate a parity of 36 inputs for my PCI IP core, > and, of course, Xilinx and Altera FPGAs are 4-input LUT-based, so the > input signals go through multiple levels of LUTs to calculate the > parity. > In the first level, the parity generator uses 9 LUTs to calculate > parity. > In the second level, 2 LUTs take in 8 of the 9 outputs of the first > level LUTs, and the remaining one output from the first level LUT will > be used at the third level. > At the final third level, 2 inputs from the second level LUTs and one > input from the first level LUTs will be used to calculate the final > parity calculation result. > Here are the partial Verilog codes for the top module where I > instantiate the parity generator, and the parity generator. > > ___________________________ Top Module _____________________________ > > Parity_Generator Parity_Generator_Generator_Instance( > .clk(clk), > .Parity_Input({c_be_n[3:0], ad_Port[31:0]}), > .XORed_Result(Parity_Generated) > ); > > ____________________________________________________________________ > > __________________________ Parity Generator ________________________ > > module Parity_Generator( > clk, > Parity_Input, > XORed_Result > ); > > input clk; > input[35:0] Parity_Input; > output XORed_Result; > > reg XORed_Result; > > wire[8:0] First_Intermediate_Parity; > wire[2:0] Second_Intermediate_Parity; > wire Final_Parity; > > // First level > assign First_Intermediate_Parity[8] = Parity_Input[35] ^ > Parity_Input[34] ^ Parity_Input[33] ^ Parity_Input[32]; > assign First_Intermediate_Parity[7] = Parity_Input[31] ^ > Parity_Input[30] ^ Parity_Input[29] ^ Parity_Input[28]; > assign First_Intermediate_Parity[6] = Parity_Input[27] ^ > Parity_Input[26] ^ Parity_Input[25] ^ Parity_Input[24]; > assign First_Intermediate_Parity[5] = Parity_Input[23] ^ > Parity_Input[22] ^ Parity_Input[21] ^ Parity_Input[20]; > assign First_Intermediate_Parity[4] = Parity_Input[19] ^ > Parity_Input[18] ^ Parity_Input[17] ^ Parity_Input[16]; > assign First_Intermediate_Parity[3] = Parity_Input[15] ^ > Parity_Input[14] ^ Parity_Input[13] ^ Parity_Input[12]; > assign First_Intermediate_Parity[2] = Parity_Input[11] ^ > Parity_Input[10] ^ Parity_Input[ 9] ^ Parity_Input[ 8]; > assign First_Intermediate_Parity[1] = Parity_Input[ 7] ^ Parity_Input[ > 6] ^ Parity_Input[ 5] ^ Parity_Input[ 4]; > assign First_Intermediate_Parity[0] = Parity_Input[ 3] ^ Parity_Input[ > 2] ^ Parity_Input[ 1] ^ Parity_Input[ 0]; > > // Second level > assign Second_Intermediate_Parity[2] = First_Intermediate_Parity[8]; > assign Second_Intermediate_Parity[1] = First_Intermediate_Parity[7] ^ > First_Intermediate_Parity[6] ^ First_Intermediate_Parity[5] ^ > First_Intermediate_Parity[4]; > assign Second_Intermediate_Parity[0] = First_Intermediate_Parity[3] ^ > First_Intermediate_Parity[2] ^ First_Intermediate_Parity[1] ^ > First_Intermediate_Parity[0]; > > // Final level > assign Final_Parity = Second_Intermediate_Parity[2] ^ > Second_Intermediate_Parity[1] ^ Second_Intermediate_Parity[0]; > > always @ (posedge clk) begin > > XORed_Result <= Final_Parity; > > end > > endmodule > ____________________________________________________________________ > > From what I see, the c_be_n[3:0] should go through, > > assign First_Intermediate_Parity[8] = Parity_Input[35] ^ > Parity_Input[34] ^ Parity_Input[33] ^ Parity_Input[32]; > > But the problem I have here is that when I synthesize the code, XST > Verilog (ISE WebPACK's synthesis tool) or Xilinx MAP somehow > automatically chooses which inputs goes into which LUTs, and I have a > problem with that. > I want "c_be_n[3:0]" which is an unregistered bus signal of PCI bus to > go through as fewer LUTs as possible to reduce setup time requirements. > For "ad_Port[31:0]," that signal comes from inside of the chip (from > DFFs), so I don't have to worry too much about how many levels of LUTs > it passes through. > I tried disabling (unchecking) XST's option called XOR collapsing, but > it didn't seem to make any difference. > I recently upgraded to the latest ISE WebPACK 4.1WP2.0 from 4.1WP0.0, > but that didn't seem to make any difference, either. > For MAP, setting Map to Inputs option to 4 or 5 didn't seem to make > difference. > I first noticed this problem when I synthesized my PCI IP core > trying to meet 66MHz PCI timings (Tsu < 3ns, Tval(Tco) < 6ns) just for > curiosity. > In 33MHz PCI, this whole issue of which signals go through how many LUTs > for calculating parity was not a big issue because Tsu only has to be < > 7ns. > I found someone else discussing a better way of calculating > 36-bit parity than the method shown above for Virtex architecture > devices, so I modified my code to take advantage of that idea. > Here are the new partial Verilog codes for the top module where I > instantiate the parity generator, and the parity generator. > > ___________________________ Top Module _____________________________ > > Parity_Generator Parity_Generator_Instance( > .clk(clk), > .Fast_Path_Parity_Input(cben[3:0]), > .Parity_Input_1(ad_Port[3:0]), > .Parity_Input_2(ad_Port[7:4]), > .Parity_Input_3(ad_Port[11:8]), > .Parity_Input_4(ad_Port[15:12]), > .Parity_Input_5(ad_Port[19:16]), > .Parity_Input_6(ad_Port[23:20]), > .Parity_Input_7(ad_Port[27:24]), > .Parity_Input_8(ad_Port[31:28]), > .XORed_Result(Parity_Generated) > ); > > ____________________________________________________________________ > > __________________________ Parity Generator ________________________ > > module Parity_Generator( > clk, > Fast_Path_Parity_Input, > Parity_Input_1, > Parity_Input_2, > Parity_Input_3, > Parity_Input_4, > Parity_Input_5, > Parity_Input_6, > Parity_Input_7, > Parity_Input_8, > XORed_Result > ); > > input clk; > input[3:0] Fast_Path_Parity_Input; > input[3:0] Parity_Input_1; > input[3:0] Parity_Input_2; > input[3:0] Parity_Input_3; > input[3:0] Parity_Input_4; > input[3:0] Parity_Input_5; > input[3:0] Parity_Input_6; > input[3:0] Parity_Input_7; > input[3:0] Parity_Input_8; > output XORed_Result; > > reg XORed_Result; > > wire[7:0] First_Intermediate_Parity; > wire[1:0] Second_Intermediate_Parity; > wire Third_Intermediate_Parity; > wire Final_Parity; > > // First level > assign First_Intermediate_Parity[7] = Parity_Input_1[3] ^ > Parity_Input_1[2] ^ Parity_Input_1[1] ^ Parity_Input_1[0]; > assign First_Intermediate_Parity[6] = Parity_Input_2[3] ^ > Parity_Input_2[2] ^ Parity_Input_2[1] ^ Parity_Input_2[0]; > assign First_Intermediate_Parity[5] = Parity_Input_3[3] ^ > Parity_Input_3[2] ^ Parity_Input_3[1] ^ Parity_Input_3[0]; > assign First_Intermediate_Parity[4] = Parity_Input_4[3] ^ > Parity_Input_4[2] ^ Parity_Input_4[1] ^ Parity_Input_4[0]; > assign First_Intermediate_Parity[3] = Parity_Input_5[3] ^ > Parity_Input_5[2] ^ Parity_Input_5[1] ^ Parity_Input_5[0]; > assign First_Intermediate_Parity[2] = Parity_Input_6[3] ^ > Parity_Input_6[2] ^ Parity_Input_6[1] ^ Parity_Input_6[0]; > assign First_Intermediate_Parity[1] = Parity_Input_7[3] ^ > Parity_Input_7[2] ^ Parity_Input_7[1] ^ Parity_Input_7[0]; > assign First_Intermediate_Parity[0] = Parity_Input_8[3] ^ > Parity_Input_8[2] ^ Parity_Input_8[1] ^ Parity_Input_8[0]; > > // Second level > assign Second_Intermediate_Parity[1] = First_Intermediate_Parity[7] ^ > First_Intermediate_Parity[6] ^ First_Intermediate_Parity[5] ^ > First_Intermediate_Parity[4]; > assign Second_Intermediate_Parity[0] = First_Intermediate_Parity[3] ^ > First_Intermediate_Parity[2] ^ First_Intermediate_Parity[1] ^ > First_Intermediate_Parity[0]; > > // Third level > assign Third_Intermediate_Parity = Second_Intermediate_Parity[1] ^ > Second_Intermediate_Parity[0]; > > // Final level > assign Final_Parity = Fast_Path_Parity_Input[3] ^ > Fast_Path_Parity_Input[2] ^ Fast_Path_Parity_Input[1] ^ > Fast_Path_Parity_Input[0] ^ Third_Intermediate_Parity; > > always @ (posedge clk) begin > > XORed_Result <= Final_Parity; > > end > > endmodule > > ____________________________________________________________________ > > In the above shown code, "ad_Port[31:0]" has to go through 4 > levels of LUTs, but like the previous version, that signal comes from > inside of the chip (from DFFs), so I don't have to worry too much about > how many levels of LUTs it passes through. > The nice part of this method is that "c_be_n[3:0]" only has to go > through 1 level of LUT. > Yes, a 5-input LUT's gate delay is larger than a 4-input LUT's gate > delay, but the 5-input LUT's gate delay is far better than two 4-input > LUTs connected in series with the routing delay between two 4-input > LUTs. > In theory XST should use Virtex architecture's 5-input LUT, but when I > synthesize this code with the same XOR Collapse option disabled, XST > still seems to collapse the XOR structure of the HDL code, and > synthesizes with 3 levels of LUTs using only 4-input LUTs. > How can I work around this problem to instruct XST not to collapse the > XOR gate structure? > The "XOR Collapse" option I unchecked was through Project Navigator -> > Processes for Current Source -> Synthesize -> Properties -> HDL Options. > I am absolutely willing to use XST synthesis constraint file which I > already use to constraint fanouts of individual inputs signals, if that > is possible, but I don't want to insert vendor specific synthesis > directives into my code because my PCI IP code will have to be > synthesizable with other synthesis tools. > I am using ISE WebPACK 4.1WP2.0 to develop my PCI IP core, and the > device I am currently targeting is Xilinx Spartan-II XC2S150-5CPQ208 or > 6CPQ208. > > Thanks, > > Kevin Brace (Don't respond to me directly, respond within the > newsgroup.) -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 38336
The other big difference that separates FPGA and ASIC designs is the ability of ASIC designs to be mixed mode. --jayArticle: 38337
Matthias Weber <msweber@onlinehome.de> wrote in message news:<1103_1010490929@news.online.de>... > hi, > > do i understand right that latches consists of simple flipflops without beeing clocked so that the circuit storesimmediately every change of signal. > is the difference between latches and registers that latter are clocked (constructed by D-, RS- or JK-FlipFlops)? > > thanks for information, > > matthias weber Unfortunately, the terms latch, register and flip-flop do not have universally accepted meanings, that are adhered to. You can have memory elemants with no clock input, you can have memory elements that do have a clock, elements that have an enable, and you can have collections of memory elements. Take two, two-input NOR gates and join the output of one to the input of the other, in a cross-coupled sort of circuit. This is a bistable circuit that can be changed from one state to the other by asserting the right signals on the two spare inputs. It is normally referred to as an SR latch, but also as an RS latch, an RS flip-flop, an SR bistable, etc. Perhaps this is the circuit you have in mind when you mention 'simple flipflops'. If you add a third input (a clock) and some circuitry that means that the output can only change in response to a change of state on the clock, then you get an edge-triggered device. Most people call this a flip-flop, although you do find references to edge-triggered latches. If, instead of the clock, you add a third input (an enable) and some (different) circuitry that means that the output follows the input when the enable is in one state, but stores the latest input when the enable is in its other state, then you get a level-triggered device. Most people call this a latch, and to emphasise the way the output follows the input when the enable is active, qualify the latch as transparent. If you string some flip-flops (edge-triggered) together, with all the clocks combined, you get what most people call a register. On the other hand, the term registered when describing a digital output, probably just means that you need to apply a clock signal to get the output to change, and may apply to just one output. So, be careful what you read into these terms. Make sure you know whether the device is level triggered or edge triggered, and what polarity of level or edge makes the device output change. Martin RiceArticle: 38338
Bret Wade wrote: > rickman wrote: > > > > > > After rethinking this I realized that there is no need remove the macro using > > > JBits. Simply delete the macro in FPGA Editor after place and route. > > > > Bret, are you saying that a macro like this can be treated as a single > > object and removed with a single command in the FPGA Editor? > > Yes, set the list window to "all macros", pick the macro from the list and delete. > I should also point out that hard macros can be imported into an .ncd using FPGA Editor (Edit--> Add Macro), so there is no need to represent the anti-core macro in the logical front end and "compile" it into the design. This also means that a library of anti-cores can be developed and used interchangeably. BretArticle: 38339
Thanks for the advice Ray. I finally figured out a way to do most of what I want. My real problem is that I am trying to design a board that can be everything to everyone depending on the chips you add. This means low power if that is what you need or high performance if that is what you need. And of course, it always has to be low cost. So I keep running into walls with this approach. The one that I really hate is the problems that the Virtex/SpartanII startup current causes. But if I only use one chip per voltage, I can get around that. So for now I am looking at using a CoolRunner as the PC/104 interface. This is not too expensive and it keeps that part of the power down. Then everything else will go into an XC2S150E. This has a little room for growth if there is future need for a bigger part. Too bad they didn't keep the pinout compatible with the Virtex E parts, d..m! The other problem this causes me is the lack of reconfiguration for different combinations of IO. There is just not room on the board (or power supply) to give each IO module its own FPGA like the current board has. The new board has 4 IO sites rather than just 2. So I will be looking hard at Jbits and various methods of modularizing bitstreams in the future. For now this will get us off the ground and allow us to design a workable board. Maybe when the Spartan version of the XC2V parts is available I will work with that if I am still working then :) BTW, I am aware of the limitations of the Altera architecture for doing math. The 10K/1K family has other limitations as well since not all of the inputs to the LUT can be used when you are using all the inputs to the FFs. But the real kicker is the MAX+PLUS II tool. We found that in a dense design it will just plain lie about timing. The analyzer says you have a good design that meets timing and because of the complex routing that can be required, it miscalculates and the chip will fail at temperature. At least they have moved most of the 10K/1K family to Quartus in the paid versions. The free versions still only support them under MAX+PLUS II. Ray Andraka wrote: > > Rick, > > I think you are intending to use these in a signal processing (read > arithmetic and filtering) application. If that is the case, be very > careful. The SpartanII/Virtex offer more advantages for DSP than just > having distributed RAM. If you look at the Altera carry chain, it breaks > the 4LUTs into a pair of 3LUTs, one for the carry and one for the 'sum'. > One input to those 3LUTs is your carry from the adjacent bit. That means, > at best, you get a two input arithmetic function in each level of logic. > Things like adder/subtractors wind up using two or more times the LUTs as > the equivalent function in Xilinx. Also, the carry chain runs through the > LAB, so your data flow has to run from LAB to LAB. In the 10K and I believe > (correct me if I am wrong) in the Acex familes, there are no direct > connections between LABs, so the data path has to go onto the row routing. > There are 6 row routes for every 8 LE's, so in a heavily arithmetic design > you run out of row routes when the row gets 3/4 full. It is actually worse > than that: since the row routing connections are a sparse matrix, you need > to route through an intermediate LAB if there is not a direct connection > between your source and destination. As the row fills up, the number of no > connects goes up sharply, accelerating the saturation. In a heavily > arithmetic design, you hit a pretty hard limit at about 50% device > utilization because of this. The 20K family greatly improves the situation > by the addition of direct connects between adjacent LABs. Also, don't > discount the utility of the SRL16's in Xilinx. Not only do those make very > compact delay queues, which are extensively used in filters, but they also > give you a way to reload LUT contents without having to reconfigure the > device. This is valuable for DA filters, since the coefficients are stored > as partial sums in LUTs. In altera, a reprogrammable filter is much harder > to build because the LUTs don't help you there. The SRL16's are also great > for doing small reorder queues. Reordering comes up frequently in signal > processing in such operations as FFTs, channel multiplexing etc. > > Before you abandon the Xilinx offerings, I would look long and hard at what > you are giving up. The cost may not be worth the small gains you get. > --Ray Andraka, P.E. > President, the Andraka Consulting Group, Inc. > 401/884-7930 Fax 401/884-7950 > email ray@andraka.com > http://www.andraka.com > > "They that give up essential liberty to obtain a little > temporary safety deserve neither liberty nor safety." > -Benjamin Franklin, 1759 -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 38340
Hey Rick, I see you are in MD. Strong suggestion here - call your local Altera office 410-203-1245 and ask for Jeff Wills. He can meet with you and tell you all about Altera's DSP capabilities architectually, tools, and IP. Much more significant than 10k. Do yourself a favor and get this in person or on phone. Guy Altera Corp. rickman <spamgoeshere4@yahoo.com> wrote in message news:<3C3F3029.5E555144@yahoo.com>... > I am finalizing my FPGA selection for a line of DSP boards that we will > be making for a number of years. I have always been more familiar with > Xilinx but had a chance to work with the Altera 10K parts this past > year. They seem ok, but the nearly identical ACEX 1K family is better in > most regards. But the gate size is limited if we are looking at having > future growth and I am not finding as good a price as with the Xilinx > SpartanII parts. The only vendors I can find are Arrow and Newark, and > Newark does not show much on their web site. Anyone know how to get good > price numbers on the Altera parts without having a handfull of specific > parts? If I call the vendors, they always want me to give the a few part > numbers and I am window shopping and need pricing on all the parts so I > can make my choices. > > The other problem I have with the Altera FPGAs is the lack of LUT RAM. > There are only a small number of RAM (EAB/ESB) in these parts and I need > a lot more blocks of it than are available. They don't have to be big, > the 16 words available in a bank of Xilinx LUTs is perfect. For example, > I will need 64 blocks of RAM if four modules of the 8 channel ADC/DAC > are on board. This is not hard using the Xilinx LUTs. Anyone know of a > way to do something similar in an Altera FPGA? > > BTW, just to mention why I like the Altera parts... THEY DON'T HAVE A > STARTUP CURRENT SURGE!!! Was I at all unclear about that? :) > > -- > > Rick "rickman" Collins > > rick.collins@XYarius.com > Ignore the reply address. To email me use the above address with the XY > removed. > > Arius - A Signal Processing Solutions Company > Specializing in DSP and FPGA design URL http://www.arius.com > 4 King Ave 301-682-7772 Voice > Frederick, MD 21701-3110 301-682-7666 FAXArticle: 38341
Yes Austin, we did discuss the power surge issue there are ways around the problem. But none of the proposed solutions are workable for me. I looked at xapp451, "SpartanII poweron assist" again just today to make sure I did not miss anything. The problem is that the added circuitry is not a cheap or as simple as you would like to think. I noticed that the example used in this xapp uses a 2600 uF capacitor! The AVX datasheet only goes as high as 1000uF at 6.3 volts with a .32" x .17" footprint including pads. This would require three of these or something around .32" x .6". I don't consider that to be small when being used with a .7" square chip. Actually the largest cap I could find in a quick search was a 470 uF device which would require 5 at $2.50 each. That makes the cost of the POS circuit nearly as much as the FPGA!!! It would actually be easier for me to increase the size of my DCDC converter and supply the extra Amps directly. I believe you (or Peter) offered in the newsgroup to provide more accurately qualified data on the POS (power on surge), but all I ever got was a phone call or email indicating that the numbers "could" be reduced. Rather than being told what the reduced numbers were, I was asked what my target was. I guess I was hoping that the POS was actually very overstated, expecially in the smaller part. But it seems that it could only be reduced by some 25% or so. In any event, I am about ready to go with a XC2Ve part along with a coolrunner for the 5 volt interface. I may not like some of the limitations of the Spartan II parts, but I will love them if I can find a way to design for partial reconfiguration. Austin Lesea wrote: > > Rick, > > I have responded before on the power on current issue, but for the group, I > will respond here again. > > For existing designs in Virtex, Virtex E, Spartan II, and Spartan IIE, > consult the datasheets, and the app notes: > > http://www.support.xilinx.com/xapp/xapp450.pdf > > http://www.support.xilinx.com/xapp/xapp451.pdf > > Which demonstrates ways to start up with as little as 80 mA by adding three > components that cost pennies at -40 C with the industrial grade parts. > > In Virtex II, there is no startup current at all on any of the three > supplies. The startup current equals the operational current, and there is > no extra current to be supplied. If all you can supply is the operational > currents, it powers on cleanly and configures. > > 4KXLA, and Spartan XL also have no startup current. > > Austin > > rickman wrote: > > > I am finalizing my FPGA selection for a line of DSP boards that we will > > be making for a number of years. I have always been more familiar with > > Xilinx but had a chance to work with the Altera 10K parts this past > > year. They seem ok, but the nearly identical ACEX 1K family is better in > > most regards. But the gate size is limited if we are looking at having > > future growth and I am not finding as good a price as with the Xilinx > > SpartanII parts. The only vendors I can find are Arrow and Newark, and > > Newark does not show much on their web site. Anyone know how to get good > > price numbers on the Altera parts without having a handfull of specific > > parts? If I call the vendors, they always want me to give the a few part > > numbers and I am window shopping and need pricing on all the parts so I > > can make my choices. > > > > The other problem I have with the Altera FPGAs is the lack of LUT RAM. > > There are only a small number of RAM (EAB/ESB) in these parts and I need > > a lot more blocks of it than are available. They don't have to be big, > > the 16 words available in a bank of Xilinx LUTs is perfect. For example, > > I will need 64 blocks of RAM if four modules of the 8 channel ADC/DAC > > are on board. This is not hard using the Xilinx LUTs. Anyone know of a > > way to do something similar in an Altera FPGA? > > > > BTW, just to mention why I like the Altera parts... THEY DON'T HAVE A > > STARTUP CURRENT SURGE!!! Was I at all unclear about that? :) > > > > -- > > > > Rick "rickman" Collins > > > > rick.collins@XYarius.com > > Ignore the reply address. To email me use the above address with the XY > > removed. > > > > Arius - A Signal Processing Solutions Company > > Specializing in DSP and FPGA design URL http://www.arius.com > > 4 King Ave 301-682-7772 Voice > > Frederick, MD 21701-3110 301-682-7666 FAX -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 38342
Martin Rice wrote: > > Unfortunately, the terms latch, register and flip-flop do not have > universally accepted meanings, that are adhered to. Let me suggest that in this newsgroup a latch is a bistable storage element that has only one rank, i.e. is transparent ( input directly affecting the output) while enabled. A flip-flop is more complex, dual-rank, and is thus never transparent, and the data input never affects the output directly. Usually, a register is a collection of flip-flops with a common clock. If we adhere to these definitions, we are in synch with most of the industry. Peter AlfkeArticle: 38343
i am trying to do a speech recognition application and i need a clean input voice signal to a microphone. my problem is that i need to get rid of ambient noise in a room without affecting the voice signal at all. i have thought about doing an adaptive filter like application, but i cannot think of a way to isolate just the ambient noise without touching the voice. my main goal would be to come up with some kind of active noise control setup that would be able to phase-cancel the ambient noise while keeping the voice signal clean. does anyone know how this can be done? i have seen something similar from andreas electronics, but their product doesn't exactly fit my requirement. any help would be appreciated. thanks. chris wangArticle: 38344
Bryan wrote: > ... I am now doing all of my routes and fpga_editor work on the linux box > and saving a lot of time. Just out of curiosity, do the toolbar buttons in the main window of fpga_editor work for you? If they do, what version of wine and Linux are you using? For me (current wine cvs and RH 6.2), the toolbar buttons in the main window do not work, and there does not appear to be any other way to select the particular pips and lines etc that are desired. Oddly enough, the toolbar buttons do work in the popup windows for the slices and iobs. Everything else in fpga_editor seems to work fine. I had not tried a run time comparison, but since I can dual boot the same machine, maybe I will try that one of these days. DuaneArticle: 38345
In Synplify, we have a special multiply by constant mode that does some recoding of the constant to reduce the number of adders. I think you will get a good result just doing the multiply. For more complex constants we will probably get fewer adders than you would expect. Try multiplying by 7 and see what happens. Ken McElvain CTO, Synplicity Ray Andraka wrote: > Because the synthesizer may not recognize that it can be done with an adder. > Often a template is used which in turn instantiates the vendor core for the > multiplier. I believe if you do this in synplicity, you'll get a LUT based sum > of partial products construction based on the Xilinx coregen constant coefficient > multiplier. The synthesis is not smart enough to distill that down to an adder > (it would if the multiplier template produced a full array multiplier, but that > is usually a very inefficient construct in an FPGA). > > Jay wrote: > > >>What about just typing a "*" and let your synthesizer turn it into 2 >>adders? This way nobody has to try to figure out why you're adding 2 >>shifted numbers when they're reading the code. >> >>Kenily <aiurh@iuehr.erug> wrote in message news:<ee74130.-1@WebX.sUN8CHnE>... >> >>>i want to implement a multiplier.one >>>multiply 0x600(Hex).how do i implement? >>> > > -- > --Ray Andraka, P.E. > President, the Andraka Consulting Group, Inc. > 401/884-7930 Fax 401/884-7950 > email ray@andraka.com > http://www.andraka.com > > "They that give up essential liberty to obtain a little > temporary safety deserve neither liberty nor safety." > -Benjamin Franklin, 1759 > > >Article: 38346
I think the trick is to have a unit with two microphones. One is highly directional toward the speaker. The other is more omnidirectional but with a null in the direction of the speaker. The data from the omnidirectional mic is the noise that you subract from the directional mic. "chris" <cjwang_1225@hotmail.com> wrote in message news:24a13eb0.0201111713.7f9af7b5@posting.google.com... > i am trying to do a speech recognition application and i need a clean > input voice signal to a microphone. my problem is that i need to get > rid of ambient noise in a room without affecting the voice signal at > all. i have thought about doing an adaptive filter like application, > but i cannot think of a way to isolate just the ambient noise without > touching the voice. my main goal would be to come up with some kind of > active noise control setup that would be able to phase-cancel the > ambient noise while keeping the voice signal clean. does anyone know > how this can be done? i have seen something similar from andreas > electronics, but their product doesn't exactly fit my requirement. any > help would be appreciated. thanks. > chris wangArticle: 38347
Kevin Neilson wrote: > > I think the trick is to have a unit with two microphones. One is highly > directional toward the speaker. The other is more omnidirectional but with > a null in the direction of the speaker. The data from the omnidirectional > mic is the noise that you subract from the directional mic. > > "chris" <cjwang_1225@hotmail.com> wrote in message > news:24a13eb0.0201111713.7f9af7b5@posting.google.com... > > i am trying to do a speech recognition application and i need a clean > > input voice signal to a microphone. my problem is that i need to get > > rid of ambient noise in a room without affecting the voice signal at > > all. i have thought about doing an adaptive filter like application, > > but i cannot think of a way to isolate just the ambient noise without > > touching the voice. my main goal would be to come up with some kind of > > active noise control setup that would be able to phase-cancel the > > ambient noise while keeping the voice signal clean. does anyone know > > how this can be done? i have seen something similar from andreas > > electronics, but their product doesn't exactly fit my requirement. any > > help would be appreciated. thanks. > > chris wang Contact throat microphones were used in fighter planes 50 years ago. They discriminate against ambient noise very nicely indeed. 50 dB SNR improvements were cited, but I didn't make measurements myself. Jerry -- Engineering is the art of making what you want from things you can get. -----------------------------------------------------------------------Article: 38348
In Apex 20KE and later devices an external feedback can be used as well. In this case you need a connection between the PLL output and the PLL feedback input. kayrock66@yahoo.com (Jay) wrote in message news:<d049f91b.0201101743.44b19f31@posting.google.com>... > I'm not sure I understand your question but I'll try my hand at it > because nobody else has. The Altera clock multiplication is done > using hard macro PLL's on the die. The feedback circuitry is > encapsulated inside the function, you just tell it what multiplication > factor you want, and let Altera do the rest. > > "Dimitry Yegorov 1598864168" <dmyegorov@geolink-group.com> wrote in message news:<a0kbgd$97h$1@josh.sovintel.ru>... > > I know that it must be an easy question, but I can't find the answer - it is > > not published in the Help. Seems that some loop should be added to max2plus > > PLL megafunction to implement the F*N multiplier. > > Any comments are welcome, I am the beginner. > > Thanks!Article: 38349
> I'm a little lost. You say SPI, but then mention UARTS. > These are all SLAVE devices ? They are each banging away at about 150 kbps, not synchronous to each other (i.e. they all have their own clock that they are supplying and are all completely uncoordinated). So the worst case is that my board has to listen to all 5 going at once. However, the traffic is short bursts so that the overall throughput from any one input is more like 2000 bps. Anyway thanks to all who have replied. I now have a lot more insight into what direction to go in, or should I say, what directions to avoid. - Stout
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z