Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
5) use a 2K deep initialized RAM (BlockROM?) and use the 2 upper address bits to select the 512 element coefficient set you want. Let the RAM address circuitry do the multiplexing for you. regards, tom Antonio wrote: > > I've a little bit of confusion on my mind, hope you could help to > clear something to me, this is the problem : > > I've 3 block each large 512x12 bits (... in each of it are stored some > polyphase sum for a different interpolation rate) , depending on the > rate selected I need to connect only one of this to the rest of the > circuit but for this operation is not required an high speed. I've > some idea : > > 1) Arrange each block in a different BlockRAM initialized using CORE > GENERATOR and then switch between them with a case > > 2) Only one RAM not initialized and three constant depending on the > rate I load the ram with the values stored in three different ROM or > defined with a constant (I don't know if it is the same). > > 3) Arrange each block in a ROM using the case construct which is > recognized by XST and mapped on a ROM, by the way which is the > speedest, blockRAM or this kind of ROM ?? For the selection of the ROM > depending on the rate I could use the CASE again. > > 4) Use a constant which I think produde a big multiplexer > > 5) your suggestion > > I'm interested in speed, hope to reach at least 165MHz on VirtexE 600 > -6 , so this is the key of lecture of this post > > Thanks > > Antonio -- Tom Burgess Digital Engineer Dominion Radio Astrophysical Observatory P.O. Box 248, Penticton, B.C. Canada V2A 6K3Article: 39001
Falk Brunner wrote: > <snip> > 65536 functions total, with 32+32+224+224+1680+1667=3859 different logic > functions, this includes all functions using 0,1,2,3,4 inputs. <snip> > Comments are very appeciated. Impressive. Part of this discussion hinges on the semantics, and also the FPGA is not quite real-world. In a real discrete-logic design, the IPs would be swapped as needed, but in a FPGA I don't think the routes have this level of freedom - so they use the 'for free' functions, to include the IP-SWAP operations. (This will be faster) Thus Peter claims the pedantic 65536 number, when a classic logic analysis can give a smaller number of distinct 'functions'. Both are right, depending on how you phrase, and understand the question :-) From the opposite end, the Fairchild Single Gate S57/S58 can be thought of as 3 IP LUTs, hard-wired to choose one of the MAX 256 and they cover (NOT) AND.OR.XOR with this. They also 'miss' one logic function from their data sheet, which is a 2:1 MUX, that selects A/!B -jgArticle: 39002
--------------E55D00EB7E1675EE47A9A030 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Jim, Detection of errors in the configuration is by readback, and checking checksums. Many hi-rel (military) users have found that continually reprogramming is easier and accomplishes the same thing (as in high altitude or space applications, SEUs are 10,000X to 100,000X more frequent, and often occur daily). Detection of transient SEU errors to the logic requires the logic to have its own checks. Triple Multiple Redundancy (TMR) with redundant voting is used in critical cases. Block RAMs are used with ECC logic, where upon readback errors are corrected and written back in on the other port corrected. To take this to another domain, when I designed systems for the telcos, I had to anticipate failures, both soft and hard, and provide recovery for both (error free for both....). At the system level, I had redundant elements, and check and correct circuits. Now that the systems are on a chip, it is a natural progression to use the same techniques inside the FPGA that used to be used with collections of FPGAs. .22->.18->.15>->.13->.10->.07 is Moore's 'Law', and the effects of SEUs must be taken into account by the IC designers. Designs can be completely hardened at the expense of area, but how much larger area, and hence larger cost, will a customer want to pay for a "feature" that may not apply to their market? As it is all programmable logic, and if SEUs can be obviated by programming more gates, then for those that require that level of availability, they can 'pay' for it by their usage of their gates. For those with non critical applications that don't care, they can benefit from the increased density and reduced costs. Many DSP processes (e.g. voice phone calls) are able to tolerate a huge level of bad bits, as human speech is incredibly robust. Banking and aerodynamic controls are applications where TMR is the only solution, even if SEU's were not an issue! Telecom switching is in-between: if the level of errors is below the noise floor of all of the interconnected elements, then no one can tell, or even know what is going on (all service objectives and tariffs are met). We by no means have all of the answers, and are out there looking at all options, and providing TMR, ECC, and other programmable cores today to our customers for their critical applications. Austin Jim Granville wrote: > This looks interesting : > > http://www.ebnonline.com/story/OEG20020128S0079 > > It has many spins (as expected :) on the problem, but it > seems very relevant to fast-shrink-path SRAM FPGAs. > > Is there any information on the 'disturbance energy' for the > various SRAM components of FPGAs :- > > - The Config Cells ( Slower/larger, but not zero FIT ? ) > - The Fast SRAM Blocks > - The sea of registers > - The LUT array store > > The SRAM notes do not mention register errors, but since a SRAM > cell is effectively a strobed latch, is it possible to disturb > the BIT state in a latch, just as in a SRAM cell ? > > Error recovery in data blocks is possible with correction, but > how does a system detect config or latch errors ? > > -jgArticle: 39003
--------------F7940E67EBE3F98EDA2E06EE Content-Type: text/plain; charset=us-ascii; x-mac-type="54455854"; x-mac-creator="4D4F5353" Content-Transfer-Encoding: 7bit Tim wrote: > If you look (via fpga_editor) at the clock buffers you will see > that there is a clock gate circuit in SpartanII. At least there > is a control pin which looks like a gate input (comp.pin = K.BUFn.CE). > > There must be some problem with using it and we will have to wait > for Peter's memoirs before we get the story. Talking about memoirs, I just finished reading "Spin-Off" by Charlie Sporck of NSC fame, and "Swimming Across" by Andy Grove of Intel fame. Both very different, and both very good reading. I can never compete with that kind of substance. Clock gating: I remember once having a "brilliant" idea and publishing a clock multiplexer circuit in XCell http://www.xilinx.com/xcell/xl24/xl24_20.pdf (Got me some flak from the high priests of metastability, but I think I managed to defend myself.) I then conned our circuit designers to implement it in XC4000XL, but it could not be supported in software, not even in fpga editor. Thus it effectively was not there, even though it was there. It was less than virtual, more like a ghost... We legitimized the circuit for Virtex and Spartan-II, although I have a hard time finding the data book description. For Virtex-II, the circuit was significantly revamped and also described in the data sheet and the manual. Yes, you can use this circuit to gate the clock. It is really an overkill, and I still prefer the simple one-bit prescaler. But both approaches will work. Peter Alfke, Xilinx ApplicationsArticle: 39004
Falk Brunner wrote: > > "Russell Shaw" <rjshaw@iprimus.com.au> schrieb im Newsbeitrag > news:3C566A0E.BE6990C@iprimus.com.au... > > > > > > Falk Brunner wrote: > > > > > > I think Peter triggerd some inner gost of mine. > > > I want to compute the EXACT number of unique functions possible with a 4 > > > input LUT... > > > > It depends on the exact definition of a function. > > To get 64k functions from a 4-bit lut, one could > > assume that all 4 input bits are dedicated to one > > function, and that all combinations of inputs are > > valid. Then by defining function as "the set of > > Hmm, I see what you mean. Like > > Y = !A and B > > is also a valid function for a 4I-LUT. > > > input->output maps", there'd by 2^16. However, > > NO!! Why? Just look. (Fixed font) > > DCBA Y Y2 > 0000 0 0 > 0001 1 0 > 0010 0 0 > 0011 1 0 > 0100 0 0 > 0101 1 0 > 0110 0 0 > 0111 1 0 > 1000 0 1 > 1001 1 1 > 1010 0 1 > 1011 1 1 > 1100 0 1 > 1101 1 1 > 1110 0 1 > 1111 1 1 > > The function Y and Y2 are IDENTICAL. > > Y = A > Y2 = D Yes, but i said if all 4 inputs were dedicated to the one function. For Y=A, DCB isn't used, and for Y2=D, CBA isn't used. Therefore, you're comparing two *separate* sub-LUTs, which is what i described later. > This is just a permutation of inputs. > > > the LUT inputs can be divided between smaller > > sub-LUTs, so that for two 2-input LUTs, each > > has 16 functions, so the total in this case > > No, see my last posting. For a 2 input LUT it can be easyly proven on a > sheet of paper by writing down all functions as I did. > > > is 2x16=32 functions. Its easy to give a > > seemingly wrong answer if the question > > doesn't have enough restrictions. > > Yes. > > So, now to the 4I-LUT. > For 4 unique elements (the inuts of the LUT), there are 4! = 24 possible > permutations, like ABCD, BCDA etc. > I hacked a small program to compute for every of the 2^16 codes all 24 > permutations and check how much are equal. > If they all are the same, the code is unique one (like the very obvious case > "0000000000000000"). > Iam not 100% sure if my theory is correct and that there is no bug in my > small program, but here my first results. > For a 4I-LUT there are > > 32 unique functions (all permutations identical, this includes > "0000000000000000" and "1111111111111111") > 96 functions with 3 permutations -> giving 96/3 = 32 individual functions > 896 functions with 4 permutations -> giving 896/4 = 224 individual > functions > 1344 functions with 6 permutations -> giving 1344/6 = 224 individual > functions > 20160 functions with 12 permutations -> giving 20160/12 = 1680 individual > functions > 40008 functions with 24 permutations -> giving 40008/24 = 1667 individual > functions > > ---------------------------------------------------------------------------- > ------------------- > 65536 functions total, with 32+32+224+224+1680+1667=3859 different logic > functions, this includes all functions using 0,1,2,3,4 inputs. > > Here is the source code. Its good old TurboPascal. Takes about 1 minute to > compute on my Duron 850. > > program permutation; > > { A program to compute the number of unique logic functions > that can be programmed into a 16 bit ROM (LUT) } > > uses crt,dos; > > const A=3; > B=2; > C=1; > D=0; > debug=false; > > const per: array[0..23,0..3] of byte=((A,D,B,C), > (A,D,C,B), > (A,B,D,C), > (A,B,C,D), > (A,C,D,B), > (A,C,B,D), > (B,D,A,C), > (B,D,C,A), > (B,A,D,C), > (B,A,C,D), > (B,C,D,A), > (B,C,A,D), > (C,D,B,A), > (C,D,A,B), > (C,B,D,A), > (C,B,A,D), > (C,A,D,B), > (C,A,B,D), > (D,A,B,C), > (D,A,C,B), > (D,B,A,C), > (D,B,C,A), > (D,C,A,B), > (D,C,B,A)); > > type t_bytearray= array [1..65534] of byte; {64 kbyte buffer} > p_b = ^t_bytearray; > > var i,j,k,starttime,endtime,code: longint; > h,m,s,s100: word; > > basic_LUT: array[0..3,0..15] of byte; > perm_LUT: array[0..3,0..15] of byte; > perm_pos: array [0..23,0..15] of byte; > perm_data: array [0..23,0..15] of byte; > perm_uni: array [0..23,0..15] of byte; > stat: p_b; > final_stat: array[1..24] of longint; > num_perm: integer; > cond: boolean; > > taste:char; > > function get_my_time: longint; > begin > gettime(h,m,s,s100); { get system time } > get_my_time:=s100+s*100+m*6000+h*360000; > end; > > function cmp_perm: boolean; { compare two permutations for permutation > table } > begin > if (perm_LUT[0,j]=basic_LUT[0,k] ) and > (perm_LUT[1,j]=basic_LUT[1,k] ) and > (perm_LUT[2,j]=basic_LUT[2,k] ) and > (perm_LUT[3,j]=basic_LUT[3,k] ) then cmp_perm:=true else > cmp_perm:=false; > end; > > function cmp_perm_code: boolean; > var l_i: integer; > begin > l_i:=0; > while ( (perm_data[j,l_i]=perm_uni[k,l_i]) and (l_i<15)) do inc(l_i); > if l_i<15 then cmp_perm_code:=false > else > begin > if perm_data[j,15]=perm_uni[k,15] then cmp_perm_code:=true else > cmp_perm_code:=false; > end; > end; > > begin > while keypressed do taste:=readkey; { clear keyboard buffer } > clrscr; > starttime:=get_my_time; > writeln('Hello World'); > writeln('Just kidding ;-)'); > writeln('A brute force attempt to solve the magic LUT puzzle'); > if debug then writeln('Debug mode OFF'); > new(stat); > for i:=1 to 65534 do stat^[i]:=0; { clear statistics } > for i:=2 to 24 do final_stat[i]:=0; > final_stat[1]:=2; { account for LOW and HIGH } > > writeln(' Initialze the MOTHER table '); > if debug then > begin > writeln(' DCBA'); > taste:=readkey; > end; > > For i:=0 to 15 do > begin > basic_LUT[3,i]:=i and 1; > basic_LUT[2,i]:=(i shr 1) and 1; > basic_LUT[1,i]:=(i shr 2) and 1; > basic_LUT[0,i]:=(i shr 3) and 1; > if debug then > begin > writeln(i:2,' > ',basic_LUT[0,i],basic_LUT[1,i],basic_LUT[2,i],basic_LUT[3,i]); > end; > end; > if debug then > begin > writeln('Hit any key to continue'); > taste:=readkey; > end; > > writeln(' Initializing Permutation tables '); > > { generate all 24 LUT permutations } > for i:=0 to 23 do > begin > { generate permutated truth table } > for j:=0 to 15 do > begin > perm_LUT[A,j]:=basic_LUT[per[i,A],j]; > perm_LUT[B,j]:=basic_LUT[per[i,B],j]; > perm_LUT[C,j]:=basic_LUT[per[i,C],j]; > perm_LUT[D,j]:=basic_LUT[per[i,D],j]; > if debug then > writeln(perm_LUT[0,j],perm_LUT[1,j],perm_LUT[2,j],perm_LUT[3,j]); > end; > if debug then taste:=readkey; > for j:=0 to 15 do > begin > k:=-1; > repeat > inc(k); > until cmp_perm; > perm_pos[i,j]:=k; > end; > end; > > writeln('Done'); > if debug then > begin > writeln(' Permutation table'); > for i:=0 to 15 do > begin > for j:=0 to 23 do > begin > write(perm_pos[j,i]:2,' '); > end; > writeln; > end; > taste:=readkey; > end; > > writeln('Calculation permutations of all codes'); > writeln('from "0000000000000001" to "1111111111111110"'); > > for i:= 1 to 65534 do { code loop } > begin > if stat^[i]=0 then > begin > for j:=0 to 23 do { permutation loop } > begin > for k:=0 to 15 do { truth table loop } > begin > perm_data[j,k]:=(i shr perm_pos[j,k]) and 1; > perm_uni[j,k]:=0; { clear table in parallel } > if debug then write(perm_data[j,k]); > end; > if debug then writeln; > end; > if debug then taste:=readkey; > { check numer of unique permutations } > num_perm:=0; > for j:=0 to 23 do > begin > cond:=false; > for k:=0 to num_perm do > begin > if cmp_perm_code then cond:=true; { when equal, set flag } > end; > if not cond then { a new permutation??} > begin > for k:=0 to 15 do { copy permutation into database } > begin > perm_uni[num_perm,k]:=perm_data[j,k]; > end; > inc(num_perm); > end; > end; > if debug then > begin > for j:=0 to 23 do > begin > write(j,' '); > for k:=0 to 15 do > begin > write(perm_uni[j,k]); > end; > writeln; > end; > taste:=readkey; > writeln(i,' ', num_perm); > taste:=readkey; > end; > stat^[i]:=num_perm; { write down number of permutations } > end; > if i mod 256=0 then write('.'); > end; > > write('Calculating statistic . . .'); > > for i:=1 to 65534 do > begin > if debug then writeln(i,' ',stat^[i]); > if stat^[i]=0 then writeln ('Statistic ERROR') else > inc(final_stat[stat^[i]]); > end; > > writeln('finished'); > { dispose(stat);} > > J:=0; > writeln('4 input LUT statistic');writeln; > for i:=1 to 24 do > begin > if i=1 then > begin > writeln(final_stat[1]:5,' unique functions (including "1111" and > "0000") '); > end > else > begin > writeln(final_stat[i]:5,' functions with ',i,' permutations'); > end; > J:=J+final_stat[i]; > end; > taste:=readkey; > writeln('Sum over all functions = ',j); > Writeln('Finished'); > endtime:=get_my_time; > i:=(endtime-starttime) div (100*60*60); > j:=((endtime-starttime) - i*(100*60*60)) div (100*60); > k:=((endtime-starttime) - i*(100*60*60)-j*(100*60)) div (100); > writeln('Computed in ',i:2,':',j:2,':',k:2); > taste:=readkey; > end. > > Comments are very appeciated. > > -- > MfG > FalkArticle: 39005
I've been trying to get an FPGA keyboard interface working, from details here: http://www.howell1964.freeserve.co.uk/logic/burched/b5_kbd.htm After running for a while it gets out of sync and the data bytes get shifted. Not having a scope, has anyone got a description of how bad they are? I know what they are supposed to look like, from web info, but after travelling 3 metres through cheap cable they are distorted enough to cause errors. What I'd like to see is how they are distorted. Excessive crosstalk or ringing or noise? Is it possible to connect the keyboard to deliver signals robustly or are some errors inevitable? If so, how are they spotted and discarded? TIA, K.Article: 39006
Peter - Many thanks. "Peter Alfke" wrote > > > Tim wrote: > > > If you look (via fpga_editor) at the clock buffers you will see > > that there is a clock gate circuit in SpartanII. At least there > > is a control pin which looks like a gate input (comp.pin = K.BUFn.CE). > > > > There must be some problem with using it and we will have to wait > > for Peter's memoirs before we get the story. > > Talking about memoirs, I just finished reading "Spin-Off" by Charlie Sporck of > NSC fame, and "Swimming Across" by Andy Grove of Intel fame. Both very different, > and both very good reading. I can never compete with that kind of substance. > > Clock gating: > I remember once having a "brilliant" idea and publishing a clock multiplexer > circuit in XCell > http://www.xilinx.com/xcell/xl24/xl24_20.pdf > (Got me some flak from the high priests of metastability, but I think I managed > to defend myself.) > I then conned our circuit designers to implement it in XC4000XL, but it could not > be supported in software, not even in fpga editor. Thus it effectively was not > there, even though it was there. It was less than virtual, more like a ghost... > We legitimized the circuit for Virtex and Spartan-II, although I have a hard time > finding the data book description. > For Virtex-II, the circuit was significantly revamped and also described in the > data sheet and the manual. > > Yes, you can use this circuit to gate the clock. It is really an overkill, and I > still prefer the simple one-bit prescaler. But both approaches will work. > > Peter Alfke, Xilinx Applications > >Article: 39007
Jim Granville wrote: > Thus Peter claims the pedantic ouch! > 65536 number, when a classic logic > analysis can give a smaller number of distinct 'functions'. > > Both are right, depending on how you phrase, and understand the > question :-) > Whether it is 65,000 or only a few thousand, a LUT is a very versatile tool. I still remember my enthusiasm when, arriving here at Xilinx, I could draw a circle around any piece of logic with 4 inputs + one output ( and no flip-flop inside) and I could say: "Fits into one LUT, I don't care how, let's go on". Gets your mind off the nitty-gritty :-) Peter AlfkeArticle: 39008
Arash Salarian wrote: > > First of all, thank you all for your answers and help. > > "Ulf Samuelsson" <ulf@atmel.REMOVE.com> wrote in message > news:CYu58.6963$O5.17299@nntpserver.swip.net... > > > > I'm starting a new design in which I'm using a multi-channel A/D with > a > > low > > > > sampling-rate and Flash memory for the storage and the system is going > > to be > > > > powered by battery. In this stage, I'm not yet sure if using a FPGA > > would be > > > > wise, as I'm very concerned with the power consumption. The gate count > > of > > > > You need to tighter specify what your requirements > > How much data? > I'm going to use 4 channels of A/D at 200Hz, but only to store 3 of them > (one is used to monitor the battery). Data is stored on a Flash, MultiMedia > Card (i.e. SPI interface...) with 64+Mbytes. That means the the system > should be able to function over 15 hours by using a small Litium-Ion > battery. I don't remember the output voltage of a Li-ion battery. Is it higher than 3.3 volts so that a simple LDO regulator is sufficient? I seem to recall that the MSP430 will operate down to 1.8 volts but needs more like 2.7 volts to program it's internal flash. BTW, the high end MSP430F148/9 have 48/60 KB of Flash on chip. If your program only takes 1 or 2 kBytes, you might be able to use the internal Flash for your data. But I assume you need it to be removable. Just for my own benifit, how many mAHours can you get from a Li-ion cell running a device at 2.7 or 3.3 volts? You might also consider using a small switcher to optimize your efficiency and make the unit run when the cell drops below (or starts below) the operating voltage of your chips. They can be very small, but of course they cost a bit more than an LDO. > > How often do you want to store it. > Interface to the MultiMedia card is packet based, so data is going to be > soted in packets of 512 bytes.... > > > Resolution/speed of A/D. > 12 bit A/D with 200Hz sampling rate. This is a very low performance task. If you don't need high timing resolution, you can get away with a 32 kHz clock on either the MCU or the FPGA. But Jim's idea of running at a higher clock speed while writing to the Flash card is good. The Flash card can be powered down between writes saving a lot of power. If you use an FPGA, you will need an external ADC (which can be as simple as some high tolerance resistors and a comparator depending on how consistent the output drive voltage is) . But at a 200 Hz sample rate, this can be powered down between samples as well. You will also need an external oscillator. I don't have a part number for you, but I am sure there are small, very low power 32 kHz devices available either with or without a crystal included. But this is starting to make the MCU look cheap in comparison. > > An FPGA does not have an ADC internally, so you need to add power for > that. > > > > The Atmel AVR will draw less power than the Atmel 8051. > > I'd try out with an ATmega8, which has internal R/C oscillator and power > > down mode. This sounds like a good idea. The MSP430 has that as well as do many chips. But certainly the Atmel devices have a lot to offer. > > -- > > Best Regards > > Ulf at atmel dot com > > These comments are intended to be my own opinion and they > > may, or may not be shared by my employer, Atmel Sweden. > > > > Btw Ulf, What's your idea about MSP430 series suggested by Jim Granville? Ulf is an Atmel representative, so it is not likely he will say much about a TI product. He would be ill advised to say anything good for the sake of his job and won't say anything bad for being thought of as badmouthing the competition. :) Even if we ask... -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 39009
Tim wrote: > > Sorry to return to this topic yet again... > > XAPP450/451 show the SpartanII power-up current requirement as > 500mA for _all_ parts, provided the temperature requirements > are met. > > However, I have a distant recollection that someone (Austin?) > has at one time posted that the smaller parts need less than > this? Any ideas on the slope of the curve from the 2S15 > to the 2S200? > > I am particularly interested in the 2S30. I'm not a Xilinx rep, but I have had a couple of conversations with Kim Goldblatt about this. Basically, they are very confident that they will be able to reduce the curve somewhat, but I seem to recall that the improvement was not large. But that may be due to the chip I was asking about, the XC2S50, IIRC. This device was said to be expected to pass 1.5 Amp at industrial temps vs. 2 Amps in the data sheet. Better for sure, but not a large improvement. But you should contact Kim directly about it. He is the expert. :) -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 39010
Is that right? Amazon has link to A1books with $6.30 for new copies ($6.00 for used ones). It is less than 10% of the original price.Article: 39011
Kevin Brace wrote: > > heyho wrote: > > > > Hi, > > > > > > BTW, Kevin, why should he use Spartan II´s? ACEX devices are great! This should be a newsgroup which discusses technical questions and not stupid stuff what device is a few cents cheaper than another one. This kind of information hich cannot be proven anyway!! > > The reasons why I think the original poster should try out > Spartan-IIs are because, ...snip... > > 2) Regardless of what you say about saving money, for similar > performance, I will rather pay less for something similar. Again, > comparing Arrow's pricing for buying one ACEX 1K and Insight > Electronics' pricing for buying one Spartan-II, Spartan-II is much > cheaper ($30 to $40 for Spartan-II -5 or -6 vs. over $90 for ACEX 1K-1) > The prices here are what I saw on their websites a few months ago. Which part are to refering to? 1K is a family, not a part. Also you should be aware that web pricing is not very good compared to what you can get by discussing this with your reps directly. I am pretty sure, based on other parts that multiple disti's sell, that Arrow's web prices are very high compared to others. For example, Arrow has the highest price on Intel Flash, E28F320J3A110, I found at $15 vs. $9 at Avnet. Likewise their web pricing is very high for many other parts. So you need to get direct quotes from the disti's rather than count on web pricing. ...snip... My experience with the MAX+PlusII tool was not good. We worked on a design that was rather full, but we paired it down to 83% in a 10K100A and it still would not meet timing except with the greatest pains of trial and error giving the tool "help". Then it would fail on the bench at slightly higher temps. We lost 4 months dealing with this problem. -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 39012
Hi, I want to know about synthesis of a function package. I have a design(in VHDL) which includes a file where I have declared three functions. In design I am using only two functions. When I am synthesizing I am getting a certain percentage of device utilization. What I want to know, does this include the unused function also or is it optimized since I am not calling this function any time in design? Waiting for reply.Article: 39013
On Mon, 28 Jan 2002 21:55:25 GMT, Ray Andraka <ray@andraka.com> wrote: >I remember that too, and I'm pretty sure it was Phil Freidin that posted >the number. I couldn't find it in the FAQ or the archive either. > >Falk Brunner wrote: > >> I remember that there was a small file discussing the number of diffent >> functions possible with a 4 input LUT, 2^16-permutations of inputs >> But I cant find it anymore >> www.fpga-faq.com ?? NO. >> The great thing about FAQ's is they keep comming back! www.fpga-faq.com >> The FAQ >> 0022 >> http://www.fpga-faq.com/archives/23500.html#23505 As definitive as it gets. :-) Philip Freidin FliptronicsArticle: 39014
Chris Cowdery wrote: > > I've got a cut down PCI design which works nicely in a > MAX7256SQC208-7. I have changed device to a FLEX10KA30-1 to port it to > CardBus. > Why use FLEX 10KA? If you want to stick with Altera parts, why not use FLEX 10KE or ACEX 1K? They are likely faster and cheaper for the same amount of LEs. If you are able to port your design to Xilinx platform assuming that your design doesn't use any Altear specific features, you may want to try newly released Xilinx Spartan-IIE which tends to be sold for much less than ACEX 1K. One interesting thing I noticed about CardBus with Xilinx devices is that Xilinx doesn't recommend people using Virtex/Spartan-II for CardBus applications because of a issue to do with a clamping diode. Instead, they recommend Spartan-IIE for CardBus. > What I do not understand is why the simulation shows that the design > on the FLEX10K part has delays of about twice that of the 7256, even > though the flex part is a -1, and the max is a -7? Thus the timing > becomes marginal to say the least, yet the datasheet claims PCI 33MHz > compliance. > > Any ideas? > > -- > > Chris Cowdery > e-mail c.cowdery@nxtsound.TAKE-THIS-BIT-OUT.com It looks like for Altera FLEX FPGAs (Altera ridiculously calls it CPLDs), a device with a smaller speed grade number is a faster part. You may want to download FLEX 10KA datasheet to verify that. PCI compliance means that it is electrically compatible with PCI, but it doesn't in any way guarantee that your design (your PCI IP core) will meet 33MHz PCI timings. It sounds like you didn't buy your PCI IP core from Altera, so you can make modifications to it. I think the hardest part of 33MHz PCI design is meeting Tsu < 7ns setup time requirement, and since FLEX 10KE or ACEX 1K are faster devices, they will fare better in terms of meeting Tsu, but even on those devices, you may still have to do manual floorplanning. What I learned from my experience of trying to get my PCI IP core to meet 33MHz PCI setup timings of Tsu < 7ns is that a signal path starting from an unregistered input (a raw input) going through several levels of LUT to an output FF or a tri-state buffer control FF often tends to be the part that doesn't meet the setup timings. Yes, such a signal path I described is not taking advantage of registering inputs, but in PCI, it is not always possible to use registered inputs. For example, when a PCI target is asserting STOP#, but waiting for the initiator to deassert FRAME# so that AD[31:0] can be tri-stated immediately during a target read cycle. Registered version of FRAME# cannot be used in this case. For 4-input LUT-based FPGAs, from my own experience, the levels of LUT should be kept at or below 3 levels as much as possible. In target part of a PCI IP core, signal paths starting from FRAME# and IRDY# going towards AD[31:0] often tends to be timing critical. That path is the most I had problems with my own design because the routing distance was fairly long, so I couldn't have too much LUT delays. Keeping the levels of LUT to 3 for such a path was pretty tough, but I did it through simplifying my Verilog RTL code. If the code is simple, it might be possible to implement such a path in 2 levels of LUT. If your PCI IP core's parity generator was designed for a CPLD (a wide product term-based device) in mind, you may need to redesign it a little bit with a 4-input LUT-based FPGA's architecture in mind. I am not an expert in arithmetic, but I heard that a parity of 36-bit can be computed with carry chain logic or with combinational logic (LUT). I never got Xilinx's synthesis tool (XST) to do a 36-bit parity generator in the carry chain method, so I used the combinational method. If you are going to use the combinational method to calculate parity during a target read cycle, the parity tree should be constructed in a way that takes inputs from raw C/BE#[3:0] near the final output (near the top of the parity tree) of the parity so that raw C/BE#[3:0] will meet Tsu. If you enter C/BE#[3:0] at the bottom of the tree, C/BE#[3:0] will propagate through more levels of LUT, which might fail to meet Tsu. I can post the Verilog RTL code if you are interested (I didn't come up with the method I described. I just copied someone else's idea.). For FLEX 10KA, parity generation for a target read cycle may not be a big issue considering that a 4-input LUT delay for FLEX 10K30KA-1 is only 0.8ns. One more thing I heard which I am not sure because I don't deal with CardBus is that parity checking for data parity is a requirement. Unfortunately, a lot of PCI devices don't bother to check for data parity because of a loophole in the specification (Basically the specification says, who cares if data parity error occur for video data being written to a video card via PCI bus.). If you didn't have this feature previously, you may have to add it. To check for a data parity error, you will likely have to design another state machine independent from the existing PCI state machine because PERR# in PCI bus has to be asserted two cycles after the data is presented, and one cycle after PAR presented if a data parity error occurred. Kevin Brace (Don't respond to me directly, respond within the newsgroup.)Article: 39015
Jeroen Van den Keybus wrote: > > First of all, thanks for the ideas. As to the connection of the buses to the > same row/column, I'm afraid that will be difficult. We're using the 208-pin > PQFP package (EP1K100QC208-3) and there are maximally only 4 (2) pins that > share the same row (column). (e.g. pins 147, 144, 143, 142 on row B, unless > I am misreading the information), so I would follow the advice of Martin, > suggesting I should go for routability. I'll keep you posted on the outcome > of that. > I am not that familiar with Altera devices, but I don't believe Spartan-II has this kind of issue regarding routing. > Second point, concerning the idea of Kevin Brace: I must say our experience > with Xilinx FPGA's (XC4000 family) is not that good. We've been using it in > a students' lab and they had lots of problems with the software (crashes, > wrong output that required awkward workarounds), and we did not have any > problems using the free Max+2. We program mostly in VHDL and the Leonardo > compiler is functionally good although it's Windows interface _badly_ needs > a good Windows programmer. > If Quartus II 1.1 Web Edition (free version like MAX+PLUS II-BASELINE) supported ACEX 1K, I will recommend using it instead of MAX+PLUS II-BASELINE, but it only supports FLEX 10KE up to 100K gates (Paid version of Quartus II 1.1 supports ACEX 1K though.). Before I started using Xilinx ISE WebPACK for development of my PCI IP core, I started off with $99 Cypress' Warp 2 development tool, but I wasn't happy at all with Cypress CPLD's limited density (512 maximum macrocells for Ultra37K), and its severely crippled HDL simulator (Aldec Active-HDL) which the only simulation I was able to perform for Verilog was entering waveforms manually, and didn't let me do HDL-based testbenching. After having invested $250 on Cypress Ultra 37K development kit (Warp 2 + Ultra 37K development board + printed manuals), and getting frustrated with having to enter waveforms each time just to simulate by PCI IP core, I decided to look around for free EDA tools from other vendors. I saw Altera and Xilinx giving out free EDA tools, MAX+PLUS II-BASELINE 9.x and WebPACK ISE 3.2 respectively, so I tried out both because there was nothing to lose unlike the Cypress Warp2 software. I wasn't at all impressed with MAX+PLUS II-BASELINE because of its totally outdated GUI, and plus Altera didn't let BASELINE user use their in-house integrated synthesis tool which for a real beginner, it is nicer to use a integrated synthesis tool than an external third-party synthesis tools like Synopsys FPGA-Express or LeonardoSpectrum. But the real killer was that the only simulator Altera provided for free was a waveform-based simulator similar to Cypress' junk, and not an HDL simulator. Instead, Xilinx WebPACK ISE 3.2 provided everything I wanted for free. It had an integrated synthesis tool (XST), and the GUI looked far better than what I have seen so far with both Cypress and Altera tools. Plus, Xilinx offered a free HDL-based simulator ModelSim XE-Starter. I have to admit that ModelSim has some learning curve involved, but after I got through their tutorial, I was able to do RTL simulation of my PCI IP core with it. Since then, I upgraded to ISE WebPACK 4.1, and I still use Xilinx tools primarily. So, basically, I don't have any experiences with Xilinx tools/devices 2 to 3 years ago, but certainly I haven't seen anything you described in ISE WebPACK 3.x or 4.1 + Spartan-II. I never had ISE WebPACK to fail to completely route in Spartan-II, but I have to admit that so far the largest design I had only occupied about 45% for the chip. Certainly, Xilinx tools are not bug free, but they tend to crash far less than Altera tools from my own experience using MAX+PLUS II-BASELINE/LeonardoSpectrum-Altera/Quartus II 1.1 Web Edition. I do wonder if you simulate your design with a simulator before generating a bitstream file. Again, I think ModelSim XE-Starter's HDL-based testbench capability is far more efficient in simulating a design than MAX+PLUS II-BASELINE's waveform simulator, but maybe you may have access to a paid version of ModelSim or a comparable simulator, so perhaps that may not be an issue to you (It is to me.). Assuming that your VHDL-based design doesn't take advantage of Altera specific features like LPM or EAB, it should still synthesize fine with XST VHDL. If you have some time, and a fast Internet connection (I will assume that a University will have that.), you may want to try ISE WebPACK out. I think it is pretty risk free. > Furthermore: we do not need the ultimate speed ! We use the FPGA in motor > control applications to generate tens of 20-100 kHz high resolution (12-16 > bit) PWM signals, decode quadrature encoder signals and acquire current > waveforms from serial ADC's. These measurements must be checked for > interferences and overloads and the innermost high bandwidth control loop, > implemented on the FPGA, is based on them. The actual system control is very > complex and performed with a powerful DSP. Modern DSP's however, depend > heaviy on their pipeline architecture and we noticed that the best way to > kill their computing power is to have them interrupted a few 10k times per > second to write PWM update values and read currents. (This I/O actually > burdened our researchers/programmers too...) > > It seemed to us that we would need a lot of 'random' logic. The quick > compile/fit times with Leonardo/Max+2 and the immediate availability of the > device made us decide for the ACEX 1K. And while we were at it, we went for > the largest device we could get our hands on. > > BTW, we paid about $30 for the EP1K100-3 (next day delivery) and for us > (university reasearch) time to market and the lowest price are not (ahem...) > very important. > > Jeroen. > I don't believe ISE WebPACK 4.1's P&R time is any slower than MAX+PLUS II/Quartus II 1.1's fit time. I guess your application doesn't require the fastest ACEX 1K speed grade, so you can get away with the slowest ACEX 1K, but for things like 5V 33MHz PCI I deal with, Altera guarantees only the fastest ACEX 1K-1 to guarantee 5V PCI current drive capability (According to ACEX 1K datasheet.). Also, for slower -2 and -3 speed grade devices, 4-input LUT delay is a lot more than the slowest Spartan-II speed grade -5 (1.0ns and 1.5ns vs. 0.7ns respectively), which will make a PCI IP core harder to meet Tsu < 7ns. I checked the ACEX 1K pricing at Arrow, and the price for ACEX 1K100K-3QC208 was $27 which was very similar to what you paid. But the fastest ACEX 1K100KQC208-1 was $57, and I believe that is twice much compared to Spartan-II XC2S200-5CPQ208. Kevin Brace (Don't respond to me directly, respond within the newsgroup.)Article: 39016
rickman wrote: > > Kevin Brace wrote: > > > > > > 2) Regardless of what you say about saving money, for similar > > performance, I will rather pay less for something similar. Again, > > comparing Arrow's pricing for buying one ACEX 1K and Insight > > Electronics' pricing for buying one Spartan-II, Spartan-II is much > > cheaper ($30 to $40 for Spartan-II -5 or -6 vs. over $90 for ACEX 1K-1) > > The prices here are what I saw on their websites a few months ago. > > Which part are to refering to? 1K is a family, not a part. Also you > should be aware that web pricing is not very good compared to what you > can get by discussing this with your reps directly. I am pretty sure, > based on other parts that multiple disti's sell, that Arrow's web prices > are very high compared to others. For example, Arrow has the highest > price on Intel Flash, E28F320J3A110, I found at $15 vs. $9 at Avnet. > Likewise their web pricing is very high for many other parts. > > So you need to get direct quotes from the disti's rather than count on > web pricing. > Who else is a distributor of Altera? The only one I know is Arrow. Regarding ACEX 1K, I sort of made a mistake when comparing devices. The $90 price tag I meant was actually for ACEX 1K100K-1 484-pin FPBGA package, not ACEX 1K100K-1 208-pin PQFP package. For ACEX 1K100K-1 208-pin PQFP package, the price is something like $58, but that is twice as much compared to Spartan-II XC2S200-5CPQ208 (around $25 to $30). I suppose that if someone buys a lot, the distributor may give a volume discount, but will the same discount apply when buying only one or two for prototyping? > ...snip... > > My experience with the MAX+PlusII tool was not good. We worked on a > design that was rather full, but we paired it down to 83% in a 10K100A > and it still would not meet timing except with the greatest pains of > trial and error giving the tool "help". Then it would fail on the bench > at slightly higher temps. We lost 4 months dealing with this problem. > Like you, I never liked MAX+PLUS II, so I instead started off with ISE WebPACK. I must say that Quartus II 1.1 Web Edition is a major improvement over MAX+PLUS II-BASELINE, but for some reason, ACEX 1K is not supported in Quartus II 1.1 Web Edition, but some how MAX+PLUS II-BASELINE (free version) supports it. Since Quartus II 1.1 Web Edition supports FLEX 10KE, I don't see a reason why Altera doesn't support ACEX 1K. Kevin Brace (Don't respond to me directly, respond within the newsgroup.)Article: 39017
Hi, Try: reg [31:0] rand; initial begin rand = $random ; end HTH, Srinivasan -- Srinivasan Venkataramanan ASIC Design Engineer Software & Silicon Systems India Pvt. Ltd. (An Intel company) Bangalore, India, Visit: http://www.simputer.org) "I don't Speak for Intel" "piiszo" <fogi@isfh.jdk> wrote in message news:ee74832.-1@WebX.sUN8CHnE... > who can post a verilog code or link > about random?Article: 39018
May I know why if a use this demux in a project I've a maximum frequency of 50MHz while if I take away the clock from this demux the maximum frequency of the project arrive to 155MHz. There's a well known reason ?? By the way here it is also the response of XST, the inferring of 30 D-type flip-flops is due to when others => null; there are alternative to not use these flip flop or is better to have them ??? Synthesizing Unit <demux_3x10>. Related source file is C:/SRRCx4ROM_26_01_2002/PolyphaseBlockROM_29_01_2002/Xilinx/demux_3x10.vhd. Found 10-bit register for signal <out_0>. Found 10-bit register for signal <out_1>. Found 10-bit register for signal <out_2>. Summary: inferred 30 D-type flip-flop(s). Unit <demux_3x10> synthesized. library ieee; use ieee.std_logic_1164.all; entity demux_3x10 is port(in_mux : in std_logic_vector(9 downto 0); clk : in std_logic; sel : in std_logic_vector(2 downto 0); out_0, out_1, out_2 : out std_logic_vector(9 downto 0) ); end demux_3x10; architecture demux_3x10_arch of demux_3x10 is begin process (sel, in_mux, clk) begin if falling_edge(clk) then case sel is when "011" => out_0 <= in_mux ; when "100" => out_1 <= in_mux ; when "110" => out_2 <= in_mux ; when others => null; end case; end if; end process; end demux_3x10_arch ;Article: 39019
XST infer this ROM always like a 512x12 bit while with the subtype addr_reduced I mean to reduce the dimension of the ROM, what's wrong and what I can do to obtain the right result ?? Or simply XST round the dimension of the ROM to the next power of two ?? library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_unsigned.all; entity ROMx3 is port(address : in STD_LOGIC_vector(9 downto 0); SRRC_out : out STD_LOGIC_VECTOR(11 downto 0) ); end; architecture ROMx3_arch of ROMx3 is begin process(address) subtype addr_reduced is integer range 0 to 383; variable addr : addr_reduced ; begin addr := conv_integer(address) ; case addr is -- somme per il FIR 0 when 0 => SRRC_out <= X"52B"; when 1 => SRRC_out <= X"52B"; when 2 => SRRC_out <= X"517"; ........................ when 382 => SRRC_out <= X"ABF"; when 383 => SRRC_out <= X"B00"; when OTHERS => SRRC_out <= X"000"; end case; end process; end ROMx3_arch;Article: 39020
with Aldec 5.1 schematic I'm assigning out_0 : out STD_LOGIC_VECTOR(9 downto 0); to the bus signal to_romx3 : STD_LOGIC_VECTOR (8 downto 0); the result is out_0(0) => DANGLING_U2_out_0_0, out_0(1) => to_romx3(0), out_0(2) => to_romx3(1), out_0(3) => to_romx3(2), out_0(4) => to_romx3(3), out_0(5) => to_romx3(4), out_0(6) => to_romx3(5), out_0(7) => to_romx3(6), out_0(8) => to_romx3(7), out_0(9) => to_romx3(8), while I would want out_0(0) => to_romx3(0), out_0(1) => to_romx3(1), out_0(2) => to_romx3(2), out_0(3) => to_romx3(3), out_0(4) => to_romx3(4), out_0(5) => to_romx3(5), out_0(6) => to_romx3(6), out_0(7) => to_romx3(7), out_0(8) => to_romx3(8), out_0(9) => DANGLING_U2_out_0_0, How I could do this ??Article: 39021
Buongiorno Cher, I hope you will excuse me but I've some other questions : 1) When I choose to fit in a ROM using the case, I see in FPGA Editor that everything is well ordered but inside CLB, does this means that the design is slower and the routing more difficult, in practice, what's the difference between the case that the synthesizer discover the ROM and the case it does not discover it ?? 2) Why for RAM is required the clock and not for ROM (I know it's a stupid question but I don't know the answer) 3) Have sense a ROM with a clock port ?? 4) Do you suggest to use a blockram for each block of memory or a blockram for all three block of memory in the case the dimension of the three blocks isn't equal ?? Thanks AntonioArticle: 39022
> Our idea is to use the Hard-drive memory to store the various FPGA > configurations, and to use a small 32 I/O MCU (8051) to perform the FPGA > reconfiguration from the HDD. (the 8051 would share the IDE bus with the > FPGA, but they would have a mutual exclusive use of the HDD, since the > MCU would only be used during reconfiguration) > This means that your FPGA should contain 'code' to access your hard drive. Why not assigning this job to the MCU (as it is already implemented for the (re)configuration of the FPGA). This will create less overhead in your FPGA. > Now we are wondering whether this idea is good or not :), we are > specifically concerned with : The idea itself looks challenging ... If power consumption is not a big issue here ... Why not? > > - PCB layout and signal integrity problems due to the fact that the IDE > connection is shared between the MCU and the FPGA. For ex. would it be > possible to use high-speed IDE (50Mhz clock) protocols from the FPGA ? > Is this 50MHz clock really needed? If not, the MCU can handle this task. > - Feasibility : How difficult would it be to design and debug such a > system ? Less development and debug time when you don't implement the IDE interface on your FPGA. > > Any advice, comments, critics, ideas are welcome, > Just my thoughts ... > Thank you in advance. > Steven > You are welcome, GeertArticle: 39023
Use a low pass filter on the clock line, (bets say 100 Ohm, 100pF)followed by a schmitt-trigger. Regards FalkArticle: 39024
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z