Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Jezwold wrote: > Ghz speed FPGAs with sub ns are seriously expensive,but as you say > there must be a market otherwise they wouldnt make them. Ive just never > come across anyone who used them to implement a general purpose CPU > Thank you, I'm learning. (don't worry, I'll shut up soon!!)Article: 81101
Hi dave, > I am probably opening a can of worms, but why are FPGAs so slow? Because they're general-purpose logic embedded in a sea of wires. > The CPU cores at www.opencore.org represent an ever growing number of > excellent and very practical, but slow processor implementations. What I > mean is 240-500Mhz FPGAs, when market CPUs are in the 3Ghz range. These CPUs make use of the same kind of technology, but the difference is that all the silicon is designed for one single purpose. A CPU is a CPU, and that's all it can be. An FPGA can be a CPU, a bingo machine, an audio effects processor, a medical imager, you name it. > Surely there must be 1 or 2 Ghz FPGAs available with sub nano second > gate switch/propagation times. The switching time of a Stratix II logic element is about 250ps (depends on which input is being used and what output path is chosen), which would theoretically allow such speeds, and I'm sure that if I write an oscillator for one logic cell, that I will get an oscillating signal in the GHz range. Depending on which routing structure I use I will either get a 2GHz (internal feedback path) or a 1GHz (local routing) signal. > Or possibly it is a verilog, vhdl or synthesis problem with the designs? Nope. ASIC synthesis is indeed more fine-grained than FPGA synthesis. If an equation needs a NAND, in an ASIC, a NAND is placed, and dedicated wires are laid out to connect the NAND to its inputs and outputs.. In an FPGA, a connection is made to a design element, and this element is then configured to function as a NAND. A single FPGA 'design' element is capable of performing much more complex functions than NANDs alone though. The output of the design element then goes into a big multiplexer. This multiplexer connects the output of the design element to a variety of routing structures. These routing structures, in turn connect to other multiplexers, which connect to either other routing structures etc, or to the inputs of other design elements. As you can see, not only the design elements in an FPGA are general purpose, but so is the interconnect. So, in short, the silicon in a modern an FPGA is indeed high-performance, but due to its general-purpose nature, it can never be as fast as dedicated ASIC logic. > Should I just use mass manufactured high speed CPUs and relegate the > other discrete logic to CPLD/FPGAs?? Horses for courses. An 'industry CPU', as you call it (I'm thinking AMD or Intel), requires a lot of support chips to properly function - it thrives in an environment that conforms to a (large) number of conditions. An FPGA can basically be plonked into any situation that requires some sort of digital function - and a CPU is just one of the many functions that can be integrated in that little square with legs. It will never be as good as dedicated silicon, but in many cases, dedicated silicon just isn't there. > Lastly, how fast is NIOSII? Pretty fast. It's a 32bit RISC softcore with a 1/5/6-stage pipeline that runs at ~150MHz in a Stratix II and at ~80MHz in a Cyclone - but Your Mileage May Vary, depending on custom instructions for the ALU, other logic getting in the way, speed grades, fill factor etc etc. Note that if you have a parallelizable algorithm, you can spread the load over multiple NIOSen in the same FPGA... Best regards, Ben Not from the marketing department ;-)Article: 81102
dave wrote: > I am probably opening a can of worms, but why are FPGAs so slow? > The CPU cores at www.opencore.org represent an ever growing number of > excellent and very practical, but slow processor implementations. What I > mean is 240-500Mhz FPGAs, when market CPUs are in the 3Ghz range. Your 3GHz CPU can do a few adds or multiplies at 3GHz. An FPGA can do hundreds or thousands of adds, or multiplies at a few hundred MHz. Which is faster? -- glenArticle: 81103
"morpheus" <saurster@gmail.com> wrote in message news:1111094017.939450.52810@o13g2000cwo.googlegroups.com... > Hi All, > If anyone knows of a bit rounding algorithm, please forward the > information to me. I am trying to round-off 24 bits to 12-bits. > Thanks > MORPHEUS > p.s. the 24 bits is the result of an additing between two 24 bit > numbers. I need to round off the result and feed it to a 12-bit DAC. > THNX > It depends on what you're doing with the output of the DAC. Under some circumstances, it helps to take the "dropped" twelve bits, delay them by a clock, then add them in again. This has the effect of making the quantizing noise higher freqency, and often less objectionable. This is known as error feedback. For plain rounding, just add half an output lsb, then truncate.Article: 81104
Thank you. Ben. Ben Twijnstra wrote: > Hi dave, > > >>I am probably opening a can of worms, but why are FPGAs so slow? > > > Because they're general-purpose logic embedded in a sea of wires. > > >>The CPU cores at www.opencore.org represent an ever growing number of >>excellent and very practical, but slow processor implementations. What I >>mean is 240-500Mhz FPGAs, when market CPUs are in the 3Ghz range. > > > These CPUs make use of the same kind of technology, but the difference is > that all the silicon is designed for one single purpose. A CPU is a CPU, > and that's all it can be. An FPGA can be a CPU, a bingo machine, an audio > effects processor, a medical imager, you name it. > > >>Surely there must be 1 or 2 Ghz FPGAs available with sub nano second >>gate switch/propagation times. > > > The switching time of a Stratix II logic element is about 250ps (depends on > which input is being used and what output path is chosen), which would > theoretically allow such speeds, and I'm sure that if I write an oscillator > for one logic cell, that I will get an oscillating signal in the GHz range. > Depending on which routing structure I use I will either get a 2GHz > (internal feedback path) or a 1GHz (local routing) signal. > > >>Or possibly it is a verilog, vhdl or synthesis problem with the designs? > > > Nope. ASIC synthesis is indeed more fine-grained than FPGA synthesis. If an > equation needs a NAND, in an ASIC, a NAND is placed, and dedicated wires > are laid out to connect the NAND to its inputs and outputs.. In an FPGA, a > connection is made to a design element, and this element is then configured > to function as a NAND. A single FPGA 'design' element is capable of > performing much more complex functions than NANDs alone though. > > The output of the design element then goes into a big multiplexer. This > multiplexer connects the output of the design element to a variety of > routing structures. These routing structures, in turn connect to other > multiplexers, which connect to either other routing structures etc, or to > the inputs of other design elements. > > As you can see, not only the design elements in an FPGA are general purpose, > but so is the interconnect. > > So, in short, the silicon in a modern an FPGA is indeed high-performance, > but due to its general-purpose nature, it can never be as fast as dedicated > ASIC logic. > > >>Should I just use mass manufactured high speed CPUs and relegate the >>other discrete logic to CPLD/FPGAs?? > > > Horses for courses. An 'industry CPU', as you call it (I'm thinking AMD or > Intel), requires a lot of support chips to properly function - it thrives > in an environment that conforms to a (large) number of conditions. An FPGA > can basically be plonked into any situation that requires some sort of > digital function - and a CPU is just one of the many functions that can be > integrated in that little square with legs. It will never be as good as > dedicated silicon, but in many cases, dedicated silicon just isn't there. > > >>Lastly, how fast is NIOSII? > > > Pretty fast. It's a 32bit RISC softcore with a 1/5/6-stage pipeline that > runs at ~150MHz in a Stratix II and at ~80MHz in a Cyclone - but Your > Mileage May Vary, depending on custom instructions for the ALU, other logic > getting in the way, speed grades, fill factor etc etc. Note that if you > have a parallelizable algorithm, you can spread the load over multiple > NIOSen in the same FPGA... > > Best regards, > > > Ben > > Not from the marketing department ;-) >Article: 81105
glen herrmannsfeldt wrote: > dave wrote: > >> I am probably opening a can of worms, but why are FPGAs so slow? > > >> The CPU cores at www.opencore.org represent an ever growing number of >> excellent and very practical, but slow processor implementations. What >> I mean is 240-500Mhz FPGAs, when market CPUs are in the 3Ghz range. > > > Your 3GHz CPU can do a few adds or multiplies at 3GHz. > > An FPGA can do hundreds or thousands of adds, or multiplies > at a few hundred MHz. Which is faster? > > -- glen > Ok.... Thank you.Article: 81106
"dave" <dave@dave.dave> wrote in message news:d1cs40$ja9$1@news6.svr.pol.co.uk... <snip> > Contrary to what you may think there is a market for Ghz speed flexible > FPGAs. But hey, what do I know, I am just a HDL newbie. <snip> I completely agree. There is a market for GHz speed FPGAs. There's also a merket for TeraHertz speed processors. And a market for safe, $1500 cars.Article: 81107
Couple of things One factor affecting the speed of the circuit is the process. Some of the FPGAs like Virtex -II are fabricated in .13um technology abd are far behind the present CPU technology. However, the latest FPGAs from Xilinx and Altera are 90nm (same as P4). I think the reason for the performance difference for these guys is, as u suggested, because of the tools. ASICs like CPUs are carefully optimized for area, timing, power and so on at every level. This is taken care of by the synthesis, PAR tools while designing the circuits using FPGAs. All you do is code the design and let the synthesizer, PAR tools do their best possible job. I also think this process is getting further complicated by the inherent design of the FPGAs.Article: 81108
On Thu, 17 Mar 2005 15:00:59 +0800, Sea Squid top-posted: > I found PP is unable to drive such LEDs, which needs 20mA, but what is the > converter chip I shall order? > > Thanks > ULN2803 - Eight darlingtons in a DIP http://www.st.com/stonline/books/ascii/docs/1536.htm Of, course, you'll need a separate supply - there is no reliable +5V. Vcc at the LPT port. Good Luck! Rich > > > > > "Sea Squid" <Sea.Squid@hotmail.com> wrote in message > news:423928c3@news.starhub.net.sg... >> I want to experiment the parallel port with eight LEDs tied to a cut >> parallel port cable, then send instructions with Visual Basic to create >> some patterns. Is there any danger to my laptop? >> >> Thanks. >> >> >> >>Article: 81109
morpheus wrote: >Hi All, >If anyone knows of a bit rounding algorithm, please forward the >information to me. I am trying to round-off 24 bits to 12-bits. >Thanks >MORPHEUS >p.s. the 24 bits is the result of an additing between two 24 bit >numbers. I need to round off the result and feed it to a 12-bit DAC. >THNX > > > Depends on your tolerance to quantization noise and bias. The simplest approach is to simply truncate the 12 lower bits. This results in an error between 0 and +1 lsb, which introduces a bias of 1/2 LSB. The bias can be reduced using simple rounding: Add 1/2 of the retained LSB weight before truncating. In your case, you'd add 0x0800 (a '1' in the top discarded bit) to the 24 bit value before truncating off the 12 LSBs. This reduces the bias to 1/2 of the lsb weight of the 24 bit word, but does not totally eliminate the bias. A bias remains because 0.5 (0x800 in the lowest 12 bits) always rounds up. 0.5 is equidistant to 0 and 1, so it introduces a small bias. The bias can be eliminated by modifying the rounding algorithm to either round to or away from zero or round towards even or odd. Note with simple rounding, +0.5 rounds up to 1 and -0.5 rounds up to 0. With the round to or round away from zero, the direction of rounding is modified when the value is negative so the +0.5 rounds up to 1 and -0.5 rounds down to -1 (or both round to 0 if the sense is reversed). This is called symmetric rounding. It can be accomplished by adding 0.5-LSB (in your case 0x7FF) and then connecting the most significant (sign bit) to the carry in of the adder (invert the carry in to reverse the sense). If the carry-in is 0, then 0.5 is rounded down, and if it is 1 then 0.5 is rounded up. Symmetric rounding uses the sign to reverse the direction of rounding based on the sign of the input. Rounding to even or odd is similar, except the value of the input bit corresponding to the LSB of the rounded output is used as the carry in so that rounding of n.5 is always to the even (or to the odd) value. Round to even/odd is useful when you are rounding as part of an arithmetic process before complete results are available because it is relatively easy to pre-compute the LSB value. If quantization noise is an issue, then you can improve noise performance with feedback of the error introduced by rounding. That is a bit more complicated, so I'll save discussion of it for another time. In summary: input truncation simple symmetric rnd to even -2.7 -3 -3 -3 -3 -2.5 -3 -2 -3 -2 -2.3 -3 -2 -2 -2 -1.5 -2 -1 -2 -2 -0.5 -1 0 -1 0 0.5 0 1 1 0 1.5 1 2 2 2 2.5 2 3 3 2 -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 81110
Hi, Im a poor student who purchased the 6.3 version of the EDK a few months back. Uptill now I have been using the evaluation version of ISE 6.3 (too expensive for me). I have been unable to use the WebPack version of the ISE 6.3 since "all" I have is a Virtex4 LX25. I had been planning to switch to the ISE 7.1 webpack (which supports my device) but I am unable to get it working with my EDK 6.3 installation. I get the following error: $XILINX does not point to an iSE 6.3 installation Isnt it possible to use the ISE 7.1 webpack with EKD 6.3 ?? thanksArticle: 81111
Hi dave, > Jezwold wrote: >> I quite agree with John_H its a mistake to compare FPGA functionality >> and CPU functionality,they are just fundamentaly different things.I >> also think its a mistake to implement a CPU in a FPGA but I'll prolly >> get flamed for saying that. >> > > Honest question, why is it a mistake? Embedded CPUs are 'hot' at the moment. In 1997 I implemented a PIC controller in an Altera Flex FPGA as a proof-of-concept. The implementation ran at three times the speed of the PIC, but cost $150, versus the $8 of the PIC. Nowadays, FPGA gate pricing has come down to levels that make implementing a CPU on an FPGA economically viable. A NIOS II will easily fit into half an Altera EP1C3 FPGA costing around $12 (in low volume), leaving the other half for more specialized logic. Then again, a PIC12/16/18 is available with lots of nice peripherals for prices around $6 or lower, so if you just want to use a CPU with some standard peripherals, then please just get a PIC (or a Cypress pSOC - they have a reprogrammable analog array as well!!). If there's some nonstandard digital function you need to build, and there happens to be a CPU on the board you,re building as well, _then_ you may want to look whether you can stuff the CPU in an unused corner of your FPGA. Otherwise, just go dedicated. Best regards, BenArticle: 81112
dave wrote: > I am probably opening a can of worms, but why are FPGAs so slow? dynmic versus static hardware configuration. A road from your home to your work place can be a lot faster if you do not require it to be used by thousends of other people that use the same road for different routes. You can get a long without traffic lights, cross roads, and so on. The same for FPGAs they have switches were an asic has wires. So an fpga will have a slower cpu, than an cpu asic, and it will have a slower fir filter than an fir asic, but the fpga can do both the other two can't. For that reason often the fpga is faster. Because it is a faster cpu than the fir asic and a faster fir filter than the cpu asic. > The CPU cores at www.opencore.org represent an ever growing number of > excellent and very practical, but slow processor implementations. What I > mean is 240-500Mhz FPGAs, when market CPUs are in the 3Ghz range. Clock speed ist not everything. These people have a 90MHz FPGA ray tracing hardware that beats a P4-3GHz by a factor of four. http://www.saarcor.de/ > Surely there must be 1 or 2 Ghz FPGAs available with sub nano second > gate switch/propagation times. The typical carry chain delay of an fpga is 50ps. Kolja SulimmaArticle: 81113
Herb T wrote: > Folks, > I am trying to learn how to program Spartan II (XC2S100-5PQ208C) and > Spartan 3 (XC3S400-4PQ208C) Xilinx FPGAs. I looked at the data sheets > for these parts, and the more I do the more mystified I get. You don't need to read the chip's data sheet in order to know how to program it. All you need to know is the I/0 pin assignment on your FPGA board. > Based on these descriptions, about how long does it take to write > simple VHDL programs that work, or become fluent enough to know a good > design from a chip fryer? Start simple. Start building a 2 input AND gates. That shouldn't take more than a few minutes. The tool should have the feature to connect the input and output of your AND gate to any I/O available in the FPGA chip. Digilent (an FPGA board vendor/www.digilentinc.com) has a video tutorial on programming their FPGA board. You can also look at www.engr.sjsu.edu/crabill. Look at lab1. The author has a good tutorial on programming an FPGA board. He used Verilog, but the procedure for VHDL should be similar. Good Luck! HendraArticle: 81114
KCL wrote: > and what is the price of this board?? -1- TRND1 - Tornado board Special Introductory price : ..... 295 €uros VAT excl. + shipping -2- TEK5 - Tornado Education Kit 5 x boards + Tuition material Special Introductory price : ...... 1 480 €uros, VAT ecl. + shipping Thanks for mentioning the wrong links, this Web page should have been updated anyway... my fault. The real site is either : http://www.alse-fr.com (French) or http://www.alse-fr.com/english (guess :-) You'll find all the contact information. Both 1 & 2 ship with all you need, including the design software and ready-to-use VHDL and Verilog example(s), scripts etc... You still need a PC running Win2k or XP though. (we have'nt ported our USB programmer on Linux yet) And to get serious, a VHDL (or verilog) simulator is indeed welcome. A beginner should have his first FPGA synthesized and running in less than 15 minutes after unpacking the board. Everything from HDL code to bitstream download and board running requires only _one_ single command : a double-clik on make.bat. Teachers should like our ready-to-use Tuition material. Hobbyists shoud like the conditioned I/Os and RC servo. Experts should like the advanced features like fast ADC, Smart Card, I2C-PS2 links, USB transfers up to 1 Mbytes/s, etc... Documents and Tuition materials are available in English or in French. Education Kit solutions only in VHDL yet, will be done in Verilog upon demand. Board's example both in Verilog and VHDL. Best regards, BertArticle: 81115
sam wrote: > Couple of things > > One factor affecting the speed of the circuit is the process. Some of > the FPGAs like Virtex -II are fabricated in .13um technology abd are > far behind the present CPU technology. > > However, the latest FPGAs from Xilinx and Altera are 90nm (same as P4). > I think the reason for the performance difference for these guys is, as > u suggested, because of the tools. ASICs like CPUs are carefully > optimized for area, timing, power and so on at every level. This is > taken care of by the synthesis, PAR tools while designing the circuits > using FPGAs. All you do is code the design and let the synthesizer, PAR > tools do their best possible job. I also think this process is getting > further complicated by the inherent design of the FPGAs. > Anyway it is strange why an PPC405 will only do ~500Mhz in 90nm in 2005 and a P4 in 90nm >3Ghz (factor 6 !!!), it suggests that the FPGA silicon is far less optimized as a CPU would, the process used for modern FPGA’s equals that of P4’s. Of course it all depends on the max level of logic between two clock levels expressed in FO4 delay's. The less levels the more though the design will be. A P4 is not a "though" design in this perspective and PPC405 also not, so still the question is why... Maybe the P4 is designed transistor by transistor, and a PPC405 in a V4 is only synthesized by some less efficient synthesis tools?? For sure it gives hope for the next generation FPGA’s. RoelArticle: 81116
Hi, I am trying to figure out how to evaluate the speed of Distributed Arithmetic architectures for FIR filter design. I was refering to the paper "A Guide to Field Programmable Gate Arrays for Application Specific Digital Signal Processing Performance" by G.R.Goslin at Xilinx. The paper says that by using a parallel distributed arithmetic architecture (where more bits of the inputs are processed at the same time) greater sampling speed (number of samples per second) can be achieved, compared to Serial Distributed Arithmetic. I am confused about this. According to me, if you pipeline the design, you can achieve the sampling speed you want. I can see how using more resources, you can achieve shorter latency, but the sampling speed should not be affected. There is probably something I am missing here. If any of you are familiar with this field, and know the answer, please let me know. Thank you, AnupArticle: 81117
"anup" <anuphosh@yahoo.com> writes: > I am confused about this. According to me, if you pipeline the design, > you can achieve the sampling speed you want. I can see how using more > resources, you can achieve shorter latency, but the sampling speed > should not be affected. There is probably something I am missing here. Adding pipelining is usually done to reduce the cycle time (increasing the clock rate) while also increasing the latency. Suppose you wanted to build a floating-point multiply-and-add unit. Perhaps if you make it fully combinatorial, it takes 100 ns for each cycle. You can process ten million samples per second, and the latency is 100 ns. Suppose instead you break it up into a pipeline with a combinatorial multiplier, a pipeline register, and a combinatorial adder. Suppose the multiplier and adder each take 60 ns separately, and the pipeline register setup and clock-to-output-valid time adds 10 ns. Now your latency (full operatin time) is 130 ns, which is longer. But your sample clock can be as fast as 70 ns, so you can process over fourteen million samples per second. Now suppose you internally pipeline the actual adder and the multiplier. Perhaps each have three stages that take 20 ns each, and you still have 10 ns of delays for combined setup and clock-to-output-valid time of the pipeline registers. Now you have a latency of 170 ns, which is longer yet. But your sample clock can now be 30 ns, so you can process over thirty million samples per second. Note that the times used in this example are probably not representative of real times for any actual system. EricArticle: 81118
Hi, I was wondering if anyone has tried to use a stapl file generated from Xilinx iMPACT software to program a XCF02S using Alteras Jam software or any other third party programming software. I believe the iMPACT software generates incorrect data streams that are used to program the device. The thing is the ACA data format used to program generates the correct bit stream, but the Hex data format seems to be scrambled. I verified this by reading back the program written to the device and comparing it to the *.mcs file generated by iMPACT. When the two files are compared, locations that were program with the hex format were scrambled. Those that used the ACA format were fine. Any comments? thanks Dave ColsonArticle: 81119
Roel wrote: > Anyway it is strange why an PPC405 will only do ~500Mhz in 90nm in 2005 > and a P4 in 90nm >3Ghz (factor 6 !!!), no it is not. Even outside fpgas the market share of slow processors is a lot larger than that of fast processors. For every 3GHz P4 there are dozens of 200MHz Riscs and hundreds of 10MHz MCUs sold. Why should Xilinx go for the exotic niche market that desktop PCs are from a processor builders view? > it suggests that the FPGA > silicon is far less optimized as a CPU would, Both are optimized for different optimization goals. The PPC405 is by far smaller than a P4 and uses a lot less power. Here is a recent processor that the makers of the P4 consider highly optimized. It runs at up to 520MHz: http://www.intel.com/design/embeddedpca/products/pxa270/techdocs.htm Two P4 cores would burn more than a hundret watts. They would also need large caches, many io pins to access external memory quickly enough. (There is an empirical law in computer architecture that memory scales with performance.) >the process used for > modern FPGA’s equals that of P4’s. No, it used to be that DRAMs used the most advanced technology first, now that switches more and more to fpgas. CPU usually adopt the technology many months later. Kolja SulimmaArticle: 81120
Hi Eric, Thank you for your response. I guess I did not frame my question properly. Your answer actually strengthens my doubt. According to what you said, by using more pipeline stages, you can increase the sampling speed. But I read this article, where they use more resources to make use of the parallelism in the application (FIR filters), and reduce the latency. But they claimed that the sampling speed is also increased. My doubt is that, even in the original design (with fewer resources), you can achieve higher sampling speed by pipelining the design. One relation I see between "more resources" and "sampling speed" is that you need fewer pipeline stages to achieve higher sampling speed. For example, consider this computation: Y = Y + A[i]*X[i]; Assume that I have only 2 adders. I can implement this as Y1 = A[i]*X[i] + A[i+1]*X[i+1]; | Y = Y + Y1 Assume that it takes 10 ns to do (A[i]*X[i] + A[i+1]*X[i+1]); Therefore, without pipelining the adder, I can achieve sampling of 200 million samples per second. Now suppose, I have 4 adders, I can do Y1 = A[i]*X[i] + A[i+1]*X[i+1]; | | | Y' = Y1 + Y2 | Y = Y + Y' Y2 = A[i+2]*X[i+2] + A[i+3]*X[i+3]; | | Now I can achieve a sampling of 400 million samples per second. The original system (with 2 adders) can also achieve the same sampling speed, if the adder is pipelined. I guess that if there is a limit on the amount of pipelining that you can do, adding more resources is the way to increase sampling speed. (I am trying to answer my own question here) Anyways, thanks for your help. -Anup Eric Smith wrote: > "anup" <anuphosh@yahoo.com> writes: > > I am confused about this. According to me, if you pipeline the design, > > you can achieve the sampling speed you want. I can see how using more > > resources, you can achieve shorter latency, but the sampling speed > > should not be affected. There is probably something I am missing here. > > Adding pipelining is usually done to reduce the cycle time (increasing > the clock rate) while also increasing the latency. > > Suppose you wanted to build a floating-point multiply-and-add > unit. Perhaps if you make it fully combinatorial, it takes > 100 ns for each cycle. You can process ten million samples per > second, and the latency is 100 ns. > > Suppose instead you break it up into a pipeline with a combinatorial > multiplier, a pipeline register, and a combinatorial adder. Suppose the > multiplier and adder each take 60 ns separately, and the pipeline > register setup and clock-to-output-valid time adds 10 ns. Now your > latency (full operatin time) is 130 ns, which is longer. But your > sample clock can be as fast as 70 ns, so you can process over fourteen > million samples per second. > > Now suppose you internally pipeline the actual adder and the multiplier. > Perhaps each have three stages that take 20 ns each, and you still have > 10 ns of delays for combined setup and clock-to-output-valid time of the > pipeline registers. Now you have a latency of 170 ns, which is longer > yet. But your sample clock can now be 30 ns, so you can process over > thirty million samples per second. > > Note that the times used in this example are probably not representative > of real times for any actual system. > > EricArticle: 81121
Roel wrote: > sam wrote: > >> Couple of things >> >> One factor affecting the speed of the circuit is the process. Some of >> the FPGAs like Virtex -II are fabricated in .13um technology abd are >> far behind the present CPU technology. >> >> However, the latest FPGAs from Xilinx and Altera are 90nm (same as P4). >> I think the reason for the performance difference for these guys is, as >> u suggested, because of the tools. ASICs like CPUs are carefully >> optimized for area, timing, power and so on at every level. This is >> taken care of by the synthesis, PAR tools while designing the circuits >> using FPGAs. All you do is code the design and let the synthesizer, PAR >> tools do their best possible job. I also think this process is getting >> further complicated by the inherent design of the FPGAs. >> > Anyway it is strange why an PPC405 will only do ~500Mhz in 90nm in 2005 > and a P4 in 90nm >3Ghz (factor 6 !!!), it suggests that the FPGA > silicon is far less optimized as a CPU would, the process used for > modern FPGA’s equals that of P4’s. Of course it all depends on the max > level of logic between two clock levels expressed in FO4 delay's. The > less levels the more though the design will be. A P4 is not a "though" > design in this perspective and PPC405 also not, so still the question is > why... Maybe the P4 is designed transistor by transistor, and a PPC405 > in a V4 is only synthesized by some less efficient synthesis tools?? For > sure it gives hope for the next generation FPGA’s. You need to be careful to compare BUS speeds, rather than CLK speeds. CLK speed can refer to how fast a single node in the chip toggles, (== marketing fluff) and on that basis, the FPGAs could be pitched at 10Ghz devices :) as the top end ones can do 10GHz comms.... Once you work at Bus-bandwidth numbers the differences greatly reduce and BUS bandwidth is also determined as much by memory devices, as it is by CPU/FPGA process. FPGAs have more general IOs, (and can easily make a BUS wider), whilst PCs try to save pins, and can focus the IO purely for DDR memory. -jgArticle: 81122
Use a series resistor of at least 3.3K Ohm to keep the current under 1 milliAmp. Most LEDs will give out enough light at this current to be visible. Glenn. Sea Squid wrote: > I want to experiment the parallel port with eight LEDs tied to > a cut parallel port cable, then send instructions with Visual Basic > to create some patterns. Is there any danger to my laptop? > > Thanks.Article: 81123
"morpheus" <saurster@gmail.com> wrote in message news:1111094017.939450.52810@o13g2000cwo.googlegroups.com... > Hi All, > If anyone knows of a bit rounding algorithm, please forward the > information to me. I am trying to round-off 24 bits to 12-bits. > Thanks > MORPHEUS > p.s. the 24 bits is the result of an additing between two 24 bit > numbers. I need to round off the result and feed it to a 12-bit DAC. > THNX > MORPHEUS, I know you are asking specifically about rounding, but I am a little concerned about the addition of two 24 bit numbers with a 24 bit result. You need a 25 bit result. Do you have overflow issues here? DougArticle: 81124
Thank you Jim. I was aware that data2mem takes in a FULL bitstream of my compiled design, and output an updated FULL bitstream of the design. Since I am using a Virtex 6000, the time required to configure the FPGA becomes intolerable. I am able to wrtie automation scripts to employ the "small bit manipulation" trick to compare two bitstream and get a differential partial bitstream, but I am concerned whether this is the right approach. Besides that, is it possible to automate the Impact to configure the FPGA, for example, 1 configuration per 5 minutes, since I intend to do some exhaustive test of all the 1000 input vectors. "Jim Wu" <nospam@nospam.com> wrote in message news:d1btiq$18g1@cliff.xsj.xilinx.com... > Check the "data2mem" program installed with ise. > > HTH, > Jim > jimwu88NOOOSPAM@yahoo.com (remove capital letters) > http://www.geocities.com/jimwu88/chips > > "Sea Squid" <Sea.Squid@hotmail.com> wrote in message > news:4238e536$1@news.starhub.net.sg... > > I made use of two 1K*10B single port RAMs generated with coregen > > which is modified to contain my test vector, and P&R with that. However, > > I have one thousand test vector files in plain text to send to the FPGA > one > > at a time. > > > > I am wondering about whether I can write a perl script to manipulate the > > bitstream and generate an *incremental* bitstream so that I can avoid > > running ISE for one thousand times? Where can I find such information? > > > > Thanks. > > > > > > > >
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z