Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
fortizzz@hotmail.com (Fernando) wrote in message news:<b650f553.0303201444.5b2a8089@posting.google.com>... > tom@arithlogic.com (Tom Hawkins) wrote in message news:<cf474e53.0303192012.6267c920@posting.google.com>... > DSP. > > > > A lot of people new to DSP on FPGAs at first tend to believe they need > > floating point precision. In our development process we usually hand > > off a C math model early on to let our clients experiment with various > > levels of fixed-point precision. Often this convinces them that fixed > > point works just as well, and sometimes even better, than floating > > point. Also, don't forget Cordics for fixed-point sines and cosines. > > They may be a better alternative than consuming memory for lookup > > tables. > > > I disagree with what you said. Floating-point units are coming along nicely, > and the features in the new fpgas (BRAM, multipliers) give a huge boost to > fp peformance. Cordic is (in my opinion) not the way to go if you want high > performance, becase it requires a lot more cycles than LUT-derived methods. > You can even add (floating-point) linear interpolation to decrease the > memory footprint, and the latency will be still be lower than cordic, > keeping the compatibility and range of the IEEE-754 format. > > However, I would like to hear the opinion of the experts, > > Fernando The problem with floating point isn't the arithmetic, but the need for variable shifting. Multiply requires a variable shift after the mantissa multiplication to renormalize the result. Add and subtract operations are even worse because they need a barrel shifter to align the binary point in addition to the shifter in post arithmetic normalization. A variable shift is a fundamental limitation. Whether you implement it as a barrel shifter or as a pipelined check-and-shift operation, you still have the standard speed vs. area tradeoff. For trigonometric functions, ROM lookup is the clear choice if precision requirements are low enough. But as precision requirements get higher, cordics become more attractive. They are fairly inexpensive too; just a bunch of adders with shifting absorbed by the signal routing. And if the design has a unidirectional dataflow, the cordic can be fully elaborated and micro pipelined. -Tom -- Tom Hawkins tom@arithlogic.com Arithlogic -- FPGA Consulting -- http://www.arithlogic.com/ Translating Data Intensive Algorithms to Digital LogicArticle: 53751
Hi; I've run into this situation -- the synthesis tool used generating a design you didn't expect, as well as timing issues not found with functional simulation. (I've been working with the older XC4000XLA family parts as well as XCV300 and XCV1000E). In addition to what Vishker suggested, if your synthesis tool allows looking at the technology generated schematics and/or RTL level schematics, I'd take a look at them too. Sometimes, due to the way items are coded, the synthesis tools can create circuits that don't exactly match what the designer originally intended... -bob Vishker wrote: > > Try running simulation for post synthesis and post place&route > results. There is a switch in ISE which outputs post place&route > netlist in verilog/vhdl. I am sure you ll find somthing similar in > Synthesis tools also. May be the tools are messing up the logic. > > -Vs > > ninjak@gmx.de (zerang shah) wrote in message news:<4d6c559c.0303182011.46e6e561@posting.google.com>... > > Hello, > > > > I am working on a fairly large project on a Xilinx XCV1000, and I am > > using ISE 5.1i for synthesis. I've tested my design in simulation and > > it behaves as it is supposed to, but when I program my FPGA, it > > behaves weirdly. I'm sure that everything is wired correctly and I can > > get simple VHDL programs to synthesize correctly. I'm certain that the > > issue isn't timing, I've slowed the clock down to molassas speed. I've > > spent many hours trying to debug this, and I've had no luck. I was > > wondering: Does anyone here have some suggestions?? This is really > > frustrating... > > > > Thanks for the help.Article: 53752
"roy hansen" <royhansen@removethis_norway.online.no> wrote > If so, what is the typical speed increase I can hope for > compared to a state-of-the-art 1 CPU PC (i.e. P4 3GHz). Some links tangential to this question: On FPGAs as PC Coprocessors (1996): http://www.fpgacpu.org/usenet/fpgas_as_pc_coprocessors.html "One of the "way out speculation" questions asked at FCCM 96 (IEEE Symposium on FPGAs for Custom Computing Machines) was "when, if ever, will an FPGA coprocessor ship on every PC motherboard?" Ignoring the daunting language and interface standards issues, and just looking at current hardware approaches to FPGA "coprocessors", the answer must be "not any time soon". ..." philosophical musing -- P4 vs. FPGA MPSoC (Oct'02): http://groups.yahoo.com/group/fpga-cpu/message/1411 Jan Gray, Gray Research LLCArticle: 53753
Hi, I'm looking for schematics or layouts that show what i need minimally to be able to use an FPGA. Specifically, a xilinx spartan II or IIE. I'm looking for something I can build on (need to add some of my own stuff to it, etc). I searched xilinx site, but didn't find anything of use. perhaps i'm missing something? thanks, --buddyArticle: 53754
The FFT is complex. A real only FFT zero's the Q input, which in turn can reduce the amount of processing. If the input is real only, you can also 'fold' the input and process it using a complex FFT kernel half the size of the FFT and then unfold the result, or you can process two real FFTs using one complex FFT operation. So the advantage of doing a real FFT less operations are are required for a given size. The complex FFT is the more general case. The FFT inputs and outputs are generally in cartesian form. Even a real-only FFT has a complex output. In some applications, you are only interested in magnitude, in which case the FFT output needs to be transformed from cartesian to polar form. If log magnitude is required, then a log stage is also needed. Bob wrote: > Hi Folks, > > Some questions for the experts. > > 1..What is the benefits of a complex fft rather than > a non-complex fft? What are the benefits and in > for what applications would each be used? > > 2...Is the fft output from an asic or fpga normally in > logarithmic or non-logarithmic form? > > Thanks for any info > > Bob Carter -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 53755
Hi, I have a Spartan2e that is not programming consistently. That is sometimes it goes and sometimes it does not. When it does not the pins are as follows. Done = '0' init_not = '1' din='1' cclk=still running mode is serial master. Prom is spartan XC17S50A series. Is init_not supposed to stay low during configure? Any observations as to what might be happening? Also, is there a clear simple timing diagram available somewhere that shows exactly what these 4 pins do during configuration at power up? I seem to recall seeing one, but I cannot seem to find it. Thanks, TheronArticle: 53756
Hi Petter - you are right, it would save time if you accomplish both programming and testing after the CPLD is mounted on the board, using the JTAG. --Neeraj "Petter Gustad" <newsmailcomp4@gustad.com> wrote in message news:m3y939t5ts.fsf@scimul.dolphinics.no... > "Neeraj Varma" <neeraj@cg-coreel.com> writes: > > > Hi Klaus - I think if Mauro wants to accomplish mass CPLD programming, a > > hardware programmer like the Xilinx HW-130 or a 3rd party programmer like > > the one from ALL-07 is more suitable than JTAG cable programming. However, > > adapters for particular packages and devices can be very expensive... > > Are these much faster than programming using JTAG? Assuming you're not > using a slow serial or parallel cable (e.g. a JTAG Technologies BV > DataBlaster, Xilinx MultiLinx USB, etc.) > > The cards have to be tested after the CPLD's have been mounted anyway. > Then why not program the PLD and test the card in the same step? Of > course you would use a less fragile connector and more capable > software. > > Petter > -- > ________________________________________________________________________ > Petter Gustad 8'h2B | ~8'h2B http://www.gustad.com/petterArticle: 53757
Hello, I am in need of a VHDL module of a n x m generic sequential divider. Does anyone have one or can tell me where to find one? Many thanks in advance! --Tom --- Tom Curran tom_curran@memecdesign_dot_comArticle: 53758
Hi, I had similiar issues with my NIOS Installation and compiling sources under Quartus. In my case this was caused by old references to directories of former installations of the cygnus tools. You may find those false directories in the path variable of your systems folder. regards Thomas "Franz Hollerer" <nospam@nospam.com> schrieb im Newsbeitrag news:Xns93459A6561A9Ehollererdecomsyscom@213.129.232.14... > Hi, > > I try to compile the hello example described in the > EPXA1 Development Kit Getting Started User Guide using > GNUPro. > > If I call 'make debug' from the command line it works, > but if I use Quartus II v2.1 I get some non-informative error message: > > > Design Debug: Software build was unsucessful. 0 errors, 0 warnings. > > Can someone give me a hint what's going wrong? > > Best regards, > > Franz HollererArticle: 53759
Fernando wrote: > tom@arithlogic.com (Tom Hawkins) wrote in message > news:<cf474e53.0303192012.6267c920@posting.google.com>... DSP. >> >> A lot of people new to DSP on FPGAs at first tend to believe they >> need floating point precision. In our development process we >> usually hand off a C math model early on to let our clients >> experiment with various levels of fixed-point precision. Often this >> convinces them that fixed point works just as well, and sometimes >> even better, than floating point. Also, don't forget Cordics for >> fixed-point sines and cosines. They may be a better alternative than >> consuming memory for lookup tables. > I disagree with what you said. Floating-point units are coming along > nicely, and the features in the new fpgas (BRAM, multipliers) give a > huge boost to fp peformance. > However, I would like to hear the opinion of the experts, Certainly not an expert, but here's my £0.02. As an undergraduate, I worked on a FPGA based hardware accelerator for ray tracing (see http://zeus.jesus.cam.ac.uk/~jrs53/dissertation.pdf ). This used a pretty large Altera FPGA (20k1000) and floating point arithmetic for ease of integration with the host PC. With a very naive implementation of the arithmetic modules, I just about made the 33 MHz of the PCI bus. Note that this was with a very loose interpretation of IEEE 754! Even with this (then) gigantic FPGA, to get all the required modules for just one ray-object intersection calculator unit required almost half the device, so this kinda limited my ability to look at increasing parallelism. So what am I saying? I think if I did the project again, I'd go for fixed point. The intricacies of IEEE 754 mean that getting something that produces as accurate results you'd expect from a dedicated FPU is probably more effort than it's worth. Now clearly this depends on your application, but you probably shouldn't discard fixed point out of hand. James p.s. the finished project did end up being between 3 and 40 times faster than the fastest PC I could find! James Srinivasan PhD Student University of Cambridge Computer LaboratoryArticle: 53760
> I have a Spartan2e that is not programming consistently. That is >sometimes it goes and sometimes it does not. When it does not the pins >are as follows. First thing I would look for is glitches on the clock line. Second would be power. >Also, is there a clear simple timing diagram available somewhere that >shows exactly what these 4 pins do during configuration at power up? I >seem to recall seeing one, but I cannot seem to find it. If it works sometimes, you should be able to capture your own timing diagram. There might be one in the data sheet. -- The suespammers.org mail server is located in California. So are all my other mailboxes. Please do not send unsolicited bulk e-mail or unsolicited commercial e-mail to my suespammers.org address or any of my other addresses. These are my opinions, not necessarily my employer's. I hate spam.Article: 53761
Just buy one from Tony. His are expandable, very expandable. The last thing you want is to be debugging some download weirdness instead of your design. www.burched.com SH On Fri, 21 Mar 2003 15:12:29 +0000 (UTC), Buddy Smith <nullset@dookie.net> wrote: >Hi, > >I'm looking for schematics or layouts that show what i need minimally to >be able to use an FPGA. > >Specifically, a xilinx spartan II or IIE. I'm looking for something I >can build on (need to add some of my own stuff to it, etc). > >I searched xilinx site, but didn't find anything of use. perhaps i'm >missing something? > >thanks, > >--buddyArticle: 53762
There is an on line VHDL generator for parallel CRCs at http://www.easics.com/webtools/crctool I don't know off hand if the license permits commercial use however. Benoit wrote: > Hi all, > > I need to calculate and verify a CRC in frames, at more than 4 Gbps in a > FPGA. > The common CRC implementation is based on a linear feedback shift register > architecture which can be used to process 1 bit per clock cycle. > Easy but slow tyically only for low rates. > > To meet the requirements of CRC calculation for gigabit networks, a solution > is > to calculate CRC32 in several steps using "Galois Method".. . > > So I'm looking for a free VHDL Source code using this method in FPGA. > Can you help me please ?. > > Thanx, > Benoit. -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 53763
Hi, An IEEE-754 single precision takes around 500 LUTs to implement and will take have a latency of 5 clocks at 100 MHz. A 32-bit fixed point addition consumes 32 LUTs and have a latency of 1 clock at 250 MHz. So the factor operations/LUT is the following. For IEEE-754: (100_000_000/5) / 500 = 40_000 operations/LUT For Fixed point: (250_000_000/1)/32 = 7_812_500 operations/LUT The difference is 195x. What would you implement? Göran Bilski Fernando wrote: > tom@arithlogic.com (Tom Hawkins) wrote in message news:<cf474e53.0303192012.6267c920@posting.google.com>... > DSP. > > > > A lot of people new to DSP on FPGAs at first tend to believe they need > > floating point precision. In our development process we usually hand > > off a C math model early on to let our clients experiment with various > > levels of fixed-point precision. Often this convinces them that fixed > > point works just as well, and sometimes even better, than floating > > point. Also, don't forget Cordics for fixed-point sines and cosines. > > They may be a better alternative than consuming memory for lookup > > tables. > > I disagree with what you said. Floating-point units are coming along nicely, > and the features in the new fpgas (BRAM, multipliers) give a huge boost to > fp peformance. Cordic is (in my opinion) not the way to go if you want high > performance, becase it requires a lot more cycles than LUT-derived methods. > You can even add (floating-point) linear interpolation to decrease the > memory footprint, and the latency will be still be lower than cordic, > keeping the compatibility and range of the IEEE-754 format. > > However, I would like to hear the opinion of the experts, > > FernandoArticle: 53764
Hi I need to synthesis a HDL Design, but some components are removed and optimized when Leonardo Spectrum are running. How do I proceed (attributes in VHDL or Leonardo commands) to running Leonardo Spectrum without optimization? Eduardo Wenzel Brião Catholic University of Rio Grande do Sul State - PUCRS Porto Alegre city BrazilArticle: 53765
Hi, I'm using Xilinx ISE webpack 5.1i with modelsim XE starter and I'm trying to use Xpower. I first synthesise my design and do the place and route process. Then I click on my testbench .vhd file and run the 'post place and route simulation '. I chose the option to 'automatically generate vcd file'. Modelsim starts and the simulation begins. However, it takes forever to simulatea small amount of time. I think this is due because my design exceeds the Modelsim starter maximum gate count. Anyway, after a day or so, I have a small vcd file (100k). I then run the xpower tool from the ISE post process window and load my vcd file. Then I get about one million messages saying that every nodes in my circuit 'has not had its activity set'. Am I doing something wrong? Could it be because my .vdc file is too small? Thanks DavidArticle: 53766
Thank you very much Alan, it works!!!! I'm really happy because I was trying two weeks doing that, I was probing many things and nothing works. I was very disappointed with my self and nobody can help me. Really I apreciate so much your tips. Thanks for all Regards GerardoArticle: 53767
"Ray Andraka" <ray@andraka.com> wrote in message news:3E7B2D8A.3B5106D0@andraka.com... > The FFT is complex. A real only FFT zero's the Q input, > which in turn can reduce the amount of processing. If the > input is real only, you can also 'fold' the input and > process it using a complex FFT kernel half the size of the > FFT and then unfold the result, or you can process two real > FFTs using one complex FFT operation. So the advantage of > doing a real FFT less operations are are required for a > given size. The complex FFT is the more general case. FFT is usually done in floating point, though that isn't required. We would need to know more about the real goal to answer the question. If the goal is to speed up a program that uses an FFT subroutine then it would need to be done in a similar data representation. Both floating point addition and multiplication are hard to do fast in FPGA's, and division is even worse. --glenArticle: 53768
"Tom Hawkins" <tom@arithlogic.com> wrote in message news:cf474e53.0303210623.384c8880@posting.google.com... (snip) wrote: > > > A lot of people new to DSP on FPGAs at first tend to believe they need > > > floating point precision. In our development process we usually hand > > > off a C math model early on to let our clients experiment with various > > > levels of fixed-point precision. Often this convinces them that fixed > > > point works just as well, and sometimes even better, than floating > > > point. Also, don't forget Cordics for fixed-point sines and cosines. > > > They may be a better alternative than consuming memory for lookup > > > tables. I do agree that floating point is overused. > > I disagree with what you said. Floating-point units are coming along nicely, > > and the features in the new fpgas (BRAM, multipliers) give a huge boost to > > fp peformance. Cordic is (in my opinion) not the way to go if you want high > > performance, becase it requires a lot more cycles than LUT-derived methods. > > You can even add (floating-point) linear interpolation to decrease the > > memory footprint, and the latency will be still be lower than cordic, > > keeping the compatibility and range of the IEEE-754 format. > > > > However, I would like to hear the opinion of the experts, > The problem with floating point isn't the arithmetic, but the need for > variable shifting. Multiply requires a variable shift after the > mantissa multiplication to renormalize the result. Add and subtract > operations are even worse because they need a barrel shifter to align > the binary point in addition to the shifter in post arithmetic > normalization. I have recommended using base 16 floating point, as IBM used in S/360 and S/370. That greatly reduces the shift requirements. It is a little harder to do the error analysis in numerical algorithms, but that might not be so much of a problem in this case. > A variable shift is a fundamental limitation. Whether you implement > it as a barrel shifter or as a pipelined check-and-shift operation, > you still have the standard speed vs. area tradeoff.Article: 53769
That may be true for software implementations of the FFT, but not necessarily for hardware. Depending on the size of the FFT, floating point may not buy anything. For smaller FFT's it is much more economical to work with wider fixed point to get the needed dynamic range than it is to do floating point. For larger FFT's, the FFT is generally accomplished using small FFTs combined using the mixed radix algorithm. In these cases, it often makes sense to do the small FFTs in fixed point and then normalize and adjust the exponent between passes. Often a block floating point scheme is sufficient, in which case the common part of the exponent is stripped off before denormalizing the data before each path. That common part of the exponent is then used to scale the final result. Glen Herrmannsfeldt wrote: > FFT is usually done in floating point, though that isn't required. We > would need to know more about the real goal to answer the question. If the > goal is to speed up a program that uses an FFT subroutine then it would need > to be done in a similar data representation. Both floating point addition > and multiplication are hard to do fast in FPGA's, and division is even > worse. > > --glen -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 53770
Hal Murray wrote: > > I have a Spartan2e that is not programming consistently. That is > >sometimes it goes and sometimes it does not. When it does not the pins > >are as follows. > > First thing I would look for is glitches on the clock line. > Second would be power. > > >Also, is there a clear simple timing diagram available somewhere that > >shows exactly what these 4 pins do during configuration at power up? I > >seem to recall seeing one, but I cannot seem to find it. > > If it works sometimes, you should be able to capture your own > timing diagram. > > There might be one in the data sheet. > > -- > The suespammers.org mail server is located in California. So are all my > other mailboxes. Please do not send unsolicited bulk e-mail or unsolicited > commercial e-mail to my suespammers.org address or any of my other addresses. > These are my opinions, not necessarily my employer's. I hate spam. As it turns out, the problem is most likely in the power supply. The system has multiple boards, each with one FPGA. I was trying to power them up in a daisy chain fashion. The daisy chained power enable signal is sometimes glitching high at the wrong time (too early) and thus trying to boot the FPGA. I am trying to add a stronger pull down to eliminate the problem. It seems to help but I have to add it to all the boards and see if it totally solves the problem. Thanks for the input, Theron HicksArticle: 53771
1>(1-0.001953125) 2>(1-0.001953125)^6 3>(1-0.001953125)^1/2Article: 53772
hi everybody, please can anybody tell me whether Altera FLEX10K100E is only a 3.0V device? i am bit confused as somewhere i have read that it's a multivoltage device. will it work on 5.0V? regards, shridhar.Article: 53773
Hi, all I defined a bidir port d[7..0] to send commands and receive data in Maxplus II. However, it does not work, the help messages tell me that I should use tri to feed the bidir. I added tri (buffer[7..0] : tri) before d[] with buffer[].oe=vcc and d[]=buffer[].out, but problems are still exist. Any suggestions? Thank you. JianArticle: 53774
Hi all. I recently had to replace a dead BurchED B5 board. This had a XC2S200 chip, the new one has XC2S300E. The WP3.3 did not support it so I downloaded WP4.2 because WP5.2 will not run on my Win98 system. I installed in a separate directory, to retain both versions. However, WP 4.2 is not compiling some code that compiled on 3.3. Are they unable to co-exist in the same PC at the same time? In one file, it won't let me take just one bit from a sub-component's multi-bit output. I can get round this by connecting it to an intermediate multi-bit signal, then using just the bit required. It works, only a bit more long winded. In some other files, it says a signal is assigned but not used, even though it explicitly _is_ connected to a component later in the file. The syntax checker and hierarchy checker report no problems. Thanks in advance, K.
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z