Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Hi Andrew, > one of the things that you are running into is caused by the router and > the part - these LUT based models cannot be accurately simulated to this > point because you end up hand routing and then running simulations in a > circle - I end up using Actel SX parts for critical timing (they have > very deterministic delays through all of the cells that allow you to do > this). I like Xilinx and Altera parts very much, but they have > limitations, and timing accuracy is one of them. Consider the > architecture of the part when you decide on your needs. I am unfamiliar with these parts, but I took a peak through the datasheet, and I don't see what it is that would make these chips any more deterministic than other FPGAs (especially non-segmented ones like FLEX10K, APEX20K, etc.). Can you elaborate? One thing to be aware of is that what the software tells you is the delay of an element may not be very accurate at all -- the goal of FPGA software is to ensure that your design runs correctly at the frequency reported. There are numerous effects that can either be swept under the rug (and covered by guard-band/conservative results) or modeled accurately. Some things I can think of off the top of my head: - Skew along a routing wire (the further you go, the longer it takes). Small effect on short wires though... - Loading effects (the more active loads on a wire, the slower it is) - Rise/Fall delay (usually rise and fall are not perfectly matched; software can't figure out which you care about, reports max) - Assymetries in the combinational logic (some paths may be faster than others due to circuit design). - Edge rates (depending on path taken, may have different edge rate, but software may just model a single, worst-case edge rate) - Proximity effects in layout. Though all wires may look the same, are they? Depending on neighbouring metal, some wires may have higher/lower R & C for a variety of reasons. Does the software model them as such? - Other subtle differences in layout. This means that though one part may *appear* more deterministic, this may just be a sign of a less sophisticated timing model. I'm not saying that's the case for SX (I have no clue), but just someting to be aware of in general. As previously indicated by posters, relying on a particular delay to be predictable is not a good. Process, temperature and voltage variation make it next to impossible to do so. The "-7" chip you have may be a "-5" chip that has been down-binned. Tools usually report the delay assuming worst-case temperature, voltage, IR drop, silicon that is at the edge of the bin, etc. Regards, Paul Leventis Altera Corp.Article: 58226
vbetz@altera.com (Vaughn Betz) wrote in message news:<48761f7f.0307162147.57d6095a@posting.google.com>... > already5chosen@yahoo.com (Michael S) wrote in message news:<f881b862.0307150558.53ac3a8c@posting.google.com>... > > I know that respective regulars of this newsgroup don't like to give > > decisive answers to A vs. X type of questions, but... This last visit > > of the Xilinx representative was a shocker ! > > <snip> > > it was a shock: about three time as many multipliers as in similar > > size/similar price Stratix chip. Two and a half times more multipliers > > than in significantly bigger Startix chip ! > > XC2P30 - 136 18x18 Dedicated multipliers > > EP1S30 - 48 18x18 Embedded multipliers (the price of the parts is > > similar to XC2P30) > > EP1S40 - 56 18x18 Embedded multipliers > > <snip> > > It's almost too good to be true. > > IMHO if there is no catch (availability ?) here the XC2P > > parts draws Stratix into irrelevance for nearly all DSP-intensive > > applications. > > Before making a decision, you should also check out Stratix's "soft > multipliers." Basically RAM blocks (which Stratix has lots of) are > used to do constant-coefficient multiplies using distributed > arithmetic -- exactly what FIR filters need. Since the RAM blocks are > re-writable, you can also update the filter coefficients by rewriting > the RAMs, in case you're doing some adapative FIR filtering. Altera's > FIR compiler will automate the implementation in RAM blocks for you. > > See http://www.altera.com/literature/hb/stx/ch_9_vol_2.pdf for the > detailed documentation. > > The document lists (in Table 9-3) how many 16x16 multiplies (with a > net throughput of one result per cycle) you can get with this > technique in various Stratix devices, even without fully optimizing > the design (i.e. wasting some of the RAM inputs & bits). > > 1S30: 159 more 16x16 multiplies > 1S40: 187 more 16x16 multiplies > > If you really need 18x18 multiplies (not 16x16) you'll lose about 25% > of these multipliers since you will need more M512 RAMs per > multiplier. On the other hand, this table (Table 9-3) assumes you've > done a fairly poor job of packing your FIR taps into the RAMs, so you > could come out even with a real, optimized implementation, and achieve > roughly the numbers as above even for 18x18. M512 supports 32x18 configuration. So 16x18 multiplies come for free. > > Adding in the 18x18 multipliers in the DSP blocks takes you to a total > "soft" + hard multiplier count of about 200 for the 1S30, and about > 250 for the 1S40 (can't be exact without knowing your number of taps > and coefficient widths precisely). > > Vaughn > -- I am an Altera employee, in case you didn't notice the email > address :). I spent many hours recently analyzing possibility of the DA solution. It probably does possible to fit our processing into S30 or S40 part with DA. But: 1. The target system has several modes of operation. Each mode requires slightly different topology of the filtering chain, i.e. different relationships between input/output decimation factor and the convolution length. When the solution is based on HW multipliers alone, it is relatively easy to implement all topologies in one design (or two at worst). Hard+Soft approach almost certainly would need different designs for different topologies. It leads to bigger on-board flash memory, complicated logistics and most important more complex verification. 2. Design effort. HW-only design is easier. 3. Compilation time. We hope that synthesis+p&r of the HW-only design will take about 10 to 30 minutes. Hard+Soft design will take at best several hours even not taking into account issue of multiple designs. The productivity gain from the faster compilation/simulation or compilation/timing analysis cycle should never be overlooked. 4. Higher risk. What if it fits, but due to 95% utilization of device resources can't meet timing requirement ? Through blood and sweet at last it'd meet :( My current conclusion is: if we would decide to stay with Altera and fight through DA there is no sense to use S30 or S40. We can go one step further and build our solution from one Stratix S10 part for NCO, digital mixers, advanced I/O etc) and four Cyclons (C12 or C20) for FIR filtering. At least this way we would have some pay-off (cheap parts) for our pain. And the compilation is faster too :-) BTW, DA is not limited to Altera. It is very possible in Xilinx too. Of coarse XC2P has no M512 blocks, but 16 SRL16 memories provide 80% worth of M512 for DA. So dedicating 20% of XC2P20's LC's to SRL16 give us about 50 more MACs and ability to fit into this even cheaper device.Article: 58227
I have a requirement to be able to enable/disable a pci card at the board level. could someone tell me which pin would be the best to interupt to accomplish this. I have tried using prsnt1 and prsnt2 open but the card is still picked up. messing with pins such as IRDY and TRDY causes the system not to boot. Thnx Chris HarwoodArticle: 58228
Henning.Bahr@ncl.ac.uk (Henning Bahr) wrote in message news:<8679149d.0307162201.49094c06@posting.google.com>... > Hi there! > I hope this isn't too trivial: > I'm having a digital system with a finite state machine and a few > other modules which send a control signal to the FSM. Do you think it > is possible to use only clock and only posedge Flip Flops in such a > design? I can't manage it without the inverted clock so that the > control signals change at half the clock signal. But is there a way to > avoid this without violating setup and hold times? > > Cheers, > Henning You can write your code using both the rising and falling edges of the clock. All you need is to select a device(FPGA)that accomodates the clock frequency and meets the timing constraints. Using timing constraints will help you to implement(write) the VHDL code such as not to violate setup and hold times(in Xilinx ISE is easy to do). Regards, DanRArticle: 58229
Chris Harwood wrote: > I have a requirement to be able to enable/disable a pci card at the > board level. could someone tell me which pin would be the best to > interupt to accomplish this. I have tried using prsnt1 and prsnt2 open > but the card is still picked up. messing with pins such as IRDY and > TRDY causes the system not to boot. Hmmm, perhaps it is the best to manipulate FRAME#. Assuming that FRAME# is always inactive, the card shall not pick up any transaction and should stay quiet. Regards, MarioArticle: 58230
You can disable a PCI card by writing to the card's PCI command register. Bit one is used to disable memory transactions. You can also remove the clock to the PCI device. "Chris Harwood" <chryselwood@hotmail.com> wrote in message news:d6a7ba72.0307170602.4fdd6e42@posting.google.com... > I have a requirement to be able to enable/disable a pci card at the > board level. could someone tell me which pin would be the best to > interupt to accomplish this. I have tried using prsnt1 and prsnt2 open > but the card is still picked up. messing with pins such as IRDY and > TRDY causes the system not to boot. > > Thnx > > Chris HarwoodArticle: 58231
Let me dig down to the bottom of this question: Some designers worry that feeding data from one flip-flop output to the input of another flip-flop clocked by the same clock is "dangerous", since a late-arriving clock on the destination flip-flop might clock in the already changed info from the first flip-flop. This is called a race condition, or a hold-time violation. The obvious cure to this "problem" is to alternate between rising and falling clock edges. This "cure" works, but it creates unnecassary complexity and cuts performance in half. Our answer is: Don't worry, be happy! The chip designers have taken care of this situation and given you a very fast, low-sekw clock distribution net ( actually many of these flobal clock nets) that completely eliminate the theoretical "problem". But if you use normal routing resources to distribute the clock, then it is wise to worry. Running the clock delays in the opposite direction of the data flow is a well-known cure. Peter Alfke, Xilinx ===== Dan RADUT wrote: > > Henning.Bahr@ncl.ac.uk (Henning Bahr) wrote in message news:<8679149d.0307162201.49094c06@posting.google.com>... > > Hi there! > > I hope this isn't too trivial: > > I'm having a digital system with a finite state machine and a few > > other modules which send a control signal to the FSM. Do you think it > > is possible to use only clock and only posedge Flip Flops in such a > > design? I can't manage it without the inverted clock so that the > > control signals change at half the clock signal. But is there a way to > > avoid this without violating setup and hold times? > > > > Cheers, > > Henning > > You can write your code using both the rising and falling edges of the clock. > All you need is to select a device(FPGA)that accomodates the clock frequency > and meets the timing constraints. Using timing constraints will help you to > implement(write) the VHDL code such as not to violate setup and hold times(in > Xilinx ISE is easy to do). > > Regards, > > DanRArticle: 58232
Willem Oosthuizen wrote: > > You can disable a PCI card by writing to the card's PCI command register. > Bit one is used to disable memory transactions. But then it will still respond to configuration transactions... > You can also remove the clock to the PCI device. Yes, that should work as well. Although manipulating the clock is always a little bit "special" because of its crucial purpose. Chris, I guess with "interrupting a line" you mean placing some masking gate there, isn't it? In this case you should also keep the PCI timing requirements in mind. Regards, MarioArticle: 58233
Someone knows a free software tool to generate a VHDL file starting from a state machine description? Thanks in advance.Article: 58234
Peter Alfke wrote: > The obvious cure to this "problem" is to alternate between rising and > falling clock edges. This "cure" works, but it creates unnecassary > complexity and cuts performance in half. > Our answer is: Don't worry, be happy! The chip designers have taken > care of this situation and given you a very fast, low-skew clock > distribution net ( actually many of these flobal clock nets) that > completely eliminate the theoretical "problem". I don't worry, and have been happy for a long time. Having low-skew global nets is the upside of using FPGA's. In cases where the falling edge seems to be needed to generate narrower pulses, you can still be happy by using the on-chip PLL/DLL to make and distribute a 2x clock. -- Mike TreselerArticle: 58235
Jay wrote: > Just wondering, I never got any responses to my original (and slightly > off-topic) post. Was I off in my design insights? I'm actually thinking > about using my aforementioned design in a project, I would like to know > if it's a good starting point or not... Consider finishing your research, running some simulations and coming back with code and more specific questions. Since PLLs/DLLs are already built-in to the devices, and work just fine, you need an interesting angle. > One thing I was particularly curious about, when thinking about my own > design, do I have to worry about synchronizing / clock domain problems > between the two reference inputs to the loop or does the standard two- > DFF design actually take care of that? I have always used a two flop shifter for synchronization without problems to date. Some use three or four. > I'm guessing that the *outputs* (Up/Down) from the DFF loop should be > put through a de-metastablizer (two DFF in series or something) since > those outputs could change irrespective of my system clock. Comments? Whenever possible, synchronize first and do logic later. > Oh, finally, is there a better frequency/phase detector than the > standard two-DFF one? I know that the two-DFF one hunts when the > phase/frequency are close to being locked. Google is your friend. Read these and tell us the answer. http://www.google.com/search?q=frequency+phase+detector+DFF -- Mike TreselerArticle: 58236
The Xilinx design tools - PC edition only - should have what used to be "StateCAD" as part of the suite. I haven't used it and don't know the details but I've wanted to see what it can do; my problem is I use Unix for my tool platform. Before Xilinx acquired StateCAD, the tool looked interesting as a stand-alone program. "Valeria Dal Monte" <prova@microsoft.com> wrote in message news:LZzRa.183127$lK4.5182616@twister1.libero.it... > Someone knows a free software tool to generate a VHDL file starting > from a state machine description? Thanks in advance.Article: 58237
StateCAD in the Xilinx WebPack bundle... http://www.xilinx.com/xlnx/xil_prodcat_landingpage.jsp?title=ISE+WebPack "Valeria Dal Monte" <prova@microsoft.com> wrote in message news:LZzRa.183127$lK4.5182616@twister1.libero.it... > Someone knows a free software tool to generate a VHDL file starting > from a state machine description? Thanks in advance. > >Article: 58238
hi, it's lucky for me that I got Synplify yesterday the VHDL version can be synthesized but now I'm puzzled how to add my own core using V2PDK ... btw, I've tried using the MontaVista 3.0 Pro so far, it seems good for creating a kernel for the ML300 tk Antti Lukats wrote: > "tk" <tokwok@hotmail.com> wrote in message > news:<bf3iqc$sh4$1@www.csis.hku.hk>... >> thx very much, antti !!!!! >> >> > 1) I use Verilog (VHDL gives error in v2pdk_lib_utils or somewhere..) >> i also encounter the VHDL problem >> too bad, i'm not familiar with Verilog!! > > same here :( > >> btw, how's ur progress with EDK ? > > not much, tried a little to get the TFT working, but the > tft.c vs xtft.c files are all messed up, also something is wrong > with the timing or plb to line buffer - the screen I see is > all in stripes like 8 pixels ok, then 8 pixels dark. > > I am little in the wait up mode, as the tft and linux will > most likely not used in the project I work for, more like > doing the research for the future. > >> any good news for its support with Linux ? > > same - used ELDK from www.denx.de to write hello.c and run it > on ml300 MVista linux, worked ok, and thats it. > > when a EDK design (linux cap) comes will checkit again. > > in the meanwhile need to get some hardware designed > > currently was fighting with Microblaze and 16 Bit Flash memory > as usual when all things are right then it works right away > but it takes a lots of time to get things right :) > > anttiArticle: 58239
Hi, The software disable mechanism that someone else mentioned (via the command register) is preferred. However, that means you need to have your device driver involved, which may not be an option for you. If you simply want to make your device appear to be invisible, you could consider using a 2-input gate so that you can locally make the PCI RST# signal on your card appear permanently asserted (like with a jumper or something...) You cannot just ground the RST# signal because that may disturb other devices on the bus. While you have forced it into reset (locally) all the I/O is supposed to be three-stated so you could consider it electrically disconnected from the bus. Even if the PRSNT# pins indicate something is "there" the O/S will not see it, not configure it, and it will be effectively disabled. Of course, doing anything like this is a technical violation of the PCI specification. However, I do not think you'll ever have your system/card fail if you do this... Chris Harwood wrote: > > I have a requirement to be able to enable/disable a pci card at the > board level. could someone tell me which pin would be the best to > interupt to accomplish this. I have tried using prsnt1 and prsnt2 open > but the card is still picked up. messing with pins such as IRDY and > TRDY causes the system not to boot. > > Thnx > Chris HarwoodArticle: 58240
Hi, Okay, I think it may be bad form to reply to myself but you don't even need to use a 2-input gate, you can just do this to the RST# signal with an appropriate jumper arrangement (break connection to the bus RST# and then locally tie it to ground...). A 3 pin header... Eric Eric Crabill wrote: > > Hi, > > The software disable mechanism that someone else > mentioned (via the command register) is preferred. > However, that means you need to have your device > driver involved, which may not be an option for you. > > If you simply want to make your device appear to be > invisible, you could consider using a 2-input gate > so that you can locally make the PCI RST# signal > on your card appear permanently asserted (like with > a jumper or something...) You cannot just ground > the RST# signal because that may disturb other devices > on the bus. > > While you have forced it into reset (locally) all the > I/O is supposed to be three-stated so you could consider > it electrically disconnected from the bus. Even if the > PRSNT# pins indicate something is "there" the O/S will > not see it, not configure it, and it will be effectively > disabled. > > Of course, doing anything like this is a technical > violation of the PCI specification. However, I do > not think you'll ever have your system/card fail if > you do this... > > Chris Harwood wrote: > > > > I have a requirement to be able to enable/disable a pci card at the > > board level. could someone tell me which pin would be the best to > > interupt to accomplish this. I have tried using prsnt1 and prsnt2 open > > but the card is still picked up. messing with pins such as IRDY and > > TRDY causes the system not to boot. > > > > Thnx > > Chris HarwoodArticle: 58241
petersander@despammed.com (Peter Sander) wrote in message news:<9501f612.0307170159.147e311e@posting.google.com>... > Hi, > > I'm having some trouble with Quartus II 3.0 on win2k and > a selfmade byteblaster cable. > > I'm getting a > 'Attempted to acccess JTAG server - internal error code 82 occured' > when clicking on the 'Add Hardware' button in the Hardware setup window. > > Since the hardware is rather simple I guess it's a driver problem :-/ > > Has anyone similar problems with Quartus II? > > > Peter Peter this problem is seen if jtagserver is not running on the computer, attached to the byteblaster. Jtagserver is run as a service. On Windows 2K this means that you need to be "administrator" to install, start and stop the service. If you were "administrator" when you installed Quartus, jtagserver would have installed properly as a service and will automatically start each time you reboot the computer. Just to make sure jtagserver is installed properly do the following: 1. Login as administrator 2. run \quartus\bin\jtagserver --uninstall 3. run \quartus\bin\jtagserver --install - Subroto Datta Altera Corp.Article: 58242
Hi guys, In my new project, input data to parallel FIR is as fast as 150 mhz which I think so fast for PDA (I have used coregen PDA at 50 mhz or slower, it run well) dont know how it perform? Xilinx data sheet on core FIR not saying about speed. I know speed depends on many factors as placement, routing,... Let say an 80% resource design with XC2V300 speed -4, how fast one can archive with the FIR???Article: 58243
"javid" <javodv@yahoo.es> ha scritto nel messaggio news:c10cd8da.0307140345.f0999e@posting.google.com... > Hello to All, > > I was wondering if it is possible to program a PLD/CPLD via a PIC > (without connecting external memory). The PIC I am using has a > internal RAM of 768 bytes and 16k of flash. I have seen some app.notes > from Altera/Xilinx/Lattice but I think that I need a more powerful > micro for doing the CPLD reprograming with it. Is there any new small > CPLD easy to reprogram?. I would appreciate any suggestion or link. > In a my project with Spartan FPGA (sorry I no experence with Xilinx PLD/CPLD) I've use a Microchip PIC16F876 with an external serial EEPROM 24LC65. I2C communication from PIC to EEPROM is built in software, from PIC to Xininx with little line of asm code I'm able to program the FPGA (i dont remember precisely, but the total configuration time can be 6 second for XCS05 model at 20MHz PIC frequency clock). Best regards Pow -- ---------------------------------------------------- Ama il tuo mestiere con passione E' il significato della tua vita Auguste Rodin (1840-1917) "Not everything that counts can be counted, and not everything that can be counted counts". Albert Einstein "Hunt for the Engineer and the Engineer will hunt you." Hellraiser5 PinheadArticle: 58244
Hi, I have to pick a drive strength on an output pin that is driving a memory chip with 5pf cap. This is a 50ohm impedance line. Distance is about 3000 mil. Will a 6ma driver be good enough ? What equation or rule of thumb can I use ? ReArticle: 58245
Hi all, I am using GDB to debug my design. First I generated XMDSTUB and dumped bit file on FPGA. Then I started XMD engine to establish connection with Stub on FPGA through UARTlite(command used is: mbconnect stub –comm serial –port com1 –baud 9600).I followd same procedure given in "EDK microblaze tutorial". But, XMD is unable to establish the connection and show following. ERROR: Unable to sync with stub on the board using the UART Closing serial port. My Board is XSV-800 from xess company and using serial port to connect with stub on board. I am really stuck in my work. Please help me to get rid of this problem. Thanks & Regards, Viral Parikh.Article: 58246
Hi Viral, Viral Parikh wrote: > But, XMD is unable to establish the connection and show following. > ERROR: Unable to sync with stub on the board using the UART Closing > serial port. Check your reset circuit - often the push buttons on these prototyping boards are active low, so when not pressed they are putting 5V (or whatever) on the reset pin, and holding the processor in reset. Check this Xilinx answer to find out how to invert the internal polarity of the reset line: http://support.xilinx.com/xlnx/xil_ans_display.jsp?iLanguageID=1&iCountryID=1&getPagePath=14337 Other simple things to check: - external clock frequency correctly set as a parameter to the UART core - UART baud rate parameter set and matches what you give to XMD. - properly set the UART as the debug peripheral (in MSS file) Hope this helps, JohnArticle: 58247
you can find the jblaster project on the sourceforge.net Jim Flanagan <jflan@ieee.org> wrote in message news:<MPG.197fad6474e7fddb989682@netnews.worldnet.att.net>... > [This followup was posted to comp.arch.fpga and a copy was sent to the > cited author.] > > Hi.. > I am searching for a 'standalone' command line utility that will > allow me to program Altera CPLD parts using the ByteBlasterMV cable > and WITHOUT using Max-Plus,etc. The MaxPlus sw comes bundled with a > small executable (with 'C' source) that will allow you to program using > .RBF (raw binary files) but not .POF files. Either I need to modify > the source to accomodate POF files (don't have the specification, any > help?) or get a utility that will convert POF to RBF format. > > In any event, I could use some direction. The reason for the standalone > tool, is that I want to integrate the CPLD programming into a production > environment and do not want an operator to have to run a program such > as MaxPlus. > > Any help would be appreciate... Thanks. > > JimArticle: 58248
Xilinx can also do multpliers in the fabric in the same way. In the case of Xilinx, using SRL16's gives you a LUTs for constant coefficient multipliers that can be easily reprogrammed using a serial reload. The Stratix structure can sort of do that using the M512 memories, not quite the same as actually changing the LUTs but a very workable solution to get more multiplies. It also works for distrbuted arithmetic, which is simply an algorithm for doing a sum of products with reduced hardware. The long and short of it is that there are many hardware tricks available for compressing the physical size of the filter by taking advantage of the hardware and indiosyncracies of your particular application. Vaughn Betz wrote: > > > Before making a decision, you should also check out Stratix's "soft > multipliers." Basically RAM blocks (which Stratix has lots of) are > used to do constant-coefficient multiplies using distributed > arithmetic -- exactly what FIR filters need. Since the RAM blocks are > re-writable, you can also update the filter coefficients by rewriting > the RAMs, in case you're doing some adapative FIR filtering. Altera's > FIR compiler will automate the implementation in RAM blocks for you. > > See http://www.altera.com/literature/hb/stx/ch_9_vol_2.pdf for the > detailed documentation. > > The document lists (in Table 9-3) how many 16x16 multiplies (with a > net throughput of one result per cycle) you can get with this > technique in various Stratix devices, even without fully optimizing > the design (i.e. wasting some of the RAM inputs & bits). > > 1S30: 159 more 16x16 multiplies > 1S40: 187 more 16x16 multiplies > > If you really need 18x18 multiplies (not 16x16) you'll lose about 25% > of these multipliers since you will need more M512 RAMs per > multiplier. On the other hand, this table (Table 9-3) assumes you've > done a fairly poor job of packing your FIR taps into the RAMs, so you > could come out even with a real, optimized implementation, and achieve > roughly the numbers as above even for 18x18. > > Adding in the 18x18 multipliers in the DSP blocks takes you to a total > "soft" + hard multiplier count of about 200 for the 1S30, and about > 250 for the 1S40 (can't be exact without knowing your number of taps > and coefficient widths precisely). > > Vaughn > -- I am an Altera employee, in case you didn't notice the email > address :). -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 58249
Actually, DA gets inefficient as you move up to larger memories. The SRL16 is a nice size for a DA LUT, and provides a reprogramming path that is not available if you do the DA with hard LUTs. The smallest grained reprogrammable element in the Stratix is the M512, so the DA LUTs are necessarily bigger. The SRL16's also give you a compact delay queue for the filter's tapped delay as long as you have more than one clock per sample. Sorry Vaughn, but I still like the SRL16's more than the M512s. Regarding the comment that the M512 gets you a 16x18 for free, I don't follow. The M512, configured for a 32x18 memory provides at most a 5x18 product, not 16x18. You can reuse the LUT sequentially with a scaling accumulator, or expand with an adder tree and multiple LUTs, but I'd hardly call it 'free'. All that said, you can fit a design to either the stratix or virtex parts and be successful. In order to fully utilize the part, you'll need to make some design optimizations to take advantage of the features of the family. The choice between using M512's or SRL16's is typical of the trades you need to make at the architectural level in your design. Michael S wrote: > vbetz@altera.com (Vaughn Betz) wrote in message news:<48761f7f.0307162147.57d6095a@posting.google.com>... > > already5chosen@yahoo.com (Michael S) wrote in message news:<f881b862.0307150558.53ac3a8c@posting.google.com>... > > > I know that respective regulars of this newsgroup don't like to give > > > decisive answers to A vs. X type of questions, but... This last visit > > > of the Xilinx representative was a shocker ! > > > <snip> > > > it was a shock: about three time as many multipliers as in similar > > > size/similar price Stratix chip. Two and a half times more multipliers > > > than in significantly bigger Startix chip ! > > > XC2P30 - 136 18x18 Dedicated multipliers > > > EP1S30 - 48 18x18 Embedded multipliers (the price of the parts is > > > similar to XC2P30) > > > EP1S40 - 56 18x18 Embedded multipliers > > > <snip> > > > It's almost too good to be true. > > > IMHO if there is no catch (availability ?) here the XC2P > > > parts draws Stratix into irrelevance for nearly all DSP-intensive > > > applications. > > > > Before making a decision, you should also check out Stratix's "soft > > multipliers." Basically RAM blocks (which Stratix has lots of) are > > used to do constant-coefficient multiplies using distributed > > arithmetic -- exactly what FIR filters need. Since the RAM blocks are > > re-writable, you can also update the filter coefficients by rewriting > > the RAMs, in case you're doing some adapative FIR filtering. Altera's > > FIR compiler will automate the implementation in RAM blocks for you. > > > > See http://www.altera.com/literature/hb/stx/ch_9_vol_2.pdf for the > > detailed documentation. > > > > The document lists (in Table 9-3) how many 16x16 multiplies (with a > > net throughput of one result per cycle) you can get with this > > technique in various Stratix devices, even without fully optimizing > > the design (i.e. wasting some of the RAM inputs & bits). > > > > 1S30: 159 more 16x16 multiplies > > 1S40: 187 more 16x16 multiplies > > > > If you really need 18x18 multiplies (not 16x16) you'll lose about 25% > > of these multipliers since you will need more M512 RAMs per > > multiplier. On the other hand, this table (Table 9-3) assumes you've > > done a fairly poor job of packing your FIR taps into the RAMs, so you > > could come out even with a real, optimized implementation, and achieve > > roughly the numbers as above even for 18x18. > > M512 supports 32x18 configuration. So 16x18 multiplies come for free. > > > > > Adding in the 18x18 multipliers in the DSP blocks takes you to a total > > "soft" + hard multiplier count of about 200 for the 1S30, and about > > 250 for the 1S40 (can't be exact without knowing your number of taps > > and coefficient widths precisely). > > > > Vaughn > > -- I am an Altera employee, in case you didn't notice the email > > address :). > > I spent many hours recently analyzing possibility of the DA solution. > It probably does possible to fit our processing into S30 or S40 part > with DA. > But: > 1. The target system has several modes of operation. Each mode > requires slightly different topology of the filtering chain, i.e. > different relationships between input/output decimation factor and the > convolution length. When the solution is based on HW multipliers > alone, it is relatively easy to implement all topologies in one design > (or two at worst). Hard+Soft approach almost certainly would need > different designs for different topologies. It leads to bigger > on-board flash memory, complicated logistics and most important more > complex verification. > 2. Design effort. HW-only design is easier. > 3. Compilation time. We hope that synthesis+p&r of the HW-only design > will take about 10 to 30 minutes. Hard+Soft design will take at best > several hours even not taking into account issue of multiple designs. > The productivity gain from the faster compilation/simulation or > compilation/timing analysis cycle should never be overlooked. > 4. Higher risk. What if it fits, but due to 95% utilization of device > resources can't meet timing requirement ? Through blood and sweet at > last it'd meet :( > > My current conclusion is: if we would decide to stay with Altera and > fight through DA there is no sense to use S30 or S40. We can go one > step further and build our solution from one Stratix S10 part for NCO, > digital mixers, advanced I/O etc) and four Cyclons (C12 or C20) for > FIR filtering. At least this way we would have some pay-off (cheap > parts) for our pain. And the compilation is faster too :-) > > BTW, DA is not limited to Altera. It is very possible in Xilinx too. > Of coarse XC2P has no M512 blocks, but 16 SRL16 memories provide 80% > worth of M512 for DA. So dedicating 20% of XC2P20's LC's to SRL16 give > us about 50 more MACs and ability to fit into this even cheaper > device. -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z