Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
> The memory RAM and RAM managment will be more important feature for your > next portable computer. I agree with Laurent, RAM will be a major issue if you have large designs and you plan to do place & route on your latop. I'm not sure what size design you're planning to develop on your portable, but for our large Virtex/Virtex2 designs we loaded our desktops with 1GB of RAM. It was a bit of overkill and 512 MB would probably have been just right. The key thing to remember is if your place & route software has to use virtual memory, buy more RAM. > Make sure your DDRAM is not shared with Video Card !!! Yeah a lot of laptop have good specs, but poor design. Watch out for things that will cause a memory bottle neck. Unfortunately I don't have much in the way of guidelines for you. Just remember that the system design is just as important as the speed of the CPU and the ammount of RAM. Theoretically speaking, the memory system on laptops might be such a bottleneck in laptops that the speed of the CPU might not be as critical. But I have no experience or evidence of this, so take it with a grain of salt. Just something to think about. Regards, VinhArticle: 60976
Your comments after your question do not make sense to me. Usaually a frame buffer is where you render. And then it goes through a DAC and several video timing control units and reaches a CRT via a VGA or 13w3 connector. Bypassing this path means a direct electron beam control on CRT. Yeah, you can do anything if you can control that beam. ---Bob "Martin Euredjian" <0_0_0_0_@pacbell.net> wrote in message news:XPBcb.6059$ju4.2633@newssvr27.news.prodigy.com... > I know about the various algorithms to draw lines, circles, etc. All of > these pretty much rely on painting onto a frame buffer that is later used to > scan out to a CRT. > > Does anyone know of any algorithms to draw primitives that work without the > intermediate frame buffer step. In other words, the algorithm's input would > be the current x,y pixel being painted on the screen and the desired shape's > parameters. Horizontal and vertical lines (and rectangles), of course, are > easy. But, how do you do curves or diagonal lines? > > It seems to me that you'd take y and solve for x, which could produce > multiple results (say, a line near 0 degrees). You'd have to save the > results for that y coordinate in a temporary buffer that would then be used > to compare to x. That's as simple as I can come up with. > > > -- > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Martin Euredjian > > To send private email: > 0_0_0_0_@pacbell.net > where > "0_0_0_0_" = "martineu" > >Article: 60977
I am confused by your terminology. Correct me if I am wrong. 1. You are looking for a "synchrounous binary counter". Isn't it the one you just wrote? 2. What is the matter with ripple counter? And what is the difference from a "synchrounous binary counter"? 3. Gray code should be able to walk through all the addresses. ---Bob "Denis Gleeson" <dgleeson-2@utvinternet.com> wrote in message news:184c35f9.0309250354.763cf662@posting.google.com... > Hello All > > OK Im a bit confused. > > the verilog code > > always @(posedge CLK or posedge CLR) > begin > if (CLR) > Q <= 20'b0; > else > if (CE) // Is counter Enabled. > begin > Q<=Q+1; > end > end > > produces a ripple counter? Yes, No? > > When I look at my simulation results I see glitches in the count output > leading me to assume that I have implemented a ripple counter. > > IS it possible to remove the glitches with a synchronous Binary counter? > If yes what is the verilog code? > > Ive read about Gray counters but as Im using the counter to step through > SRAM addresses I dont want to loose storage locations in my SRAM just because > my counter doesnt count through all possible binary counts. > > > Thanks in advance for all suggestions. > > > DenisArticle: 60978
"T. Irmen" <tirmen@gmx.net> wrote in message news:<bkvkuq$9im$1@online.de>... > Hi Antti, > > > 5 intercept dll, it exposes xilinx DLL entry points and call xilinx > > dll > > (that you have renamed) part of the calls are re routed to your dll > > that then talks to your hardware > > that sounds good to me. > > I never thought of that way. Isn´t it possible to stack a filter onto xilinx > kernel driver? stupid, me, yes it would. its just that I havent written such drivers > Do you know which Xilinx DLL I have to deal with? :-) not hard to find, check out whitch one talks the xpc4drvr.sys anttiArticle: 60979
"James Williams" <james@williams-eng.com> wrote in message news:<bkv30s$pp0$1@news3.infoave.net>... > I found that it did synthesise into an XC95144, however is does not fit into > the device use the fitter of the ISE. > > Regards, with plug and play and debug disabled it fits into XC9572 with ISE anttiArticle: 60980
Jan Panteltje <panteltjeNSOAPM@yahoo.com> wrote in message news:<1064411674.535376@evisp-news-01.ops.asmr-01.energis-idc.net>... > What more can I say: I lost yahoo now. > That virus searches Usenet for email addresses, and then sends > thousands of times that microft fix with the worm. Hi, I am also facing the same problem.Is there a fix to avoid this? Regards, JaideepArticle: 60981
Hello, I wroted a simple process to synchronyze Dat signal with a clock, and when a "Dat" pulse edge is very close to "Clock", timing simulation shows "X". And a simulation of the rest signals fails. How to solve it? My text is: A0: process(Clock) begin if Clock'event and Clock='0' then Dat2 <= Dat; end if; end process A0; Result of timing diagram (Aldec 5.2) I put at: http://www.electronicsdesigns.net/img/timings.gif Thank you in advance for any suggestion.Article: 60982
hi folks, i am using virtex2p in my design , now i want to estimate the total design power. Xilinx provides a spread sheet for power calculation.what is the range of power consumption for good designs which are working in the field which are using virtex2p device. and what are the appropriate heat sinks to be provided if power consumption is more. could somebody help me out in this rgds, praveenArticle: 60983
On Fri, 26 Sep 2003 04:53:29 GMT, "Bob Feng" <yi.feng@sbcglobal.net> wrote: >Your comments after your question do not make sense to me. > >Usaually a frame buffer is where you render. And then it goes through a DAC >and several video timing control units and reaches a CRT via a VGA or 13w3 >connector. > >Bypassing this path means a direct electron beam control on CRT. Yeah, you >can do anything if you can control that beam. ISTR that one of the early personal computers, perhaps the Sinclair ZX81, generated video on the fly in software. This was done in groups of eight pixels with an external eight bit shift register (& I think there was an external character generator rom as well). This meant that the software had to render a byte of display every two microseconds or so. Regards, Allan.Article: 60984
Your problem could be the fact that your sensitivity list is incomplete. Replace this: A0: process(Clock) With: A0: process(Clock,Dat) An incomplete sensitivity list can cause strange simulation results in Aldec. You would think Aldec would give you a warning when your sensitivity list is incomplete. So what I end up doing is enter my design in Aldec. Synthesize the design in Synplicity, which does give you a warning. Fix the sensitivity lists. THEN simulate it in Aldec. Perhaps there is a way to make Aldec give you a warning or maybe the latest version of Aldec does it. Well hopefully that fixes your problem. I can't see any other reason for the 'X'. Regards, VinhArticle: 60985
"Bob Feng" wrote: > Your comments after your question do not make sense to me. You are thinking computers. I'm thinking video. Here's one possible scenario. You are required to overlay graphics (primitives: lines, circles, etc.) and text onto an incoming video feed, which is then output in the same format. The allowable input to output delay is in the order of just a few clocks --if that much at all-- not frames, not lines, a few clocks at best. Frame buffers are out of the question, of course. In addition to this, due to cost constraints, you are not allowed to have rendering memory for the graphics overlay. You, therefore, must render text and graphics on the fly, in real time, as the signal flows through. Text and horizontal/vertical lines are pretty easy to deal with. You start getting into circles and rotated lines or polygons and it gets interesting real fast. In contrast to this, rendering these primitives to a frame buffer (in a "traditional" computer-type application) is a no-brainer. > Usaually a frame buffer is where you render. And then it goes through a DAC > and several video timing control units and reaches a CRT via a VGA or 13w3 > connector. So, in the above context, nothing goes through a DAC. The video feed (assume analog) simply goes through some analog switches that, under FPGA control, select, on a pixel-by-pixel basis, from among the incoming video signal and a set of pre-established analog values (say white, to keep it simple). Hope this example clarifies it for you. -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Martin Euredjian To send private email: 0_0_0_0_@pacbell.net where "0_0_0_0_" = "martineu"Article: 60986
Peter Alfke <peter@xilinx.com> wrote in message news:<3F734E61.3300CE68@xilinx.com>... > I remember the 70's well. :-) > The TTL logic we used was slow by today's standards, with output delays > of 25 ns and gentle rise and fall times. > But the interconnect was fast, just wires, at 1 to 2 ns per foot. > > Now, in FPGAs, you have very fast logic, with extremely short transition > times of <1ns, but the FPGA-internal interconnects are comparatively > slow. It is now fairly normal to spend as much time travelling in the > on-chip interconnects, as propagating through the logic. In the old > TTL/MSI days, the logic and the flip-flops were very much slower than > the interconnect. > > That means: The slow logic used to tolerate and swallow up interconnect > delays, and the transitions were so slow that there were hardly any > transmission-line effects. > Now the logic is so fast and unforgiving, that the interconnect delays > can no longer be ignored, and almost any pc-board trace must be treated > as a transmission line. > > These "cultural differences" may bite you in your translation effort. > So, beware of decoding glitches on the clock lines, of uncontrolled > clock distribution delay differences, and hold time issues. And crank > the chip-output drivers down to the weakest drive strength! > You do not need speed; speed is your enemy, but the enemy is very much alive... > > Have fun! > Peter Alfke > ========================== Hi All! Thanks for everyone for comments on my design. All those comments are scaring me . So please help me out in this situation. I try to explain the design a bit. As i told, the design has around 200+ IO's with more than 80 inputs and nearly 180 ICs. No information on these inputs is avaialble apart from their connections, from the schematic. There are around 24 Flip flops (D and JK type) and few other ICs requiring clocks. We have an on board clock of 20Mhz which is clocking some of FFs, while some are clocked from external port inputs and some from Combinatorial clocks. The design is clearly Asynchronous. Also due to non-avaialability of information about IOs, i cant trace the design functionality(due to massive nets). What i have done is modeled the possible ICS in Verilog and instantiated them as per schematic. the few observations are: 1)Some FFs have their inputs permanently tied to 1 or 0. (OK i feel?) 2)Some FFs share common clock but their inputs are combinational logic comprised of other FFs(same clock). (Its OK i guess). 3)Here seems a problem: A FF is clocked by local 20MHz clock, but its input is a combinational logic which itself derives it inputs from outputs of FFs clocked by signals from 2 external ports. (Clearly Asynchronous. So why did the designer use it? How could he have been sure at that time that their is no asynchronous effect?) 4)An input from IO port passes through a buffer IC . This buffer's output makes input to a clocked IC, which is inturn clocked by a combinatorial logic. Again asynchronous. The card is authentic and from a reliable company. My question is that wasn't there any concern at that time(1980) of asynchronous design that the designer has used so much asynchronous techniques? Is it possible that there is no such REQUIREMENT of synchronism in this design(the system interfacing to IOs of card can handle it ? of which we have no knowledge). Isn't asynchronous design a necessasity times when either synchronous design is neither possible nor feasible? Also why i am gonna be in trouble? Because i am shifting to FPGA OR theere would be same problems even if i redesign the card with same components it is now using? If FPGA is a problem, then why do they claim that we can translate our obselete components into FPGA? If a design is asynchronous inherently, should we forcibly synchronise it mere for the sake of FPGA implemenatation? Thanks to all who commented. RiderArticle: 60987
congratulations.. I think you have simulated metastability :-) dat doesn't need to be in your sensitivity list as dat changing won't affect the simulation result unless clk changes. What you are seeing is a setup time violation. What you need to do is to sample data only when its valid, or if data is truly asynchronous to clk, chain 2 or 3 flip flops to form a shift register. This doesn't cure the problem, but will stop it occurring most of the time. Then go and search the web and this news group for metastability issues. Simon "Vakaras" <dainius@electronicsdesigns.net> wrote in message news:1682a848.0309252247.6374df38@posting.google.com... > Hello, > > I wroted a simple process to synchronyze Dat signal with a clock, > and when a "Dat" pulse edge is very close to "Clock", timing > simulation shows "X". And a simulation of the rest signals fails. How > to solve it? > > My text is: > > A0: process(Clock) > begin > if Clock'event and Clock='0' then > Dat2 <= Dat; > end if; > end process A0; > > Result of timing diagram (Aldec 5.2) I put at: > > http://www.electronicsdesigns.net/img/timings.gif > > Thank you in advance for any suggestion.Article: 60988
I use xilinx ise webpack 6.1 sp1. In may project I tried to add contrains like: NET "probes<0><0>" LOC = "D11" | PULLUP ; NET "probes<0><1>" LOC = "D12" | PULLUP ; NET "probes<0><2>" LOC = "C12" | PULLUP ; This signals are all input. In translate report is reported: Attached a PULLUP primitive to pad net probes<0><2> Attached a PULLUP primitive to pad net probes<0><1> Attached a PULLUP primitive to pad net probes<0><0> But in place&route report there is no reference to pullups: Resolved that IOB <probes<0><0>> must be placed at site D11. Resolved that IOB <probes<0><1>> must be placed at site D12. Resolved that IOB <probes<0><2>> must be placed at site C12. Even in pad report is not mentioned pullup resistor for that signals. How can I be sure about the presence of pullup resistors on that ports? thanksArticle: 60989
Cutting thru the words... I will have to agree with Austin here.. as speeds go up and voltages go down simulation is going to go from "well maybe" to "essential"... Sure your 12 MHz 8031 micro will never need to be simulated.. but your 800 MHz LVDS bus may not even work.. is just a matter of getting relative and you will find the same with your FPGA... fast edges and fast clocks will stir a hornets nest .. just be prepared for them.. and you think this is bad .. wait for the Spartan 4 at .6 instead of .9 .. give it a bad look and it will jump of the board .. but it will probably run at a Gig and run on an oily rag Simon "rickman" <spamgoeshere4@yahoo.com> wrote in message news:3F7388A0.A6A9568A@yahoo.com... > I find your tone offensive Austin. I am simply trying to understand the > issue being discussed by yourself as well as others. On my board there > will be no traces that are near 12" much less 24" and the power supply [snip]Article: 60990
You've got quite a set of challenging constraints there. So you're saying an FPGA is cheaper than using a frame buffer and a micro-controller? Another thing to think about is you'll need memory for each drawing primitive in your overlay graphics, which might be a problem if you're going to have a lot of primitives. Unfortunately I've never tinkered with analog video signals, so I've got no real advice. By the way, how are you planning to sync to the incoming video signal? Well if you pull this off, it'll be quite impressive. Please let us know of your progress. Regards, VinhArticle: 60991
Hi, I would like to know if anybody here has tested the avnet VirtexII Pro evaluation board with XC2VP7 or XC2VP20 chip (board ref : ADS-XLX-V2PRO-DEVP7-5 and ADS-XLX-V2PRO-DEVP20-5) http://www.silica.com/eval_kits/ads-20030515.html I would like to know if : - it's possible to install an OS like linux or uclinux on this board ? What about the linux port from http://penguinppc.org/dev/kernel.shtml What about ucOSII (http://www.micrium.com) - Is there any bottleneck to access the SRAM from the chip through the PCI bridge ? By the way has anybody tried to put it in a PC and communicate with it? The card seems to be delivered with a windows interface. What's this ? How to use it with a PC running linux ? Is it possible to programm the FPGA trough PCI (once in a PC) or is it mandatory to use the standard way ? - is there a way to connect it a display - LCD or video ? More generally, about VirtexII Pro, it seems that all the coreconnect bus stuff has to be synthesized using FPGA ressources ? At the opposite, it seems the Excalibur Arm solution proposes a basic microsystem (a CPU with some peripherals) which preserves FPGA ressources. Am I right ? What's the complexity of the coreconnect bus ? How many room is left in XC2VP7 or XC2VP20 chips for, say, a single PLB bus, a SDRAM/RAM controller, UART, timer and interface to PCI (to the PC or PMC-daughter cards)? Do I have to use two PLB bus if I use the XC2VP20 chip (with 2 PowerPC) ? I understand that it doesn't consume multipliers but what about combinatorial/sequencing logic for the bus stuff ? Thanks a lot for your future responses StéphaneArticle: 60992
> congratulations.. I think you have simulated metastability :-) > dat doesn't need to be in your sensitivity list as dat changing won't affect > the simulation result unless clk changes. > > What you are seeing is a setup time violation. Simon, I think if you carefully look at the timing diagram, it's clearly a hold-time violation. And what an awful flip-flop, to have such a long meta-stability resolution time. Perhaps an upgrade to Aldec 6.1 will provide better performing flip-flop models. Just joking, of course :_) Regards, VinhArticle: 60993
On 26 Sep 2003 00:30:38 -0700, cialdi@firenze.net (Max) wrote: >I use xilinx ise webpack 6.1 sp1. >In may project I tried to add contrains like: > >NET "probes<0><0>" LOC = "D11" | PULLUP ; >NET "probes<0><1>" LOC = "D12" | PULLUP ; >NET "probes<0><2>" LOC = "C12" | PULLUP ; > >This signals are all input. > >In translate report is reported: > >Attached a PULLUP primitive to pad net probes<0><2> >Attached a PULLUP primitive to pad net probes<0><1> >Attached a PULLUP primitive to pad net probes<0><0> > > >But in place&route report there is no reference to pullups: >Resolved that IOB <probes<0><0>> must be placed at site D11. >Resolved that IOB <probes<0><1>> must be placed at site D12. >Resolved that IOB <probes<0><2>> must be placed at site C12. > >Even in pad report is not mentioned pullup resistor for that signals. > >How can I be sure about the presence of pullup resistors on that ports? Didn't you just ask this question in comp.lang.vhdl? There are a number of ways: 1. Get a better version of the software, which will allow you to use fpga_editor to view the configuration of the pin. 2. Use ngd2vhdl (or whatever) to generate a human readable version of the chip contents. I *think* that this will include the pullup if it's there. 3. Download the bitstream into an fpga and measure the electrical characteristics on a curve tracer, multimeter, or whatever test equipment comes to hand. Allan.Article: 60994
Only 24 FFs? Sounds like the board synchronizes incoming data+clock to one system clock. Maybe there is som metastability improvmenet technique incorporated. Do the FF IC:s have clock enable? If not, that could be done by using combinatorial clocks. Is there much feedback between all the FF:s? If so, I guess there is some state machine in there (scary...), if not then I'm sure it's relatively easy to reverse engineer the board. What's the board'sfunction, anyway? That would give you a good start. Homann -- Magnus Homann, M.Sc. CS & E d0asta@dtek.chalmers.seArticle: 60995
Max wrote: > I use xilinx ise webpack 6.1 sp1. > In may project I tried to add contrains like: > > NET "probes<0><0>" LOC = "D11" | PULLUP ; > NET "probes<0><1>" LOC = "D12" | PULLUP ; > NET "probes<0><2>" LOC = "C12" | PULLUP ; > > This signals are all input. > > In translate report is reported: > > Attached a PULLUP primitive to pad net probes<0><2> > Attached a PULLUP primitive to pad net probes<0><1> > Attached a PULLUP primitive to pad net probes<0><0> > > > But in place&route report there is no reference to pullups: > Resolved that IOB <probes<0><0>> must be placed at site D11. > Resolved that IOB <probes<0><1>> must be placed at site D12. > Resolved that IOB <probes<0><2>> must be placed at site C12. > > Even in pad report is not mentioned pullup resistor for that signals. > > How can I be sure about the presence of pullup resistors on that ports? > > thanks Hi, In your constraint file (.ucf) add the following lines to add pull-up: ################################################################################ ## PULLUP DESCRIPTION (IF NEEDED) ################################################################################ NET your_netname PULLUP; It works well ! Laurent www.amontec.com ______________________________________________ Amontec provides new low cost solutions for FPGA Download and Processor DebugArticle: 60997
Hello, has anybody tried the partial reconfiguration flow with the latest ISE version 6.1? Are there any fundamental differences between the actual and previous releases? (e.g the MODE = RECONFIG attribute) In the assemble phase PAR aborts guiding my design with the message: abnormal program termination (without any further information) Thanks in advance. ChristianArticle: 60998
Hong, Firstly, my apologies for the delay in replying. I have inserted my responses to your points below: "Hong Shan Neoh" <hsneoh@netscape.net> wrote in message news:2ff2f33d.0309111656.12ef44cd@posting.google.com... > Ken, > While the RSG solution may yield smaller designs for specific cases, > the Altera FIR Compiler gives you more flexibility in terms of > optimizing area vs.speed. Agreed - RSG only produces full-parallel filters. If multiple clocks per output can be used (depending on data rates and avaible clock/power consumption requirements of course) then I would be the first to say use Distributed Arithmetic (DA) in multi-cycle mode. > For instance, the numbers presented in the RSG datasheet is based on a > pipeline=2 setting for the Altera FIR Compiler. Using the FIR > Compiler, the design yields an fmax of 322MHz (single rate, single > channel). This is much higher than the 154MHz cited for the filter > using the RSG approach. This is the classic speed/area trade-off > scenario. The figure of 154MHz is not in the datasheet you are referring to (available from http://www.dspec.org/rsg/downloads/datasheets/RSG_Overview.pdf) - I do not know where you got this number from (I seem to remember it being in an old version [which did not include Altera results] but I removed it because the 154MHz was Xilinx specific for a particular filter on a particular device and has no bearing at all to Altera devices/filter implementations). The test filters I did came in at various speeds between [244-283MHz] for RSG and [217-293MHZ] for Altera FIR Compiler (on exactly the same device and with the same constraints). This is why I used pipeline level 2 for fir compiler because pipeline level 1 made the fir compiler filters too slow and pipeline level 3 made the fir compiler filters ridiculously large and not that much faster. Hence, pipeline level 2 does give a fair comparison with RSG. For pipeline level 1, the FIR compiler clock rates were in most cases slower than for RSG and the FIR compiler implementations were also larger. The 8-channel singlerate FIR was the only one that was the same size but the RSG filter ran at 251MHz and FIR Comp at 206MHz. Cheers, Ken > > Regards, > HS > > Tero Rissa <tpr@doc.ic.ac.uk> wrote in message news:<bjhr0m$7f7$1@harrier.doc.ic.ac.uk>... > > I went around the irregularity issue by having sub-multiplier > > block architecture that has have fixed interface to the routing > > and have fixed (yet reasonable) area. Therefore, when the > > coefficients are changed, no place and route is required and > > the latency remains the same (unless you change the number taps). > > The generation of coefficients can done at reconfiguration time > > thanks to symmetry in the FPGA used (Atmel 40K40). Naturally, > > there is the problem of hassling with run-time reconfiguration > > and everything that comes with that... > > > > As part of this work we looked also into common subexpression > > sharing in that particular FPGA family and found it very unlikely > > that benefits could be obtained with similar multiplier-block > > architecture. This is mainly due the fact that it is different > > story to be able to generate the most useful common subexpressions > > that it is to really use them before the routing becomes congested. > > > > http://www.doc.ic.ac.uk/~tpr/papers/rissa_FPT02.pdf > > > > T.Rissa > > > > > > Ray Andraka <ray@andraka.com> wrote: > > > I agree the multiplier block style filters are more efficient area-wise. It > > > sounds like you have addressed the irregularity issues by using a program > > > to do the generation, which I think is pretty much a necessity. As I thought > > > I alluded to, the biggest problem with multiplier block filters is that the > > > layout/size is not a constant if you change the coefficients. This means that > > > the fiter coefficients have to be constant and known earlier in the design > > > cycle, and necessitates a rerun of synthesis, place and route for any filter > > > changes. Depending on the implementation, it may also mean a change in the > > > filter's pipeline latency. These factors can make them difficult to use on > > > some projects. The filters typically used in my projects often need to be > > > adjusted by the customer or late in the project to accommodate minor > > > requirements changes. I prefer to use a filter with reloadable coefficients > > > for that reason. > > > > > > > > > Ken wrote: > > > > >> Ray, > > >> > > >> I sent this to Michael via email and he suggested the group would be > > >> interested also... > > >> > > >> My PhD (now drawing to the end) has been on implementing full-parallel > > >> Transpose FIR filters using multiplier blocks that you mention (I use > > >> techniques/algorithms that exceed the efficiency of CSD in terms of FPGA > > >> area). > > >> > > >> The upshot of my work is that I have written a C++ program that will > > >> generate RTL VHDL given the quantised filter coefficients, the type of > > >> filter required (singlerate, interpolation, decimation etc.) and the > > >> appropriate parameters (input width, signed/unsigned input, number of > > >> channels, rate-change factor etc.) > > >> > > >> The VHDL my program generates exceeds the functionality (at a lower > > >> cost) of that provided by Xilinx's Distributed Arithmetic core and Altera's > > >> FIR Compiler (also DA). In fact, my program allows interpolation and > > >> decimation factors up to the number of filter coefficients and any number of > > >> data channels (for interpolation/decimation filters also). > > >> > > >> The main point is that, once synthesised and mapped to a specific FPGA, the > > >> filters my program generates require far less FPGA area (slices/logic cells) > > >> than those generated using Distributed Arithmetic. The critical path in my > > >> filters is just the longest adder carry chain so very high speeds are > > >> possible. E.g. 154MHz for a singlerate filter (25 bit output) in a Xilinx > > >> xc2v3000-fg676-5 - obviously the speed will depend on the device > > >> family/speed grade and the longest carry chain. The facility for multiple > > >> channels in interpolation/decimation filters (not supported by Xilinx) > > >> allows lower than full-parallel sampling rates to be efficiently processed > > >> in one filter. > > >> > > >> As Michael points out in his post, this technique would be very suitable for > > >> a > > >> Xilinx Spartan-IIE and indeed any FPGA - there are many cases where these > > >> filters would be useful even on devices with dedicated multipliers (when > > >> they are all in use for example! ;-) ). > > >> > > >> You can find out more at http://www.dspec.org/rsg.asp - there are also > > >> datasheets here that provide comparisons with Xilinx and Altera and > > >> demonstrate the output of another application (written in java) that > > >> generates schematic representations of the filters for use in reports, > > >> meetings and thesises! :-) > > >> > > >> I hope this information is of use to you - please contact me if you have any > > >> questions, > > >> > > >> Thanks for your time, > > >> > > >> Ken > > >> > > >> -- > > >> To reply by email, please remove the _MENOWANTSPAM from my email address. > > >> > > >> "Ray Andraka" <ray@andraka.com> wrote in message > > >> news:3F54F936.5E694FD1@andraka.com... > > >> > The problem with the multiplier block approach is that the > > >> > construction is predicated on the specific coefficients. As > > >> > a result it is considerably harder to use for an arbitrary > > >> > set of coefficients. It may reduce area over a straight FIR > > >> > filter running at the same clocks per sample, but at a > > >> > considerable cost in design time and flexibility. You also > > >> > give up regularity in the structure, which may reduce the > > >> > overall performance. Essentially what the block multiplier > > >> > and distributed arithmetic approaches are is a rearrangement > > >> > of the bitwise product terms. The mutliplier block takes > > >> > advantage of duplicate terms by adding the inputs before > > >> > they are multiplied by the term. > > >> > > > >> > Michael Spencer wrote: > > >> > > > >> > > Hello, > > >> > > > > >> > > Has anyone compared FPGA implementations of full-rate > > >> > > digital FIR filters based on the use of Multiplier Blocks > > >> > > vs. traditional FIRs with constant coefficient > > >> > > multipliers? By full rate, I mean: one output result per > > >> > > clock cycle and no interpolation or decimation. > > >> > > > > >> > > For anyone not familiar, a multiplier block is a network > > >> > > of shifters and adders that performs multiplications by > > >> > > several coefficients efficiently by exploiting common > > >> > > sub-expressions. The multiplier block can be exploited in > > >> > > FIR filters by transposing the standard filter so that the > > >> > > products of all the coefficients with the current > > >> > > input-sample are required simultaneously. > > >> > > > > >> > > Also, by representing the coefficients in the > > >> > > Canonical-Signed-Digit number system (a small number of > > >> > > +1 and -1's) along common sub-expression sharing the > > >> > > multiplier block can get even smaller. > > >> > > > > >> > > For example, the multiplier block for a 100 tap FIR filter > > >> > > (fp=0.10 and fs=0.12) can be realized with only 61 adds > > >> > > (zero explicit multiplications). See filter example #4 in > > >> > > "FIR Filter Synthesis Algorithms for Minimizing the Delay > > >> > > and the Number of Adders," > > >> > > http://ics.kaist.ac.kr/~dk/papers/TCAD2001.pdf > > >> > > If the adder depth is constrained to a maximum of four, > > >> > > then the authors' algorithm can do the multiplier block in > > >> > > 69 additions. > > >> > > > > >> > > It would seem that this approach would be very efficient > > >> > > in a target such as the Xilinx Spartan-IIE (with no > > >> > > dedicated multipliers). > > >> > > > > >> > > Another question: If we only need one result per K clock > > >> > > periods (K ~= 1000 for audio applications), could a > > >> > > multiplier block approach realized with, say, bit-serial > > >> > > addition be more efficient than some other approach such > > >> > > as distributed arithmetic? > > >> > > > > >> > > Comments welcome. Thanks. > > >> > > > > >> > > -Michael > > >> > > ______________________ > > >> > > Michael E. Spencer, Ph.D. > > >> > > President > > >> > > Signal Processing Solutions, Inc. > > >> > > Web: http://www.spsolutions.com > > >> > > > >> > -- > > >> > --Ray Andraka, P.E. > > >> > President, the Andraka Consulting Group, Inc. > > >> > 401/884-7930 Fax 401/884-7950 > > >> > email ray@andraka.com > > >> > http://www.andraka.com > > >> > > > >> > "They that give up essential liberty to obtain a little > > >> > temporary safety deserve neither liberty nor safety." > > >> > -Benjamin > > >> > Franklin, 1759 > > >> > > > >> > > > > > > -- > > > --Ray Andraka, P.E. > > > President, the Andraka Consulting Group, Inc. > > > 401/884-7930 Fax 401/884-7950 > > > email ray@andraka.com > > > http://www.andraka.com > > > > > "They that give up essential liberty to obtain a little > > > temporary safety deserve neither liberty nor safety." > > > -Benjamin Franklin, 1759Article: 60999
Is there a human readable file of Xpower analysis (something in *txt format) I only have seen the *.ncd file.
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z