Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
maxedman3503@yahoo.com (Max Edmand) wrote in message news:<3a30996f.0203291324.18cb9ba2@posting.google.com>... > Hello all, > > I'm trying to design a block to perform correlation > on two vectors: A and B. (A and B each have two elements: > A=(a1, a2) and B=(b1, b2) > The correlation between A and B shoul be calculated like > this: > > corr = [(a1 * b1) + (a2 * b2)] / [sqrt(a1^2 + a2 ^2) * sqrt(b1^2 + b2^2)] I think the CORDIC algorithm can be useful here. Since sqrt(a^2 + b^2) is the magnitude of vector [a, b] you could use the CORDIC algorithm to compute the vector magnitude and save a lot of resources. Check mr. Andraka's website for more information: http://www.fpga-guru.com/cordic.htm / Jonas Thor thor(at)sm.luth.se replace (at) with @Article: 41651
Hi, What follows below is my personal opinion. I think Viewdraw by Viewlogic (Innoveda) is the best schematic entry tool I have ever used. I've used the Viewdraw 6.1 on Solaris and also Workview Office 7.5 on WindowsNT. This tool does have a bit of a learning curve. After using it for a while, though, it now seems natural and the keyboard shortcuts and command line really let you fly through your work... Hope that helps, Eric Christopher Saunter wrote: > > Greetings All, > > I have always found it more natural to work with schematics than an HDL > (although learnign vhdl is proving very usefull in some areas...) > > I have used the Aldec schematic capture program from Xilinx Foundation 3/4 > and ECS from Webpack. > > Thus far, I am somewhat underwhelmed by these tools - I have always felt > that a good tool (eg text editor, ide etc.) should allow you to work about > as fast as you can enter data, and this is just not the case with the > schematic capture tools I have used. > > So my question is: Does anyone know of a powerfull, flexible schematic > editor with decent (preferably configurable) key bindings, rock like > stability, a nice user interface, highly intuitive, that is fast and a > pleasure to use etc? > > One that uses an HDL description of each schematic behind the scenes ECS > style is probably a plus. > > Or should I just be gratefull I'm not directly entering netlists... ;-) > > Cheers, > Chris SaunterArticle: 41652
The way you describe it, you have crosstalk between (almost) adjacent pins. That is not too uncommon, and depends on the output impedance of the affected pin. I suggest you check this by testing the output with an external pull-down resistor, of 100 Ohm to 1 kilohm. If the output impedance is around 20 Ohm, you should not have this crosstalk. Assuming max 5 pF capacitance across, the time constant would be 100 ps, invisible on most 'scopes. If you find the output impedance to be 20 kilohm or more, then I am not surprised about the crosstalk, since the coupling time constant could be 0.1 microsecond. Moral: A pin with no active or low-impedance pull-up is easily affected by adjacent signals. This has nothing to do with FPGA technology, it's a package issue, and is usually irrelevant. Who cares about a signal that has no driver attached to it ? Peter Alfke ====================================== Frank Zampa wrote: > > > The ripple I saw is in on the positive logic level, but i can't say > that is "only" on the positive level, because the duration of low > level of the signal with the ripple is shorter than the frequency of > ripple, so i can't see the 0.8 V PP on the low level. > I think that the ignal is not an LVTTL because the Spartan I use do > not use that levels. On the signal without pollution the high logic > level is about 4.1-4.3 Volts. > > Frank.Article: 41653
JB wrote: > I guess that to implement a monostable multivibrator using a Xilinx FPGA > should be pretty common. > > Maybe somebody provide me with a hint or an example? Could use a shifter, an edge detector, and a counter. -- Mike TreselerArticle: 41654
Peter Alfke wrote: > > "Cyrille de Brébisson" wrote: > > > In our design we are using an ARM CPU. My question is: > > Can we put an ARM in the virtex 2 pro? > > Were can I find/buy an ARM cpu core source (or precompiled) file to program > > in my FPGA? > > > > Cyrille, > the answer to both your questions is: No. > The PowerPC in Virtex-II Pro is a "hard" implementation, packing the > microprocessor with its caches and MMU into the smallest possible silicon > area, <4 square millimeters. > What you seem to be looking for is a "soft" implementation, using the > programmable logic "fabric". > That solution is impractical for something as complex as PowerPC or even ARM. > It would take up an unreasonable portion of a large chip, and achieve mediocre > performance at best. > Xilinx offers a soft microprocessor, called MicroBlaze, especially tuned for > efficient implementation in the Virtex architecture. It is not as fast and > capable as PowerPC, but uses only ~900 slices. > "Half the size and twice the speed of NIOS" is the Xilinx slogan. Please, no > flames... > > Peter Alfke, Xilinx Applications you can definately put an ARM in an FPGA the last project I worked on, I did an a ASIC proto of a SoC with an ARM7-TDMI-S in a virtexE, rigth now I'm working on something similar but in a virtex2, so it can hopefully get more of the clock gating in the design working in the prototype. Size and performance will not be like a hard implementation, but for a prototype that doesn't really matter as long as the performance is enough and the design fits a chip you can buy. And if you need to there's things that could be changed to bettter fit and fpga, so performance could be increased, but for a prototype you don't what to do that unless you have to. But anyways, buying the source code for an ARM will probably cost you an arm ;) and a leg, -Lasse -- Lasse Langwadt Christensen, -- Aalborg, DanmarkArticle: 41655
"Peter Alfke" <peter.alfke@xilinx.com> schrieb im Newsbeitrag news:3CAB7A34.F16B1AC4@xilinx.com... > No surprise, and an excellent argument for on-chip microprocessors running out of on-chip caches > and BlockRAM, and having good connectivity to the FPGA fabric. > Let me stop here, before I get into my Virtex-II Pro with PowerPC pitch... :-) Peter Alfke, always on duty ! SNCR. . .;-) -- MfG FalkArticle: 41656
In article <E30r8.139983$u77.31687100@news02.optonline.net>, jbonill1 @optonline.net says... > I am new come in the FPGA business. > > I guess that to implement a monostable multivibrator using a Xilinx FPGA > should be pretty common. Do you have a clock? What parameters? Do you want retriggerable or non-retriggerable. Level sensitive trigger? Edge? > > Maybe somebody provide me with a hint or an example? Hint 1: don't think about doing this with a resistor/capacitor or &Diety. forbid, a chain of inverters for timing. Hint 2: If you have a clock with a period less than your time-delay, think down counter. Set the counter on whatever trigger you want. Block triggers you don't. Flip on trigger, flop on count = 0. ---- KeithArticle: 41657
"Austin Lesea" <austin.lesea@xilinx.com> schrieb im Newsbeitrag news:3CAC80DE.40E4D721@xilinx.com... > If 405ppc are everywhere, you may dedicate them to tasks that seem horribly > inefficient if you continue to think in terms of the one big expensive monster > processor. 110% acknowledge!!!!!!! This "one big CPU for all task" is the ancient approach of those Intel guys. I remember a day, not too long ago, where Intel saw the future of the personel computer with just a big RAM and a CPU, doing everything just in software. :-0 Hey guys, see those grafic controllers nowadays? See how many transistor they have? See how much OPS they do? Yes? So go home and cry. ;-) -- MfG FalkArticle: 41658
Additional view from a computer architect type: In article <3CAC80DE.40E4D721@xilinx.com>, Austin Lesea <austin.lesea@xilinx.com> wrote: >> 1) How come there isn't a dedicated DDR interface on the chip. I've never >> seen a PPC application that didn't require DRAM, a dedicated interface >> would be cheaper and higher performing than using valuable CLBs to build >> a soft interface. (If I'm mistaken about the lack of a dedicated DDR >> interface please let me know, I didn't see any mention of one when I read >> the spec). > >DDR is built out of the DDR FF in the IOB's and logic in the FPGA. DDR isn't >the only standard, and customers have many other applications. DDR is neat, >but too specific. There is also a design pholosophy (which I can agree with for some uses, can't for others here) that only the minimally useful set should be implemented, because that is the cheapest and useable by the most people. A dedicated DDR SDRAM interface would be very nice, but that would consume a couple mm^2 of silicon, which is only usable by those who are going to plunk down a DDR interface, on a specific set of pins. >> 2) I don't see the need for putting four processors on a die. In almost >> all cases a single 405 should be adequate, in a few case you could make >> good use of two but I don't think that you would ever need four. There >> should have been a wider choice of parts with a single 405 core. > >We just don't know how customers will use all of this power. If 405ppc's are >'free', you can use one executing out of internal cache to handle the "error >404", and another running off internal cache to monitor QOS, etc. > >When electric motors were very expensive, a machine shop had one, and leather >belts to every tool station. When fractional horsepower motors became >inexpensive and ubiquitous, they were used everywhere, with no thought. > >If 405ppc are everywhere, you may dedicate them to tasks that seem horribly >inefficient if you continue to think in terms of the one big expensive monster >processor. And processors these days, for a simple core, are INCREDIBLY cheap, especially this one: It has no memory (those are the BlockRAMs), only the register file, datapath, and control logic. Even in synthesis, discounting the register file and caches, a 5 stage SPARC uP core takes 1.3mm x .85mm in a .18uM process. The caches, out of 4 1024x32b memories, are almost as big as the core itself! http://www.eecg.toronto.edu/~pagiamt/research/leon.html So in the area of about ~8-10 Virtex 2 BlockRAMs (1024x18b memories), you can fit a SYNTHESIZED sparc core (without a hardware multiplier/divider or MMU). I suspect that the Virtex 2 PPC core is even smaller, but with most of the actual area being the interfacing of the core to everything else. I'd love to get my hands on an XC2VP4 or larger die or die photo, just to verify these hunches about area in more detail. But according to the datasheet, the XC2VP2 uses 4 columns, 4 high of BlockRAMS, with the top and bottom of the center columns replaced with the RocketIO transecivers, so a pitch of 4 clb slices/BlockRAM. The XC2VP4 uses 4 columns, 10 high (its a 40x22 instead of a 16x22 array) and has 28 BlockRAMs, so 8 BlockRAMs are replaced for the PPC core, and 128 CLBs (500 slices) of logic. This is pretty CHEAP! If you have a low time critical function (EG, one which takes a fair path-length, but isn't necessarily pipeline-every-cycle), if you can replace just 128 CLBs with the use of the processor core, you've won, bigtime. So my assumption here is the 8 BlockRAMs of area are replaced with the uP core, with the rest going to a heck of a lot of interface logic. >> 4) On chip Flash RAM would be useful. An embedded PPC is going to require >> some Flash. Also it would be nice if the serial Flash RAM were on chip, >> I bet every one is sick of the extra part that most Xilinx designs >> require. > >Flash requires a process that is usually two years behind the leading >process. To do a flash capable FPGA would be to be obsolete on day 1 of the >introduction. Not very exciting. The only way I could conceive of their being Flash on the die is some fancy packaging, eg, a chip-up smaller flash chip bonded to internal pads on a chip down larger part. And do you REALLY want to spend an extra $20 just to reduce your part count from 2 to 1, and save 16-30 external pins? >> 6) This is a Virtex II issue, not just a Virtex II Pro issue. How about >> offering versions of the Virtex II without the on board multipliers. The >> multipliers make sense for DSP applications but they are a waste of money >> and power for everything else. In my 12 years doing Xilinx designs I have >> never needed a multiplier. I've frequently needed a CAM so I wouldn't >> mind a few CAMs on board, but I'd rather have a cheaper part without the >> multiplers. > >Well, they take up a tiny amount of area, so the cost savings is washed out >completely by having to make two parts, with lower volumes in each. And, as Ray Andraka has pointed out, a multiplier makes a great shifter as well. A variable shift is suprisingly expensive in an FPGA fabric: there are a lot of muxes, but it is an operation that is suprisingly common. An 18x18 multiplier can implement an 18 bit variable rotation with just 18 LUTs worth of logic to deincode the shift amount, and an additional 18 LUTs worth of logic if you want to make it a left shift/rotate, an additional 36 LUTs worth if you want to make a variable left/right shift. The multiplier blocks are an example of something which IS very common. -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 41659
In article <3CAC9AC3.C181A79D@ieee.org>, Lasse Langwadt Christensen <langwadt@ieee.org> wrote: >But anyways, buying the source code for an ARM will probably cost you an >arm ;) and a leg, While SPARC is free. :) http://www.gaisler.com/leon.html -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 41660
In article <a8i7t9$c1b$1@newsreader.mailgate.org>, Kevin Brace <ihatespam84kevinbraceusenet@ihatespam84hotmail.com> wrote: > I have seen one Xilinx employee in this newsgroup saying that >automatic P&R is getting better, so low level tools like floorplanner or >FPGA Editor is getting less important. Everytime Xilinx/Altera/Tool people say this, I have to laugh. There is so much low hanging fruit in datapath recognition, which the tools fail MISERABLY to recognise. A simple first order pass, align up the datapath, can be such a win. -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 41661
In article <a8i8kh$sh28a$2@ID-84877.news.dfncis.de>, Falk Brunner <Falk.Brunner@gmx.de> wrote: >110% acknowledge!!!!!!! > >This "one big CPU for all task" is the ancient approach of those Intel guys. >I remember a day, not too long ago, where Intel saw the future of the >personel computer with just a big RAM and a CPU, doing everything just in >software. :-0 >Hey guys, see those grafic controllers nowadays? See how many transistor >they have? See how much OPS they do? >Yes? Pfah. Big bloated pieces of silicon. :) It has ALWAYS been that several small processors are more "efficient" than one big processor, and it has always been a matter of programmability. A classic example is the Intel IXP1200 network processor, it consists of a single ARM core and 6 small risc-like cores, with context-switch on event (memory miss). A really powerful architecture if you can program it, and small too. Excluding the numerous interfaces (SDRAM, PCI, IXP bus, etc), it ends up being in the ~$10 silicon range. There is a lot of space still left in architectures with such performance that are also easier to program. Remember, an 8x8mm die, in a wafer level package, can buy you >200 pins [1], 10+ 32b Gops/second, in the sub $10/chip range. [2] [1] albeit at a .5mm pitch. Then again, 200 pins, any other ways, is going to easily add another $4-5 to the chip cost. So it is a tradeoff: higher board cost, lower part cost and area. -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 41662
Lasse Langwadt Christensen wrote: >> > > you can definately put an ARM in an FPGA the last project I worked on, I > did > an a ASIC proto of a SoC with an ARM7-TDMI-S in a virtexE, rigth now I'm > working > on something similar but in a virtex2, so it can hopefully get more of > the > clock gating in the design working in the prototype. Clock gating for an asic design can be automatically converted to enables in Certify with no source code changes. This covers flops, latches, memories (inferred or instantiated). Ken McElvain CTO Synplicity, Inc. > > Size and performance will not be like a hard implementation, but for a > prototype > that doesn't really matter as long as the performance is enough and the > design > fits a chip you can buy. And if you need to there's things that could be > changed > to bettter fit and fpga, so performance could be increased, but for a > prototype > you don't what to do that unless you have to. > > But anyways, buying the source code for an ARM will probably cost you an > arm ;) and a leg, > > -Lasse > -- Lasse Langwadt Christensen, > -- Aalborg, Danmark >Article: 41663
Austin answered the specific questions very well. Please allow me to add some philosophical comments: We are in the business of providing programmable solutions, but there is always a temptation to add dedicated circuitry because it is smaller and faster and may consume less power. We have to make agonizing choices, because any specialization detracts from the universality, and any one of the special circuits we add burdens each chip and must be paid for by every user, while it may help only certain users or applications. Over the years we have added global clocks, carry logic, BlockRAM, clock management, lots of I/O standards, on-chip termination resistors, multipliers, triple-DES decryption, and now also PowerPC and 3-gigabit SerDes dedicated circuitry. Every one of these additions was made after carefully evaluating the trade-offs between the dedicated area (cost) vs general usefulness. And we are happy with our choices. There is a long list of potential candidates that were rejected ( I was in favor of adding a dedicated PCI interface, the the XC4000, which luckily was rejected). Some of our competitors have populated a graveyard (or at least a retirement community) of commercially unsuccessful attempts to add excessive or poorly executed specialization to programmable logic, and IMHO Excalibur with its glued-on ARM and Mercury with its limited-speed incomplete dedicated clock recovery may be headed in the same direction. Whenever you add something costly, you should do it right, and don't leave the job half completed! Xilinx is obviously also adding dedicated circuitry, but only after very careful consideration of the technical and economical trade-offs. And it looks like we have been right in our choices so far. But keep the suggestions coming. We are listening! Peter AlfkeArticle: 41664
I have seen one Xilinx employee in this newsgroup saying that automatic P&R is getting better, so low level tools like floorplanner or FPGA Editor is getting less important. That can be true to some extent, but still, automatic P&R is so bad that, when I have to reduce setup time (Tsu) of my PCI IP core, I still have to rely on floorplanner. In theory, I can route my design many times to improve the timings, but typically, the improvement seems to end after the 10th routing, and after that, things don't improve at all. What I discovered through wasting lots of time routing my design multiple times is that the problem of Xilinx or Altera's P&R tool is that the tool doesn't place the timing critical LUTs and FFs in the right place, or relevant LUTs and FFs within a CLB (in Xilinx) or a LAB (in Altera). Because the timing critical LUTs and FFs are placed physically so far away from the destination (typically FFs), routing it multiple times just won't save the design, because the path will have greater routing delay inevitably. That's when the designer has to force the placement to certain location by using a floorplanner. If you are using ISE WebPACK, click on "View Floorplanner" after you P&R your design. Use UCF flow if you are using Floorplanner for the first time. You should download the Xilinx Floorplanner manual before trying it, but it only explains how the thing works, and it doesn't include anything like a tutorial. There is no tutorial available from Xilinx (I asked such a question several months ago, but no one gave me a reply. It turns out, Xilinx doesn't really have such a tutorial.), but if you are going to use Floorplanner, and target Virtex architecture FPGAs including Spartan-II, keep all relevant LUTs within a CLB because the routing delay within the CLB is small. Getting out of a CLB costs a lot in terms of routing delay, but still the delay to a CLB horizontally adjacent to is still fairly small. Another obvious advise will be that keep signal path distances to minimum because greater the distance, more the routing delay. Also, weren't you looking for a low cost PCI card? Insight Electronics recently released an upgraded version of the Spartan-II PCI card, and the new one is a little more expensive ($225) than the older one, but it has a bigger chip, and has more stuff on the card. http://www.insight.na.memec.com/cgi-bin/bvutf8/memec/scripts/local/mc_loc_b.jsp?Div=INSIGHT&Reg=AMERICAS&Country=UNITED_STATES&Lang=EN&EDOID=187428 Kevin Brace (In general, don't respond to me directly, and respond within the newsgroup.) Jimmy Zhang wrote: > > Just keep hearing about this hand placement thing, don't know how it > is done in reality. Does someone actually use their hands to do the > placement as opposed to CAD based P&R. Any hints? > > -- > ----------------------------------------------------- > Click here for Free Video!! > http://www.gohip.com/freevideo/Article: 41665
The way the placer works is to a random placement. Then it takes a nets (usually in alphabetical order) and estimates the wire distance to all the pins it is connected to. This is the "cost" of each net. Then it takes two components (luts or flops)and swaps them. If the cost is lower it keeps the swap otherwise it doesn't. The placement can be really improved just by placing a few luts or flops. The placed components act like "attractors" for the rest of the components connected it. I had a chance to look at the old ppr code. I was able to speed the cost function by 9.8x by putting the function in hardware. Steve "Nicholas Weaver" <nweaver@CSUA.Berkeley.EDU> wrote in message news:a8i8mh$n57$1@agate.berkeley.edu... > In article <a8i7t9$c1b$1@newsreader.mailgate.org>, > Kevin Brace <ihatespam84kevinbraceusenet@ihatespam84hotmail.com> wrote: > > I have seen one Xilinx employee in this newsgroup saying that > >automatic P&R is getting better, so low level tools like floorplanner or > >FPGA Editor is getting less important. > > Everytime Xilinx/Altera/Tool people say this, I have to laugh. > > There is so much low hanging fruit in datapath recognition, which the > tools fail MISERABLY to recognise. A simple first order pass, align > up the datapath, can be such a win. > -- > Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 41666
Nicholas, Just one minor point: the 405ppc has its own caches (16K for data, and 16K for instructions) so you can execute quite a bit right out of that without ever using a BRAM. Austin Nicholas Weaver wrote: > Additional view from a computer architect type: > > In article <3CAC80DE.40E4D721@xilinx.com>, > Austin Lesea <austin.lesea@xilinx.com> wrote: > >> 1) How come there isn't a dedicated DDR interface on the chip. I've never > >> seen a PPC application that didn't require DRAM, a dedicated interface > >> would be cheaper and higher performing than using valuable CLBs to build > >> a soft interface. (If I'm mistaken about the lack of a dedicated DDR > >> interface please let me know, I didn't see any mention of one when I read > >> the spec). > > > >DDR is built out of the DDR FF in the IOB's and logic in the FPGA. DDR isn't > >the only standard, and customers have many other applications. DDR is neat, > >but too specific. > > There is also a design pholosophy (which I can agree with for some > uses, can't for others here) that only the minimally useful set should > be implemented, because that is the cheapest and useable by the most > people. > > A dedicated DDR SDRAM interface would be very nice, but that would > consume a couple mm^2 of silicon, which is only usable by those who > are going to plunk down a DDR interface, on a specific set of pins. > > >> 2) I don't see the need for putting four processors on a die. In almost > >> all cases a single 405 should be adequate, in a few case you could make > >> good use of two but I don't think that you would ever need four. There > >> should have been a wider choice of parts with a single 405 core. > > > >We just don't know how customers will use all of this power. If 405ppc's are > >'free', you can use one executing out of internal cache to handle the "error > >404", and another running off internal cache to monitor QOS, etc. > > > >When electric motors were very expensive, a machine shop had one, and leather > >belts to every tool station. When fractional horsepower motors became > >inexpensive and ubiquitous, they were used everywhere, with no thought. > > > >If 405ppc are everywhere, you may dedicate them to tasks that seem horribly > >inefficient if you continue to think in terms of the one big expensive monster > >processor. > > And processors these days, for a simple core, are INCREDIBLY cheap, > especially this one: > > It has no memory (those are the BlockRAMs), only the register file, > datapath, and control logic. > > Even in synthesis, discounting the register file and caches, a 5 stage > SPARC uP core takes 1.3mm x .85mm in a .18uM process. The caches, out > of 4 1024x32b memories, are almost as big as the core itself! > http://www.eecg.toronto.edu/~pagiamt/research/leon.html > > So in the area of about ~8-10 Virtex 2 BlockRAMs (1024x18b memories), > you can fit a SYNTHESIZED sparc core (without a hardware > multiplier/divider or MMU). I suspect that the Virtex 2 PPC core is > even smaller, but with most of the actual area being the interfacing > of the core to everything else. > > I'd love to get my hands on an XC2VP4 or larger die or die photo, just > to verify these hunches about area in more detail. > > But according to the datasheet, the XC2VP2 uses 4 columns, 4 high of > BlockRAMS, with the top and bottom of the center columns replaced with > the RocketIO transecivers, so a pitch of 4 clb slices/BlockRAM. > > The XC2VP4 uses 4 columns, 10 high (its a 40x22 instead of a 16x22 > array) and has 28 BlockRAMs, so 8 BlockRAMs are replaced for the PPC > core, and 128 CLBs (500 slices) of logic. This is pretty CHEAP! > > If you have a low time critical function (EG, one which takes a fair > path-length, but isn't necessarily pipeline-every-cycle), if you can > replace just 128 CLBs with the use of the processor core, you've won, > bigtime. So my assumption here is the 8 BlockRAMs of area are > replaced with the uP core, with the rest going to a heck of a lot of > interface logic. > > >> 4) On chip Flash RAM would be useful. An embedded PPC is going to require > >> some Flash. Also it would be nice if the serial Flash RAM were on chip, > >> I bet every one is sick of the extra part that most Xilinx designs > >> require. > > > >Flash requires a process that is usually two years behind the leading > >process. To do a flash capable FPGA would be to be obsolete on day 1 of the > >introduction. Not very exciting. > > The only way I could conceive of their being Flash on the die is some > fancy packaging, eg, a chip-up smaller flash chip bonded to internal > pads on a chip down larger part. And do you REALLY want to spend an > extra $20 just to reduce your part count from 2 to 1, and save 16-30 > external pins? > > >> 6) This is a Virtex II issue, not just a Virtex II Pro issue. How about > >> offering versions of the Virtex II without the on board multipliers. The > >> multipliers make sense for DSP applications but they are a waste of money > >> and power for everything else. In my 12 years doing Xilinx designs I have > >> never needed a multiplier. I've frequently needed a CAM so I wouldn't > >> mind a few CAMs on board, but I'd rather have a cheaper part without the > >> multiplers. > > > >Well, they take up a tiny amount of area, so the cost savings is washed out > >completely by having to make two parts, with lower volumes in each. > > And, as Ray Andraka has pointed out, a multiplier makes a great > shifter as well. A variable shift is suprisingly expensive in an FPGA > fabric: there are a lot of muxes, but it is an operation that is > suprisingly common. > > An 18x18 multiplier can implement an 18 bit variable rotation with > just 18 LUTs worth of logic to deincode the shift amount, and an > additional 18 LUTs worth of logic if you want to make it a left > shift/rotate, an additional 36 LUTs worth if you want to make a > variable left/right shift. > > The multiplier blocks are an example of something which IS very > common. > -- > Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 41667
I have to disagree that a part with dedicated pins is a net loss for Xilinx. For example my patent http://www.delphion.com/details?pn=US06178494__ suggests that it might be useful to have a part that can be inserted into a pre-existing socket. For example if there were a part that fit into the second slot in of a Pentium system there is a good chance you could sell millions and millions of them. Steve "Peter Alfke" <peter.alfke@xilinx.com> wrote in message news:3CACAFC2.140DBBFD@xilinx.com... > Austin answered the specific questions very well. > Please allow me to add some philosophical comments: > > We are in the business of providing programmable solutions, but there is always > a temptation to add dedicated circuitry because it is smaller and faster and may > consume less power. We have to make agonizing choices, because any > specialization detracts from the universality, and any one of the special > circuits we add burdens each chip and must be paid for by every user, while it > may help only certain users or applications. >Article: 41668
In article <3CACC517.AC390FDE@xilinx.com>, Austin Lesea <austin.lesea@xilinx.com> wrote: >Nicholas, > >Just one minor point: the 405ppc has its own caches (16K for data, and 16K for >instructions) so you can execute quite a bit right out of that without ever using a >BRAM. OK. That makes even more sense (i shoulda noticed something was wrong), because otherwise it would take a HELL of a lot of interface logic to occupy 128 CLBs worth of logic. In any case, the assertion is: A uP is small. Including a fair number of them in a large FPGA is rather low cost. -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 41669
I argued ten years ago, and I am still convinced: The human brain is better than any computer in recognizing the underlying structure ( and thus drive some basic hand placement). But a computer is much better at the tedious job of routing. That's why routers have become very good, but the placer is still the problem child. And a bad placement is very difficult to remedy later. Peter AlfkeArticle: 41670
In article <k24r8.1447$Jl4.914143265@newssvr13.news.prodigy.com>, Steve Casselman <sc.nospam@vcc.com> wrote: >I have to disagree that a part with dedicated pins is a net loss for Xilinx. >For example my patent http://www.delphion.com/details?pn=US06178494__ >suggests that it might be useful to have a part that can be inserted into a >pre-existing socket. For example if there were a part that fit into the >second slot in of a Pentium system there is a good chance you could sell >millions and millions of them. However, the only consistant dedicated pins NEEDED are power and ground. Otherwise, the joys of reconfiguration, as long as the reconfigurable logic is fast enough, you can match the interface. Also, any dedicated circuitry is much harder to test, as it adds irregularities which need to be tested. -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 41671
Steve Casselman wrote: > I had a chance to look at the old > ppr code. I was able to speed the cost function by 9.8x by putting the > function in hardware. Sounds interesting. What did you do?Article: 41672
I think you're in the ballpark ay 80MHz. Why don't you just design the circuit and let Max-Plus/Quartus tell you the answer to your query. However, the low density/high speed design you are describing might be better suited to a CPLD architecture. Regards, Jay "S?awomir Balon" <antyspam.bsl@post.pl> wrote in message news:<a8gvvm$8q8$1@news.tpi.pl>... > >It depends on what you're trying to do with your clock, need to supply > >more detail... > > ok, i'm planning to use it for aquiring datas from two 8 bit flash adc > (AD9057) clocked at 80MHz both but clocks are shifted by 180deg in phase > (effectivelly 160MHz) will APEX -3 be fast enough to work with, or should i > use a -2 device (this data will be stored in fast 16 bit SRAM). > > regards > SlawekArticle: 41673
I took the cost function put it in hardware and ran the database past it several times. The cost function accounted for 30% of the placer performance. That part of the placer took about 1/3 of a xc4010. From my analysis I concluded that ppr could be speed up by 10x and would take about 50K gates. This holds to the normal 90/10 rule. Of course Xilinx was moving over to par at the time and they concluded that they didn't need the speedup. After spending a lot of time with the code I'm convinced that P&R is a sure bet for acceleration. Now with the PPC and Virtex II I'm sure that over all speedups of 8-10x would be pretty straight forward. I estimate about 2 man years of work and a design with 4-8 gig on board would do it. Steve "Tim" <tim@rockylogic.com.nooospam.com> wrote in message news:1017961909.26884.0.nnrp-01.9e9832fa@news.demon.co.uk... > Steve Casselman wrote: > > I had a chance to look at the old > > ppr code. I was able to speed the cost function by 9.8x by putting the > > function in hardware. > > Sounds interesting. What did you do? > > > >Article: 41674
We've heard the "place and route is good enough you don't need to do floorplanning unless you are doing the 1% designs from hell" line for as far back as I can remember from Xilinx. Fact is, floorplanning seems to be getting larger gains, not smaller, with the new devices. I typically see 50-70% performance improvement over a automatic placement. Routing multiple times without running placement is not going get much in the way of performance gains. The router does a pretty decent job if the placement is good, and can't do much to salvage a poor placement. Xilinx, as a company, promotes not using the floorplanner probably to avoid a feeling that the devices are more difficult to design in (which is not the case, in fact the ability to improve performance and density through floorplanning is a big plus). Floorplanning has always been the closet case, and from the looks of it will continue to be. Therefore you get poor documentation, lots of bugs, and very low priority on getting the bugs fixed compared with the rest of the software package. Until floorplanning becomes a mainstream design event, I doubt it will ever be anything more than the poor cousin no one will admit to having. Unfortunately, the mainstream doesn't use it because a) they are told they don't need it*, that it is only there for the FAEs to get you out of trouble in special cases, b) They don't know the benefits because those are not told to them and the tool is not easy to learn without doing it alot (and living with numerous bugs), and c) Even if someone convinces them to use it, the documentation is next to useless as far as learning how to floorplan. Part of the problem is that floorplanning is sort of like putting together a puzzle with many acceptable solutions. Some people have the knack for it, some don't and if you don't you will probably not inherit it ever. Kevin Brace wrote: > I have seen one Xilinx employee in this newsgroup saying that > automatic P&R is getting better, so low level tools like floorplanner or > FPGA Editor is getting less important. > That can be true to some extent, but still, automatic P&R is so bad > that, when I have to reduce setup time (Tsu) of my PCI IP core, I still > have to rely on floorplanner. > and so on... > > > ----------------------------------------------------- > > Click here for Free Video!! > > http://www.gohip.com/freevideo/ -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z