Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
In comp.arch.fpga Kevin Neilson <kevin-neilson@removethistextattbi.com> wrote: : Word. C->gates is still a pipe dream. Maybe some day... : "Ray Andraka" <ray@andraka.com> wrote in message <4.5 kBytes> of quote deleted> Kevin, please cut down what you quote to avoid spoiling the archives. Also faked retun addresses are depricated. Bye -- Uwe Bonnes bon@elektron.ikp.physik.tu-darmstadt.de Institut fuer Kernphysik Schlossgartenstrasse 9 64289 Darmstadt --------- Tel. 06151 162516 -------- Fax. 06151 164321 ----------Article: 45427
I suspect it always will be. Software and hardware have totally different design constraints. C code is inherently sequential, and it takes great pains in coding to make it map to parallel hardware. The processors for C can afford to have very elaborate instruction units because there is usually only one and it is used for all instructions. Hardware solutions, on the otherhand should stive to minimize the complexity of the data path, because each part is typically only used by a small part of the algortihm. Kevin Neilson wrote: > Word. C->gates is still a pipe dream. Maybe some day... >Article: 45428
Domagoj wrote: > > Verification patterns (like in E) and formal verification patterns seem like > an obvious copying > of well-established and studied programming patterns. Project management and > versioning > is pretty much the same, too. --snip-- I think you're right. But we will see a pick & choose approach to software methods. Why? Moore's Law drives processor to a lesser extent ASIC design. What drives software design? Gate's Law? My point is that chip design is already more complex than software design with no slowdown in sight. Chip design methodology must advance faster than software methodology. -DonArticle: 45429
Re: the carry chains. The Tciny and the time to get off the carry chain into the flip-flop are both much longer than you would expect considering the die shrink and everything. Unlike virtex and virtexE (which often got limited by SRL16 minimum pulse width, routing distances and fanout), it looks like the carry chain is going to set the upper limit on DSP performance in V2. The improved multipliers are around 250 MHzdepending on which set of numbers you believe and how carefully you add pipeline registers immediately before and after the multipliers. You can get to the timing numbers by putting the CONFIG_STEPPING property on the multipliers. Kevin Neilson wrote: > Ray, > Thanks for that observation about the carry chains. I thought I might be > crazy because I was getting worse response from the adders on the V2 than I > was on the Virtex Es. I posted a question about this but never got a > response. I think there is a value called Tciny(?) that deals with getting > data on the carry chain that's a lot bigger for V2 than it was for VE. It's > depressing looking at the paths, because the carry chains are so fast, but > getting data to the chain and on it is so slow. > > How much faster are the enhanced multipliers going to be? I've been > disappointed with the fact that with each service pack the multipliers get > slower and that the pipelined multiplier doesn't yield very much benefit, > but I have to say that I'm still very glad Xilinx put these in. In the > design I'm doing right now I'm using about 25 for halfband interpolators, > mixers, linear interpolators, gain stages, etc. I would never have the > gates to do this with fabric-based multipliers. In many cases the > multipliers weren't fast enough for my needs and I had to double up and > operate in parallel, but since I have 40 multipliers in the part this isn't > a problem. > > Hua: > I'd recommed this technique over using pipelined fabric multipliers. A > fabric multiplier can get you over 200MHz if you pipeline every stage, but > this will eat up a lot of gates. Using two embedded multipliers will also > get you over 200MHz in fewer cycles if you demux the data stream into two > multicycle paths and use two multipliers and then remux. (You have to set > the constraints and clock enables properly.) Then you haven't burned up > nearly as many of the valuable fabric gates. This might not work in every > application, but works in most DSP applications. > -Kevin > > "Ray Andraka" <ray@andraka.com> wrote in message > news:3D3CAE7C.9325F62F@andraka.com... > > This depends on which silicon. Silicon produced before this spring has > slow multipliers that are > > pretty easy to beat with a pipelined multiplier in the fabric. The > silicon with the fixed > > multipliers is difficult to match the speeds of the multipliers with a > pipelined multiplier in > > the fabric because it takes a pretty long time getting on and off the > carry chains, in fact from > > what I've seen so far the carry chains are no faster than, and perhaps a > little slower than the > > virtexE carry chains :-(. > > > > Jay wrote: > > > > > Those hardwired multipliers are about as fast as you're going to get > > > for a single cycle multiply that wide, in that process technology. If > > > you can stand the latency, you could probably get a faster pipelined > > > multiplier using the logic and hand placement. What speed do you > > > need? Are both factors really 16 bits wide and every bit can vary > > > every clock? > > > > > > Regards > > > > > > HUA QIAN <qianhua@ece.gatech.edu> wrote in message > news:<3D39828F.189F00AB@ece.gatech.edu>... > > > > Hello, all, > > > > > > > > I noticed that Xilinx Vertex-II provide 18*18 multipliers, which > > > > introduce a lot of delays. Can I generate a more efficient 16*16 > > > > multiplier, which is my target, and give me a shorter delay? > > > > > > > > Another question is how to determine the clock speed for the Vertex-II > > > > embedded multiplier? > > > > > > > > Any advice or help is greatly appreciated! > > > > > > > > Hua > > > > -- > > --Ray Andraka, P.E. > > President, the Andraka Consulting Group, Inc. > > 401/884-7930 Fax 401/884-7950 > > email ray@andraka.com > > http://www.andraka.com > > > > "They that give up essential liberty to obtain a little > > temporary safety deserve neither liberty nor safety." > > -Benjamin Franklin, 1759 > > > > -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 45430
Kevin, you probably could have used distributed arithmetic to get the logic to fit within the fabric if you did not have the multipliers. After all you have a total of 40 multiplies which are presumably used in a sum of products architecture and that have a top clock rate in the 130-150 MHz range if you use pipeline registers before and after them. Kevin Neilson wrote: > How much faster are the enhanced multipliers going to be? I've been > disappointed with the fact that with each service pack the multipliers get > slower and that the pipelined multiplier doesn't yield very much benefit, > but I have to say that I'm still very glad Xilinx put these in. In the > design I'm doing right now I'm using about 25 for halfband interpolators, > mixers, linear interpolators, gain stages, etc. I would never have the > gates to do this with fabric-based multipliers. In many cases the > multipliers weren't fast enough for my needs and I had to double up and > operate in parallel, but since I have 40 multipliers in the part this isn't > a problem. > -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 45431
The construction is the same as pipelined multipliers but without the registers. YOu might look at the multipliers page on my website as a starting point. Reala wrote: > Dear all, > > I would like to design of 16 X 16 multiplier with single clock cycle. > I try to search internet but fail. Any free design and hint to design this > kind of multiplier? Size reduction is need for the design. > > Thank a lot. > Reala -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 45432
Back in the 60's RTL meant resistor-transistor logic, which was a forerunner to TTL logic. Today it means register transfer level. Register transfer level design basically means that you specify the registers explicitly and the logic between the register implicitly (behaviorally). Most HDL synthesizers do RTL synthesis. Reala wrote: > > What is RTL (Register Tran...Logic) I know the name but not really know the > meaning? What tools for RTL synthesis? > -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 45433
The pride of getting hello world (dip switches to seven-segment displays) working in Verilog on my SpartanIIe board has been replaced by a bunch of new questions. I hope you can help me out a bit on these ones; I'm a bit at a loss when it comes to handling the different constraints files and constraints databases in WebPack. What I do is change the name of an input or output in my .v file, hit "Edit Implementation Constraints (Constraints Editor)", and have it reporting errors when parsing the .ucf-file. What is the recommended way to maintain constraints? Also, how should I enter timing and compactness constraints? And equally imprtant, which report do I read to find out what timing can be expected from the actual implementation? Thanks, BĝrgeArticle: 45434
This is interesting. I too work on something that is tested in Verilog and later to be implemented in an ASIC. When the Verilog code works on the FPGA board, I'm going to be travelling the same route as you are. What my boss said was that we could work in parallel on the code and on implementing and laying out some dedicated building blocks. Like, it is quite probable that we will need a 4-bit adder and friends. Our ASIC process is proprietary and has to work with some touchy analog stuff. So what we do is make our own digital cells. I guess, though I am in no way experienced in this, there are programs which can be told what your cells look like (fanout, fanin, digital function, etc) and produce a netlist for that particular "library". Even greater would be if it could also be told about the pin locations of the cells and then autoroute the whole thing. Any information about typical FPGA and digital ASIC flows will be appriceated! I have worked mostly with analog board layout, where the flow is draw schematic, simulate, layout. The digital flow seems a bit more complicated to me. Regards, Bĝrge "Reala" <manfield.chow@scoreconcept.com> wrote in message news:ahitls$qkr10@imsp212.netvigator.com... > Dear Kevin, > > Thank you for your detailed reply. > Actually, I work in a IC design company. My boss want to develop a low-end > DSP chip. However, we are less experience in this. > We think that one of the important building block is 16X16 small size, > single cycle multiplier. > I write simple verilog and synthesis by Xilinx Web pack tools. It seems that > work. > Assuming it is work, I want to open some output files to see what "circuit" > is synthesised, because I will design a DSP chip. But i do not know which > output files mention the netlist of the "systhesised design" in gate level. > > I guess that the verilog code will be synthesised by synthesis according to > synthesis tool's library. Am I correct? Can i force the synthesis tool to > synthesis the verilog code without using library? (I means the design is > systhesis in gate level ...AND OR XOR.....) Then, can i see the netlist in > gate level such that I can study the design synthesised by the synsthesis > tool? > > You say that: > >To make sure the synthesized design was synthesized correctly, > >do a gate-level simulation of the synthesized design. > >You should be able to run the same testbench code you used for an RTL > simulation. > > I am not really understand because I am a beginner of IC design. > what is the meaning of gate-level simulation? by what kind of tools? > Modelsim? Xilinx? or other? > > What is RTL (Register Tran...Logic) I know the name but not really know the > meaning? What tools for RTL synthesis? > > Thank again ^_^ > Reala > > > > > "Kevin Brace" <killspam4kevinbraceusenet@killspam4hotmail.com> wrote in > message news:ahip0o$rea$1@newsreader.mailgate.org... > > You will want to avoid using vendor specific features (vendor > > specific primitives) as much as possible if your goal is to do an FPGA > > to ASIC conversion. > > Also, you will need to have sufficient volume to justify the NRE (Non > > Recurring Engineering) fee you will have to pay upfront. > > There are firms like AMI Semiconductor, Chip Express, Lightspeed > > Semiconductor, and NEC (And a few more I cannot think of right now.) > > that do an FPGA to ASIC conversion if you submit them the EDIF netlists > > generated from your FPGA synthesis tool. > > To make sure the synthesized design was synthesized correctly, > > do a gate-level simulation of the synthesized design. > > You should be able to run the same testbench code you used for an RTL > > simulation. > > Also, before firing up your FPGA, it is probably a good idea to do a > > post P&R simulation of your P&Red design. > > I always do a post P&R simulation before firing up an FPGA board I got. > > When I made sure the design worked fine in a post P&R simulation, my > > design always worked fine in a real system. > > > > > > > > Kevin Brace (In general, don't respond to me directly, and respond > > within the newsgroup.) > > > > > > > > > > Reala wrote: > > > > > > Hi, > > > > > > I have a question is that : if I design a circuit by verilog. Then, I > > > synthesis this and implement by FPGA. > > > Assuming the design is work in FPGA, then, I want to make it a custom > IC. > > > Can I know the netlist in gate level of the design after synthesis? > > > Otherwise, how can i translate the design from FPGA to IC? > > > > > > Thanks a lot. ^_^ > > > Reala > >Article: 45435
Ken Schmidt <kschmidt@peerless.com> wrote: > ISE 4.2 seems like a big step backwards. What happened? Are others > having troubles with 4.2? My own experience is that it's a fairly painless upgrade from 4.1, for Virtex-II designs. I don't remember much real change in the tools; I just track them for the latest bug fixes, plus access to the latest speed files. For Virtex-E it seems to be a significant upgrade from 4.1. It sounds like most of your problems are with XST - I don't use it, so my experience is limited to the routing tools only. I don't use Verilog either. I suppose XST behaves pretty similarly for Virtex-E and Virtex-II. > I am very disappointed in Xilinx and in ISE 4.2. Are we doing > something wrong, or is a 4.2 a step backwards??? Use Synplify? Hamish -- Hamish Moffatt VK3SB <hamish@debian.org> <hamish@cloud.net.au>Article: 45436
You can do a gated clock.but only with extreme care. You will need to instantiate the global clock buffer and feed it with the logic gated clock signal. Depending on you application care will also be needed when you change over the clock. You should also add constraints on path through the gating logic or your timing could be anywhere. Check also that clock period constraints are properly applying and not being ignored, you may have to add period constraints to the net after the clock buffer and of course consider the gated clock as a different clock to the source and apply clock boundary crossing techniques or ensure that you meet setup and hold where you cross. John Adair Enterpoint Ltd. -- The views expressed in this message are those of the writer and not necessarily those of Enterpoint Ltd.. The use of information in this message is without warranty and persons using the information are advised to make their own checks as to it's validity. No responsibility will be accepted for any incorrect, inaccurate or missleading information supplied. "Jason Crawford" <jace@cisco.com> wrote in message news:3D3BAC69.A03017A@cisco.com... > Hi, > > Apart from using clock-enables, does anyone know of any > way to use clock-gating in Virtex-E parts? > > We have a design that is partially written for an ASIC > target and expects to see a gated clock. Rather than have > to get the designers to pour throught the code and add > clock enables to all flip flops (I can hear teeth gnashing > already) I am hoping against hope that someone has an > alternate answer to this rather difficult problem. > > yours in hope, > Jason.Article: 45437
In order that I, personally, don't "spoil the archives," what is it about the post that was bad? The original posting had expired from my newsgroup so to see the original text would take a journey to groups.google.com. Are a few kbytes really that important? Also, it's pretty standard in the newsgroups to include text that obviously marks the email with bogus text that can be removed by anyone reading the address. This prevents the massive amounts of spam to include porn, viagra, and get rich quick schemes. I wish now that I'd done the same thing when I first started porting; I'm thinking it's useless to change now. I appreciate the forum and the people who participate. I like to see everyone get along reasonably. The professional exchange benefits many people to include myself. - John_H Uwe Bonnes wrote: > Kevin, > > please cut down what you quote to avoid spoiling the archives. > > Also faked retun addresses are depricated. > > ByeArticle: 45438
Uwe Bonnes wrote: > > ... faked retun addresses are depricated. > I would describe kevin-neilson@removethistextattbi.com as munged, rather than faked. It's easy to see what the real address is. Jerry -- "The rights of the best of men are secured only as the rights of the vilest and most abhorrent are protected." -Chief Justice Charles Evans Hughes, around when I was born. "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759 ŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻArticle: 45439
Hmm, I think a translation is in order: spiral = circling the drain waterfall = falling off the cliff watersluice = down the tubes Pick your poison :-) -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 45440
Fellow experts: I'm hoping someone has fallen on this one and found a workable solution. So far the Xilinx hotline has been unusually unresponsive in handling this case (they've had it since 7/12 and haven't even acknowledged whether or not they see the problem with the test case I sent). Seems the hotline is not as responsive and helpful as it once was, which is a shame. Anyway, this is the case: RLOC_ORIGIN being ignored by placer. The macro shows up in the correct position in the floorplanner but with the G and Y elements missing (that is a separate issue which is still an open case). The RLOC origin is being ignored by the place and route, so the macro is not landing in the specified position. It is critical I be able to specify the RLOC origins in this design (2V6000, 200 MHz, high utilization. Is this a known problem (I don't see it in the answers data base). I tried adjusting the RLOC origin by single slice steps as suggested in answer record 12192 to no avail. The RPMs are not created under FPGA editor, rather they are RLOC'd in the VHDL source. That gets me relative locations for each BEL in the design which in turn give me a macro that I should be able to place or let the tools place. I am trying to put an RLOC_ORIGIN on it by adding an RLOC_ORIGIN attribute to the UCF file. The syntax is correct, and indeed the part of the RPM that is not decimated by the floorplanner shows up in the floorplanner editable window in the correct position (there is a previously reported bug in the floorplanner that prevents the RPM from showing correctly unless you first go through auto PAR, then constrain from placement, then unbind and bind the RPM). However, when the design is run through the PAR, the macro is placed at some location other than that indicated by the RLOC_ORIGIN. Thanks in advance for any info you may have seen on this problem. -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 45441
Hi Don, "Abernathey Family" <family2@aracnet.com> wrote in message news:3D3D4B88.E7621A86@aracnet.com... > I think you're right. But we will see a pick & choose approach to > software methods. Why? Moore's Law drives processor to a lesser extent > ASIC design. What drives software design? Gate's Law? My point is that > chip design is already more complex than software design with no > slowdown in sight. Chip design methodology must advance faster than > software methodology. > -Don Perhaps we are flattering ourselves :) . I agree that the whole ASIC design process is composed of many complex different activities (simulation, synthesis, floorplanning...). Software design is composed of activities that are actually very similiar to each other and that's why it's simpler to design software. But complexity of the software itself is enormous today. I believe that trends seen in the software arena, like object cooperation, AOP and dependibility are far ahead of any programming related to digital design. Also, formal verification of hardware is much easier than of software because hardware is simpler in its nature. -- Domagoj Babic domagoj (et) engineer.comArticle: 45442
Seen a lot of replies on software use of these methods, and how sw is different than hw. But these are not the real issues. In hardware. as in software, you still have requirements, architectural plans, design, verification plans, and validation. What we I talking about is METHODOLOGY! In spiral, you don't have a hard spec, but start to build right away, and then keep on correcting until you get something that works. In waterfall, you MUST define all your requirements/architectural plans BEFORE you build. It's a bit like the process involved in getting a house built. Firt you define plans/architecture (on paper), then you have them approved (the city in this case), then the builder builds the house acording to plans. Inspectors verify that the house conforms to design and standards. If you were to build a house using the spiral method, you have an idea, and start pouring the concrete. If you don't like what you see, you just tear everything, or some of the cement, and make changes. You keep this process, until house is built. Then you call the inspectors (maybe). Bottom line, spiral/waterfall deals with PROCESSESm but not with sw/hd and differences or adaptation to a methodology define for sw. Whne you think about it, the methodology applies to many other disciplines (i.e., constructions, relationships, traveling, jobs, proessions, etc.) For the record, on hw designs I had more experience using the waterfall approach, and that process is documented in the book "Component Design by Example" Ben ---------------------------------------------------------------------------- Ben Cohen Publisher, Trainer, Consultant (310) 721-4830 http://www.vhdlcohen.com/ vhdlcohen@aol.com Author of following textbooks: * Real Chip Design and Verification Using Verilog and VHDL, 2002 isbn 0-9705394-2-8 * Component Design by Example ", 2001 isbn 0-9705394-0-1 * VHDL Coding Styles and Methodologies, 2nd Edition, 1999 isbn 0-7923-8474-1 * VHDL Answers to Frequently Asked Questions, 2nd Edition, isbn 0-7923-8115 ------------------------------------------------------------------------------Article: 45443
Hi Antonio, Antonio Martínez Álvarez wrote: > I'm using Handel-C and VHDL to make a 2D filter. (DOG (Diference of > Gaussians) filter indeed). > I'm more interested in Handel-C. (DK-1.1) I haven't used Handel-C, but I don't think it will ever match the speed from well-designed hand-written VHDL, particularly for DSP applications. > I'm doing it secuentially. For every pixel I read the pixels which are > below the mask and multiply... > > Well. I'm using a RAM that I've defined for a Virtex-E. The filter > works... but it's very slow. (12 frames per second). The bottleneck in the design you describe is probably the RAM access to get the pixel values. It sounds like you are fetching NxN pixel values for each output pixel, and this is slowing you down. Depending on your available on-chip RAM/LUT resources, a faster way to do it is shift the image pixels out of RAM into a local line buffer. The SRL16 in the Virtex architecture is great for that, but if you've got a very wide image it will eat a lot of floor space. For a 3x3 mask and a 256x256 image, you need a shift register that is (256*2+3) = 515 pixels long, tapped at positions 0,1,2,256,257,258,512,513,514. Each new pixel is shifted in at position 0, and the rest of the pixels shifted along. If your line buffer can support it, you may wish to take the taps in parallel, to a set of parallel multipliers. Doing it this way you could also pipeline each multiplier, increasing the delay slightly but giving you blistering throughput. Using the on-chip line buffer concept, but with a sequential non-pipelined multiplier, I achieved about 45 frames/sec in a Virtex (speed grade 4) for a 256x256 8-bit image, 3x3 convolution, without any real optimisation or fine-tuning. By doing the multiplies in parallel and pipelining them, I think the frame rate could ultimately be limited by the RAM access time for a single pixel. Hope this helps, Regards, John -- Dr John Williams, Postdoctoral Research Fellow High Performance Computing Group, CRC for Satellite Systems Queensland University of Technology, Brisbane, Australia Phone : (+61 7) 3864 2427 Fax : (+61 7) 3864 1517 Web : http://www.crcss.bee.qut.edu.au/comp.htmlArticle: 45444
If I have two separate boards in a system, each with a VirtexII DCM used in frequency synthesis mode, both generating the same frequency, is there anything I can do with the DCM rst line (or anything else) to ensure that they come up in phase or is it impossible to guarantee the phase? Thanks, DougArticle: 45445
John, You definitely don't want to use flops for this. (I don't think there are even enough; you need 24500 flops and that part only has about 6000 I think.) You could use the SRLs, which are 16-bit shift registers. You would need about 1530 of these, and the part has about 6000 of these. Better yet would be to use the dual-port blockRAMs. You could use 24 of these in a 16wide x 256deep configuration. Then you just set the write and read addresses 16 apart to get a 16-deep pipe. If your clock is slow enough, you split up the RAM and use only six RAMs. This part has about 32 blockRAMs I think. -Kevin "John Hovell" <jhovell@yahoo.com> wrote in message news:9402973.0207231559.2030b94a@posting.google.com... > Hello all -- > > I am trying to implement a delay pipe that is 384 bits wide and 64 > bits long in Verilog. > > I was trying to build one out of fairly simple D-flops, but my design > has been "synthesizing" in Xilinx Web Pack for nearly 2 hours now, so > I think I mus have done something wrong. > > Is there an efficient or correct way to implement such a pipe on a > 300K gate Spartan IIe? I think the size should be OK since a Spartan > IIe 300K gate could theoretically make a 98kbit distributed memory... > but maybe I am missing something here. > > TiA for any help, pointers, etc. > > Cheers, > JohnArticle: 45446
I like the BlockRAM idea. But - to get 32 bit mode - use the same read and write address (cycle through n addresses) and use port A as the upper 16 bits and port B as the lower 16 bits. The output data is the registered version of the memory at that address so there aren't any timing problems. The memory can look like a 32 wide by 128 deep in this mode. Only 12 memories needed which would fit in a Spartan-IIE 150! Lots of extra logic to develop a "pong" game as well. I tried compiling an array of 384x64 regs in Synplify and I quit after 8 minutes of compiling. Synplify usually goes much faster! The SRL16s with arrays of instatiations would probably work much faster in Synplify but I don't know if Webpack supports the "arrays of instances" which - I believe - was Verilog 1995, just not supported by many. The BlockRAM approach would work out so very nice. Kevin Neilson wrote: > John, > You definitely don't want to use flops for this. (I don't think there are > even enough; you need 24500 flops and that part only has about 6000 I > think.) You could use the SRLs, which are 16-bit shift registers. You > would need about 1530 of these, and the part has about 6000 of these. > Better yet would be to use the dual-port blockRAMs. You could use 24 of > these in a 16wide x 256deep configuration. Then you just set the write and > read addresses 16 apart to get a 16-deep pipe. If your clock is slow > enough, you split up the RAM and use only six RAMs. This part has about 32 > blockRAMs I think. > -Kevin > > "John Hovell" <jhovell@yahoo.com> wrote in message > news:9402973.0207231559.2030b94a@posting.google.com... > > Hello all -- > > > > I am trying to implement a delay pipe that is 384 bits wide and 64 > > bits long in Verilog. > > > > I was trying to build one out of fairly simple D-flops, but my design > > has been "synthesizing" in Xilinx Web Pack for nearly 2 hours now, so > > I think I mus have done something wrong. > > > > Is there an efficient or correct way to implement such a pipe on a > > 300K gate Spartan IIe? I think the size should be OK since a Spartan > > IIe 300K gate could theoretically make a 98kbit distributed memory... > > but maybe I am missing something here. > > > > TiA for any help, pointers, etc. > > > > Cheers, > > JohnArticle: 45447
Here, I am Daryl and I have to trouble you. :-) When I design a chip used for optical network, a lot of effort must be made to increase the clock speed and reduce the chip resource cost. In a timing interface module, there is a counter with 14-bit width to provide timing to the outgoing frame. So, a comparator used to compare the counter word with a series of registers set by the controller. I've notice that the slices cost increases seriously and the maxinum clock speed decreases a lot, when the counter and the comparator get wider. Troubled with it, I firstly tried a wider counter(14-bit) and a narrower comparator(4-bit) and got 20MHz upgrade of speed and more than 20 slices saving. Then, a 4-bit counter and 14-bit comparator with a result of 10MHz upgrade and about 10 slices saving. So, I think the critical factor is the wide comparator. This is proved by studying the report and schematics from the synthesis tools(FCII3.6.1 and Synplify Pro with Amplify). To improved the performance, I've tried to use CoreGen tool to generate a core of comparator. But,after implement, the result is no better than from myselft code. The synthesis tool I used is FCII 3.6.1, the device is VirtextII1000, implement by ISE4.2SP3. Here is the result of my trials : 14-bit counter, 14-bit comparator and other logic : 63 slices used(36 FFs and 105 LUTs); 95MHz 4-bit counter, 14-bit comparator and other logic : 50 slices used(26 FFs and 85 LUTs); 115MHz 14-bit counter, 4-bit comparator and other logic : 41 slices used(26 FFs and 62 LUTs); 127MHz Would you give me some advice about it from your experience? Or some resource to study? Thanks in advance for you time! DarylArticle: 45448
> Although I have a fair amount of experience with FPGA, there is little > chance for > us to startup on our own. Unlike the US, companies here are not > out-sourcing. > They would rather buy tools and hire people to do in-house development. My impression working in Southern California (so I have a small view of the EE world!) is US companies prefer to keep project development in-house. If an outside vendor already has a finished, working IP-block, the company will consider buying it. In essence trading money for time (the idea being they can 'buy' the IP block and just drop it in their current project.) Otherwise, the company will have to invest money AND time(since the contracter's development-cycle would be non 0-day.) In that case, for a little additional expenditure, the company can keep the development expertise in-house, so why pay someone else to learn your project? I know there are some success stories when it comes to design services companies. But given the size of the fabless semiconductor industry, wouldn't you expect there to be *more* design services consulting?Article: 45449
I'm looking for a synthesizeable 32-bit 33MHz PCI Target only design to be placed into a FPGA or large CPLD. Minimal implementation is fine. Does anybody know if such a thing is available in VHDL or Verilog and is open sourced? I seem to recall Xilinx publishing a target only design quite some time ago but I can no longer find it on their web site. Any help is much apprecieated! Jeff
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z