Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
I am very new to programming FPGA's but old (today's standards) in C programing. I am trying to find a generic method of defining what size FPGA I need given a certain algorithm most likely written in C. Thus, I have some idea what the compiled size of the C program but can I relate the compiled size to how many gates will be necessary in a FPGA. I have four algorithms to code into a FPGA. Two are data manipulation, one is pixel compression using a new algorithm, and the last is a standard Principal Component Analysis algorithm. I know the software used to program the FPGAs will give an apporximate gate count but only after development of the VHDL code. As this will be a proposal, I was wondering if there was a method to estimate the FPGA size prior to all the encoding. I know I don't want to oversize the FPGA area due to unnecessary time delays and power requirements. ANY information will be helpful since I haven't found any information yet and I seemed to have run out of places to search on the WEB. None of the tutorials seem to deal with the algoritm to gate mapping. The two books I have don't mention this either. Thanks Rick Dempster red@usgmrl.ksu.eduArticle: 20626
gazit@my-deja.com wrote: > Matt, > try to use "map -c 1 ..........." it will improve your device > utilization. > My experience shows that the ratio between the real gate count and > Xilinx numbers is ~1/5 ( 60K ASIC system gates into a "xcv300" device > sounds reasonable ratio). > If your design is going to be changed you should consider immigrating > to a larger device. I believe you can use the same foot print. > Good luck. I think the 1/5 ration is a little pessimitic esp. if there's a fair amount of memory involved. We recently did an ASIC prototype in a Xilinx XCV300. The FPGA usage was 89% of LUTs, 55% of CLB FFs, 14 out out 16 Block RAMs [each of these only 30-50% used]. The final ASIC gate count was about 210K pre scan insertion. With scan & boundary scan it went up to about 240K. The ASIC partition between logic & memory was about 55/45. However I would agree with using a bigger FPGA. The XCV300-4 was all we could get back last April and it was really not big enough to allow fast P&R and to get decent timing took a lot of work. The best we could get we could get with a (nearly) pure HDL design & no Floorplanning [F1.5i floorplanner didn't support Virtex] was 66MHz. The ASIC signed off @ 91MHz.Article: 20627
Matt, I would not worry (at least not now...) My experiences with Virtex and Virtex-E PAR produce similar SLICE utilization results. Consider the following when interpreting the SLICE utilization number: 1) If the Xilinx mapper determines that the design requires a portion of the device, it will "spread" out the FF / LUTs into a greater number of SLICEs than required (The C-100 MAP option). This "SLICE-spreading" is performed to simplify the routing issues. 2) When MAP reports X SLICES utilized, we have no way of determining if the SLICE used 1 of the 2 FFs, 1 of the 2 LUTs, 1 FF and 0 LUTs, 0 FFs and 1 LUT etc. To see the "real" implementation, you would have to look at it in FPGA Editor. 3) If you want to see the MINIMAL required SLICES, run MAP with the -C 1 option. On the "gate" count issue - Xilinx "gate" counts are constructed to fit the device's part number. For example, we are using the 2000E devices which are billed as 2 Million gate devices. Xilinx arrived at the 2 Million gate number by assuming the average design would use X% of the CLBs as logic, Y% of the CLBs as distributed memory, and Z% of the Block RAM. Consider that the 2000E has 160 BRAMs which are each 4096 bits. At 4 "gates" per bit, the 2000E has 2.6 Million gates in Block RAM alone - a number which greatly exceeds the marketed 2 Million gate number for the entire device. In article <38A978D6.F2364810@collins.rockwell.com>, Matt Gavin <mtgavin@collins.rockwell.com> wrote: > FPGA gurus, > > I am trying to fit a design in a Virtex XCV300 (2.5V part). > The Xilinx mapper reports 100% slice utilization, which shocked me. > However, if you do the math on their flop and LUT counts, > (assuming 2 flops and LUTs per slice), the flop and LUT utilizations > are 55% and 76%, repectively, which is what I expected. > The equivalent gate count is ~60K, which isn't that high > (especially since Xilinx claims that 322K system gates can fit in > a XCV300.) > > Should I be worried that my slice utilization is 100%? > Why would the mapper choose to use every single slice > if the flop and LUT counts are so low? > > The report is given below for reference. > > Thanks for any help, > > Matt Gavin > mtgavin@collins.rockwell.com > > Design Information > ------------------ > Command Line : map -p xcv300-5-pq240 -o map.ncd mimas_fpga.ngd > mimas_fpga.pcf > Target Device : xv300 > Target Package : pq240 > Target Speed : -5 > Mapper Version : virtex -- C.19 > Mapped Date : Mon Feb 14 17:08:18 2000 > > Design Summary > -------------- > Number of errors: 0 > Number of warnings: 4 > Number of Slices: 3,072 out of 3,072 100% > Slice Flip Flops: 3,408 > 4 input LUTs: 4,682 (4 used as a route-thru) > Number of Slices containing > unrelated logic: 948 out of 3,072 30% > Number of bonded IOBs: 120 out of 166 72% > Number of GCLKs: 4 out of 4 100% > Number of GCLKIOBs: 3 out of 4 75% > Total equivalent gate count for design: 60,367 > Additional JTAG gate count for IOBs: 5,904 > > Sent via Deja.com http://www.deja.com/ Before you buy.Article: 20628
Matt, I would not worry (at least not now...) My experiences with Virtex and Virtex-E PAR produce similar SLICE utilization results. Consider the following when interpreting the SLICE utilization number: 1) If the Xilinx mapper determines that the design requires a portion of the device, it will "spread" out the FF / LUTs into a greater number of SLICEs than required (The C-100 MAP option). This "SLICE-spreading" is performed to simplify the routing issues. 2) When MAP reports X SLICES utilized, we have no way of determining if the SLICE used 1 of the 2 FFs, 1 of the 2 LUTs, 1 FF and 0 LUTs, 0 FFs and 1 LUT etc. To see the "real" implementation, you would have to look at it in FPGA Editor. 3) If you want to see the MINIMAL required SLICES, run MAP with the -C 1 option. On the "gate" count issue - Xilinx "gate" counts are constructed to fit the device's part number. For example, we are using the 2000E devices which are billed as 2 Million gate devices. Xilinx arrived at the 2 Million gate number by assuming the average design would use X% of the CLBs as logic, Y% of the CLBs as distributed memory, and Z% of the Block RAM. Consider that the 2000E has 160 BRAMs which are each 4096 bits. At 4 "gates" per bit, the 2000E has 2.6 Million gates in Block RAM alone - a number which greatly exceeds the marketed 2 Million gate number for the entire device. In article <38A978D6.F2364810@collins.rockwell.com>, Matt Gavin <mtgavin@collins.rockwell.com> wrote: > FPGA gurus, > > I am trying to fit a design in a Virtex XCV300 (2.5V part). > The Xilinx mapper reports 100% slice utilization, which shocked me. > However, if you do the math on their flop and LUT counts, > (assuming 2 flops and LUTs per slice), the flop and LUT utilizations > are 55% and 76%, repectively, which is what I expected. > The equivalent gate count is ~60K, which isn't that high > (especially since Xilinx claims that 322K system gates can fit in > a XCV300.) > > Should I be worried that my slice utilization is 100%? > Why would the mapper choose to use every single slice > if the flop and LUT counts are so low? > > The report is given below for reference. > > Thanks for any help, > > Matt Gavin > mtgavin@collins.rockwell.com > > Design Information > ------------------ > Command Line : map -p xcv300-5-pq240 -o map.ncd mimas_fpga.ngd > mimas_fpga.pcf > Target Device : xv300 > Target Package : pq240 > Target Speed : -5 > Mapper Version : virtex -- C.19 > Mapped Date : Mon Feb 14 17:08:18 2000 > > Design Summary > -------------- > Number of errors: 0 > Number of warnings: 4 > Number of Slices: 3,072 out of 3,072 100% > Slice Flip Flops: 3,408 > 4 input LUTs: 4,682 (4 used as a route-thru) > Number of Slices containing > unrelated logic: 948 out of 3,072 30% > Number of bonded IOBs: 120 out of 166 72% > Number of GCLKs: 4 out of 4 100% > Number of GCLKIOBs: 3 out of 4 75% > Total equivalent gate count for design: 60,367 > Additional JTAG gate count for IOBs: 5,904 > > Sent via Deja.com http://www.deja.com/ Before you buy.Article: 20629
Matt, I would not worry (at least not now...) My experiences with Virtex and Virtex-E PAR produce similar SLICE utilization results. Consider the following when interpreting the SLICE utilization number: 1) If the Xilinx mapper determines that the design requires a portion of the device, it will "spread" out the FF / LUTs into a greater number of SLICEs than required (The C-100 MAP option). This "SLICE-spreading" is performed to simplify the routing issues. 2) When MAP reports X SLICES utilized, we have no way of determining if the SLICE used 1 of the 2 FFs, 1 of the 2 LUTs, 1 FF and 0 LUTs, 0 FFs and 1 LUT etc. To see the "real" implementation, you would have to look at it in FPGA Editor. 3) If you want to see the MINIMAL required SLICES, run MAP with the -C 1 option. On the "gate" count issue - Xilinx "gate" counts are constructed to fit the device's part number. For example, we are using the 2000E devices which are billed as 2 Million gate devices. Xilinx arrived at the 2 Million gate number by assuming the average design would use X% of the CLBs as logic, Y% of the CLBs as distributed memory, and Z% of the Block RAM (where X,Y,Z << 100%). Consider that the 2000E has 160 BRAMs which are each 4096 bits At 4 "gates" per bit, the 2000E has 2.6 Million gates in Block RAM alone - a number which greatly exceeds the marketed 2 Million gate number for the entire device. Jeff In article <38A978D6.F2364810@collins.rockwell.com>, Matt Gavin <mtgavin@collins.rockwell.com> wrote: > FPGA gurus, > > I am trying to fit a design in a Virtex XCV300 (2.5V part). > The Xilinx mapper reports 100% slice utilization, which shocked me. > However, if you do the math on their flop and LUT counts, > (assuming 2 flops and LUTs per slice), the flop and LUT utilizations > are 55% and 76%, repectively, which is what I expected. > The equivalent gate count is ~60K, which isn't that high > (especially since Xilinx claims that 322K system gates can fit in > a XCV300.) > > Should I be worried that my slice utilization is 100%? > Why would the mapper choose to use every single slice > if the flop and LUT counts are so low? > > The report is given below for reference. > > Thanks for any help, > > Matt Gavin > mtgavin@collins.rockwell.com > > Design Information > ------------------ > Command Line : map -p xcv300-5-pq240 -o map.ncd mimas_fpga.ngd > mimas_fpga.pcf > Target Device : xv300 > Target Package : pq240 > Target Speed : -5 > Mapper Version : virtex -- C.19 > Mapped Date : Mon Feb 14 17:08:18 2000 > > Design Summary > -------------- > Number of errors: 0 > Number of warnings: 4 > Number of Slices: 3,072 out of 3,072 100% > Slice Flip Flops: 3,408 > 4 input LUTs: 4,682 (4 used as a route-thru) > Number of Slices containing > unrelated logic: 948 out of 3,072 30% > Number of bonded IOBs: 120 out of 166 72% > Number of GCLKs: 4 out of 4 100% > Number of GCLKIOBs: 3 out of 4 75% > Total equivalent gate count for design: 60,367 > Additional JTAG gate count for IOBs: 5,904 > > Sent via Deja.com http://www.deja.com/ Before you buy.Article: 20630
Rick Filipkiewicz wrote: > Looking at the Virtex data sheet there's a timing parameter for the GSR->IOB/CLB FF > outputs given. For a -4 part its 12.5nsec. The question is whether this includes GSR > routing. If it doesn't its got to be the slowest async reset since LS TTL. Of course it includes the max routing delay. But it's a max delay, and some flip-flops are closer to the source and have a much shorter delay. So this delay ( different from all other delays in the data sheet) has an enormous spread, you really should assume anywhere between almost zero to the max value. That's what causes the problems that Ray and I discussed before. Peter AlfkeArticle: 20631
Gidday there, I'm looking for a better way to manage projects than is available from Xilinx Foundation. I want to be able to store the appropriate source files for synthesis and implementation in a CVS system, and use something like 'make' to handle compilation of subsystems. I want to be able to create libraries of VHDL, and correctly make use of all the hierarchical features of this language. I am very new to this (just over a year) and come from an ANSI-C+GNU development background. All this GUI management stuff is horrible, and the management of source versions is absolutley non-existent. Or am I missing something. Any hints to projects, or products that would be able to help me would be appreciated. Thanks Joshua Lamorie Systems Designer Xiphos Technologies Inc. Sent via Deja.com http://www.deja.com/ Before you buy.Article: 20632
In article <38A80F1C.499D85@ids.net>, Ray Andraka <randraka@ids.net> wrote: > You need to use signed arithmetic. The input and all subsequent stages > should be sign extended to the width of the adders. I made a mistake, was using STD_LOGIC instead SIGN, because my input was in 2´s complement ( direct from A/D converter). Now, the sign bit generation is correct! Also, keep in mind > there is a gain through the filter, so you need to either limit the input > bits or extend the output width to accommodate the gain. You might try not > truncating first to get it working then prune the adders. At least then > you'll be able to determine if the pruning is causing the problem. Now, I´m a little bit confused. I didn´t truncate any register, except the last one. Should I use more bits for each intermediate stage, than defined by Hogenauer´s paper? TIA, Flávio > > flavioas@my-deja.com wrote: > > > Hi, > > > > We are attempting to implement a CIC Interpolation Filter, > > following Hogenauer´s recipe. In a flex10k device. > > The parameters are : Bin = 12; Bout = 12; R = 4; M = 1 and N = 4. > > So, the minimum register width for each stage is : 13, 14, 15, 15, > > 15, 16, 16, 18. > > We used 2´s complement addition rules, i.e.,all numbers unsigned, > > simple binary addition, and carries past the sign are bit ignored. > > We did the properly sign extension from one stage to another. > > But, the overall frequency response is not the expected, it didn´t > > work. > > Looking at freq. resp. at each stage, we found the desired shape > > till the first ( N+1 )Interpolator stage. From this point, as we add > > more stages ( N+2,..., 2N) the things get worst. The outputs saturates, > > we think, and we see nothing useful. > > Is this problem familiar to someone? Any hint? > > > > Thanks in Advance, > > > > Flávio > > > > Sent via Deja.com http://www.deja.com/ > > Before you buy. > > -- > -Ray Andraka, P.E. > President, the Andraka Consulting Group, Inc. > 401/884-7930 Fax 401/884-7950 > email randraka@ids.net > http://users.ids.net/~randraka > > Sent via Deja.com http://www.deja.com/ Before you buy.Article: 20633
Hi all, My company is working on a networking product that uses an FPGA for performing some analysis of Ethernet packets. The algorithms require quick access to some RAM based tables and dual port on-chip block Ram structures fit the bill perfectly. The final product is for a price sensitive market so Xilinx's Spartan-II line looks perfect, but... I called a distributor to get pricing for 50,000 XC2S100 Spartan-II chips and received a quote of $58.65 (down from a single chip at $77). Yet Xilinx's literature claims this chip to cost under $10 in volume. What constitutes "volume" to get this kind of price? Is there an FPGA with 30-40K dual port RAM blocks that costs <= $10 in volumes of 50,000? Quote from http://www.xilinx.com/products/spartan2/index.htm: "Say hello to a new level of performance. The Spartan-II family delivers 100,000 system gates for under $10, at speeds of 200 MHz and beyond, giving you design flexibility that's hard to beat." Also, I looked and looked and could not find any disclaimers or volume quotes for these prices. There are plenty of flashing GIFs proclaiming this price though. Thanks, AndyArticle: 20634
This is a multi-part message in MIME format. --------------C073BF6C231A9662622D8096 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit You can take a look at http://www.xess.com/fndmake.pdf. This document shows you how to implement Xilinx projects using a makefile and batch mode processing. You can store the makefiles and VHDL files in a CVS tree and recall them to regenerate your project bit files. The makefile described in the document is a bit simple, but you can probably modify it to make it smarter. > Gidday there, > > I'm looking for a better way to manage projects than is available from > Xilinx Foundation. > > I want to be able to store the appropriate source files for synthesis > and implementation in a CVS system, and use something like 'make' to > handle compilation of subsystems. > > I want to be able to create libraries of VHDL, and correctly make use of > all the hierarchical features of this language. I am very new to this > (just over a year) and come from an ANSI-C+GNU development background. > > All this GUI management stuff is horrible, and the management of source > versions is absolutley non-existent. Or am I missing something. > > Any hints to projects, or products that would be able to help me would > be appreciated. > > Thanks > > Joshua Lamorie > Systems Designer > Xiphos Technologies Inc. > > Sent via Deja.com http://www.deja.com/ > Before you buy. --------------C073BF6C231A9662622D8096 Content-Type: text/x-vcard; charset=us-ascii; name="devb.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for Dave Vanden Bout Content-Disposition: attachment; filename="devb.vcf" begin:vcard n:Vanden Bout;David tel;fax:(919) 387-1302 tel;work:(919) 387-0076 x-mozilla-html:FALSE url:http://www.xess.com org:XESS Corp. adr:;;2608 Sweetgum Drive;Apex;NC;27502;USA version:2.1 email;internet:devb@xess.com title:FPGA Product Manager x-mozilla-cpt:;28560 fn:Dave Vanden Bout end:vcard --------------C073BF6C231A9662622D8096--Article: 20635
Mark, Why not directly instantiate the RAMB4_S16_S16 in your HDL? In this case, Coregen just adds a layer of unnecessary complexity. Jeff In article <88c4df$4r2@news.Informatik.Uni-Oldenburg.DE>, "Mark Hillers" <Mark.Hillers@Informatik.Uni-Oldenburg.DE> wrote: > Hello, > > i think i have found a bug in xilinx-tool coregen 2.1i. > it happens when creating single-port-blockrams with words larger than 16 > bit. > > the resulting ".edn"-file (for synopsys) uses one RAMB4_S16_S16 > component where the > lower 16 bit of the 24-bit-word are mapped to port A (DOA[15:0]) and the > upper 8 bit are mapped to port B (DOA[15:8]). > But - and here comes the bug - the address of the desired word is simply > mapped to both address-ports (ADDRA and ADDRB (8 bit wide)) the > following way: > > ADDRA(4 downto 0) = myaddress(4 downto 0) > ADDRA(7 downto 5) = "000" > ADDRB(4 downto 0) = myaddress(4 downto 0) > ADDRB(7 downto 5) = "000" > > The problem is now that always both ports load the same address and with > it the same data. The Result is an output which has the form CDABCD > where A,B,C,D are hex-ciphers. > > In application-note XAPP130 (V1.1) is a solution to this problem. The > mapping of the address-ports should be: > > ADDRA(4 downto 0) = myaddress(4 downto 0) > ADDRA(7 downto 5) = "000" > ADDRB(4 downto 0) = myaddress(4 downto 0) > ADDRB(7 downto 5) = "100" > > Now I am looking for a simple patch. The simples would be a new version > of coregen because i am not good in writing ".edn"-files :-(. > > greetings > mark > Sent via Deja.com http://www.deja.com/ Before you buy.Article: 20636
Ray, Yes, if it doesn't meet 74MHz, then the GSR net is useless for us, but it is worth enquiring. If a non GSR-net reset is used, then this uses routing (i know Xilinx say that Virtex has plentiful routing) that could impact timing etc, and it uses the SR input of the Virtex CLB meaning that RAM LUTs can't exist with DFFs. I understand your concern about synchronising the Reset signal with a CLB DFF (chicken and egg scenario), but there is a technical note by Peter Alfke ? which details the construction of a synchronising FF with a CLB's LUTs (cross coupled Nand gates). Cheers, Mark. On Mon, 14 Feb 2000 23:53:14 GMT, Ray Andraka <randraka@ids.net> wrote: >It's only free if it meets timing. I think you'll find that you are past it at >74MHz. Even in the 4K parts, GSR was only good up to a fraction of the clock >rate the part can easily achieve with careful design. Also, the GSR hits every >single flip-flop in the design, which in some cases can cause you grief >(especially when you consider the need to resync the reset). > >Mark Luscombe wrote: > >> Ray, >> >> Thanks for your input, but if the GSR net is used, then routing and >> CLB resources are not used, as it is a "free" function. >> >> The Xilinx Rep is coming to see me Monday, so hopefully he'll be able >> to say whether i can use the GSR net at 74MHz. >> >> Cheers, Mark. >> >> On Sun, 13 Feb 2000 17:27:01 GMT, Ray Andraka <randraka@ids.net> >> wrote: >> >> >Not every flip flop in an FPGA design needs to be reset; You only need to >> >reset select flip-flops to make sure that 'loops' in the logic reach a known >> >state after some number of clock cycles. Data path will self clear, so >> >there's no need to apply explicit resets. You may also want to reset the >> >flip-flops closest to the outputs, and hold them reset for a number of >> >clocks after reset is released. >> > >> >I know that this makes the ASIC guys blood curdle, but the fact of the >> >matter is that it uses up resources in the FPGA and slows down your design. >> > >> >Rickman wrote: >> > >> >> Mark Luscombe wrote: >> >> > >> >> > Hi, >> >> > >> >> > I am trying to work out a satifactory method for resetting a >> >> > synchronously design Virtex running at 74MHz. >> >> > >> >> > Now, the reset signal needs to be synchronised with the 74MHz clock >> >> > and the propagation delay from this to the CLB and IOB DFFs needs to >> >> > be less than 13ns to ensure that all registers within the device at >> >> > reset on the sam clock edge. >> >> > >> >> > Xilinx seem to have been telling people not to use the GSR net, as it >> >> > is too slow, but it does seem a pity not to use it, and use extra >> >> > routing and CLB inputs for a global reset. >> >> > >> >> > It seems as though the STARTUP_VIRTEX component can accept a USER_CLK >> >> > input, i.e. the 74MHz, so is this a good solution ? >> >> > Also, this component has a GSR input for an external reset signal, >> >> > does anybody know if this is also synchronised with the USER_CLK input >> >> > ? >> >> > The device is configured in 8-bit parallel with CCLK which is related >> >> > to the 74MHz. >> >> > >> >> > What have other designers done in this situation. >> >> > >> >> > Cheers, Mark. >> >> >> >> This is a subject that is often discussed here. What you describe with >> >> using a user clock for startup is one way to do it. That should work if >> >> the GSR net is fast enough to operate within your clock cycle. >> >> >> >> Another way to use the GSR which does not depend on sychronized release >> >> of the GSR is to make sure that all of the inputs to your various FSMs >> >> or other sychronous logic are in a state that will not cause the >> >> machines to make a state change. For example if the FSMs reset to an >> >> IDLE state, then make sure that none of the inputs that let the machine >> >> leave the IDLE state are asserted. Then even if the GSR is released on >> >> different clock cycles for different FFs, it will not matter. >> >> >> >> Or use a couple of delay FFs to generate (from the GSR) a separate, >> >> synchronized input to the FSMs which will delay state changes from this >> >> initial state until it releases. This net will not need to go to all of >> >> the FFs in your design and can be routed much faster. >> >> >> >> Another method which is similar to this last one is to have a separate, >> >> external reset signal which is controlled by a micro or other logic. >> >> This will only be released well after the config is complete and is >> >> synchrnized to the clock. As in the last method, this reset will not >> >> need to go to every FF in the FPGA and so can be routed more quickly. >> >> >> >> The GSR is nice in that it puts every FF into a known state and it is >> >> asynch so it does it NOW! But releasing it can be a problem. A second, >> >> more limited reset is a good way to get the FPGA started on the right >> >> foot. >> >> >> >> I can't remember other ways that have been described, but I am sure >> >> there are some. >> >> >> >> -- >> >> >> >> Rick Collins >> >> >> >> rick.collins@XYarius.com >> >> >> >> remove the XY to email me. >> >> >> >> Arius - A Signal Processing Solutions Company >> >> Specializing in DSP and FPGA design >> >> >> >> Arius >> >> 4 King Ave >> >> Frederick, MD 21701-3110 >> >> 301-682-7772 Voice >> >> 301-682-7666 FAX >> >> >> >> Internet URL http://www.arius.com >> > >> >-- >> >-Ray Andraka, P.E. >> >President, the Andraka Consulting Group, Inc. >> >401/884-7930 Fax 401/884-7950 >> >email randraka@ids.net >> >http://users.ids.net/~randraka >> > >> > > >-- >-Ray Andraka, P.E. >President, the Andraka Consulting Group, Inc. >401/884-7930 Fax 401/884-7950 >email randraka@ids.net >http://users.ids.net/~randraka > >Article: 20637
Hey all, As we are approaching the end of a few projects, we have a need to make a few demo boards for our FPGA's. We want to have two - a Virtex board for one design, and an Altera APEX board for another. I have never done a demo board before, and while I am not responsible for it, it would heighten my learning to find out a little more about them. My question - What are some of the basic/general things that you include on the demo board. For instance, we plan on having a PCI interface on the APEX board....granted, that is specific to what we want, but a general "Well, when I start a demo board, the first things I make sure I have on the board are: " is what I am asking. Thanks - please post or email (remove the spam block to email) -XanatosArticle: 20638
I think Hogenauer's paper took into account the growth bits. Much of his paper was dedicated to truncating the LSBs and the effects of doing so. flavioas@my-deja.com wrote: > In article <38A80F1C.499D85@ids.net>, > Ray Andraka <randraka@ids.net> wrote: > > You need to use signed arithmetic. The input and all subsequent > stages > > should be sign extended to the width of the adders. > > I made a mistake, was using STD_LOGIC instead SIGN, because > my input was in 2´s complement ( direct from A/D converter). Now, > the sign bit generation is correct! > > Also, keep in > mind > > there is a gain through the filter, so you need to either limit the > input > > bits or extend the output width to accommodate the gain. You might > try not > > truncating first to get it working then prune the adders. At least > then > > you'll be able to determine if the pruning is causing the problem. > > Now, I´m a little bit confused. I didn´t truncate any > register, except the last one. Should I use more bits for each > intermediate stage, than defined by Hogenauer´s paper? > > TIA, > > Flávio > > > > > flavioas@my-deja.com wrote: > > > > > Hi, > > > > > > We are attempting to implement a CIC Interpolation Filter, > > > following Hogenauer´s recipe. In a flex10k device. > > > The parameters are : Bin = 12; Bout = 12; R = 4; M = 1 and N = > 4. > > > So, the minimum register width for each stage is : 13, 14, 15, > 15, > > > 15, 16, 16, 18. > > > We used 2´s complement addition rules, i.e.,all numbers > unsigned, > > > simple binary addition, and carries past the sign are bit ignored. > > > We did the properly sign extension from one stage to another. > > > But, the overall frequency response is not the expected, it > didn´t > > > work. > > > Looking at freq. resp. at each stage, we found the desired shape > > > till the first ( N+1 )Interpolator stage. From this point, as we add > > > more stages ( N+2,..., 2N) the things get worst. The outputs > saturates, > > > we think, and we see nothing useful. > > > Is this problem familiar to someone? Any hint? > > > > > > Thanks in Advance, > > > > > > Flávio > > > > > > Sent via Deja.com http://www.deja.com/ > > > Before you buy. > > > > -- > > -Ray Andraka, P.E. > > President, the Andraka Consulting Group, Inc. > > 401/884-7930 Fax 401/884-7950 > > email randraka@ids.net > > http://users.ids.net/~randraka > > > > > > Sent via Deja.com http://www.deja.com/ > Before you buy. -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 20639
John Fielden wrote in message <88cs39$dkp$1@schbbs.mot.com>... >Which OBUFT do you have instantiated. There are two different kinds, active >high and active low enabled. Maybe you need the opposite one. Not instantiated; inferred. The IOB has a mux that selects the proper polarity, so it shouldn't have mattered. One of the Xilinx apps guys was kind enough to send me a note about this. Apparently, it's a known issue that Synopsys dropped the ball on. It's something that would only become obvious when you aren't meeting timing and you start peeking into what's going on. -- a ----------------------------------------- Andy Peters Sr Electrical Engineer National Optical Astronomy Observatories 950 N Cherry Ave Tucson, AZ 85719 apeters (at) noao \dot\ edu "Money is property; it is not speech." -- Justice John Paul StevensArticle: 20640
Hi everybody, I am trying to get started with the fpga design on a Virtex. Up to now, I have designed my circuits with the Xilinx 4000 family, using logiblox when needed. However, when I have moved to Virtex, I have realized that the software tool Foundation (verion 2.1i) does not allow to design logiblox for virtex. Does anybody know where have logiblox gone? Does anyone know how to use some other design tool instead of logiblox? Thanks in advance Federico SillaArticle: 20641
The M2.1i software now reports hold times on input pads in the data sheet timing report file, and, of course, I have some significant (up to 2.5 ns) hold times relative to the system clock. This does not happen when using the IOB flip flop, with its delay line. It does happen when there is small amount of logic between the input and the first flip flop (so that the IOB flip flop can not be used), and when both are placed together in a CLB near the pad. What is the best (easy+automatic) way to eliminate these hold times? Has anyone else noticed this? -- /* jhallen@world.std.com (192.74.137.5) */ /* Joseph H. Allen */ int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0) +r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2 ]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}Article: 20642
jhallen@world.std.com (Joseph H Allen) writes: > The M2.1i software now reports hold times on input pads in the data sheet > timing report file, and, of course, I have some significant (up to 2.5 ns) > hold times relative to the system clock. > > This does not happen when using the IOB flip flop, with its delay line. It > does happen when there is small amount of logic between the input and the > first flip flop (so that the IOB flip flop can not be used), and when both > are placed together in a CLB near the pad. > > What is the best (easy+automatic) way to eliminate these hold times? Has > anyone else noticed this? No, I haven't noticed that they had hold times. I wonder if you can specify it in the timing constraints. Might be a good idea. I'm not sure why you want to eliminate hold times on the output. I thought one normally wanted to eliminate clock to out. They might of course be related, but not always. Remember, the hold times are probably minimum, with the (implied) maximum at Tco. Homann -- Magnus Homann, M.Sc. CS & E d0asta@dtek.chalmers.seArticle: 20643
In article <ltr9ecamgh.fsf@mis.dtek.chalmers.se>, Magnus Homann <d0asta@mis.dtek.chalmers.se> wrote: >No, I haven't noticed that they had hold times. I wonder if you can >specify it in the timing constraints. Might be a good idea. I don't see how. >I'm not sure why you want to eliminate hold times on the output. I >thought one normally wanted to eliminate clock to out. They might of >course be related, but not always. Remember, the hold times are >probably minimum, with the (implied) maximum at Tco. I'm working on a project which may actually be subjected to the entire temperature range, plus the chips driving the pins with the hold times are both fast, subject to clock-skew and close by. The problem is simplified (but of course, not eliminated) if the hold times are zero (the timing window is the same size, but the windows of all the pins are more likely to maximally overlap if they have the same hold time spec., which gives a greater overall window size). -- /* jhallen@world.std.com (192.74.137.5) */ /* Joseph H. Allen */ int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0) +r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2 ]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}Article: 20644
I just went through the recent discussion in the group about conditional compilation, and I need to take it to a different level. I want to have a test bench that does: procedure ... if (a) {} else {} end proc ... where a is a value passed in from the command line. In verilog, I can simply pass in a plusarg and then use the $testplusarg task. Is there a similar function in VHDL to do runtime decisions based on command line parameters? Cheers, Gary spivey@ieee.comArticle: 20645
"Joseph H Allen" <jhallen@world.std.com> wrote in message news:Fq1J1q.BDA@world.std.com... > The M2.1i software now reports hold times on input pads in the data sheet > timing report file, and, of course, I have some significant (up to 2.5 ns) > hold times relative to the system clock. > > This does not happen when using the IOB flip flop, with its delay line. It > does happen when there is small amount of logic between the input and the > first flip flop (so that the IOB flip flop can not be used), and when both > are placed together in a CLB near the pad. > > What is the best (easy+automatic) way to eliminate these hold times? Has > anyone else noticed this? > I have noticed this, but I have not really come up with a great solution. The solutions I used: 1) Put a layer of flops in after the input. You might not be able to accommodate this change, depending on your design, but the extra layer of flops significantly lowered the hold times for me. 2) Move the nearest flop as close as possible to the IOB. This is a horrid solution (using RLOC etc). 3) Check, using EPIC or the FPGA Editor in 2.1i, that the inputs are not going thru the DELAY element. This is turned ON by default in 2.1i, and was off in 1.5. Go figure. In the UCF file, put all input pins with the NODELAY tag if you havn;t done so already. This one seemed to save the most in the timing department for me. Hope it helps....and if not, sorry, but I tried. -XanatosArticle: 20646
Here's another (hopefully) easy one ... I am trying to write a debug message to stdout. It appears that the only way to do this is with the assert command (which also gives me a bunch of other stuff). And if I want to view variables, it gets even more arcane. So, in Verilog, I would type $display ("mem[%d] = %d", i, mem[i]); and in VHDL I get something like assert (1=0) report "mem[" & integer'image(i) & "] = " & integer'image(mem[i])) severity note; Is this the only way to do this? Does textio only work on a file or can it be used on stdout? Cheers, Gary spivey@ieee.orgArticle: 20647
This a non-trivial problem. My first approach would be to estimate the number of flip-flops, and then slect a part that has twice as many as required. All vendors tell you how many flip-flops they have per block ( XC4000 has 2, Virtex and Spartan2 have 4 ffs per CLB). But that can be wrong in either direction. If you have a lot of complex combinatorial logic, the estimated chip might be too small. If you can "hide" some of what you think as ffs in the RAMs, BlockRAMs or even the 16-bit shift registers available in each Virtex LUT, then you can be far more compact. I think you have to invest a bit of time in studying the architectures. Hell, there are only two or three contenders. I obiously prefer the "end of the alphabet"... Peter Alfke, Xilinx Applications =============================== Richard Dempster wrote: > I am very new to programming FPGA's but old (today's standards) in C > programing. I am trying to find a generic method of defining what size > FPGA I need given a certain algorithm most likely written in C. Thus, I > have some idea what the compiled size of the C program but can I relate > the compiled size to how many gates will be necessary in a FPGA. > > I have four algorithms to code into a FPGA. Two are data manipulation, > one is pixel compression using a new algorithm, and the last is a > standard Principal Component Analysis algorithm. > > I know the software used to program the FPGAs will give an apporximate > gate count but only after development of the VHDL code. > > As this will be a proposal, I was wondering if there was a method to > estimate the FPGA size prior to all the encoding. I know I don't want > to oversize the FPGA area due to unnecessary time delays and power > requirements. > > ANY information will be helpful since I haven't found any information > yet and I seemed to have run out of places to search on the WEB. None > of the tutorials seem to deal with the algoritm to gate mapping. The > two books I have don't mention this either. > > Thanks > Rick Dempster > red@usgmrl.ksu.eduArticle: 20648
The classical solution to this old problem is to utilize the input flip-flop with its input delay, but configured as a latch, and hold it permanently transparent. Peter Alfke ============================== Joseph H Allen wrote: > The M2.1i software now reports hold times on input pads in the data sheet > timing report file, and, of course, I have some significant (up to 2.5 ns) > hold times relative to the system clock. > > This does not happen when using the IOB flip flop, with its delay line. It > does happen when there is small amount of logic between the input and the > first flip flop (so that the IOB flip flop can not be used), and when both > are placed together in a CLB near the pad. > > What is the best (easy+automatic) way to eliminate these hold times? Has > anyone else noticed this? > > -- > /* jhallen@world.std.com (192.74.137.5) */ /* Joseph H. Allen */ > int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0) > +r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2 > ]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}Article: 20649
jhallen@world.std.com (Joseph H Allen) writes: > In article <ltr9ecamgh.fsf@mis.dtek.chalmers.se>, > Magnus Homann <d0asta@mis.dtek.chalmers.se> wrote: > > >No, I haven't noticed that they had hold times. I wonder if you can > >specify it in the timing constraints. Might be a good idea. > > I don't see how. *blush* You clearly said hold times on the INPUT. As you can see below, I read it as output. That's why I'm confused, and you are right. > >I'm not sure why you want to eliminate hold times on the output. I > >thought one normally wanted to eliminate clock to out. They might of > >course be related, but not always. Homann -- Magnus Homann, M.Sc. CS & E d0asta@dtek.chalmers.se
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z