Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Reconfiguration is triggered by pulling PROG_B Low. The chip recognizes this and answers with a Low on INIT_B. No need to hold PROG_B Low beyond that point. The 300 ns number is for "open loop" operation, where you do not look at INIT_B. Peter Alfke, Xilinx Applications Patrick Muller wrote: > Hi > > To reconfigure a Xilinx Virtex-II, the Prog_B signal has to be held low for > at least 300 ns. Is it sufficent to held the Prog_B signal as long low, > until the Init_B signal goes low? Or is the delay from the falling edge of > the Prog_B signal to the falling edge of the Init_B signal anyway longer > than 300 ns? > > Thanks, > > PatrickArticle: 35351
Hi, A recent thread (Programming flash connected to CPLD via JTAG) and the chicken & egg problems associated with SRAM based FPGA configuration remembered me that configuring Xilinx FPGA from standard bytewide memory was much more convenient with previous FPGA series, and some other previously available and useful functions were also lost for no apparent reasons. --- In xilinx's early days (XC2000 / XC3000) there was a "Master Parallel Mode" that allowed the FPGA to be configured directly from a standard bytewide Rom / Eprom / Flash. http://www.xilinx.com/partinfo/3000.pdf (page 25) Spartan II / Virtex also have a parallel mode, but only in the "slave" role, without the proper address generation logic (a simple binary counter in facts). Most of my apps have both a processor and a FPGA, and no being able to configure it directly from the same parallel flash that holds the processor's application increases the system's chip count / EMI / power consumption / cost in a very artificial and inefficient way. Namely, the 3 main ways to configure a Xilinx FPGA are : - use a serial config prom that's often more expensive and harder to find than the FPGA it configures. In systems that don't already use a bytewide Flash, cost and availability often make up for the few extra pins needed for the function. In system that already use a bytewide flash, configuration storage in Flash is either free or require the next bigger flash size, for an added cost often under 1$. - Add an address counter / flash interface CPLD between the Flash and the FPGA but that's a board space taking, cost increasing and often power hungry piece of silicon that's completely useless once the FPGA is configured. Also there is the "how to fill the flash ?" problem and then the "how to fill the flash in a reasonable time ?" problem that is even more difficult to solve. Have a look at the "Programming flash connected to CPLD via JTAG" thread for more informations on this problem. - put the FPGA out of the processor's data path so that it can start running even if the FPGA is not configured yet and configure it later (using software). That's usually what I end up doing, but it adds discrete glue logic and artificial complexities, and prevents the FPGA from being used in the processor's data path (memory management / bus demultiplexing, bus arbitration, Etc ...) since they must be functional before the FPGA is configured. Since putting the "Master Parallel Mode" back as a configuration option would add virtually no silicon area/cost to the FPGA (it's just a binary counter) I can't see any reason for not putting it back, except that it might hurt expensive serial proms sales. Few years ago, when FPGA were mostly used in projects where cost was not a real concern, adding the config prom was no big deal. Now that FPGA are used for more cost constrained applications (with the Spartan series), costs associated with configuration need to be reduced, and the return of the "Master Parallel Mode" would be the easiest way to do it. Also, sharing the configuration flash with a processor is a good way to allow processor software controlled FPGA configuration changes for late product differentiation / evolution / bug correction. This is not possible with OTP serial proms and require extra circuitry / software with reprogrammable serial proms. The other way around, it also allows the FPGA to program the bytewide flash containing it's own application configuration and the processor's program / data in system, using serial or JTAG configuration to serially load the FPGA with a flash programmer / JTAG (or serial) emulation configuration and then use it to program the flash. ------- While I'm at it, I also can't see why the configuration oscillator that was available for the user after configuration until Spartan is no more offered. Having an internal clock source (no matter how imprecise it was) was not only handy, but a very good way to add a watchdog to a processor based design, or a low power standby mode (put the main processor in clock stop mode and keep some background tasks going, such as keyboard scanning, analog voltage (ADC) monitoring or simply a periodical processor wake up) Now that this oscillator was enhanced to allow programmable frequency in master serial configuration mode, it would be even more useful. ------- Another option that existed with early FPGAs was the crystal oscillator. With new generation FPGAs supporting so many different IO standards, adding low gain inverter option on at least a pair of pins so that they can be used as a quartz oscillator would certainly be both easy and useful. http://www.xilinx.com/partinfo/3000.pdf (page 16) ------- Since I'm in a wish list, a device that I would really like to have in FPGAs and that would be useful in virtually all applications is a good programmable PLL frequency generator similar to this one : http://www.icst.com/pdf/ics3070102.pdf Clock generation nirvana would be (nearly !) attained if it could use a low power, low EMI, smallest size, very low cost 32768 Hz watch crystal as the reference. You can find here an example of such a design (page 44) : http://www.semiconductors.philips.com/acrobat/3070.pdf some Atmel ARM processors also use a 32768 hz crystal / PLL to generate their clock. Also, lowering the min. frequency of at least one of the DLLs from 25 Mhz to something around 12 Mhz would allow more choice for fundamental mode crystal oscillator frequencies (oscillators beyond 24 Mhz usually require 3rd overtone mode that's inherently less reliable, needs more parts including an inductor and have a longer startup time) as well as reduce power and EMI generation. ----------- I really hope to see the first 2 features back in newer FPGAs and the next 2 ones would add value for many applications at negligible cost. Am I the only one interested in these features, or would other users also like to see them in future Xilinx products ? regards, Eric.Article: 35352
hamish@cloud.net.au wrote: > Tom Brooks <tbrooks@corepower.com> wrote: > > No, in fact, with 3.3, i was seeing runtimes > > around 30 minutes, with 4.1, it took well over > > an hour. > > Did you run with equivalent options? "-l 5" (effort level 5) > implies "-xe 1" (extra effort level 1), which didn't exist on > 3.1i. You might need -xe 0 to make the options equivalent. > > I have been quite pleased with the 4.1i SP1 results > for a 2V6000 design I'm working on. > > Hamish > -- > Hamish Moffatt VK3SB <hamish@debian.org> <hamish@cloud.net.au> I have a query here, outstanding for some time. Somewhere in the Synplify documentation it says that use of Xilinx PAR effort level above 4 is not recommended. Anybody know why ? Is it just a fossil from some old Synplify or Fndtn version ? Somewhere nearby there's also the statement that use of the -k 5 flag to MAP [map to 5 input functions] is also not recommended ?Article: 35353
Does anyone know if xchecker works with NT? I tried it, and it doesn't seem to work correctly...Article: 35354
On Sun, 30 Sep 2001 16:12:51 -0400, Eric <erv_NO_SPAM@sympatico.ca> wrote: > >- put the FPGA out of the processor's data path so that it can start running >even if the FPGA is not configured yet and configure it later (using software). >That's usually what I end up doing, but it adds discrete glue logic and artificial >complexities, and prevents the FPGA from being used in the processor's data >path (memory management / bus demultiplexing, bus arbitration, Etc ...) since >they must be functional before the FPGA is configured. > I commonly hang a Xilinx FPGA right on my uP's data bus, and configure it serially in bit-bang mode after the proccessor is up and running, using three parallel port pins. The FPGA config pattern data is just built into the processor's EPROM code. The FPGA is tristated until it's configured... this works fine. JohnArticle: 35355
Just FYI. If you are using a processor plus FPGA or CPLD, you also might be interested in the Triscend Configurable System-on-Chip (CSoC) devices. The Triscend CSoC parts embed an industry-standard processor, peripherals, memory, a high-speed bus, and a block of LUT-based programmable logic, all in a single device. The CSoC device boots from a single external memory device, per your request. The boot PROM holds the configuration data for the embedded programmable logic plus the application program for the embedded processor. You can also download and debug the device directly via a JTAG port. There are two Triscend CSoC families available today--one based around the 8-bit 8051/8052 microcontroller and another based around the 32-bit ARM7TDMI RISC processor. There is more information at the following link. http://www.triscend.com/products The E5 family--based around the 8051--has both an internal ring oscillator and a crystal-oscillator amplifier that operates up to 40 MHz. The A7 family--ARM7TDMI-based--uses a 32.768 kHz watch crystal to synthesize system clock frequencies between 1 and 60 MHz. It too has an internal ring oscillator. Eric <erv_NO_SPAM@sympatico.ca> wrote in message news:<3BB77CC3.DBD58021@sympatico.ca>... [snip] > > --- > > In xilinx's early days (XC2000 / XC3000) there was a "Master Parallel Mode" > that allowed the FPGA to be configured directly from a standard bytewide > Rom / Eprom / Flash. > http://www.xilinx.com/partinfo/3000.pdf (page 25) > Spartan II / Virtex also have a parallel mode, but only in the "slave" role, > without the proper address generation logic (a simple binary counter in facts). > > Most of my apps have both a processor and a FPGA, and no being able > to configure it directly from the same parallel flash that holds the processor's > application increases the system's chip count / EMI / power consumption / cost > in a very artificial and inefficient way. > [snip] > > ------- > > Another option that existed with early FPGAs was the crystal oscillator. > With new generation FPGAs supporting so many different IO standards, > adding low gain inverter option on at least a pair of pins so that they > can be used as a quartz oscillator would certainly be both easy and useful. > > http://www.xilinx.com/partinfo/3000.pdf (page 16) > > ------- > > Since I'm in a wish list, a device that I would really like to have in > FPGAs and that would be useful in virtually all applications is a good > programmable PLL frequency generator similar to this one : > http://www.icst.com/pdf/ics3070102.pdf > Clock generation nirvana would be (nearly !) attained if it could use a > low power, low EMI, smallest size, very low cost 32768 Hz watch crystal > as the reference. > You can find here an example of such a design (page 44) : > http://www.semiconductors.philips.com/acrobat/3070.pdf > some Atmel ARM processors also use a 32768 hz crystal / PLL to generate > their clock. > [snip]Article: 35356
On Sun, 30 Sep 2001 21:25:12 +0100, Rick Filipkiewicz <rick@algor.co.uk> wrote: > > >hamish@cloud.net.au wrote: > >> Tom Brooks <tbrooks@corepower.com> wrote: >> > No, in fact, with 3.3, i was seeing runtimes >> > around 30 minutes, with 4.1, it took well over >> > an hour. >> >> Did you run with equivalent options? "-l 5" (effort level 5) >> implies "-xe 1" (extra effort level 1), which didn't exist on >> 3.1i. You might need -xe 0 to make the options equivalent. >> >> I have been quite pleased with the 4.1i SP1 results >> for a 2V6000 design I'm working on. >> >> Hamish >> -- >> Hamish Moffatt VK3SB <hamish@debian.org> <hamish@cloud.net.au> > >I have a query here, outstanding for some time. Somewhere in the >Synplify documentation it says that use of Xilinx PAR effort level above >4 is not recommended. Anybody know why ? Is it just a fossil from some >old Synplify or Fndtn version ? Most of my designs won't work with effort level less than 5. That's at > 160MHz though. >Somewhere nearby there's also the statement that use of the -k 5 flag to >MAP [map to 5 input functions] is also not recommended ? I once had a design that wouldn't work with -k 4 or -k 5. It needed -k 6 to route to speed. Usually -k 4 gives good results. Synplify also recommend that we use general purpose routing for async resets instead of the built in gsr net. I ignore that "advice" as well. Regards, Allan.Article: 35357
Hi, Was wondering if anyone could lend some help. I am implementing a 45 degree complex mixer. The coefficient sequence should be, if I am correct in saying, 1 (.707 +j.707) +j (-.707 +j.707) -1 (-.707 -j.707) -j (.707 - j.707). So, for a 4bit input and output, an input of (6,3) gives, correctly according to above sequence, [(6,3),(2,6),(D,6),(A,2),(A,D),(E,A),(3,A),(6,E)]. I get this output from my timing simulations. However, this waveform is not a mixed waveform (looking at the output of my FPGA, and also checking the shape of the simulated output in a spreadsheet). I am running at a slow 16MHz, so I shouldn't be having any timing problems. I know this post should probably go to some DSP newsgroup, but I thought that possibly someone here could give an answer to my dillema. Thanks adrianArticle: 35358
"Ray Andraka" <ray@andraka.com> wrote > Very nice. I hadn't taken the time to figure out how to parse a string to get it > into a form usable as a bit vector. We had been building a library of fmapped > functions, but it has grown to the point where it is unwieldy. I've been looking > to do something similar, but have not had the time to put my mind to it. What are > your license terms for use of the VExprEval function? Is it free for use?, > licensed?, GPL? It's free. No guarantees, but no known bugs :) As regards Don's problem with generating stuff for Verilog, maybe a Perl pre-processing stage would do the job. A little text processing plus the Perl 'eval()' should do the job. Start with this: LUT3 Magic_Label(Ready = (Shift[0] & Sending | Stall)); and pre-process to this: LUT3 #('hnn) Label(Shift[0], Sending, Stall, Ready); I speak Verilog, but not with enthusiasm. Perhaps Professor Bromley could set it as homework for one of his classes.Article: 35359
Check out the Aldec Website at www.aldec.com and find the Evita HDL tutorial It should be at http://www.aldec.com/Registration/download_form.asp?product=Evita&version=Ve rilog There is a very good introductory course on Verilog, starts at a basic level, and covers the concepts. Sean. Principle Design Engineer Calyptech www.calyptech.com > > What is the fastest way to become a Verilog expert these days? Classic books > like the Thomas/Moorby "The Verilog Hardware Description Language" seem dated, > and Smith's "HDL Chip Design" is cluttered (IMHO) with VHDL, which we can't use > here. >Article: 35360
Sorry, just found the problem. During simulation, I made the mistake of using a positive number input... however, upon trying a negative input, ht emixer falls over. Have found the problem...moral of the story is: don't trust spreadsheets to do 2's complement arithmetic for you (unless anyone can tell me how, and which spreadsheet...) adrianArticle: 35361
When I try to use a bufgmux in a virtex2 vhdl design I get the following error message. # WARNING[1]: test.vhd(58): No default binding for component: "bufgmux". (No entity named "bufgmux" was found) I have the following in my top level design -- -- pragma translate_off library UNISIM; use UNISIM.VCOMPONENTS.ALL; -- pragma translate_on -- In the body I have as a component component BUFGMUX port ( I0 : in std_logic; I1 : in std_logic; S : in std_logic; O : out std_logic ); end component; and I used the following as an instantiation U_BUFGMUX: BUFGMUX port map ( I0 => zero, -- insert clock input used when select (S) is Low I1 => clk, -- insert clock input used when select (S) is High S => ena, -- insert Mux-Select input O => bufclk -- insert clock output ); Note: I am using Foundation ISE 3.3i. Any answers? Thanks, Theron HicksArticle: 35362
The latest release of VHDL Studio is now available for the SPARC/Solaris platform. This is the first low-cost, integrated VHDL design suite on the market for the SPARC/Solaris platform. VHDL Studio integrates proven VHDL simulation technology with advanced VHDL entry tools including a state-machine editor and graphical testbench editor. Node locked licenses available for 1595 USD with no annual maintenance fees. For more information, see http://www.gmvhdl.com --Scott Thibault Green Mountain Computing Systems, Inc. http://www.gmvhdl.comArticle: 35363
In article <1001952786.11968.0.nnrp-07.9e9832fa@news.demon.co.uk>, Tim <tim@rockylogic.com.nospam.com> writes >It's free. No guarantees, but no known bugs :) Cheap at aleph-null times the price :-) >As regards Don's problem with generating stuff for Verilog, maybe a >Perl pre-processing stage would do the job. A little text processing >plus the Perl 'eval()' should do the job. > >Start with this: > LUT3 Magic_Label(Ready = (Shift[0] & Sending | Stall)); >and pre-process to this: > LUT3 #('hnn) Label(Shift[0], Sending, Stall, Ready); > >I speak Verilog, but not with enthusiasm. Perhaps Professor Bromley >could set it as homework for one of his classes. Miaow! And I don't speak Perl at all, and I'm allergic to preprocessors. And I never was a Professor (except in the lower-case sense); and these days my classes are populated with people whose employers have paid for them, so leaving things "as a trivial exercise for the student" is less easy to get away with than it once was :-) Seriously - the LUT-evaluator thing in VHDL looks like a really nice piece of work, one of those things that's easy enough in principle and really handy if you've access to it "off the shelf", but too much trouble to write for most people to be bothered. So, many thanks! -- Jonathan Bromley DOULOS Ltd. Church Hatch, 22 Market Place, Ringwood, Hampshire BH24 1AW, United Kingdom Tel: +44 1425 471223 Email: jonathan.bromley@doulos.com Fax: +44 1425 471573 Web: http://www.doulos.com ********************************** ** Developing design know-how ** ********************************** This e-mail and any attachments are confidential and Doulos Ltd. reserves all rights of privilege in respect thereof. It is intended for the use of the addressee only. If you are not the intended recipient please delete it from your system, any use, disclosure, or copying of this document is unauthorised. The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.Article: 35364
Steve, How big is your FPGA fabric these days (ie how many CLBs). Last I looked, you were using a rough equivalent of the xilinx 4K architecture. IIRC the equivalent device size was not very big, so it didn't meet the needs of something that needed alot of FPGA plus some microprocessor. What is the top of your line now? "Steven K. Knapp" wrote: > Just FYI. If you are using a processor plus FPGA or CPLD, you also > might be interested in the Triscend Configurable System-on-Chip (CSoC) > devices. The Triscend CSoC parts embed an industry-standard > processor, peripherals, memory, a high-speed bus, and a block of > LUT-based programmable logic, all in a single device. > > The CSoC device boots from a single external memory device, per your > request. The boot PROM holds the configuration data for the embedded > programmable logic plus the application program for the embedded > processor. You can also download and debug the device directly via a > JTAG port. > > There are two Triscend CSoC families available today--one based around > the 8-bit 8051/8052 microcontroller and another based around the > 32-bit ARM7TDMI RISC processor. There is more information at the > following link. > http://www.triscend.com/products > > The E5 family--based around the 8051--has both an internal ring > oscillator and a crystal-oscillator amplifier that operates up to 40 > MHz. > > The A7 family--ARM7TDMI-based--uses a 32.768 kHz watch crystal to > synthesize system clock frequencies between 1 and 60 MHz. It too has > an internal ring oscillator. > > -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 35365
LOL Tim wrote: > I speak Verilog, but not with enthusiasm. -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 35366
I am trying to configure a Spartan-II XC2S50 using Slave Parallel Mode and am unsuccessful. DONE never goes high and INIT never goes low. Is there any way to debug what state the configuration is in? For instance do the FPGA drive a pin after it detects the sync pattern (or something like that)? The manual states the "The D0 pin is considered the MSB bit of each byte." I thought I knew how to interpret that, but I'm not sure. The 5th byte is the bit stream is 0xaa. Do the bits D7-D0 read 10101010 or 01010101? I have tried both with no success, so I must have other problems. Here is what I am doing: 1. Drive PROG low. 2. Wait for DONE to go low. 3. Drive PROG high. 4. Wait for INIT to go high. 5. Drive CS and WRITE low. 6. Write 64 bytes of 0xff (CCLK is driven low then high during each byte). 7. Write 69900 bytes of bit stream (CCLK is driven low then high during each byte). 8. Write 64 bytes of 0x00 (CCLK is driven low then high during each byte). 9. Drive CS and WRITE high. 10. Wait for INIT to go low or DONE to go high (neither occurs). I added the 64 bytes of 0xff at the beginning, because I don't have a freerunning CCLK and need to make sure the FPGA configuration state machine is ready. I wasn't sure if I should have CS & WRITE low during these writes, but it looks like it is designed to throw out the extra 0xff bytes before the sync word. The same thing goes for the 0x00 bytes at the end. If anyone can help me shed some light on this, it would be greatly appreciated. -ChrisArticle: 35367
Sounds like you need to re-compile your unisim library to me, with the unisim files from 3.1i SP8, these have the Virtex II primitives in them and some bug fixes. Either that or you should check that you have your Modelsim library mappings setup correctly for unisim. Note that if you system has been upgraded using the Xilinx patches to SP8 make sure that the SP6 patch has not been missed out this is the critical one which adds most of the Virtex II functionality. We had a problem where by Xilinx 'forgot' to send us the SP6 patch, but remembered 7 & 8 (applying 7&8 without 6 means problems with Virtex 2), so it might be worth checking with your sysadmin. Jon Theron Hicks <hicksthe@egr.msu.edu> wrote in message news:3BB88E39.DD27850A@egr.msu.edu... > When I try to use a bufgmux in a virtex2 vhdl design I get the following > error message. > > # WARNING[1]: test.vhd(58): No default binding for component: "bufgmux". > (No entity named "bufgmux" was found) > > > I have the following in my top level design > > > -- > -- pragma translate_off > library UNISIM; > use UNISIM.VCOMPONENTS.ALL; > -- pragma translate_on > -- > > > > In the body I have as a component > > component BUFGMUX > port ( > I0 : in std_logic; > I1 : in std_logic; > S : in std_logic; > O : out std_logic > ); > end component; > > and I used the following as an instantiation > > U_BUFGMUX: BUFGMUX > port map ( > I0 => zero, -- insert clock input used when select (S) is Low > I1 => clk, -- insert clock input used when select (S) is High > S => ena, -- insert Mux-Select input > O => bufclk -- insert clock output > ); > > > Note: I am using Foundation ISE 3.3i. > > Any answers? > > Thanks, > Theron Hicks >Article: 35368
If you're not seeing any problems, then you've probably got sufficient decoupling caps - and the programmable grounds are helpful as well. One thing to try if you're worried about this, is to set some macrocells to low power mode - this will help prevent so many signals clocked off the same edge from transitioning all at the same time. Another alternative is to adjust the slew rate of some outputs to slow to help ward off ground bounce issues. Regards, ArthurArticle: 35369
SI Trick: Take the Vcc to its highest operational value (but do not exceed the absolute maximum DC specifications). Take the design down to its coldest operating environment (and add an extra -10C, ie get colder). See if you have any problems there. If there is an issue with Simultaneous Switching Outputs (SSOs), they will be far worse at this "corner." Austin Arthur wrote: > If you're not seeing any problems, then you've probably got sufficient decoupling caps - and the programmable grounds are helpful as well. > > One thing to try if you're worried about this, is to set some macrocells to low power mode - this will help prevent so many signals clocked off the same edge from transitioning all at the same time. Another alternative is to adjust the slew rate of some outputs to slow to help ward off ground bounce issues. > > Regards, > ArthurArticle: 35370
I was reading XAPP151 and noticed that it said for Virtex devices, bit 2 of the CTL regsiter must be zero. Is this also the case for Virtex E devices? I have a .bit file for a Virtex E that has a 1 in bit 2 of the CTL register, and as it happens, we can't get this device to configure. Do I need to have a zero there? Thanks, DaveArticle: 35371
Dave, It is my understanding that the only change from Virtex to Virtex E in configuration was the longer frame size to accomodate the extra memory bits. Control, etc. all remained identical. Austin Dave Brown wrote: > I was reading XAPP151 and noticed that it said for Virtex devices, bit 2 of > the CTL regsiter must be zero. Is this also the case for Virtex E devices? I > have a .bit file for a Virtex E that has a 1 in bit 2 of the CTL register, > and as it happens, we can't get this device to configure. Do I need to have > a zero there? > Thanks, > DaveArticle: 35372
Hi, I have implemented a barrel shifter in a Virtex-E part as: module barrel_up (datain, bs, dataout); // port definitions input[31:0] datain; // data in input[4:0] bs; // barrel shift value output[31:0] dataout; // barrel shift out data reg[31:0] output_reg; always @(datain or bs) begin case (bs) 5'h00: output_reg = datain; 5'h01: output_reg = {datain[30:0], 1'b0}; 5'h02: output_reg = {datain[29:0], 2'b0}; 5'h03: output_reg = {datain[28:0], 3'b0}; 5'h04: output_reg = {datain[27:0], 4'b0}; 5'h05: output_reg = {datain[26:0], 5'b0}; 5'h06: output_reg = {datain[25:0], 6'b0}; 5'h07: output_reg = {datain[24:0], 7'b0}; 5'h08: output_reg = {datain[23:0], 8'b0}; 5'h09: output_reg = {datain[22:0], 9'b0}; 5'h0a: output_reg = {datain[21:0], 10'b0}; 5'h0b: output_reg = {datain[20:0], 11'b0}; 5'h0c: output_reg = {datain[19:0], 12'b0}; 5'h0d: output_reg = {datain[18:0], 13'b0}; 5'h0e: output_reg = {datain[17:0], 14'b0}; 5'h0f: output_reg = {datain[16:0], 15'b0}; 5'h10: output_reg = {datain[15:0], 16'b0}; 5'h11: output_reg = {datain[14:0], 17'b0}; 5'h12: output_reg = {datain[13:0], 18'b0}; 5'h13: output_reg = {datain[12:0], 19'b0}; 5'h14: output_reg = {datain[11:0], 20'b0}; 5'h15: output_reg = {datain[10:0], 21'b0}; 5'h16: output_reg = {datain[9:0], 22'b0}; 5'h17: output_reg = {datain[8:0], 23'b0}; 5'h18: output_reg = {datain[7:0], 24'b0}; 5'h19: output_reg = {datain[6:0], 25'b0}; 5'h1a: output_reg = {datain[5:0], 26'b0}; 5'h1b: output_reg = {datain[4:0], 27'b0}; 5'h1c: output_reg = {datain[3:0], 28'b0}; 5'h1d: output_reg = {datain[2:0], 29'b0}; 5'h1e: output_reg = {datain[1:0], 30'b0}; 5'h1f: output_reg = {datain[0], 31'b0}; default: output_reg = datain; endcase end assign dataout = output_reg; endmodule This synthesizes to 5 logic levels, which seems like a lot to me. I tried doing it another way but it was also 5 levels. Anybody have any ideas on how to make it less, or is that just the way it is? I am using Synplify for synthesis. Nate ---------------------------------------------------- Nate Goldshlag nateg@pobox.com Arlington, MA http://www.pobox.com/~nategArticle: 35373
How about a multiplier? Definitely would be good in Virtex II. We synthesize with Exemplar Leonardo and it will add pipeline registers = within the multiplier logic which will make for a very fast clock rate. = I=20 can get well over 100MHz clock rate for a multiplier that Leonardo=20 creates with pipeline registers. I don't know if Synplify will do that for you. Good luck, Tom Dillon Dillon Engineering, Inc. http://www.dilloneng.com >>>>>>>>>>>>>>>>>> Original Message <<<<<<<<<<<<<<<<<< On 10/1/01, 8:15:17 PM, Nate Goldshlag <nateg@pobox.com> wrote regarding= =20 barrel shifter in Xilinx Virtex-E: > Hi, > I have implemented a barrel shifter in a Virtex-E part as: > module barrel_up (datain, bs, dataout); > // port definitions > input[31:0] datain; // data in > input[4:0] bs; // barrel shift value > output[31:0] dataout; // barrel shift out data > reg[31:0] output_reg; > always @(datain or bs) > begin > case (bs) > 5'h00: > output_reg =3D datain; > 5'h01: > output_reg =3D {datain[30:0], 1'b0}; > 5'h02: > output_reg =3D {datain[29:0], 2'b0}; > 5'h03: > output_reg =3D {datain[28:0], 3'b0}; > 5'h04: > output_reg =3D {datain[27:0], 4'b0}; > 5'h05: > output_reg =3D {datain[26:0], 5'b0}; > 5'h06: > output_reg =3D {datain[25:0], 6'b0}; > 5'h07: > output_reg =3D {datain[24:0], 7'b0}; > 5'h08: > output_reg =3D {datain[23:0], 8'b0}; > 5'h09: > output_reg =3D {datain[22:0], 9'b0}; > 5'h0a: > output_reg =3D {datain[21:0], 10'b0}; > 5'h0b: > output_reg =3D {datain[20:0], 11'b0}; > 5'h0c: > output_reg =3D {datain[19:0], 12'b0}; > 5'h0d: > output_reg =3D {datain[18:0], 13'b0}; > 5'h0e: > output_reg =3D {datain[17:0], 14'b0}; > 5'h0f: > output_reg =3D {datain[16:0], 15'b0}; > 5'h10: > output_reg =3D {datain[15:0], 16'b0}; > 5'h11: > output_reg =3D {datain[14:0], 17'b0}; > 5'h12: > output_reg =3D {datain[13:0], 18'b0}; > 5'h13: > output_reg =3D {datain[12:0], 19'b0}; > 5'h14: > output_reg =3D {datain[11:0], 20'b0}; > 5'h15: > output_reg =3D {datain[10:0], 21'b0}; > 5'h16: > output_reg =3D {datain[9:0], 22'b0}; > 5'h17: > output_reg =3D {datain[8:0], 23'b0}; > 5'h18: > output_reg =3D {datain[7:0], 24'b0}; > 5'h19: > output_reg =3D {datain[6:0], 25'b0}; > 5'h1a: > output_reg =3D {datain[5:0], 26'b0}; > 5'h1b: > output_reg =3D {datain[4:0], 27'b0}; > 5'h1c: > output_reg =3D {datain[3:0], 28'b0}; > 5'h1d: > output_reg =3D {datain[2:0], 29'b0}; > 5'h1e: > output_reg =3D {datain[1:0], 30'b0}; > 5'h1f: > output_reg =3D {datain[0], 31'b0}; > default: > output_reg =3D datain; > endcase > end > assign dataout =3D output_reg; > endmodule > This synthesizes to 5 logic levels, which seems like a lot to me. I=20= tried > doing it another way but it was also 5 levels. Anybody have any ideas= on=20 how > to make it less, or is that just the way it is? > I am using Synplify for synthesis. > Nate > ---------------------------------------------------- > Nate Goldshlag nateg@pobox.com > Arlington, MA http://www.pobox.com/~nategArticle: 35374
The multipliers in virtexII do indeed work as a shifter, but they are quite slow compared to what you can do in the fabric. The barrel shift, as implemented in the previous post has n^2 cost. By using a merged tree structure you can reduce that to nlog(n). If you look at an individual bit output, it can be represented as a function of a numbr of inputs. That function can be represented as a tree of logic blocks. A merged tree is the result of taking the intermediate values and sharing them among adjacent trees for the adjacent bits. In the case of a barrel shift, the merged tree consists of layers of 2:1 muxes. In your case, one layer selects between straight through or shift by 16 bits. The next layer takes that output and either passes it through or shifts by 8 bits and so on. The advantage of the merged tree is less logic and considerably less routing congestion, which makes it faster. It is also very easy to pipeline as well as to describe in RTL code. No synthesizer will infer a merged tree from the description given in the code below. BTW, you need 5 layers if your kernel is a 4 lut because you can only get a 2:1 mux in each layer. You can use the f5 and f6 muxes to make the layers faster, but they still count as layers of logic. You may have to instantiate the muxf5 and muxf6's to get your synthesizer to use them consistently. Tom Dillon wrote: > How about a multiplier? Definitely would be good in Virtex II. > > We synthesize with Exemplar Leonardo and it will add pipeline registers > within the multiplier logic which will make for a very fast clock rate. I > can get well over 100MHz clock rate for a multiplier that Leonardo > creates with pipeline registers. > > I don't know if Synplify will do that for you. > > Good luck, > > Tom Dillon > Dillon Engineering, Inc. > http://www.dilloneng.com > > >>>>>>>>>>>>>>>>>> Original Message <<<<<<<<<<<<<<<<<< > > On 10/1/01, 8:15:17 PM, Nate Goldshlag <nateg@pobox.com> wrote regarding > barrel shifter in Xilinx Virtex-E: > > > Hi, > > > I have implemented a barrel shifter in a Virtex-E part as: > > > module barrel_up (datain, bs, dataout); > > > // port definitions > > input[31:0] datain; // data in > > input[4:0] bs; // barrel shift value > > output[31:0] dataout; // barrel shift out data > > > reg[31:0] output_reg; > > > always @(datain or bs) > > begin > > case (bs) > > 5'h00: > > output_reg = datain; > > 5'h01: > > output_reg = {datain[30:0], 1'b0}; > > 5'h02: > > output_reg = {datain[29:0], 2'b0}; > > 5'h03: > > output_reg = {datain[28:0], 3'b0}; > > 5'h04: > > output_reg = {datain[27:0], 4'b0}; > > 5'h05: > > output_reg = {datain[26:0], 5'b0}; > > 5'h06: > > output_reg = {datain[25:0], 6'b0}; > > 5'h07: > > output_reg = {datain[24:0], 7'b0}; > > 5'h08: > > output_reg = {datain[23:0], 8'b0}; > > 5'h09: > > output_reg = {datain[22:0], 9'b0}; > > 5'h0a: > > output_reg = {datain[21:0], 10'b0}; > > 5'h0b: > > output_reg = {datain[20:0], 11'b0}; > > 5'h0c: > > output_reg = {datain[19:0], 12'b0}; > > 5'h0d: > > output_reg = {datain[18:0], 13'b0}; > > 5'h0e: > > output_reg = {datain[17:0], 14'b0}; > > 5'h0f: > > output_reg = {datain[16:0], 15'b0}; > > 5'h10: > > output_reg = {datain[15:0], 16'b0}; > > 5'h11: > > output_reg = {datain[14:0], 17'b0}; > > 5'h12: > > output_reg = {datain[13:0], 18'b0}; > > 5'h13: > > output_reg = {datain[12:0], 19'b0}; > > 5'h14: > > output_reg = {datain[11:0], 20'b0}; > > 5'h15: > > output_reg = {datain[10:0], 21'b0}; > > 5'h16: > > output_reg = {datain[9:0], 22'b0}; > > 5'h17: > > output_reg = {datain[8:0], 23'b0}; > > 5'h18: > > output_reg = {datain[7:0], 24'b0}; > > 5'h19: > > output_reg = {datain[6:0], 25'b0}; > > 5'h1a: > > output_reg = {datain[5:0], 26'b0}; > > 5'h1b: > > output_reg = {datain[4:0], 27'b0}; > > 5'h1c: > > output_reg = {datain[3:0], 28'b0}; > > 5'h1d: > > output_reg = {datain[2:0], 29'b0}; > > 5'h1e: > > output_reg = {datain[1:0], 30'b0}; > > 5'h1f: > > output_reg = {datain[0], 31'b0}; > > default: > > output_reg = datain; > > endcase > > end > > > assign dataout = output_reg; > > > endmodule > > > This synthesizes to 5 logic levels, which seems like a lot to me. I > tried > > doing it another way but it was also 5 levels. Anybody have any ideas on > how > > to make it less, or is that just the way it is? > > > I am using Synplify for synthesis. > > > Nate > > > ---------------------------------------------------- > > Nate Goldshlag nateg@pobox.com > > Arlington, MA http://www.pobox.com/~nateg -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z